Shared Flashcard Set

Details

Title

Math 146 Chapter 3

Description

Intro to Statistics

Total Cards

Subject

Mathematics

Level

Undergraduate 1

Created

10/01/2017

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Mathematics Flashcards

Cards Return to Set Details

Term

Arithmetic mean

Definition

This is computed by adding all of the values of the variable in a the data set and dividing by the number of observations. Also known as the mean, or the average.

Term

Population arithmetic mean
(μ)

Definition

This is computed using all of the individuals in a population. It is a parameter. The average of a population.

μ = (x1+x2+...+xN)/N = (Σxi)/N

Term

Sample Arithmetic mean

Definition

This is computed using sample data. The sample mean is a statistic. Average of the sample.

Term

Median

Definition

The value that lies in the middle of the data when arranged in ascending order - represented by M

Term

Resistant

Definition

A numerical summary is said to be ________ if extreme values (very large or very small) relative to the data do not affect its value substantially.

Term

Mode

Definition

The most frequent observation of a variable that occurs in a data set. There can be multiple.

Term

Bimodal

Definition

When the data set has two modes

Term

Multimodal

Definition

When the data set has 3 or more modes

Term

No mode

Definition

When no observation in a data set occurs more than once.

Term

Dispersion

Definition

The degree to which the data are spread out. Includes: the range, standard deviation, variance, and the interquartile range

Term

Range

Definition

The difference between the largest and the smallest data value. Represented by R.

R = largest data value - smallest data value

Term

Deviation about the mean

Definition

Population: For the ith observation, it is xi - μ
Sample: For the nth observation, it is xi-(mean)x

Term

Population Standard Deviation (σ)

Definition

The ___ of a variable is the square root of the sum of squared deviations about the population mean, divided by the number of observations in the population N. That is, the square root of the mean of the squared deviations about the population mean.

[image]

Term

Conceptual formula

Definition

Using this formula:
1. Create a table with four columns: enter pop. data in column 1, in column 2 enter the pop. mean.
2. Compute the deviation about the mean for each data value and enter the result in column 3.
3. In column 4, enter the squares of the values in Column 3.
4. Sum the entries in Column 4 and divide this result by ther size of the population.
5. Determine the square root of the value found in step 4.

[image]

Term

Computational Formula

Definition

A formula that is equal to the population standard deviation formula:

[image]

Using this formula:
Create a table with two columns: Population data in column 1. Square each value in column 1 and enter the result in column 2.
Sum the entries in column 1 and sum the entries in column 2.
Substitute these values into the computational formula and simpllify.

[image]

Term

Sample standard deviation (s)

Definition

____of a variable is ther square root of the sum of squared deviations about the sample mean divided by n-1 where n is the sample mean

[image]

Term

Degrees of freedom

Definition

(n-1) because the first n-1 observations have the freedom to be whatever value they wish, but the nth value has no freedom. It must be whatever value forces the sum of the deviations about the mean to equal zero.

In other words, we have n-1 degrees of freedom in the computation of s because an unknown parameter, μ, is estimated with (mean)x. For each parameter estimated we lose 1 degree of freedom.

Term

The larger the standard deviation, the more dispersion that distribution has

Definition

When comparing two populations, __________, provided that the populations use the same units of measure. You want to compare apples with apples.

Term

Variance

Definition

The ___ of a variable is the square of the standard deviation.

Term

Population variance

Definition

σ^2

Term

Sample variance

Definition

s^2

Term

Biased statistic

Definition

This is used to describe a statistic when it consistently under or overestimates a parameter.

Term

Empirical rule

Definition

If the data have a distribution that is bell shaped, then this rule can be used to determine the percentage of data that will lie within k standard deviations of the mean.

If the distirbution is roughly bell shaped, then:
- Appx. 68% of data will lie within 1 standard deviation of the mean. Meaning appx. 68% of data will lie between μ-1σ and μ+1σ
- Appx. 95% of the data will lie within 2 standard deviations of the mean, between μ-2σ and μ+2σ
- Appx. 99.7% of the data will lie within 3 standard deviations of the mean, between μ-3σ and μ+3σ

This rule gives more precise results.

Term

Chebyshev’s Inequality

Definition

An inequality that determines a minimum percentage of observations that lie within k standard deviations of the mean, where k>1 regardless of the basic shape of the distribution (skewed left, skewed right, or symmetric).

- For any data set or distribution, at least (½ - 1/k^2) x 100% of the observations lie within k standard deviations of the mean, where k is any number greater than 1. That is, it lies between μ-kσ and μ+kσ for k>1.

- Can also be used based on sample data

Term

Grouped data

Definition

Data that has been summarized in frequency distributions.

Term

Weighted mean

Definition

This is found by multiplying each value of the variable by its corresponding weight, adding these products, and dividing this sum by the sum of its weights. It can be expressed using the formula:

[image]

Term

Approximate Standard Deviation of a Variable from a Frequency Distribution

Definition

Population Standard Deviation - σ = √ ((Σ((xi - μ)^2)fi) / (Σfi))
Sample standrard deviation - s = √ ((Σ((xi - μ)^2)fi) / (Σfi - 1))

Where xi is the midpoint or value of the ith class, fi is the frequency of the ith class

Term

Z-score

Definition

Represents the distance that a data value is from the mean in terms of the number of standard deviations. We find it by subtracting the mean from the data value and dividing this result by the standard deviation. There is both population ___ and a sample ___:

[image]

Term

kth Percentile

Definition

Denoted Pk, of a set of data is a value such that k percent of the observations are less than or equal to the value

- Percentiles divided a set of data written in ascending order into 100 parts, so 99 percentiles can be determined
- Used to give the relative standing of an observation

Term

Quartiles

Definition

Divide data sets into fourth, or four equal parts: (Q1, Q2, Q3, Q4)

Q1 - the first quartile, divides the bottom 25% from the top 75%; this is equivalent to the 25th percentile

Q2 - the second quartile, divides the bottom 50% of the data from the top 50%; equivalent to the 50th percentile or the median

Q3 - the third quartile, divides the bottom 75% of the data from the top 25%; equivalent to the 75th percentile

Term

Interquartile Range (IQR)

Definition

The range of the middle 50% of the observations in a data set. The IQR is the difference between the third and first quartiles and is found using the formula: IQR = Q3 - Q1

Term

Describe the distribution

Definition

___means to describe a distributions shape (skewed left, right, or symmetric), its center (mean or median), and its spread (standard deviation or interquartile range).

Term

Outliers

Definition

Extreme observations in the data set; can occur by chance, or error.

Checking for ____:
1. Determine the first and third quartiles of the data
2. Compute the interquartile range
3. Determine the fences.
4. If a data value is less than the lower fence or greater than the upper fence, it is consdiered an outlier.

Term

Fences

Definition

Serve as cutoff points for determining outliers
Lower ___ = Q1 - 1.5(IQR)
Upper ___ = Q3 + 1.5(IQR)

Term

Exploratory Data Analysis

Definition

Exploring the data to see if they contain interesting information that may be useful in our research; goal is to collect and present evidence NOT to make conclusions.

Term

Five number summary

Definition

This consists of the smallest data value of a set, Q1, the median, O3, and the largest data value of the set. Organized as so:
MINIMUM Q1 M Q2 MAXIMUM

Term

Boxplot

Definition

A graph that is made using the five-number summary.

1. Determine the lower and upper fences
2. Draw a number line long enough to include the maximum and minimum values. Insert vertical lines at Q1, M, and Q3. Enclose these vertical lines in a box.
3. Label the lower and upper fences.
4. Draw a line from Q1 to the smallest data value that is larger than the lower fence. Draw a line from Q3 to the largest data value that is smaller than the upper fence. These lines are called whiskers.
5. Any data values less than the lower fence or greater than the upper fence are outliers and are marked with an asterisk.

Term

Whiskers

Definition

Lines on the outside of the box plot, that display the distance from the outer quartiles to the outer data values.

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Mathematics Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile