Term
|
Definition
the objects described by a set of data. Individuals may be people, animals, or things |
|
|
Term
|
Definition
any charactereistic of an individual. A variable can take different values for different individuals |
|
|
Term
|
Definition
Places an individual into one of several groups or categories |
|
|
Term
|
Definition
Takes numerical values for which it makes sense to find an average |
|
|
Term
|
Definition
the pattern of variation of a variable. It tells us what values the variable takes and how often it takes these values |
|
|
Term
|
Definition
Drawing conclusions that go beyond thed ata at hand |
|
|
Term
|
Definition
displays the counts (frequencies) of the variables in each format category. |
|
|
Term
|
Definition
the data of that shows the percents of variables in each format category |
|
|
Term
|
Definition
The exact percents would add to 100, but the rounded percents only come close |
|
|
Term
|
Definition
Display the distribuiton of a categorical variable more vividly |
|
|
Term
|
Definition
it describes two categorical variables |
|
|
Term
|
Definition
one of the categorical variables in a two-way table of counts is the distribuiton of values of that variable among all individuals described by the table |
|
|
Term
|
Definition
describes the values of that variable among individuals who have a specific value of another variable. There is a seprate conditional distribution for each value of the other variable |
|
|
Term
|
Definition
We say that there is an association between two variables if specific values of one variable tend to occur in common with specific values of the other. |
|
|
Term
|
Definition
An association between two variables that holds for each individual value of a third variable can be changed or even reversed when the data for all values of the third variable are combined. |
|
|
Term
|
Definition
you can describe the overall pattern of a distribution by its shape, center and spread |
|
|
Term
|
Definition
An important kind of depature is an outlier, an individual value that falls outside the overall pattern |
|
|
Term
|
Definition
either mirrored or much longer on one side |
|
|
Term
|
Definition
They have a single peak: minor ups and downs in a graph, like the "bumps" |
|
|
Term
|
Definition
|
|
Term
|
Definition
More than two clear peaks |
|
|
Term
|
Definition
give us a quick picture of the shape of a distribution while including the actual numerical values in the graph. |
|
|
Term
|
Definition
The most common graph of distribution of one quantitative variable. |
|
|
Term
|
Definition
adding the values together and dividing by the number of observations |
|
|
Term
|
Definition
is the midpoint of a distribution, the number such that half the observations are smaller and the other half are larger. |
|
|
Term
The 1.5 X IQR rule for outliers |
|
Definition
Call an observation an outlier if it falls more than 1.5 x IQR above the third quartile or below the first quartile |
|
|
Term
|
Definition
A distribution consists of the smallest observation, the first quartile, the median, the third quartile, and the largest observation, written in order from smallest to largest. In symbols, the five-number summary is
Minimum Q1, M Q3 Maximum |
|
|
Term
|
Definition
measures the average distance of the observations from their mean. It is calculated by finding an average of the squared distances and then taking the square root. |
|
|
Term
|
Definition
The average squared distance |
|
|
Term
|
Definition
a distribution of the value with p percent of the observations less than it |
|
|
Term
Standardized value (z-score) |
|
Definition
If x is an observation from a distribution that has known mean and standard deviation, the standardized value of x is
Z=x-mean/standard deviation
A standardized value is often called a z-score |
|
|
Term
|
Definition
is a curve that is always on or above the horizontal axis, and has area exactly 1 underneath it. It describes the overall pattern of a distribution. |
|
|
Term
Normal distribution and normal curve |
|
Definition
is described by a Normal density curve. Any paticular Normal distribution is completely specified by two numbers; its mean and standard deviation. The mean of a Normal distribution is at the center of the symmetric Normal curve. |
|
|