Term
|
Definition
a measure that can resist the influence of extreme observations
e.g Median |
|
|
Term
|
Definition
midpoint of a distribution (i.e. the number such that half the observations are smaller and the other half are larger (n+1)/2 |
|
|
Term
|
Definition
1st Quartile is > 25% of observations 2nd Quartile = median 3rd Quartile is > 75% of observations |
|
|
Term
Quartiles (Freund/Perles) |
|
Definition
the lower quartile (Q1) is the ¼(n+3)th observation
the second quartile (median) is the ½(n+1)th observation
the upper quartile (Q3) is the ¼(3n+1)th observation |
|
|
Term
Choosing a Summary (center/spread) |
|
Definition
Five number summary is usually better than mean and standard deviation for a distribution or one with strong outliers |
|
|
Term
|
Definition
A curve that has area exactly 1 underneath it. The area under the curve and above any range of values is the proportion of values that fall in that range |
|
|
Term
Mean of skewed distribution |
|
Definition
The mean of a skewed distribution is pulled toward the long tail |
|
|
Term
Normal Curve/Distribution |
|
Definition
Symmetric, single-peaked, and bell-shaped |
|
|
Term
|
Definition
68% of values fall within the 1 std dev from the mean 95% fall within 2 std dev from the mean 99.7% fall within 3 std dev from the mean |
|
|
Term
|
Definition
subtract mean of distribution from value and divide by standard deviation (z-score) |
|
|
Term
|
Definition
tells is how many standard deviations original value falls away from the mean and in what direction |
|
|
Term
Standard Normal Distribution |
|
Definition
The normal distribution with mean 0 and standard deviation 1 |
|
|
Term
Behavior of Mean of Skewed Distribution |
|
Definition
Mean moves farther toward long tail for a skewed curve |
|
|
Term
|
Definition
minimum, Q1, Q2(Median), Q3 Maximum |
|
|
Term
|
Definition
s is zero when there is no spread and gets larger as spread increases |
|
|
Term
|
Definition
|
|
Term
|
Definition
sum of individual deviations squared divided by the degrees of freedom (i.e. n-1) |
|
|
Term
|
Definition
Q3-Q1 (Outlier is 1.5 X IQR above Q3 or below Q1 |
|
|
Term
|
Definition
Measures outcome of a study |
|
|
Term
|
Definition
explains or influences changes in a response variable |
|
|
Term
|
Definition
Plot explanatory variable on x-axis and response variable on the y-axis |
|
|
Term
|
Definition
when above average of one variable tend to accompany above average of the other or below average values tend to occur together |
|
|
Term
|
Definition
when above average value of one variable accompany below average values of the other and vice versa |
|
|
Term
|
Definition
when points in a scatter plot lie in a straight line pattern |
|
|
Term
|
Definition
the sum of the x deviations over std dev of x times the y deviations times 1/n-1 |
|
|
Term
|
Definition
Correlation makes no distinction between x and y |
|
|
Term
|
Definition
Because r uses standardized variables r doesn't change when change units of measurement for x and y or both |
|
|
Term
|
Definition
Positive r indicates positive association and negative r indicates negative correlation |
|
|
Term
|
Definition
r is always between -1 and 1 and strength increases as move away from 0 in either direction (r = +-1 points lie on straight line) |
|
|
Term
|
Definition
correlation measure strength of linear relationship only not curved |
|
|
Term
|
Definition
correlation is not resistant i.e. affected by outliers |
|
|
Term
|
Definition
a straight line that describes how a response variable changes as an explanatory variable changes |
|
|
Term
Least-squares regression line |
|
Definition
the line that makes the sum of the squares of the vertical distances of the data points from the line as small as possible slope = r*(sy/sx) intercepts = y-b*x |
|
|
Term
|
Definition
along the regression line a change of one std dev in x corresponds to a change of r std dev in y in other words as correlation grows less strong the prediction moves kess in response to changes in x |
|
|
Term
|
Definition
is the fraction of the variation in the values of y that is explained by the least-squares regression of y on x |
|
|
Term
|
Definition
The difference between an observed value of the response variable and the value predicted by the regression line residual = obs y - predicted y |
|
|
Term
Mean of least-squares residuals |
|
Definition
|
|
Term
|
Definition
a scatterplot of the regression residuals against the explanatory variable |
|
|
Term
|
Definition
point in extreme of x direction which has a strong influence on the position of the regression line |
|
|
Term
|
Definition
observation that lies outside the overall pattern of the other observations |
|
|
Term
|
Definition
the use of a regression line for prediction far outside the range of values of the explanatory variable |
|
|
Term
|
Definition
correlations based on averages are usually too high when applied to individuals |
|
|
Term
|
Definition
a variable that has an important effect on the relationship among the variables in a study but is not included amont the variables studied |
|
|
Term
|
Definition
changing one of the variables causes changes in the other - usually caused by lurking variable |
|
|
Term
|
Definition
an association between an explanatory variable and a response variable is not by itself good evidence that changes in x cause changes in y even if that association is strong |
|
|
Term
|
Definition
Association is strong Association is consistent Higher doses are associated with stronger responses Cause precedes effect in time Cause is plausible |
|
|
Term
|
Definition
table defining two categorical variables |
|
|
Term
|
Definition
row and column totals that appear at right and bottom margins of a two way table |
|
|
Term
|
Definition
an association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group |
|
|
Term
|
Definition
observes individuals and measures variables of interest but does not attempt to influence responses e.g. sampling |
|
|
Term
|
Definition
study that deliberately imposes some treatment on individuals in order to observe their responses |
|
|
Term
|
Definition
when two variables (explanatory or lurking) effects on a response variable cannot be distinguished from each other |
|
|
Term
|
Definition
entire group of individuals we want info about |
|
|
Term
|
Definition
subset of population that we actually examine in order to gather information |
|
|
Term
|
Definition
method used to choose sample from population |
|
|
Term
Voluntary Response Sample |
|
Definition
sample where people choose themselves to respond to a general appeal. biased b/c people with strong opinions-especially negatve ones-are most likely to respond |
|
|
Term
|
Definition
sample design that chooses the individuals easiest to reach |
|
|
Term
|
Definition
systematic error; i.e. sample design that favors certain outcomes |
|
|
Term
|
Definition
consists of n indviduals from a population chosen such that every set of n individuals has an equal chance to be selected |
|
|
Term
|
Definition
sample technique that gives each member of the population a known chance of being selected |
|
|
Term
|
Definition
divides population into groups of similar individuals called strata and then choosing a SRS from each stratum and combining the SRSs to form sample |
|
|
Term
|
Definition
groups of similar individuals within a population used in stratified random sampling |
|
|
Term
|
Definition
Stage 1: Divide population into groups and select a sample of the groups Stage 2: divided groups from one into smaller areas called blocks and take a stratified sample from the blocks Stage 3: Sort individuals from blocks into clusters and take random sample of clusters |
|
|
Term
|
Definition
when some groups in the population are left out of SRS. e.g. phone survey and 6% w/o phones |
|
|
Term
|
Definition
when an individual chosen for the sample can't be contacted or refuses to cooperate |
|
|
Term
|
Definition
bias caused by behavior of respondent or interviewer e.g. respondent lying, race or sex of interviewer |
|
|
Term
|
Definition
bringing events in the past forward in memory to more recent time periods e.g. saw dentist 8 months ago and say yes to seeing dentist in the last 6 mos. |
|
|
Term
|
Definition
wording of quesions in sample surveys can introduce bias |
|
|
Term
|
Definition
list of individuals from which a sample is selected |
|
|
Term
|
Definition
The individuals on which an experiment is done |
|
|
Term
|
Definition
the experimental units when dealing with human beings |
|
|
Term
|
Definition
experimental condition applied to the units |
|
|
Term
|
Definition
the explanatory variable(s) in an experiment |
|
|
Term
|
Definition
values of the factors in an experimental treatment |
|
|
Term
|
Definition
use of chance to divide experimental units into groups in an experiment |
|
|
Term
Randomized Comparative Experiment |
|
Definition
An experiment that uses both comparison and randomization |
|
|
Term
|
Definition
experimental design where all experimental units are allocated at random among all treatments |
|
|
Term
Statistically Significant |
|
Definition
An observed effect so large that it would rarely occur by chance |
|
|