Shared Flashcard Set

Details

Statistics 111; Lectures 6-7
Terms pertaining to lectures before secon homework; Shane T. Jensen, STAT-111 Fall 2010; Introduction to the Practice of Statistics Ch.2 (2.1 and 2.2)
9
Mathematics
Undergraduate 1
10/25/2010

Additional Mathematics Flashcards

 


 

Cards

Term
Associations between Variables
Definition
•  Positively associated if increased values of
one variable tend to occur with increased
values of the other

•  Negatively associated if increased values of
one variable occur with decreased values of
the other
Term
Response variable
Definition
A response variable (Y-axis) measures an
outcome of interest. Also called dependent
Term
Explanatory variable
Definition
An explanatory variable (X-axis) explains
changes in response. Also called independent

•  Explanatory does not mean causal: there are
often several possible explanatory variables
•  Example: Study of heart disease & smoking

•  Response: death due to heart disease
•  Explanatory: number of cigarettes smoked per day

•  Example: City dataset

•  Response: mortality
•  Explanatory: education
Term
Linear Relationships
Definition
Some associations are not just positive or
negative, but also appear to be linear

A perfect linear relationship is Y = a + bX
Term
Contingency Table
Definition
Can examine relationships of categorical variables this way.
Term
Describing/ Examining a Scatterplot
Definition
Look for the overall pattern and for striking deviations from that patter.

You can describe the overall pattern of a scatterplot by the form, direction, and strength of the relationship.

An important kind of deviation is an outlier
Term
Correlation
Definition
measures the direction and strength of the linear relationship between two quantitatie variables. Correlation is usually written as r.

r is positive when there is a positive association

correlation makes no use of the disticntion between explanatory and response variables. Doesn't make a difference which variable you call x and which you call y in calculating the correlation.

requires that both variables be quantitative* key difference between correlation and association, because association can apply to categorical

always a number between -1 and 1. values near 0 indicate very weak linear relationships. Extreme values of -1 and 1 only occur when the points in a scatterlot lie exactly along a straight line.

correlation can ONLY apply to linear relationships (a curved relationship, no matter how strong, cannot be described by a correlation)

not resistant to outliers.
Term
Linear Regression
Definition
If our X and Y variables do show a linear
relationship, we can calculate a best fit
line in addition to the correlation
The values a and b together are called
the regression coefficients

•  a = intercept
•  b = slope

How to determine our “best” line ? (ie. best regression coefficients a and b ?)

•  Must square “Y-residuals” and add them up:
total residuals = ∑ (y i -(a + b ⋅ x i )) 2
Term
Least squares line
Definition
Line with smallest total residuals

Best values for slope and intercept
Supporting users have an ad free experience!