Term
What does regression analysis actually do? |
|
Definition
The depedency of at least one variable on one or more explanatory (independent) variables to estimate a conditional population mean of the dependent variable as accurately as possible with the informaiton given |
|
|
Term
|
Definition
The deviation of a specific value from the conditional mean is an unobserved random variable/error (can be positive or negative) |
|
|
Term
How do we interpret the intercept (ß0)? |
|
Definition
Intercept is the predicted value of the dependent variable if the independent variable takes a value of zero |
|
|
Term
How do we interpret the slope (ß1)? |
|
Definition
The slope informs us about the increase/decrease in (the unit of measurement of) the dependent variable if the independent variable is changed by one unit of measurement |
|
|
Term
Assumptions of Linear Regression Model |
|
Definition
A1: the population regression funciton is linear in its parameters
A2: the values of the regressor x are fixed (not stochastic or random)
A3: the mean (conditional expectation) of the stochastic error term ui is for any given value of x always zero (cancels out)
A4: for any given values of x, the conditioanl variance of the error term ui is a constant for any observation. (Homoscedasticity)
A5: For any xi and xj, we demand the correlation between the errors ui and uj is always zero (no autocorrelation)
A6: Regressor and error terms are not correlated (no covariance between error term and regressior)
A7: the number of observations (n) must be larger than the number of parameters (k)
A8: the variance of x must be positive finitie
A9: must be correctly specified in all respects
A10: There can be no perfect multicollinearity between regressors |
|
|
Term
Equation for the intercept |
|
Definition
|
|
Term
|
Definition
|
|
Term
In multivariable regression what does intercept represent? |
|
Definition
intercept is the predicted value of y if all the independent variables equal zero |
|
|
Term
How can we tell the accuracy of the estimator? |
|
Definition
Standard Error (simply is the standard deviation of its sampling distribution) |
|
|
Term
What do we need before calculating the standard error? |
|
Definition
Must first calculate the
variance of the error term:
Var(u|x)=σ2 |
|
|
Term
What is unbiased estimator of the error term for multiple OLS regression? |
|
Definition
|
|
Term
Standard Error of the Regression
(also called root mean squared error/RMSE) |
|
Definition
σ=sqrt[∑ui2/n-k-1]
- used as a "goodness of fit" measure
- want it to be as small as possible (means the better the fit of the data)
|
|
|
Term
What are the assumptions of the distribution of the error term? |
|
Definition
- ui's must have an expected value of zero
- must not be correlated to each other
- assume constant variance
- each error term must be normally distributed
|
|
|
Term
When do we use the t-test? |
|
Definition
- To test the null hypothesis H0: Bj=Bi
(looking for signficiant differences in the means)
- must calculate the t statistic to determine that
t ≡ (Bj-Bj*)/se(Bj) ~ tn-k-1
Stat programs test whether the regressor has any influence on y. Thus, the H0=Bj=0 and the t stat= Bj/se(Bj)
|
|
|
Term
How to test for differences in parameter values? |
|
Definition
Again, use t-test:
t ≡ Bj-Bl/se(Bj-Bl) ~ tn-k-1
se(Bj-Bl)= sqrt[Var(Bj) + Var(Bl) - 2Cov(Bj,Bl)] |
|
|
Term
What do y, yhat, ybar look like graphically? |
|
Definition
|
|
Term
When do we use the F-test? |
|
Definition
To question whether all the regressors taken together posses any explanatory power
H0: B1=...=Bk=0
The null is rejected if calculated F> F from table
[image] |
|
|
Term
How do we test the "goodness-of-fit"? |
|
Definition
Coefficient of Determination (aka=R2)
[image]
|
|
|
Term
What is the explained variance of the linear model? |
|
Definition
Explained Sum of Squares (ESS)
ESS=∑(yhat-ybar)2 |
|
|
Term
What is the unexplained variance of OLS model? |
|
Definition
Residual Sum of Squares
RSS=∑ui2
|
|
|
Term
How do we quantify the total variance of y? |
|
Definition
- If we had no regressors, the best estimate of y would be its mean
--> Therefore, the total variance of y is the
Total Sum of Squares (TSS)
TSS=∑(yi-yhat)2
TSS=ESS+RSS |
|
|
Term
What does R2=1 mean?
And R2=0? |
|
Definition
R2=1 indicates that all observations lie on the regression line (perfect description/modelling of data)
R2=0 means that the regression model has no explanatory power with regards to y
R2 will never decrease if the number of regressors increases even if the newly added regressors have no real explanatory power
|
|
|
Term
What is Adjusted R2? Why do we need it? |
|
Definition
Adj. R2 corrects for degrees of freedom (adding unneccessary regressors just to inflate R2)
|
|
|
Term
Items asked to solve from STATA output (Assignment No.1) |
|
Definition
Std. Error for regressor
t-stats for regressor and constant
p-values for regressor and constant
Confidence intervals
F value for the model
|
|
|
Term
Items asked to solve from STATA output
(Final Exam) |
|
Definition
Model Mean Squared Error (MMS)
Number of df for residuals
R2
Root Mean Squared Errors (RMSE)
Confidence Intervals
Coefficients |
|
|
Term
With STATA output: How to solve for R2? |
|
Definition
|
|
Term
With STATA output: SE of coefficient
(ß1 in bivariate caes) |
|
Definition
RMSE/sqrt[(∑xi-xbar)2] <-denominator will be given (or enough info to derive it) |
|
|
Term
With STATA output: t-stat |
|
Definition
ß1/se(ß1)
*easy because it is the two values
listed before it in the row |
|
|
Term
With STAT output: p-value |
|
Definition
reverse cumulative t distribution
syntax:
1 - ttail(residual df, - t-stat) + ttail(residual df, + t-test) |
|
|
Term
with STATA output: F-value |
|
Definition
|
|
Term
|
Definition
RMSE = sqrt[RSS/residual df] |
|
|
Term
with STATA output: Adjusted R2 |
|
Definition
Adjusted R2 = 1 - ((RSS/res. df)/(TSS/n-1)) |
|
|
Term
with STATA output: Confidence Intervals |
|
Definition
CI: [ß1 - tvalue*se(ß1), ß1 + tvalue*se(ß1)]
tvalue retrieved from table
(usually 1.96 ~ 95%) |
|
|
Term
What is meant by a "strong" effect?
What is meant by a "powerful" effect? |
|
Definition
An effect is strong if the coefficient, compard to other coefficients, take on a large value (magnitude)
An effect is powerful/pronounced if the coefficient clearly exceeds its standard error
(that would be the ratio of coefficient to SE which equals the t-value) |
|
|