Shared Flashcard Set

Details

Title

Cross Sectional Data Analysis-Section 1

Description

Important terms and concepts from CSDA Fall 2012

Total Cards

Subject

Other

Level

Graduate

Created

09/29/2012

Click here to study/print these flashcards.

Create your own flash cards! Sign up here.

Additional Other Flashcards

Cards Return to Set Details

Term

What does regression analysis actually do?

Definition

The depedency of at least one variable on one or more explanatory (independent) variables to estimate a conditional population mean of the dependent variable as accurately as possible with the informaiton given

Term

Stochastic Error Term

Definition

The deviation of a specific value from the conditional mean is an unobserved random variable/error (can be positive or negative)

Term

How do we interpret the intercept (ß₀)?

Definition

Intercept is the predicted value of the dependent variable if the independent variable takes a value of zero

Term

How do we interpret the slope (ß₁)?

Definition

The slope informs us about the increase/decrease in (the unit of measurement of) the dependent variable if the independent variable is changed by one unit of measurement

Term

Assumptions of Linear Regression Model

Definition

A1: the population regression funciton is linear in its parameters

A2: the values of the regressor x are fixed (not stochastic or random)

A3: the mean (conditional expectation) of the stochastic error term u_i is for any given value of x always zero (cancels out)

A4: for any given values of x, the conditioanl variance of the error term ui is a constant for any observation. (Homoscedasticity)

A5: For any x_i and x_j, we demand the correlation between the errors u_i and u_j is always zero (no autocorrelation)

A6: Regressor and error terms are not correlated (no covariance between error term and regressior)

A7: the number of observations (n) must be larger than the number of parameters (k)

A8: the variance of x must be positive finitie

A9: must be correctly specified in all respects

A10: There can be no perfect multicollinearity between regressors

Term

Equation for the intercept

Definition

[image]

Term

Equation for slope

Definition

[image]

Term

In multivariable regression what does intercept represent?

Definition

intercept is the predicted value of y if all the independent variables equal zero

Term

How can we tell the accuracy of the estimator?

Definition

Standard Error (simply is the standard deviation of its sampling distribution)

Term

What do we need before calculating the standard error?

Definition

Must first calculate the

variance of the error term:

Var(u|x)=σ²

Term

What is unbiased estimator of the error term for multiple OLS regression?

Definition

σ²=RSS/(n-k-1)

Term

Standard Error of the Regression

(also called root mean squared error/RMSE)

Definition

σ=sqrt[∑u_i²/n-k-1]

used as a "goodness of fit" measure

want it to be as small as possible (means the better the fit of the data)

Term

What are the assumptions of the distribution of the error term?

Definition

u_i's must have an expected value of zero

must not be correlated to each other

assume constant variance

each error term must be normally distributed

Term

When do we use the t-test?

Definition

To test the null hypothesis H₀: B_j=B_i

(looking for signficiant differences in the means)

must calculate the t statistic to determine that

t ≡ (B_j-B_j^*)/se(B_j) ~ t_n-k-1

Stat programs test whether the regressor has any influence on y. Thus, the H0=B_j=0 and the t stat= B_j/se(B_j)

Term

How to test for differences in parameter values?

Definition

Again, use t-test:

t ≡ B_j-B_l/se(B_j-B_l) ~ t_n-k-1

se(B_j-B_l)= sqrt[Var(B_j) + Var(B_l) - 2Cov(B_j,B_l)]

Term

What do y, yhat, ybar look like graphically?

Definition

[image]

Term

When do we use the F-test?

Definition

To question whether all the regressors taken together posses any explanatory power

H0: B₁=...=B_k=0

The null is rejected if calculated F> F from table

[image]

Term

How do we test the "goodness-of-fit"?

Definition

Coefficient of Determination (aka=R²)

[image]

Term

What is the explained variance of the linear model?

Definition

Explained Sum of Squares (ESS)

ESS=∑(yhat-ybar)²

Term

What is the unexplained variance of OLS model?

Definition

Residual Sum of Squares

RSS=∑u_i²

Term

How do we quantify the total variance of y?

Definition

If we had no regressors, the best estimate of y would be its mean

--> Therefore, the total variance of y is the

Total Sum of Squares (TSS)

TSS=∑(y_i-yhat)²

TSS=ESS+RSS

Term

What does R²=1 mean?

And R²=0?

Definition

R²=1 indicates that all observations lie on the regression line (perfect description/modelling of data)

R²=0 means that the regression model has no explanatory power with regards to y

R2 will never decrease if the number of regressors increases even if the newly added regressors have no real explanatory power

Term

What is Adjusted R²? Why do we need it?

Definition

Adj. R²corrects for degrees of freedom (adding unneccessary regressors just to inflate R²)

Term

Items asked to solve from STATA output
(Assignment No.1)

Definition

Std. Error for regressor

t-stats for regressor and constant

p-values for regressor and constant

Confidence intervals

F value for the model

Term

Items asked to solve from STATA output

(Final Exam)

Definition

Model Mean Squared Error (MMS)

Number of df for residuals

R²

Root Mean Squared Errors (RMSE)

Confidence Intervals

Coefficients

Term

With STATA output: How to solve for R²?

Definition

MSS/TSS

Term

With STATA output: SE of coefficient

(ß₁ in bivariate caes)

Definition

RMSE/sqrt[(∑x_i-xbar)²] <-denominator will be given (or enough info to derive it)

Term

With STATA output: t-stat

Definition

ß₁/se(ß₁)

*easy because it is the two values

listed before it in the row

Term

With STAT output: p-value

Definition

reverse cumulative t distribution

syntax:

1 - ttail(residual df, - t-stat) + ttail(residual df, + t-test)

Term

with STATA output: F-value

Definition

(MSS/k) / (RSS/(n-k-1))

Term

with STATA output: RMSE

Definition

RMSE = sqrt[RSS/residual df]

Term

with STATA output: Adjusted R²

Definition

Adjusted R²= 1 - ((RSS/res. df)/(TSS/n-1))

Term

with STATA output: Confidence Intervals

Definition

CI: [ß₁ - tvalue*se(ß₁), ß₁ + tvalue*se(ß₁)]

tvalue retrieved from table

(usually 1.96 ~ 95%)

Term

What is meant by a "strong" effect?

What is meant by a "powerful" effect?

Definition

An effect is strong if the coefficient, compard to other coefficients, take on a large value (magnitude)

An effect is powerful/pronounced if the coefficient clearly exceeds its standard error

(that would be the ratio of coefficient to SE which equals the t-value)

Flashcard Machine - create, study and share online flash cards

Shared Flashcard Set

Details

Additional Other Flashcards

Cards Return to Set Details

My Flashcards

Flashcard Library

Browse

About

Help

Mobile