Shared Flashcard Set

Details

Statistics
Multivariate
72
Other
Graduate
11/24/2007

Additional Other Flashcards

 


 

Cards

Term
Covariates
Definition
Variables that are measured before the DV and are correlated with the DV
Term
Covariates
Definition
Covariates should be measured on a continuous scale, though occasionally you see researchers using categorical variables that have been dummy coded (0, 1) as covariates. Adding a covariate to the analysis can increase the power (sensitivity) of the test by reducing the error term – if it is a well chosen covariate.
Term
Covariates
Definition

Covariates IOW, it reduces the ‘noise’ in the analysis and makes it easier to find significant differences between your levels of your IVs. The covariate can reduce the sum of squares error and increase the f value and decrease the p value – this increases the likelihood of finding significance.

Term

PURPOSES OF ANCOVA (3 MAJOR)

Definition

Three major purposes of ancova

¨ To increase sensitivity (i.e., power) of the test of main effects and interactions by reducing the error term: This is the most common use of ancova. This is useful when you have an experimental design. This reduces noise in your analysis. The noise is unwanted variance (individual differences) in the dv that is estimated by scores on the cvs. Useful in a pretest/posttest design where you want to control for the pretest (use as a covariate) and then test for group differences on your dv (posttest)

¨ To adjust the means on the dv themselves to what they would be if all participants scores equally on the cvs. This is useful if you have a non-experimental design where participants cannot be assigned to groups (quasi-experimental iv). Here differences between your groups on the cvs are removed so that, presumably, the only differences that remain are related to the effects of the ivs. Here the addition of the cvs only enhances the prediction of the dv – there is no implication of causality. Here you are ‘controlling for’ the covariate and then assessing group differences on your dv

¨ As a follow-up to manova: These are known as stepdown analyses – where you use the individual dvs as covariates for your follow-up analyses

Term

ISSUES WITH ANCOVA (9)

Definition

Issues with Ancova

¨ Causality - You can’t infer causality if it is not an experimental design

¨ # of Covariates - You want to choose a small number of covariates, all of which are correlated with your dv and none are correlated with each other or the ivs

¨ Sample sizes and missing data - Unequal n can cause homogeneity of variance problems

¨ Outliers - Outliers can cause heterogeneity of regression

¨ Multicollinearity and singularity - Highly correlated cvs can cause multicollinearity or singularity making it impossible to compute your ancova or they can weaken the power of your test

Term
ISSUES WITH ANCOVA (9)
Definition
Issues with Ancova

¨ Normality of sampling distributions of the mean - Sampling distributions of means should be normal within each group. This is usually achieved with large sample sizes

¨ Homogeneity of variance - It is assumed that the variance of the cv scores is equal between the groups. You can either use a more stringent alpha level or drop the cv from the analysis. You should always test for homogeneity of variance (levene’s test) prior to performing an ancova

¨ Homogeneity of regression - In order to use ancova it is assumed that the relationship between each CV and the dv is linear in nature. If the relationship is not linear it reduces the power of the statistical test. If there is curvilinearity you can transform the non-linear cvs. If transformation does not work either drop the CV or use a higher order power (squared, cubed, etc.) Of the cv in the analysis

¨ Reliability of covariates - It is assumed that cvs are measured without error; that they are perfectly reliable


Term

Ancova

TEST FOR HOMOGENEITY OF REGRESSION

Definition

Test for homogeneity of regression:

This assumption states that the slopes of the regression of the dv on the cv are the same for all cells of a design. If interaction of iv and cv is significant then you have heterogeneity of regression. If you have heterogeneity of regression you should not use ancova. It is assumed that the slope (steepness) of the regression between the dv and the cv are equal with each group (levels of the iv)

Term

Ancova


SPECIFIC COMPARISONS & EFFECT SIZE

Definition

Ancova

Specific comparisons: If your iv has more than 2 levels you need to do post-hoc or planned comparisons (using syntax or bonferroni correction)

Effect size: this is a statistical measure of the strength of the relationship or the magnitude of the difference. It is a measure of how much variance in a dependent variable can be explained by an independent variable

Term

Ancova (more)

Definition
Ancova (more)

If a cv is significant it provides adjustment of the dv scores. Sometimes if the cv is non-significant it can reduce noise (error) but not adjust the dv scores. You should run your analysis with the cv and without (regular anova) to see if having the cv in the analysis improves it. If the cv is not significant or does not reduce the error terms then it shouldn’t be included in your analysis. Your cvs should not be correlated with each other or the ivs but all should be correlated with the dv. Sometimes even if the cv is correlated with the dv, it still isn’t a good covariate. If too many cvs are used or they are correlated with each other then power is reduced. Sometimes covariates can pull out important variance. You should always perform bivariate correlations with all of your variables prior to choosing your covariates

Term
Ancova (more)
Definition

Ancova (more)

Sometimes the assumptions and restrictions of ancova are too stringent and/or there is too much ambiguity in interpreting results -- therefore you want to choose an alternative. If your cv and dv are measured on the same scale (such as a pretest, posttest) then you perform a repeated measures design to find change in dv among your groups (repeated measures anova or dependent t-test if only two levels). Other options include randomized blocks and blocking. You can make your cv into another iv (make it a categorical variable)

Term
Ancova (more)
Definition

Ancova (more)

According to miller & chapman (2001) ancova is widely misused, especially in psychopathology research. Many researchers use ancova often to ‘control’ for variables they do not want in their analyses. If your design is an experimental one with random assignment to groups then ancova is appropriate to reduce noise in your analysis since your cv should not be related to your iv due to random assignment. If the cv is related to the iv then when you include the cv in the analysis you are removing variance that is an important distinction between your groups – it is not noise or error

Term
Ancova (write-up)
Definition

Write-up of Ancova:

Covariate (alcqty): F(1,486) = 119.97, p < .001, partial η2 = .20, power = 1.00 this covariate is significant, it adjusts the dv scores

F(1,486) = 28.07, p < .001, partial η2 = .05, power = 1.00 At posttest, students in the control group (M = 5.16, SD = .20) had significantly higher drinking amount perceptions than students in the experimental group (M = 4.45, SD = .34) after controlling for pretest drinking amount perception scores

Term

(MANOVA)

Multivariate analysis of variance

Definition
Multivariate analysis of variance (manova):
It is statistically equivalent to discriminant function analysis (dfa). Manova emphasizes the mean differences and statistical significance of differences among groups. The DVs are combined into a new dv that is a linear combination of all the DVs
Term

(MANOVA)

advantages (4)
Definition

Advantages of manova (4 advantages)

¨ Including multiple dvs increases the change of discovering what the iv effects

¨ There is protection against inflated type i error due to multiple tests of (likely) correlated dvs

¨ In rare circumstances, it may show differences that are not detectable by multiple anovas

¨ May occasionally be more powerful than multiple anovas

Term

(MANOVA)

disadvantages (3)

Definition

Disadvantages of manova (3 disadvantages)

¨ It is a much more complicated procedure and there can be ambiguity in interpreting results

¨ Several assumptions must be met in order to be justified in performing manova

¨ Often is less powerful than multiple anovas due to redundant (correlated) dvs

Term

(MANOVA)

asks the question?
Definition
Multivariate analysis of covariance: It asks the question, “are there statistically reliable mean differences among the groups after adjusting the newly created dv (combination of all the dvs) for differences on one or more covariates?”
Term

(With MANOVA)

Important correlational values!

Definition

With manova

You want dvs that are not too correlated with each other – you want them to be assessing different dimensions. Manova works best with highly negatively correlated dvs (> -.6) and acceptably well with moderately correlated dvs in either direction (about /.6/ ). Manova does not work well with dvs that are highly positively correlated (> .6) or dvs that have correlations near zero. If your dvs are correlated near zero than doing multiple univariate anovas is much more powerful than doing a manova

Term

(Issues with MANOVA- 8)

Definition

Issues with Manova/Mancova (8 issues)

¨ Causality - You can’t infer causality if it is not an experimental design

¨ Sample sizes - For manova and mancova, it is necessary to have more cases than dvs in every cell. If you don’t you can violate the homogeneity of variance-covariance assumption and it might be impossible to perform the analysis due to unstable matrices. You lower the power of your test if your sample size is too small

¨ Multivariate normality and outliers - The sampling distributions of means of the various dvs in each cell and all linear combinations of them are normally distributed. This occurs if you have >20 sample size in each cell.

¨ Outliers - They can produce either a type I or a type II error

Term

(Issues with MANOVA- 8)

Definition

Issues with Manova/Mancova (8 issues)

¨ Homogeneity of variance-covariance matrices - If you have equal sample sizes this is usually achieved. If you do not have equal sample sizes you should perform a box’s m test to test for violation of this. Use the pillai’s criterion to evaluate the significance of your test if you violate this assumption

¨ Linearity - Manova and mancova assumes linear relationships among all pairs of dvs, all pairs of covariates, and all pairs of dv-covariates. If you don’t have linearity then you reduce the power of your test. Either transform or drop dvs/cvs that are curvilinear

¨ Homogeneity of regression - You need to test for this in manova if you use the stepdown analysis to evaluate individual dvs. If you are performing a mancova you always need to test for this. If you violate this assumption you should not be performing the manova with stepdown analysis or the mancova

¨ Reliability of covariates - Your covariates must be reliable or you can increase the type I or type II error rate

Term
Four criteria for assessing whether or not your overall manova or mancova is significant
Definition


¨ When your iv has only two levels the F values for these 4 tests will be the same

¨ Wilk’s lambda (l): this is the f-test for explained variance; this is usually the criterion of choice. 1 – wilk’s lambda = effect size (partial eta-squared). You want this number to be small

¨ Hotelling’s trace: this is best with uncorrelated dvs

¨ Pillai’s trace: this is the pooled effect variances. This is best when you have violated assumptions. 1 – wilk’s lambda = pillai’s trace. You want this number to be large

¨ Roy’s largest root: this is best with correlated dvs. This tends to be the most conservative criterion

Term

Wilk’s lambda

criteria for assessing whether or not your overall manova or mancova is significant

Definition

criteria for assessing whether or not your overall manova or mancova is significant

Wilk’s lambda (λ): this is the f-test for explained variance; this is usually the criterion of choice. 1 – wilk’s lambda = effect size (partial eta-squared). You want this number to be small

Term

Hotelling’s trace

criteria for assessing whether or not your overall manova or mancova is significant

Definition

Hotelling’s trace:

this is best with uncorrelated dvs

criteria for assessing whether or not your overall manova or mancova is significant

Term

Pillai’s trace

criteria for assessing whether or not your overall manova or mancova is significant

Definition

criteria for assessing whether or not your overall manova or mancova is significant

Pillai’s trace: this is the pooled effect variances. This is best when you have violated assumptions. 1 – wilk’s lambda = pillai’s trace. You want this number to be large

Term

Roy’s largest root

criteria for assessing whether or not your overall manova or mancova is significant

Definition

Roy’s largest root:

criteria for assessing whether or not your overall manova or mancova is significant

this is best with correlated dvs. This tends to be the most conservative criterion

Term
(MANOVA)- Follow-up Analyses
Definition

Three main follow-up analyses that you can use to assess individual dvs

¨ Univariate anovas: - Here if your overall manova/mancova is significant you then perform univariate anovas or univariate ancovas (if you are doing mancova) for each dv. Since you have multiple tests some researchers suggest that you use a more stringent alpha level (.05 / # of tests = bonferroni’s correction). If you have correlated dvs using this a follow-up could lead to inflated type I error rate. This is the most popular (and easiest to do) method of follow-up to manova

¨ Roy-bargmann stepdown analysis - If you have highly correlated dvs you should use a stepdown analysis. This tests the importance of each dv individually by using ancova.This is a conservative follow-up to manova/mancova (less powerful). First the highest priority dv (which you decide) is tested in a univariate anova and then each successive dv is tested with the previous dvs as covariates. You should perform homogeneity of regression tests for each dv prior to performing the stepdown analysis

¨ Discriminant function analysis - You can perform a dfa as a follow-up to manova. Your dvs now become ivs and are used to predict the categorical dvs (formerly your ivs)

Term
Factorial MANOVA or Mancova - Follow-up Analyses
Definition

If you are performing factorial manovas or mancovas then you must perform simple effects analyses if your interaction(s) are significant. If your iv has more than 2 levels you need to do post-hoc comparisons. If you are doing manova you can use post-hoc tests (tukey, scheffe, etc.). If you are doing mancova you have to use the special syntax or bonferroni correction

Term
Write-up of Manova/Mancova:
Definition

Write-up of Manova/Mancova:

 

F(3,441) = 9.61, p < .001, λ = .939, partial η2 = .06, power = 1.00 This states that there is a significant mean difference between men and women on the combination of the three social support variables. It does not show which dvs are significant

Univariate Anovas: Tcomm: F(1,443) = 13.83, p < .001, partial η2 = .03, power = .96 This shows that females (M = 24.20, SD = 3.43) have significantly higher support from community members than males (M = 21.06, SD = 3.56)

Stepdown Analysis: Tpeer (this is a univariate anova): F(1,443) = 22.28, p < .001 This shows that females (M = 28.36, SD = 3.22) have significantly higher support from peers than males (M = 25.29, SD = 3.41). Tcomm (this is an ancova with tpeer as the cv): F(1,442) = 2.78, ns. This shows that females (M = 24.20, SD = 4.10) have similar support from community members compared to males (M = 21.06, SD = 4.09) after controlling for peer support

 

Term
Multiple Regression (MR
Definition
this is a statistical technique that allows one to assess the relationship between one dependent variable (DV) and several independent variables (IVS). The IVS can be continuous, dichotomous, or you can change a discrete IV into a set of dichotomous IVs. The DV must be continuous. The result of a MR is an equation that represents the best prediction of a DV from several IVS. MR is very flexible, it can be used with experimental, observational, and survey research
Term
Important terms for mr
Definition

¨ Multiple correlation (R) – this is the correlation between the obtained and the predicted Y (DV) values

¨ Squared multiple correlation (R2) – the proportion of variance in the DV that is predictable from the best linear combination of the IVS. This is the effect size

¨ Adjusted R2 – here the R2 is adjusted for sample size. R2 tends to be an overestimate of the relationship

¨ B weights – these are the unstandardized regression coefficients. This represents the change in the DV associated with a one unit change in the IV with all other IVS held constant. These weights are in the same metric as the original data

¨ β weights (Beta weights) – these are the standardized regression coefficients. The larger the absolute value of the weight indicates the strength of the relationship between the IV and the DV (how well does it predict)

Term

Important terms for mr

 

Multiple correlation (R)

Definition
this is the correlation between the obtained and the predicted Y (DV) values
Term

Important terms for mr

 

Squared multiple correlation (R2)

Definition

Squared multiple correlation (R2)

 

the proportion of variance in the DV that is predictable from the best linear combination of the IVS. This is the effect size

 

 

 

Term

Important terms for mr

 

Adjusted R2

Definition
here the R2 is adjusted for sample size. R2 tends to be an overestimate of the relationship
Term

Important terms for mr

 

 B weights

Definition
these are the unstandardized regression coefficients. This represents the change in the DV associated with a one unit change in the IV with all other IVS held constant. These weights are in the same metric as the original data
Term

Important terms for mr

 

β weights (Beta weights)

Definition
these are the standardized regression coefficients. The larger the absolute value of the weight indicates the strength of the relationship between the IV and the DV (how well does it predict)
Term
Important terms for mr
Definition

¨       Standard error of the estimate: this is your prediction error. It is a standardized version of the sum of the residuals (the difference between y-observed and y-predicted

¨       Squared semi-partial correlations (sri2) – this is the unique contribution of the IV to the total variance of the DV.  This represents the amount of variance accounted for by the individual IV

¨       Suppressor variables – when an IV suppresses variances that is unimportant to the regression equation.  It enhances the prediction of the DV.  You can tell if you have these if the sign of the beta weight for an IV is the opposite of the sign of the bivariate correlation between the IV and the DV

Term

Important terms for mr

 

Standard error of the estimate

Definition
this is your prediction error. It is a standardized version of the sum of the residuals (the difference between y-observed and y-predicted
Term

Important terms for mr

Squared semi-partial correlations (sri2)

Definition
this is the unique contribution of the IV to the total variance of the DV.  This represents the amount of variance accounted for by the individual IV
Term

Important terms for mr

 

Suppressor variables

Definition
when an IV suppresses variances that are unimportant to the regression equation.  It enhances the prediction of the DV.  You can tell if you have these if the sign of the beta weight for an IV is the opposite of the sign of the bivariate correlation between the IV and the DV
Term
Similarity of MR to ANOVA
Definition

¨       Analysis of variance and multiple regression are both based on the general linear model. The advantage of using regression versus anova is that you can have continuous predictors (in anova you have to make them categorical – say by using a median split)

¨       Analysis of variance is just a restricted version of multiple regression. In analysis of variance you are trying to assess the amount of variance accounted for in your dv based on your independent variables (which in an experimental design are manipulated). In regression you are also trying to assess the amount of variance accounted for in your dv (criterion, y) based on your ivs (predictors, x)

¨       Any between subjects anova can be performed by using a regression instead. For example in your anova your iv=gender (m,f), iv=group (exp,cont) and your dv=performance. You would use your ivs as predictors (you can even include the interaction between the two) in a multiple regression

¨       Eta-squared (ss effect/ss total) in anova is the same as r-squared in multiple regression

Term
Formulae for MR
Definition

SIMPLE LINEAR REGRESSION

 

Y = BX + A  + E

 

Where y is the criterion (what you are trying to predict), a is the intercept (where the line crosses the y-axis), b is the slope (the steepness of the line), x is the predictor (the variable that you know the value of), and e is error

 

MULTIPLE REGRESSION

 

Y = B1X1 + B2X2 + ... + BKXK + A + E

 

Where y is the criterion, the b’s are the regression coefficients, the x’s are the predictors, a is the intercept, and e is error

Term
Assumptions and Issues that must be addressed in MR (7 issues)
Definition

¨       Causation issues – regression analyses reveal relationships among variables but do not imply that the relationships are causal.  Many researchers do not like to use the term ‘predict’ if the design is not a true experiment.  Instead they use terms such as: ‘correlated’, ‘associated’, and ‘related’

¨       Types of variables – regression is best when each IV is strongly correlated with the DV but uncorrelated with other IVS.  You should identify the fewest IVS necessary to predict a DV.  Do not include IVS that are too highly correlated (>|.6|) and do not include IVS that are not correlated with the DV (<|.3| - which represents only 10% overlap).  Also make sure all variables are reliable

¨       Ratio of cases to IVS – for testing R (N > 50+8m, where m is number of IVS) and for testing individual predictors (N > 104 + m).  If your variables are skewed, have error, or you expect a small effect size then you will need more participants.  If using statistical (stepwise) regression you should have a ratio of at least 40 cases per IV.  Make sure you deal with missing cases prior to performing a MR

Term
Assumptions and Issues that must be addressed in MR (7 issues)
Definition

¨       Absence of outliers – outliers can greatly impact the regression equation and affect the precision of the estimation of regression weights.  Outliers should be deleted, changed, or the variable transformed before performing a MR

¨       Absence of multicolinearity and singularity – delete or collapse variables that have correlations > |.9|

¨       Normality, linearity, and homescedasticity of residuals – there is no assumption that the IVS must be normal but the prediction is enhanced if they are.  You should have linearity between your IVS and DV.  Transform variables if you have non-linear or homoscedastic variables

¨       Independence of errors – errors of prediction should be independent of one another

Term
Types of Multiple Regression
Definition

¨        Standard (simultaneous) MR – all IVS are entered into the regression equation at once.  Each IV is assigned only the area of its unique contribution to the relationship.  No IV is assigned the overlapping variance.  This is the most widely used type of MR.  You need the least amount of participants compared to other types of MR.  You use this type of MR if you simply want to assess the relationships among variables.  This is an atheoretical approach – you have no prior theory about how the variables should be related

¨        Sequential (hierarchical) MR – IVS are entered into the equation in an order specified by the researcher.  This order is based on theory or prior research.  Each IV or set of IVS is assessed in terms of what it adds to the equation at its own point of entry.  Each IV is given the variance (unique and overlapping) that exists at its point of entry into the equation.  IVS can be entered one at a time or in sets.  You use this type of MR if you have specific hypotheses or theories about how your variables should be related.  This is a model-testing method.  You can also use this type of MR if you have covariates, these are put in step 1 so they are controlled for in the other IVS

¨        Statistical (stepwise) MR – the order of entry for the IVS is based on statistical criteria.  The IV with the largest bivariate correlation with the DV is entered first (forward method) or the IV with the smallest bivariate correlation is deleted first (backward method) or a combination of forward and backward is used (stepwise).  Each IV is given the variance (unique and overlapping) that exists at its point of entry into the equation.  This method is useful when you want to find a subset of IVS that is useful in predicting the DV.  You use this method if you want to build a model of prediction and you want to find the best predictors of your DV.  This is usually an exploratory technique and you need a large sample size to perform this since it tends to capitalize on chance and overfits the data

Term
Logistic Regression (LR): 
Definition

this is a statistical technique that allows one to predict a discrete outcome (DV) from a set of IVS that can be continuous, discrete, dichotomous, or a combination of all three.  The discrete outcome in many logistic regressions in health sciences is disease/no disease.  This type of analysis is related to DFA, MFA, and MR but it is much more flexible.  Logistic regression has no assumptions about the distribution of the predictor variables (IVS); they don’t have to be normally distributed, linearly related, or of equal variance within each group.  Logistic regression can be used to fit and compare models.  Logistic regression is relatively free of restrictions.  If assumptions regarding distributions are met you should use DFA if you have a dichotomous DV or MR if you have a continuous DV

 

Binomial logistic regression:  two outcomes (levels) of the DV

            Multinomial logistic regression:  more than two outcomes (levels) of the DV

Term

 

Goal of Logistic Regression: 
Definition
to find the best linear combination of predictors (IVS) to maximize the likelihood of obtaining the observed outcome frequencies.  You perform a logistic regression to correctly predict the category of the outcome for individual cases
Term
Strength of association (R2): 
Definition
this measures how strong the relationship between the DV and the set of IVS in a particular model.  It is the proportion of variance in the DV that is associated with a set of IVS
Term
Assumptions and Issues that must be addressed in LR (6 issues)
Definition

¨        Ratio of cases to IVS – LR may produce extremely large parameter estimates and standard errors and it may fail to find a solution if your sample size is too small.  You should collapse levels of the DV or delete or collapse IVS to avoid this

¨        Adequacy of expected frequencies and poweryou will have inadequate power if your expected frequencies are too low; they should all be greater than 1 and no more than 20% less than 5.  The power of your logistic regression increases with sample size

¨        Linearity in the logit – logistic regression assumes a linear relationship between continuous predictors and the logit transform of the DV but there doesn’t have to be linear relationships between your individual predictors

¨        Absence of multicolinearity – logistic regression is sensitive to extremely high correlations among predictor variables.  Delete or collapse IVS if they are too highly correlated

¨        Absence of outliers – outliers greatly impact LR.  Delete or transform them prior to performing LR

¨        Independence of errors – LR assumes that responses of different cases are independent of each other

Term
Types of Logistic Regression
Definition

¨      Standard (direct) LR – all predictors (IVS) are entered into the equation at the same time.  This is the method of choice when you have no specific hypotheses about the order or importance of predictor variables

¨      Sequential LR – here the researcher specifies the order of the predictors into the model.  You can also have covariates in this type of LR

¨      Stepwise (statistical) LR – the inclusion and removal of predictors is solely based on statistical criteria.  This type of LR is useful when you want to screen for important (best predictors) IVS or to generate hypotheses about the IV/DV relationships

Term
Assessing goodness of fit models for lr
Definition

¨       Constant only model – this includes no predictors

¨       Full model – this includes the constant plus all the predictors

¨       Comparing constant only to full model – this will show you if the predictors, as a group, contribute to the prediction of the outcome.  If this shows significance then it states that the combination of IVS are significant predictors of the DV.  You want a significant chi-square test here

¨       Testing individual predictors – to test to see which IVS are significant predictors you need to look at the Wald test.  You want this to be significant, indicating that the IV is a significant predictor

Term
Classification of cases for lr
Definition

¨      Classification is available only when you have two levels of the DV

¨      Type I error – classifying a non-diseased case (0) as a diseased case (1)

¨      Type II error – classifying a diseased case (1) as a non-diseased case (0)

Term
Odds Ratio Eβ
Definition

the increase (or decrease if the ratio is less than 1) in odds of being in one outcome category when the value of the predictor (IV) increases by one unit.  B coefficients are the natural logs of the odds ratios.  Odds ratios greater than 1 show an increase in odds of an outcome of 1 with a one-unit increase in the predictor.  Odds ratios less than 1 show the decrease in odds of an outcome of 1 with a one-unit increase in the predictor.  In health sciences odds ratios are often referred to as relative risk.  You can look at the odds ratios to determine the best predictors.  The farther the odds ratio is from 1, the more influential the predictor is

 

            No relationship between IV and DV:  odds ratio of 1

            Negative relationship between IV and DV:  odds ratio of < 1

            Positive relationship between IV and DV:  odds ratio of > 1

Term
Factor Analysis (FA) and Principal Components Analysis (PCA)
Definition
statistical techniques applied to a single set of variables when the researcher is interested in discovering which variables in the set form coherent subsets that are relatively independent from each other.   These subsets, called factors or components, are thought to reflect underlying processes that have created correlations among the variables
Term
Goals of FA/PCA
Definition

¨      Summarize patterns of correlations among observed (marker) variables

¨      Reduce large number of observed variables to a smaller number of factors/components

¨      Provide an operational definition (regression equation) for an underlying process by using observed variables

¨      Test a theory about the nature of underlying processes

Term
Steps to FA/PCA (7 steps)
Definition

¨      Selecting and measuring a set of variables

¨      Preparing the correlation matrix

¨      Extracting a set of factors/components from the correlation matrix

¨      Determining the number of factors/components

¨      (Probably) rotating the factors/components to increase interpretability

¨      Interpreting the results

¨      Verifying the factor/component structure by establishing construct validity of factors

Term
Types of rotation
Definition

¨        Orthogonal rotation - this is varimax in spss.  This type of rotation ensures that all the factors are uncorrelated with each other.  This produces a loading matrix (a matrix of correlations between observed variables and factors).  This type of rotation tends to make the higher loadings higher and the lower loadings lower.  If a variable loads /.32/ or greater it is usually interpreted.  This type of rotation offers ease of interpreting, describing, and reporting results.  You should not use this type of rotation if your factors are correlated with each other

¨        Oblique rotation - this is direct oblimin in spss.  This type of rotation allows factors to be correlated with each other.  The loading matrix is split into the pattern matrix (matrix of unique relationships) and the structure matrix (matrix of correlations).  The pattern matrix should be interpreted.  If a variable loads /.32/ or greater it is usually interpreted.  There are conceptual advantages to oblique rotation but it tends to be more difficult to interpret than orthogonal rotation.  If your factors are not correlated with each other you should not use this type of rotation

Term
Types of Factor Analysis
Definition

¨      Exploratory factor analysis (EFA) - here the researcher seeks to describe and summarize data by grouping together variables that are correlated.  This is performed at the early stages of research.  It helps to generate hypotheses about underlying processes

¨      Confirmatory factor analysis (CFA) - this is a much more sophisticated technique used in the advanced stages of research.  It tests a theory about latent processes.  It is usually performed using structural equation modeling

Term
Assumptions and issues that need to be addressed in FA/PCA (7 issues)
Definition

¨        Variability – your sample should have a lot of spread (variability) in its scores on the variables for the factors to emerge

¨        Combining samples – you should not combine different samples together to perform a FA/PCA.  This is especially true if the samples differ on some other variables (gender, ses, etc.)

¨        Sample size – sample size needs to be large enough so your correlations are reliably estimated.  A general rule is to have at least 300 cases for FA/PCA

¨        Normality – there is no assumption that variables have to be normal but the solution is enhanced if they are.  There does have to be multivariate normality.  Check individual variables for non-normality and transform if needed

¨        Linearity – relationships among pairs of variables should be linear, the analysis is degraded when linearity is not met.  Delete or transform variables if they are non-linear

¨        Absence of outliers among cases and variables – outliers tend to make factors unreliable.  Delete or modify outliers if found

¨        Factorability of R – the matrix of correlations should include several sizable correlations.  If you have no correlations greater than /.3/ reconsider use of FA/PCA

Term
To determine number of factors/components
Definition
eigenvalues (% of variance accounted for or explained by the factor) greater than 1, total % of variance accounted for greater than 50%, scree test that corresponds with previous two, and having at least 3 variables load on each factor.  Also cronbach’s alpha (internal consistency) should be calculated for each factor, this should be .80 or greater (though .70 is acceptable for many psychological studies)
Term
Moderators: x credit
Definition
These are qualitative or quantitative variables that affect the direction and/or strength of the relation between and IV (predictor) and a DV
Term
Moderation occurs: x credit
Definition
When the direction of the correlation changes based on the value of the moderating variables
Term
Moderation occurs: x credit
Definition
With interaction with IVs to influence DV
Term
Mediators: x credit
Definition
A mediator accounts for the relation between the predictor IV and the criterion DV. If you find mediation in your analysis you are saying that there is an indirect relationship between your IV and DV through the mediator instead of a direct relationship between the IV and DV
Term

Cluster analysis: x credit

Definition

A group of multivariate techniques whose primary purpose is to assemble objects based on the characteristics that they possess. Cluster analysis classifies objects so that each object is similar to others in the cluster with respect to a predetermined selection criterion (group scores on variables into homogenous groups [profile of students])

Term

Structural equation modeling (sem): x credit

Definition

This is a collection of statistical techniques that allow a set of relationships between one or more IVs, either continuous or discrete, and one or more DVs, either continuous or discrete, to be examined. Both IVs and DVs can be either factors or measured variables. SEM analyses covariance.

Term
Structural equation modeling (sem): x credit
Definition

SEM analyses path diagrams. The diagram is usually referred to as the model. It is a quantitative method that combines path analysis and factor analysis and is an extension of fa and mr combined. It gives the full picture!

Term
Meta-analysis: x credit
Definition

Combines a number of empirical studies and provides a summary of results. Researchers use a sample or population of research rather than a sample or population of people to examine the direction and magnitude of the effects across studies.

Term
Meta-analysis (analyses used): x credit
Definition

analyses used include:

Pearson r, cohen’s d

Term
Meta-analysis (how quantify the data?): x credit
Definition

§        Researchers code descriptive features of each study using categorical or continuous coding schemes

§        An effect size is used to standardize findings across studies so they can be compared

Term
Meta-analysis (answers the question?): x credit
Definition

What significant relationships exist between the IVs (features or characteristics of the individual studies) and the DVs (outcomes expressed as an effect size)?

Term
Multilevel-modeling (MLM): x credit
Definition

Also known as hierarchical linear modeling (hlm), random coefficients modeling (rc) or covariance components models. An analysis used when data is hierarchical or clustered in structure. It is an alternative to other analyses such as anova, mr and others e.g. elementary school students in classrooms.

Term
Multilevel-modeling (MLM), the goal?: x credit
Definition

To determine if group-level variables have an effect on individual-level DVs directly, or indirectly as moderators of lower level relationships. Use 3 time points.

Term
[image]
Definition
[image]
Supporting users have an ad free experience!