Shared Flashcard Set

Details

Research Design Exam I
Research Design Exam I
318
Psychology
Graduate
03/09/2013

Additional Psychology Flashcards

 


 

Cards

Term
4 points of Meier: Why study measurement?
Definition
(1) Grads don't have good understanding, (2) Lack of appreciation for importance in advancement of science, (3) Overemphasize home-grown scales and unknown psychometric properties, and (4) Need more instruction on how to evaluate/report psychometric properties
Term
Complete quote: "Anything that exists in an amount can be measured and..."
Definition
"...anything that can be measured can be measured inaccurately."
Term
In psychology, we typically study
Definition
Hypothetical constructs
Term
What are hypothetical constructs
Definition
Abstract concepts used to relate different behaviors according to their underlying features or causes.
Term
What is the key feature of hypothetical constructs?
Definition
Cannot be observed
Term
Constructs are assumed to exist based on
Definition
Observable behaviors
Term
Behaviors and constructs have a ______ relationship
Definition
Reciprocal
Term
Define scale
Definition
Consists of effect indicators.  In the example of depression, these would be weight fluctuation, crying spells, suicidality, etc.
Term
Define index
Definition
Consists of cause indicators.  These influence the level of a construct.  In the example of graduate school performance, an index would assess for parental income, organizational skills, motivation, value of higher education, etc.
Term
Define latent variables
Definition
Presumed to "cause" an item score in whole or in part.  These are the hypothetical constructs of interest.
Term
We are less interested in ______ and more interested in ______
Definition
Items/scores; constructs
Term
"A well-designed study cannot make up for..."
Definition
"...problematic measurement."
Term
The item value should be correlated with...
Definition
True score of the construct
Term
True or false: True scores are measured.
Definition
False: true scores cannot be measured because they do not take into account error or extraneous variables/factors.
Term
Inter-item relationships
Definition
Scores on a measure should co-vary with the true score.
Term
Define error
Definition
The difference between the variable (true score) and how it is represented in measurement (observed score).  Variance due to the operation of the latent variable.
Term
Observed score =
Definition
True score (of the latent variable) + measurement error
Term
Error variance =
Definition
Residual variance that may be random or systematic
Term
3 assumptions regarding error in classical measurement theory (CMT)
Definition
(1) Amount of error associated with individual items varies randomly, (2) Error terms are uncorrelated across items, and (3) Error terms are uncorrelated with the true score of the latent variable.
Term
Standardized path coefficients
Definition
Show a relationship between the latent construct and the variable of interest by showing the strength of the construct.  A good path diagram of a latent variable will have path coefficients that are high (close to 1).
Term
Item correlations allow for
Definition
Estimation of path coefficients
Term
The best index of true score is
Definition
Score on a measure
Term
The cross-products of path coefficients are
Definition
The correlations between items.
Term
Parallel and strictly parallel assumptions of CMT
Definition
(1) The amount of influence from the latent construct to each item is the same, and (2) Each item is assumed to have the same amount of error (under this assumption, the path coefficients from the construct to all items should be the same).
Term
Congeneric model (Joreskog, 1971)
Definition
All items share a common latent variable, latent variables need not exert the same strength of relationship to all items (path coefficients may be different/unequal), and error variables need not be equal.  The latter two occur when relaxing the assumptions, which also allows you to see the latent variable influence on indicators.
Term
Maximum likelihood
Definition

Most common used technique, likelihood that observed correlations are drawn from a population that is the same/similar to your sample.

  • Assumes large N (few hundred)
  • Indicates multivariate, normal distribution
  • Individual variables are normally distributed
  • Indicates continuous variables
  • Chooses estimates that have greatest chance of reproducing obtained data (corr. matrix)
Term
Non-moderator variables =
Definition
Continuous
Term
Moderator variables can be
Definition
Categorical
Term
Relationship between item-total correlations and standardized path coefficients
Definition
Item total correlations (relationship between item and scale) allow for estimation of path coefficients.
Term
Problem with item-total correlations
Definition
The total/scale is influenced by the item to which it is being compared.
Term
If the item-total correlation for an item is less than alpha, then
Definition
It is eliminated.
Term
How can interrelationships among test items be empirically assessed?
Definition
Through reliability testing (coefficient alpha/internal consistency).  Covariance matrices are also used.
Term
What is the difference between a variance-covariance matrix and a correlation matrix?
Definition
A correlation matrix is standardized, whereas a variance-covariance matrix is not and maintains the original scaling of the items.
Term
What is a variance-covariance matrix?
Definition
A matrix that represents the score variations of variables (variances), pairs of variables (covariances), and the entire dataset.
Term
What do the diagonals and the off-diagonals in a variance-covariance matrix represent?
Definition
The diagonals are the items variances and the off-diagonals are the item covariances.
Term
What is coefficient alpha?
Definition
Coefficient alpha is concerned with the homogeneity of items within a scale.  It shows the proportion of a scale's total variance that is attributable to a common source of a latent variable underlying the items.  The higher the value of coefficient alpha, the more the items are related to each other.
Term
What is communal variation?
Definition
The diagonal items in the covariance matrix.
Term
What is non-communal variation?
Definition
The off-diagonals in the covariance matrix.
Term
What is the relationship between coefficient alpha, communal variation, and non-communal variation?
Definition
Coefficient alpha determines how related the items (communal and non-communal) are to each other.
Term
Define reliability.
Definition
The consistency of measurement across time or items.  Devellis: "...proportion of variance attributable to the true score of the latent variable."
Term
What are some methods for determining reliability?
Definition
Internal consistency, temporal stability, split-half reliability, and other forms of reliability that require a parallel version of the instrument.
Term
What is internal consistency?
Definition
Coefficient alpha.  It is the percent agreement per chance removed.
Term
What is an example of temporal stability?
Definition
Test-retest correlation.
Term
What is split-half reliability?
Definition
The first half of the measurement is related to the second half.  It is similar to coefficient kappa in that you take the average of all the split halves and report this or kappa.
Term
Schmitt (1996): "Alpha is not a good measure of homogeneity, rather it is a good measure of _____."
Definition
Interrelatedness
Term
Variability in a set of scores is due to...
Definition
Actual variation across individuals, error, or alpha.
Term
What is a signal?
Definition
Actual variation across individuals.
Term
What is noise?
Definition
Error.
Term
1 - error =
Definition
Alpha or internal consistency reliability.
Term
σy2 is equal to...
Definition
The variance for the measure as a whole.
Term
In order to compute alpha, we need...
Definition
Estimates of total variance and common variance.
Term
σy2 is also equal to
Definition
C, which is the sum of all matrix components.
Term
Two assumptions when calculating alpha
Definition
(1) Items in a scale are correlated because they are all affected by the same latent variable (Y), and (2) Error terms are uncorrelated and reflect the unique variation each item possesses.
Term
As a measure of reliability and as a ratio, coefficient alpha is...
Definition
Common-source variation to total variation in scores.
Term
What influences item score and scale score?
Definition
(1) Source variance that is common to itself and other items, and (2) Unique unshared variance (error).
Term
The variance of a k-item scale (k being group) equals
Definition
The sum of all the matrix items.
Term
The sum of the elements along the main diagonal equals
Definition
The sum of the variances of individual items.
Term
The sum of the matrix elements equals
Definition
The total variance for the scale.
Term
Communal error
Definition
Variability equals changes in the value of the common source.
Term
Non-communal error
Definition
Variability in items is not attributed to the common sources.
Term
As alpha goes up, consistency goes
Definition
Up.
Term
Why is use of the term homogeneous (when referring to alpha) controversial?
Definition
(1) You can have a high alpha coefficient with more than one source of variance, and (2) Coefficient alpha is sensitive to the length of the scale (with higher alphas for longer scales).
Term
The cutoff for high alpha is
Definition
.7
Term
The cutoff for temporal stability is
Definition
.9
Term
Alpha ranges from
Definition
0 to 1
Term
Temporal stability ranges from
Definition
0 to 1
Term
Assumptions of temporal stability
Definition
(1) Variable being measured is stable, and (2) Error is not associated with time of measurement (error is assumed to be constant across time).
Term
Concerns surrounding test-retest reliability
Definition
(1) Accurate measurement of variable behavior/trait will result in low test-retest reliability, (2) Insensitive measure of volatile trait may yield artificially high test-retest reliability, (3) Third variable may moderate expression of trait, (4) Maturation effects, (5) Systematic oscillation, and (6) Fatigue effects
Term
Interobserver agreement (IOA) is
Definition
Cohen's kappa.
Term
Kappa is a function of
Definition
A ratio of agreements to disagreements in relation to the expected frequencies.
Term
Define kappa.
Definition
Percent agreement per chance removed.
Term
The formula for kappa is
Definition
K = [Pr(α)-Pr(e)]/[1-Pr(e)], with Pr being the percent agreement.
Term
Validity coefficients are constrained by
Definition
Reliability.
Term
Increased reliability means increased
Definition
Power.
Term
Define power.
Definition
Ability to detect mean differences between conditions.
Term
Define validity.
Definition
"Does the test measure what it purports to measure?" (Anatasi).  "Are the scores produced by an instrument or assessment procedure meaningful?" (Cronbach & Meehl, 1955).
Term
What is the relationship between reliability and validity?
Definition
Reliability and validity are co-dependent because in order for a test to measure what it should measure, then the variables should be attributable to the true score of the latent variable (and vice versa).
Term
When developing a test, you should review the literature and define the construct of interest by...
Definition
Comprehensively searching appropriate databases, consulting experts in the area, consulting compendiums designed to list measures of limited distribution, and including behavior excesses/deficits, skills/weaknesses, and traits.
Term
When developing a test, the test planning and layout should consist of...
Definition
Selection of a representativre sample (one that has the same levels of the relevant characteristics of the population to which you want to generalize), behavior domain, behavior sample, and table of specifications (lists processes, content, and number of items per cell)
Term
What is a behavior domain?
Definition
The entire range of behaviors the test is supposed to measure.
Term
What is a behavior sample?
Definition
All the behaviors thei tems actually cover (should represent behavior domain).
Term
When developing a test, designing the test should include which processes?
Definition
Write instructions, every section of the test should begin with a new set of instructions, consider including a demographic section (unless it would impair performance), develop a test manual, ensure standardization of administration, and choose item type.
Term
When developing/designing a test, what should your test manual include?
Definition
Brief directions for administration, scoring, and interpretation of the test; appropriate use of the test; psychometric properties; and appropriate populations.
Term
When developing/designing a test, what would be some good characteristics of your instructions?
Definition
Brief, simple, clear, free of modifying clauses, and not offensive to any particular group.
Term
In developing/designing a test, what are some things to consider when choosing your item type?
Definition
Free response versus fixed response, timed versus untimed, avoid multiple item types, consider item difficulty/attractiveness, have colleagues review items for clarity, items should avoid difficult words, items should be grammatically simple, items should be emotionally neutral, and you initial want to have 1.5-2 times the number of items you hope to ultimately end up with.
Term
What are some examples of free response and fixed response items?
Definition

Free response: How many times have you cried this week? _______________.

Fixed response: How many times have you cried this week?

(a) None

(b) Once

(c) 2-3 times

(d) 4 or more times

Term
What is a rule of thumb when using timed tests?
Definition
90% of people should complete the test within the limit.  You should also have separate limits for each section.
Term
When developing a test, why should you avoid multiple item types?
Definition
Having multiple item types complicates item analysis and interpretation.
Term
Define item attractiveness.
Definition
Asking a question in such a way that it increases the likelihood of eliciting a positive response.
Term
When developing a test, what are some key features you should look for when trying out your items on a sample?
Definition
Select a sample of sufficient size to reduce undue influence of outliers and a sample that is similar to the target population in relevant characteristics. Should have N 50-500, possibly oversampling, and be aware of things like missing data.
Term
What should be included in your consent form?
Definition
Types of tests, reason(s) for testing, intended use of results, consequences of that use, who will receive the results.
Term
What is the key thing you'll be looking for in item analysis?
Definition
Discriminating power - being able to distinguish between persons high and low on the characteristic being measured.
Term
True-score theory
Definition
Looks at the ability of an individual item to determine high or low scores (you don't want everyone to get the same answer and you want a range of difficulties to tap all levels of the construct).
Term
Item difficulty
Definition
Proportion of test takes (p) who give the keyed response.  You want this to be ~.5 (unless dichotomous), but you should also have a range of item difficulties.
Term
Item discrimination index
Definition
Difference between the number of high and low scorers who get an item correct (or the number of persons with and without a disorder who endorse an item in the keyed direction).
Term
Formula for calculating item discrimination index
Definition

Di (item discrimination index) = Nhi/Nh - Nli/Nl

Nhi = number of persons in the high scoring group who passed item i

Nh = number of persons in the high scoring group

Nli = number of persons in the low scoring group who passed item i

Nl = number of persons in the low scoring group

Term
When building a scale, you want the following characteristics
Definition
Items with acceptable statistics, items with marginal statistics, statistically weak items that will be eliminated, use of factor analysis to make decisions (exploratory).
Term
How do you standardize a test?
Definition
Identify a standardization sample, compute reliability indices (consistency over time, conditions, scorers, forms, and internal consistency), and compute validity indices (convergent and divergent)
Term
Key point of item response theory when developing a test
Definition
Items assess different levels of some construct.
Term
Define content validity
Definition
Degree to which elements of an assessment instrument are relevant to and representative of the targeted construct for the assessment purpose (Haynes et al, 1998).  Determines whether the test is made up of stimuli calling for construct-relevant responses.  The score on the measure is meant to describe the behavior in its own right, not as a sign of some abstraction or construct (Foster & Cone, 1995).
Term
Which type of validity concerns sampling adequacy?
Definition
Content validity
Term
Two features of content validity
Definition
(1) Easiest to evaluate when the domain of interest is well-defined, and (2) A scale will have high content validity when its items constitute a randomly selected subset of items drawn from the universe of items.
Term
You don't want your items to
Definition
Overlap too much.
Term
Content validity is compromised if...
Definition
(1) Items reflecting any important facets are omitted, (2) Items measuring facets outside the domain are included, and (3) Aggregate score is disproportionately influenced by any one facet.
Term
Questions to ask when generating items for each facet.
Definition
How many items?  Will/Should some facets be overrepresented (items must bear relationship to facet demarcations)?  How do you know if the item truly taps into the facet componentf or which it was designed?  What if item content overlaps?  What is the criterion for deciding if items are too similar?  What if facets overlap too much?  How would you know?
Term
Content validation guidelines
Definition
Define domain/facets of construct and subject to content validation before developing other elements of measure, subject all elements of measure to content validation, use population/expert sampling for initial generation of items/elements, use multiple judges of content validity and quantify using formalized scaling procedures, examine proportional representation of items, report results of content validation when reporting development of new scale, and use subsequence analyses for instrument refinement.
Term
Dimensions of formalized scaling procedures
Definition
Relevance, specificity, clarity, representativeness.
Term
What is criterion validity?
Definition
An item or scale has an empirical association with some criterion or gold standard.  Also called predictive validity.
Term
What are some characteristics of criterion validity?
Definition
Not driven by theory, practical, and not essential that the criterion follows the test administration.
Term
What are some types of criterion validity?
Definition
Predictive criterion validity, postdictive criterion validity, and concurrent criterion validity.
Term
Distinction between criterion validity and accuracy.
Definition
Correlation coefficienst may not be good indices of criterion validity because correlation cannot tell you how many did or did not succeed nor the hit rate.
Term
Define construct validity
Definition
Directly concerned with the theoretical relationship of a variable (test score) to other variables.  Put another way, it is the extent to which a measure accurately reflects the way that the construct it purports to measure should behave relative to measures of other related (and unrelated) constructs (Cronbach & Meehl, 1955).
Term
Methods for establishing construct validity
Definition
(1) Known-groups method (determines if you measure adequately distinguishes between different groups), and (2) Multimethod-multitrait matrix (fully crossed method-by-measure matrix that allows you to disentangle the variance attributable to effects of method of assessment and construct variance).
Term
Multimethod-multitrait matrix components
Definition
Heteromethod blocks, heterotrait-heteromethod triangles, validity diagonals, heterotrait-monomethod triangles, reliability diagonals, and monomethod blocks.
Term
Heteromethod blocks
Definition
Relationship between all 3 traits when manipulating method of assessment.
Term
Heterotrait-heteromethod triangles
Definition
Low associations.
Term
Validity diagonals
Definition
Relationship among similar traits using different methods - strongly related - low variability from measurement type.
Term
Heterotrait-monomethod triangles
Definition
Different trait, same method.
Term
Reliability diagonals
Definition
Relationship between measure and itself using same method (test-retest or internal consistency)
Term
Monomethod blocks
Definition
Relationship between similar and dissimilar traits holding the method constant.
Term
Hypothesized relational magnitudes of the multimethod-multitrait matrix
Definition
(In order of strongest to weakest) Same trait, same method; same trait, different method; and different trait, different method.
Term
Elements of construct validity
Definition
Covergent validity and discriminant validity
Term
Convergent validity
Definition
Measures that assess the same construct are highly correlated.
Term
Discriminant validity
Definition
Measures that assess different (or unrelated) constructs should be uncorrelated.
Term
How do you determine convergent validity?
Definition
Obtain scores on the new measure for a group of persons and scores on an independent measure of the same latent construct and correlate them.  High correlation supports convergence and it is recommended that convergence be shown between independent approaches that are maximally different.
Term
Signal detection theory
Definition
The alternative to the correlation coefficient when determining criterion validity.  This type of theory makes fine-grained distinctions between signal and noise.
Term
Typical signal detection procedure
Definition
Administer your questionnaire and the gold standard and use the gold standard to diagnose each respondent.  Use your measure to dichotomize the sample into test positives or test negatives.  Then, construct a 2X2 matrix to compare the questionnaire and gold standard in term of results.
Term
Goal of 2X2 signal detection matrix
Definition
Maximize true positives and true negatives.
Term
True positive or hit
Definition
Positive test result (your test) and present diagonsis (gold standard)
Term
False positive or false alarm
Definition
Positive test result (your test) and absent diagonsis (gold standard).
Term
False negative or miss
Definition
Negative test result (your test) and present diagonsis (gold standard).
Term
True negative
Definition
Negative test result (your test) and absent diagnosis (gold standard).
Term
Define sensitivity
Definition

The probability of having a positive test result among those patients who have the disorder.

Ratio of hits to hits and misses (FN)

Term
Define specificity
Definition

The probability of having a negative test result among those patients who do not have the disorder.

Ratio of true negatives to true negatives and false alarms (FP).

Term
Define positive predictive power.
Definition

The probability of having teh disorder among those with a positive test result.

Ratio of hits to hits and false alarms (FP).

Term
Negative predictive power
Definition

The probability of not having the disorder among those with a negative test result.

Ratio of true negatives to true negatives and misses (FN).

Term
The formula for kappa is
Definition
% observed agreement X % chance agreement / 100% - chance agreement.
Term
What is the range of kappa?  What are good values?
Definition

Kappa ranges from 0 to 1.

< .2 = poor agreement

.2-.4 = fair agreement

.4-.6 = moderate agreement

.6-.8 = good agreement

> .8 = excellent agreement

Term
What is the formula for prevalence?
Definition
P = (TP+FN)/N
Term
What is the formula for the level of a test?
Definition
Q = (TP+FP)/N
Term
What is the formula for sensitivity?
Definition
SE = TP/P
Term
What is the formula for specificity?
Definition
SP = TN/P'
Term
What is the formula for efficiency?
Definition
(TP+TN)/N
Term
What is the formula for positive predictive power?
Definition
TP/Q
Term
What is the formula for negative predictive power?
Definition
TN/Q'
Term
What is the formula for the standard error of prevalence?
Definition
(P^) = (PP'/N0)1/2
Term
What is the formula for the standard error of true positives?
Definition
(TP^) = (TP(1-TP')/N0)1/2
Term
What is the formula for the standard error of false negatives?
Definition
(FN^) = (FN(1-FN')/N0)1/2
Term
What is the formula for the standard error of false positives?
Definition
(FP^) = (FP(1-FP')/N0)1/2
Term
What is the formula for the standard error of true negatives?
Definition
(TN^) = (TN(1-TN')/N0)1/2
Term
What is the formula for standard error of efficiency?
Definition
(EFF^) = (EFF(1-EFF')/N0)1/2
Term
What is the formula for the standard error of the level of a test?
Definition
(Q^) = (QQ'/N0)1/2
Term
True or false: Standard error (hat) is a biased estimator standard error.
Definition
True, SE^ underpredicts SE (a large N may counter this).
Term
Which of the following requires a larger sample size: low-risk population or high-risk population?
Definition
Low-risk populations because the test must be able to pick up on the signal of a diagnosis and if a population does not have as many true positives/hits, then it will become more difficult for a test to pick up on the hits unless there is a larger N (more hits).
Term
Minimum of __ people in each marginal of the 2X2 table yields unbiased estimators.
Definition
10.
Term
What are the 3 types of sampling strategies?
Definition
Naturalistic sampling, retrospective sampling, and prospective sampling.
Term
What is naturalistic sampling?
Definition
A sampling strategy in which the evaluator decides how large a sample to gather (N) and takes a random (or representative) sample of that size from the population of interest - each patient receives a diagnosis and a test.
Term
What is retrospective sampling?
Definition
A sampling strategy in which a representative sample is drawn from the population and each patient is diagnosed.  This is called the screening sample.  Then, another random sample is drawn from among those in the screening sample with a positive diagnosis and another random sample from among those with a negative diagnosis.  Note: each N must have a minimum of 10 people.
Term
Prospective sampling
Definition
A sampling strategy in which a representative sample of patients is drawn from the population and the "screening sample" and each patient is tested.  These two groups then receive a diagnosis.  The proportion of patients with a positive test provides an unbiased estimator of the level of the test, Q.
Term
True or false: retrospective sampling will generally yield a less powerful test than naturalistic sampling.
Definition
False: retrospective sampling will generally yield a MORE powerful test than naturalistic sampling.
Term
For a random test in which p = 0, sensitivity equals
Definition
The level of the test.
Term
For a random test in which p = 0, specificity equals
Definition
The compliment of the level of the test.
Term
For a legitimate test in which p > 0, sensitivity ______ the level of the test
Definition
Exceeds.  SE > Q
Term
For a legitimate test in which p > 0, specificity ______ the complement of the level of the test.
Definition
Exceeds.  SP > Q'
Term
Ideal value for sensitivity and specificity is ___.  The range is ___ to ___ and ___ to ___, respectively.
Definition
1; Q; 1; Q'; 1
Term
Q' equals
Definition
1-Q
Term
Any report of a test that does not report its level gives no indication of
Definition
The quality of the test
Term
What is a ROC curve?
Definition
A graphical representation of the relationship between sensitivity and specificity across different test cutoff points.  Your goal is to select the test cutoff point that maximizes sensitivity and specificity.
Term
What is a ROC plot?
Definition
A graph showing the conditional probability of choosing alternative A when the alternative occurs (hit) plotted against the conditional probability of choosing alternative A when alternative B occurs (false alarm).
Term
If the point is at or under the ROC curve, it is a ____ test; if the point is above the ROC curve, it is a ____ test.
Definition
Poor; good.
Term
What is the relationship between sensitivity and specificity?
Definition
As sensitivity increases, specificity decreases and as specificity increases, sensitivity decreases; they are negatively correlated.
Term
Some things to consider in incremental validity:
Definition

Does the instrument:

Predict the phenomenon more validly/accurately?

Contribute meaningfully to predictive efficiency when added to an existing/readily obtainable measure?

Cost less than other measures?

Term
What is incremental validity?
Definition
The degree to which a measure explains or predicts some phenomena of interest relative to other measures?  Note: it is defined relative to other measures, which distinguishes it from other forms of validity, which are typically defined in absolute terms.
Term
Predictive power and incremental validity are a function of
Definition
How accurate other measures are.
Term
Implications of the definition of incremental validity
Definition
Multiple dimensions, inferences are dependent on the comparison measure(s) used, dependent on mode of assessment, dependent on criterion, inferences vary across target populations and samples, and it is conditional.
Term
In order to select the most appropriate dimension for study through incremental validity, you must
Definition
Determine how the measure will be used, select criteria on which validity inferences will be made, selective alternative/comparison measures, and identify the population.
Term
Cost-benefit analysis
Definition

What is the relative cost of acquiring new data compared with data form comparison measures?  Is it cost effective?  Does the ration of costs to benefits of the new measure, relative to others, warrant its use?

Costs/benefits can be defined in terms of money, time, consequences of incorrect decisions, and each of these may vary as a function of the population.

Term
When to develop a new instrument:
Definition
Problems with item construction/content, predictive power, sensitivity to change, non-equivalent performance across samples.
Term
You want scores to be sensitive to change in the underlying construct because if they are not sensitive...
Definition
Changes will occur in one group and not the other.
Term
Things to consider when refining/creating a new measure
Definition
Item response methods to identify bias or poorly performing items, internal consistency, item-level temporal stability, interrater agreement indices, item factor loadings (aka item-total correlations), proportion of items performing poorly to guide decision, published versus unpublished literature.
Term
Data analysis for examining incremental validity involves the following components:
Definition
Administration of your measure along with comparison measures and some criterion and creation of a correlation matrix between all variables (look for degree of collinearity/shared variance among preditors and strength of association between each predictor and criterion).
Term
Goals of data analyses for examining incremental validity
Definition
Estimate relative proportions of variance, estimate unique variance predicted by new instrument, and examine interaction (moderator) effects associated with sex, age, SES, etc.
Term
How do you perform a data analysis for examining incremental validity?
Definition
Computing a hierarchical linear regression, you look for unique predictive power.  First, do a forced step-wise regression, enter comparison measures, enter your measure, and the difference in R2 is the index of incremental validity.
Term
The formula for tolerance is
Definition
1-R2 (with R2 being a coefficient of determination).
Term
The formula for the variance inflation factor (VIF) is
Definition
1/tolerance.
Term
What is VIF?
Definition
Impact of collinearity on all independent and dependent variables in a model.  It should be less than 10.
Term
If the magnitude of a relationship differs across groups, then
Definition
The relationship is moderated by another variable.
Term
In order to examine moderator effects, you need to first
Definition
Add main effects followed by interaction terms.
Term
Basic elements of a factor analysis are
Definition
Set of procedures designed to produce a set of correlational data into a smaller set of data/reduce data in correlation matrix to better explain variance among the items, tests whether single factor explains interrelationships between items (initial premise - single factor accounts for pattern of correlations in items).
Term
Steps of factor analysis
Definition
Begin with correlation matrix, look for patterns of covariation, create a null hypothesis (single factor), sum items (estimate of latent construct), compute item-total correlations (ITC), computer projected inter-item correlations (IIC), subtract projected inter-item correlations and item-total correlations (IIC-ITC).
Term
What are item-total correlations?
Definition
Proxies for correlation between item and construct.
Term
In a really good model, the inter-item correlations equal
Definition
The item-total correlations.
Term
The difference between the inter-item correlations and the item-total correlations is
Definition
The residual variance, from which you can create a residual matrix.
Term
Once you have a residual matrix from the IIC and ITC difference, you
Definition
Extract the second factor, compute correlations between terms and second latent variable, generate matrix of propose correlations, if the second factor captured all left over covariation, you are done, if not continue until no more factors can be extracted.
Term
Residual matrices are treated like
Definition
Correlation matrices.
Term
Purpose of factor analysis
Definition
Identify latent structure of assessment instrument, item refinement/scale development, relationship to content validity and construct validity of instrument.
Term
Exploratory/Common factor analysis (EFA) procedures
Definition
Identify underlying dimensions of instrument, factor analytically derived scales represent separate empirically-derived dimensions (subscales), use correlation/covariance matrix to identify subset of like items, analysis produces "factor loadings," condense info from individual terms.
Term
What is exploratory factor analysis?
Definition
A statistcail method used to identify the underlying dimensions of an instrument.
Term
Goals of exploratory factor analysis
Definition
Identify factor loadings, refine instrument's content by using loadings to guide item retention, better understand latent construct.
Term
What is confirmatory factor analysis? (CFA)
Definition
A statistical method that is a more theoretically drive approach and requires the use of a priori hypothesis of some sort.
Term
Procedures of confirmatory factor analysis
Definition
Hypothesize item relationships in a model, fit data to model using SEM (structural equation modeling), use this information to test theoretical models.
Term
Recommended sample size for confirmatory factor analysis
Definition
10:1 ratio.  Monte Carlo studies - 5-10:1 ratio.  For up to 40 variables, N should be at least 125.
Term
In CFA, factors are first extracted then
Definition
Rotated.
Term
What are the two types of factor rotation?
Definition
Orthogonal (uncorrelated) and oblique (correlated).
Term
Purpose of extraction (factor)
Definition
Percent variance explained by extracted factors.
Term
Purpose of factor analysis (with regards to extraction)
Definition
Extract as much variability as possible through as few factors as possible.
Term
Eigenvalues less than 1 indicate
Definition
No factor.
Term
Interpret the scree plot by taking the "elbow" and
Definition
Subtracting 1.
Term
What is the purpose of multiple regression?
Definition
To learn more about the relationship between several independent or predictor variables and a dependent or criterion variable.
Term
According to true score theory, if observed variables = true score plus error, then what is the measurement error equal to?
Definition
Observed score minus true score (or irrelevant sources of variance).
Term
If error consists of unmeaningful components of variance, what are meaningful components of variance?
Definition
True scores.
Term
Explain the effects of unreliability in causal models.
Definition
If a causal variable has measurement error, the estimate of its effect is biased, as well as the effects of other variables in the structural equation.  Measurement error in the effect variable does not bias its coefficient unless the variables are standardized.  In this case, the bias is that the true beta equals the measured beta divided by the square root of the endogenous variable's reliability.
Term
What are some advantages of SEM?
Definition
Can handle random and non-random measurement error, can reject models, enables advances treatment of missing data (full-information maximum likelihood), and can disentangle different variance and error sources.
Term
Overarching idea of SEM?
Definition
The observed covariance matrix is a function of a set of parameters.  Relationships are being predicted, not scores.
Term
True or false: SEM procedures emphasize covariances and correlations rather than cases.
Definition
True.
Term
In regression, the procedure minimizes differences between observed and predicted values for individual cases.  How is this different from SEM?
Definition
In SEM, the procedure minimizes differences between observed variances/correlations and the ones predicted by the model.
Term
What are the data of SEM?
Definition
Covariance/correlation matrices.
Term
Assumptions about the variables on which the matrix coefficients are based:
Definition
(1) They are intervally scaled, and (2) They have a multivariate normal distribution.
Term
When using SEM, in which situations would you increase your sample size?
Definition
If you are using complex models, models with weak relationships, models with few observed variables per factor, and non-normal distributions.
Term
Compare the new and old methods for determining sample size in SEM.
Definition

Old: 5-10 cases per parameter estimate (usually a minimum of 100-150 total N).

New: 10-20 cases per parameter estimate using RMSEA.

Term
What is RMSEA?
Definition
Root Mean Squared Error of Approximation.  A method used for showing error and misfit of the measure or data.  Ideally, you want this value to be < .08.
Term
What was the start of SEM?
Definition
Ordinary least squares.
Term
Define measurement models and list the constituent parts.
Definition
Measurement models are the mapping of measures/items onto theoretical constructs.  The constituent parts are loadings of the measures onto theoretical constructs, error variances, and error covariances (correlated errors).
Term
Define loadings.
Definition
Effect of latent variable on the measure; if a measure loads onto only one factor, the standardized loading is the measure's correlation with the factor and can be interpreted as the square root of the measure's reliability.
Term
Define error variance.
Definition
The variance in a measure not explained by the latent variance; does not imply that the variance is random or not meaningful, rather that it is unexplained by the latent variable.
Term
Define structural models and list the constituent parts.
Definition
Structural models are the causal and correlational links between theoretical (latent) variables.  Constituent parts of this model are the paths, variances of the exogenous variables, variances of the disturbances of endogenous variables, covariances between disturbances, and covariances between disturbances and exogenous variables (usually set to zero).
Term
Define exogenous variables.
Definition
Variables not caused by another variable in the model.  Usually, this variable causes one or more variables in the model.  Think of this like an independent variable.
Term
Define endogenous variables.
Definition
Variables caused by one or more variables in the model.  Note that an endogenous variable may also cause another endogenous variable in the model.  Think of this like a dependent variable.
Term
Define standardized variable.
Definition
Variable whose mean is zero and variance is one.
Term
Define latent variable.
Definition
Variable in the model that is not measured.
Term
What are the components into which a correlation can be decomposed?
Definition
Direct effects, indirect effects, spurious effects (common causes), and unanalyzable components (correlated causes).
Term
What is the tracing rule?
Definition
The correlation between any pair of variables equals the sum of the products of the paths or correlations from each tracing.  A tracing between two variables is any route in which the same variable is not entered twice and no variable is entered and left through an arrowhead (this applies only to hierarchical models/models with no feedback; tracings do not go through covariances).
Term
What are ways to scale latent variables?
Definition
Using correlations and Wright's tracing rules to solve for coefficients (assuming a variance of 1.0).
Term
If a variance/loading is not set to 1.0...
Definition
It will be impossible to solve the equation because there will be more unknowns than knowns.
Term
What are the steps in SEM?
Definition
(1) Specification, (2) Identification, (3) Estimation, and (4) Model fit.
Term
What is specification in SEM?
Definition
Statement of the theoretical model either as a set of equations or as a diagram.  In this model, you cannot have more unknown variables than known variables.
Term
What is identification in SEM?
Definition
The model can, in theory and in practice, be estimated with observed data.
Term
What is estimation in SEM?
Definition
The model's parameters are statistically estimate from data.  Multiple regression is one such esimation method, but typically more complicated estimation methods are used.
Term
What is model fit in SEM?
Definition
The estimated model parameters are used to predict the correlations/covariances betwen measured variables and the predicted correlations/covariances are compared to observed correlations/covariances.
Term
True or false: Recurrence models have feedback loops and are recursive.
Definition
False: recurrence models do not have feedback loops and are non-recursive.
Term
What is the chi-square test?
Definition
A test of model fit used in SEM.
Term
What are some characteristics of chi-square tests?
Definition
Sensitive to sample size (good for N of 75-100, otherwise larger sample sizes will always result in significant values) and size of the correlations (larger the correlation, poorer the fit).
Term
Why do you want chi-square to be nonsignificant?
Definition
Nonsignificance in chi-square tests shows that there are no significant differences between your data and your model (which is good and shows good fit).
Term
As sample size increases, power
Definition
Increases.
Term
Joreskog measurement of model fit
Definition
Calculate the proportion of variance estimated by the estimated population covariance.  This value ranges from 0 to 1, with a cutoff of .9.  .95 is said to be good fit.
Term
Adjusted goodness of fit (GOF) index
Definition
A GOF method that penalizes lack of prediction and rewards parsimony.  Ranges from 0 to 1, with a cutoff of .9 and good fit being .95.
Term
Incremental fit/Comparative indices
Definition
Compare your model with a fully saturated model and an independence model.  Take the chi-square of your model and the comparative fit index.
Term
What is the comparative fix index?
Definition
A revised form of normative fix index but takes into account sample size.
Term
What are some characteristics of RMSEA?
Definition
You want little difference/residuals.  < .08 is good, < .06 is optimal, however standard is < .07.
Term
Good model fit is not the same as
Definition
A good model.
Term
What are the different types of criteria to consider when evaluating a model?
Definition
Theoretical, technical, statistical.
Term
What are theoretical criteria of model evaluation?
Definition
Appropriateness of general causal structure, inclusion of the right variables, and reasonableness of results in light of previous knowledge.
Term
What are technical criteria of model evaluation?
Definition
Identification status, appropriate estimation method, and appropriateness of instrumental variables.
Term
What are statistical criteria of model evaluation?
Definition
Reasonable parameter values, substantial coefficients linking measured variables to factor, latent endogenous variables are well explained, and fit.
Term
What does an independence model of fit assume?
Definition
That the observed variables are relaxed and that nothing correlates.
Term
What does a saturated model of fit assume?
Definition
All observed variables are relaxed and that everything intercorrelates.
Term
What is the premise of parsimony of a model?
Definition
Simple models are better.
Term
What is the purpose of a nested model of fit?
Definition
To directly compare several models of fit.
Term
Define model identification
Definition
A unique solution for the model's parameters exists.  This two step proces involves testing the adequacy of fit of individual models and (if these tests are alright), testing the structural model for fit after the knowns are equal in quantity to the unknowns.
Term
What is the minimum condition of identifiability?
Definition
The number of known values must equal or exceed the number of free parameters in the model.  If this rule is not met, the model is not identified.
Term
What is a justidentified/saturated model?
Definition
An identified model in which the number of free parameters exactly equals the number of known values; a model with zero degrees of freedom.  This model will not give meaningful values.
Term
What is an underidentified model?
Definition
A model for which it is not possible to estimate all of the model's parameters because there are more unknown values than known values.
Term
What is an overidentified model?
Definition
A model for which all the parameters are identified and for which there are more knowns than free parameters.  It places constraints on the correlation/covariance matrix and more known values than unknown values exist in this model.
Term
What is empirical underidentification?
Definition
A model which is theoretically identified, but one or more of the parameter estimates has a denominator that equals a very small value, making estimates unstable.  This cannot be solved by hand.
Term
What is an example of empirical underidentification?
Definition
A path analysis model with high multicollinearity.
Term
What is model fit?
Definition
The ability of an overidentified model to reproduce the variables' correlation or covariance matrix.
Term
What are known values?
Definition

For standard specification, the number of covariances where n is the number of variables:

n(n+1)/2.

For path analytic specification: n(n-1)/2.

 

Term
What are constraints?
Definition
Setting of a parameter equal to some function of other parameters (e.g., setting one parameter equal to another).
Term
How do you figure out the degress of freedom of a model?
Definition
The number of knowns minus the number of free parameters.
Term
What is the covariance between two variables equal to?
Definition
The correlation times the product of the variables' standard deviations.  The covariance of a variable with itself is the variable's variance.
Term
What are the free parameters in a structural model when using standard specification paths?
Definition
Covariances between the exogenous variables, the disturbances, the exogenous variables and disturbances, the variances of the exogenous variables, and the disturbances of endogenous variables less the number of linear constraints.
Term
What are the free parameters in a structural model using path analytic specification?
Definition
Paths and correlations between exogenous variables, disturbances and the exogenous variables, and the disturbances less the number of linear constraints.
Term
Multigroup analysis is
Definition
Testing for measurement invariance across groups (multigroup modeling).
Term
What is the procedure of multigroup analysis?
Definition
Test for the measure invariance between the uncorrelated model for all groups combined, then for a model where domain parameters and constrained to be equal between the groups.  If chi-square does not yield a significant difference between the original and constrained-equal models, it is concluded that the model measures invariance score groups.
Term
What are some parameters that can be constrained to define measurement invariance?
Definition
Invariance on number of factors, invariant factor loadings, invariant structural relations among the latent variables, and equality of error variances and covariances across groups.
Term
If lack of measurement invariance is found, this means that
Definition
The meaning of the latent construct is shifting across groups or over time.
Term
When does interpretational confounding occur when determining measurement invariance?
Definition
When there is substantial measurement variance because the factor loadings are used to induce the meaning of the latent variable.  If the factor loadings differ substantially across groups or time, then the induced meanings of the factors will differ substantially even if the same factor label is retained.
Term
Why are one-sample models tested separately first when testing for multigroup variance?
Definition
Separate testing provides an overview of how consistent the model results are, but it does not constitute testing for significant differences in the model's parameters between groups.
Term
If there is consistency in the multigroup invariance analysis, multigroup testing will proceed.  Explain this process.
Definition
(1) Calculate baseline chi-square by computing model fit for the pooled sample of all groups, (2) Add constraints that various model parameters must be equal across groups and the model is fitted, (3) Chi-square difference test is completed to determine if the difference is significant, and (4) Nonsignificant concludes that the constrained-equal model is the same as the unconstrained multigroup model, meaning that the model applies across groups and displays measurement invariance.
Term
The constrained model expects factor loadings to be equal for
Definition
Each class of the grouping variable.
Term
What is multiple group confirmatory factor analysis?
Definition
Using SEM and the measurement invariance test, the chi-square difference is calculated in order to assess whether a set of indicators reflects a latent variable equally well across groups in the sample.
Term
What is the partial measurement invariance test (Kline, 1998)?
Definition
If the model fails the measurement invariance test, some indicators may still be invariant, so we examine each indicator for group invariance.
Term
Because standard errors of factor loadings cannot be computed, there are _______ methods but no ______ method for comparing models across groups.
Definition
Indirect; direct.
Term
What are some characteristics of item response theory (IRT)?
Definition
It is mathematically complex and traced back to the 1940s.  It was originally used for testing cognitive ability and due to increased computing power and widespread technology, it has been much more developed in recent years.
Term
What are some implications for scale development when using CMT?
Definition
Emphasis on redundancy, factor analysis, identification of several items that all appear to measure the same thing and placing them on the same scale, and longer scales = more reliable measures.
Term
What are some characteristics of IRT?
Definition
Items arranged on a continuum measure one attribute, passing an item implies greater attribute possession, items differ in level of difficulty but still measure the same attribute, and attributes can vary in broadness and specificity.
Term
What are the three parameters of IRT?
Definition
Item difficulty, item discrimination, and false positives.
Term
What is item difficulty in IRT?
Definition
The more difficult an item, the more of the attribute required to pass it.  Items should therefore differ in terms of difficulty across the attribute continuum.
Term
What is item discrimination in IRT?
Definition
Items should do a good job of discriminating correct from incorrect answers (e.g., unambiguous classification of "pass" and "fail").
Term
What are false positives in IRT?
Definition
Response that suggests attribute exists/is present when it doesn't/isn't.  Good items minimize false positives.
Term
What is the item characteristic curve?
Definition
Explains the value of the latent trait and probability of a positive/correct answer (theta).  The scaling constant is 1.702.
Term
What are the types of common unidimensional IRT models?
Definition
Dichotomous and polytomous.
Term
What are some of the dichotomous models in common unidimensional IRT models?
Definition
1, 2, or 3 parameter logistic models.
Term
What are the polytomous models of the common unidimensional IRT model?
Definition
Samejima's Graded Response Model and Master's Partial Credit Model.
Term
In a 1 parameter logistic model, the only parameter estimated is
Definition
Difficulty.
Term
A 1 parameter logistic model specifies that all items are
Definition
Equally discriminating.
Term
A 2 parameter logistic model consists of
Definition
Difficulty and discrimination.
Term
A 3 parameter logistic model consists of
Definition
Difficulty, discriminatino, and c (a pseudo-guessing parameter).
Term
How do IRT and CMT differ in terms of SEM as a function of trait level?
Definition

CMT assumption: The standard error of measurement applies to all scores in a population.

IRT alternative: The standard error of measurement differs across scores (or response patterns) but generalizes across populations.

Term
How do IRT and CMT differ in terms of test length and reliability?
Definition

CMT assumption: Longer tests are more reliable than shorter tests.

IRT alternative: Shorter tests can be more reliable than longer tests.

Term
How do IRT and CMT differ in terms of assumptions?
Definition

CMT assumption: Comparing test scores across multiple forms depends on test parallelism or adequate equating.

IRT alternative: Comparing scores from multiple forms is optimal when test difficulty levels vary across persons.

CMT assumption: Meaningful scale scores are obtained by comparisons of position in a score distribution.

IRT alternative: Meaningful scale scores are obtained by comparisons of distances from various items.

Term
True or false: the initial extraction (unrotated) indicates that there should be exactly as many items as factors.
Definition
True.
Term
Initial communality of extraction of factors is always equal to
Definition
1.
Term
What is the extracted communality in factor extraction?
Definition
The percent of variance in the item explained by the factor.
Term
Factor extraction/analysis is driven by which procedure?
Definition
Internal consistency reliability.
Term
After looking at communalities, what is the approximate appropriate number of items per factor in factor analysis?
Definition
Approximately three items per factor.
Term
In the item characteristic curve (ICC), a =
Definition
Slope.
Term
In the item characteristic curve (ICC), b =
Definition
Difficulty.
Term
In the item characteristic curve (ICC), c =
Definition
Probability of (at random) selecting a keyed response (through the use of a pseudo guessing parameter).  This is not used often outside of cognitive assessment.
Term
Steeper ICC slopes indicate
Definition
Better items!
Term
Difficulty is related to ______ in CMT.
Definition
The proportion correct score.
Term
Difficulty in ICC is the inverse of
Definition
The probability of getting an item correct (e.g., as a score increases, difficulty increases).
Term
Why are steeper ICC slopes useful?
Definition
Better able to distinguish those above/below the level of theta.
Term
As the slope of the ICC goes up, the curve
Definition
Becomes more vertical.
Term
As the difficulty in ICC goes up, the slope
Definition
Shifts further to the right.
Term
SEM (standard error of measurement) =
Definition
SD√(1-r), with r being reliability.
Term
What is the standard error of measurement negatively correlated to?
Definition
Accuracy.
Supporting users have an ad free experience!