Term
What does X=T+E stand for? |
|
Definition
score(X)=true score(T)+error(E) FOR INDIVIDUALS |
|
|
Term
|
Definition
Does the item "appear" to be measuring the construct? "face value" Is it obvious that the test is measuring what it is supposed to be measuring? Can be manipulated |
|
|
Term
|
Definition
Is there a proportionate sampling of the universe of possible items? Variety ALL aspects taken into account |
|
|
Term
Criterion-related validity |
|
Definition
Is there a correlation between the test (x) and some criterion (y)? -predictive (future) vs concurrent (current) must have direct measure (rating by someone else) (not self report-indirect measure) |
|
|
Term
|
Definition
Do the patterns of correlations with other measures make theoretical sense? "theoretical validity" -congruent, convergent, discriminate |
|
|
Term
Under construct validity, what is congruent validity? |
|
Definition
correlations with other measures of the same construct -high correlation |
|
|
Term
Under construct validity, what is convergent validity? |
|
Definition
correlations with measures of similar constructs -intermediate correlation |
|
|
Term
Under construct validity, what is discriminant validity? |
|
Definition
correlations with measures of unrelated validity -low correlation |
|
|
Term
Given y=5x+12 and SEM=3, what is the 95% confidence interval when x=2? |
|
Definition
Interpretation/Translation= 22±6 OR 16 ≤y ≤ 28
**Work Shown y=5(2*for 95%*)+12, so y=22, variance is %# times SEM so 2(3)=6, so 22±6** |
|
|
Term
What are the sources of error? |
|
Definition
-true change -item sampling -statistical factors -random factors |
|
|
Term
Which source of error most affects test/retest and split half reliability assessments? |
|
Definition
test/retest - true change split half - statistical factors |
|
|
Term
What is the goal of test/retest? |
|
Definition
stability with an acceptable reliability of r≥.80 and staying stable over time |
|
|
Term
What are the types of reliability methods with the goal of internal consistency? |
|
Definition
-alternate forms (items measuring the same construct) -split half (correlating 1/2 test with other 1/2 like odd and even with Spearman Brown prophecy formula to correct) -Cronbach's Alpa (way of estimating avg. correlation among all possible pairs of items like Likert Scale) -KR20 & KR21 (same as Cronbach's but for dichotomus items) |
|
|
Term
What is the Spearman-Brown prophect formula? |
|
Definition
to correct estimate of reliability (restricted range) |
|
|
Term
What are the confidence intervals of SEM? |
|
Definition
68% confidence = score ± 1 SEM 95% confidence = score ± 2 SEM |
|
|
Term
What is the purpose of item analysis in general? |
|
Definition
to detect and remove items that fail to discriminate to enhance reliability and validity of the test |
|
|
Term
Explain item analysis for maximal performance measures. |
|
Definition
Defines "top group" and "bottom group" and computes difficulty index (p) and Discrimination index (D) |
|
|
Term
What is the difficulty index? |
|
Definition
indicated by "p" equation: (P top + p bottom)/2 Optimum p = .50, higher = easy, lower = difficult |
|
|
Term
What is the discrimination index? |
|
Definition
indicated by "D" Equation: P top - P bottom Acceptable discrimination is D≥.30, unacceptable is D˂.30 |
|
|
Term
What is intrinsic ambiguity? |
|
Definition
bad kind of ambiguity unacceptable discrimination index top group does not have knowledge |
|
|
Term
What is extrinsic ambiguity? |
|
Definition
good kind of ambiguity acceptable discrimination index bottom group does not have knowledge |
|
|
Term
Item Analysis Maximal Performance Sample: Group #correct #incorrect p D T 7 3 B 4 6 Give p, give D, give 3 part interpretation |
|
Definition
p=.55 (more than 1/2 got it right) D=.30 Interpretation: slightly easy (b/c p above .50), acceptable discrimination (b/c D is .30) and extrinsic ambiguity (b/c D is acceptable) |
|
|
Term
Briefly explain item analysis of typical performance measures. |
|
Definition
Since there are no right and wrong answers, item difficutly index of p is not used (difficulty level is irrelevant) |
|
|
Term
What are the 2 measures of discrimination in typical performance item analysis? |
|
Definition
--D = Discrimination index (% said yes top group a minus % said yes bottom group b), if D≥.30 then acceptable --Rpb = item total correlation (correlation between item score and total score, closer to 1 is better correlation) |
|
|
Term
What does item-total correlation rely on? |
|
Definition
significance as criterion, taking sample size into account which D does not. item total is more sensitive measure of discrimination |
|
|
Term
What type of validity is emphasized by achievement tests? |
|
Definition
|
|
Term
What type of validity is emphasized by aptitude tests? |
|
Definition
criterion-related and construct validities |
|
|
Term
Explain 4 of the subtests of the Stanford-Binet. |
|
Definition
-verbal reasoning, crystallized -quantitative reasoning, crystallized -abstract/visual reasoning, fluid -short term memory, memory specific |
|
|
Term
How is content validity addressed on achievement tests in schools? |
|
Definition
a matter of finding the best fit between the school/district curriculum and the emphasis of a given test |
|
|
Term
How is basal age computed in the Stanford-Binet? |
|
Definition
highest level at which all items are passed |
|
|
Term
How is ceiling age computed on the Stanford-Binet? |
|
Definition
level at which all answered incorrectly |
|
|
Term
How is the Stanford-Binet scored? |
|
Definition
basal age + months of credit for items gotten right |
|
|
Term
What is the current deviation IQ equation? Why is deviation IQ preferred to the old ratio IQ? |
|
Definition
z(16)+100 Deviation IQ means the same across all age levels |
|
|
Term
What is g-factor & what is s-factor? |
|
Definition
G-Factor: general factor, core of intelligence S-Factor: specific factors, unqie understanding |
|
|
Term
What are the verbal subtests of the WAIS-R (Weschler Adult Intelligence Scale)? |
|
Definition
1 information 2 comprehension 3 arithmetic 4 similarities 5 vocabulary 6 digit span 7 letter-number sequencing |
|
|
Term
What are the performance subtests for the WAIS-R? |
|
Definition
8 picture completion 9 block design 10 picture arrangement 11 object assembly 12 digit symbol 13 matrix reasoning 14 symbol search |
|
|
Term
Explain Catell's fluid and crystallized IQ concept. |
|
Definition
Fluid IQ is innate and is shown and tested through matrices tests
Crystallized IQ is gained with experience with language and is shown and tested through vocabulary and verbal testing |
|
|
Term
What is the most common IQ test? |
|
Definition
WAIS-R (Weschler Adult Intelligence Scale) |
|
|
Term
What does the public law IDEA (1990) provide for? |
|
Definition
For special needs students 1 least restrictive environment 2 individual education plans (IEPs) 3 clearly defined disorders |
|
|
Term
What does the public law PL95-561 provide for? |
|
Definition
for gifted and talented students -incentives to school offering gifted programs (gifted is 1 1/2 SD above avg) |
|
|
Term
|
Definition
|
|
Term
What is mental retardation? |
|
Definition
IQ at least 2 SD below mean, some states also require assessment of psychosocial maturity |
|
|
Term
What are learning disorders? |
|
Definition
LD - not any other disorder, discrepency of 1 1/2 SD in aptitude vs achievement |
|
|
Term
What are behavioral disorders? |
|
Definition
presence or absence of key behaviors requires behavioral assessment Example is ADD, ADHD |
|
|
Term
What are the 3 issues of test bias? |
|
Definition
measurement bias prediction bias contect bias |
|
|
Term
What is measurement bias? |
|
Definition
differential difficulty across groups example: item difficulty by group |
|
|
Term
|
Definition
differential prediction of scholastic success across groups predicting better for one group than another example: IQ predicitng school GPA |
|
|
Term
|
Definition
content of items biased towards the dominant group example: regional, cultural |
|
|
Term
What are the 3 approaches to cross-cultural testing? |
|
Definition
-mainstream approach -pluralistic approach -"culture-fair" tests |
|
|
Term
What is the mainstream approach to cross-cultural testing? |
|
Definition
one set of norms that are not differentiated by a group (race/culture) Tests should reflect dominant culture since schools are embedded in the dominant culture |
|
|
Term
What is the pluralistic approach to cross-cultural testing? |
|
Definition
compare "like with like" Same test with different norms for different groups |
|
|
Term
What is the "Culture Fair" test approach to cross-cultural testing? |
|
Definition
use of non-verbal items change assessments by removing all words so that anyone can take the test |
|
|
Term
What are the problems with the "culture fair" test approach to cross-cultural testing? |
|
Definition
-assumptions of universal understanding (there is no such thing) -restricted range of abilities tested -lack of comparability with mainstream IQ tests |
|
|
Term
What are the problems with the "culture fair" test approach to cross-cultural testing? |
|
Definition
-assumptions of universal understanding (there is no such thing) -restricted range of abilities tested -lack of comparability with mainstream IQ tests |
|
|
Term
If you were constructing a depression scale, how would you deal with face validity and content validity? |
|
Definition
face validity: manipulate face validity so that it was not apparent that the depression scale was measuring depression so that the answers are not skewed one way or the other by the test taker
content validity: make sure that there are questions to check for all possible parts, types and situations of depression. |
|
|
Term
If you were constructing a depression scale, how would you deal with criterion-related validity and construct validity? |
|
Definition
criterion-related validity: compare and look for correlation to scores on such tests as social desirability scale, body attitude, and the stress quiz, see if another test predicts depression in client, see if depression scale matches counselor observations
construct validity: make sure the test correlates well with other depression scales like Beck's depression scale. Make sure it correlates low with unrelated scales like the masculine/feminine scale |
|
|