Week 7 - Test Development
Week 7 - Test Development
Test Development ➢
→
Norm-Referenced Test
Test Conceptualization
→
➢
→
“There ought to be a test designed to measure [fill in
the blank] in [such and such] way.”
➢ Criterion Referenced Test
➢
➢
TYPES OF SCALES:
Age-Based Scale
Grade-Based Scale
Stanine Scale
→
Unidimensional Scale
PILOT WORK
Multidimensional Scale
Pilot Work, Pilot Study, and Pilot Research
SCALING METHODS:
➢ Rating Scale
➢
➢
➢
➢
Test Construction ➢
SCALING
→ Summative Rating Scale
Scaling
➢
→ Likert Scale
scale values
L.L. Thurstone
→
➢
➢ absolute scaling
→
→ ➢
➢
➢ Objective
➢ ➢
WRITING ITEMS
➢ Advantage
➢
Sorting Task
➢
Guttman Scale ➢
➢
→ True-False Item
➢
→ → Other Variety of Binary-Choice Format
→ →
→ → Disadvantage
Constructed-Response Format
Item Format
Types of Constructed-Response Item Format:
Completion Item
Selected-Response Format →
→ Disadvantage
→
Short-Answer Item
Essay Item
Matching Item
→
→ Drawback
➢ Advantages
→ E.g., If a respondent answers an item in a way that
suggests he or she is depressed, the computer
might automatically probe for depression-related
➢ symptoms and behavior.
→
Item Branching
➢ SCORING ITEMS
Cumulative Model
→ →
→ Advantages:
Ipsative Scoring
Floor Effect →
→
Test Tryout
Ceiling Effect
➢
→ ➢
➢ ➢
Item-Endorsement Index
→
phantom
factors ➢ p
p1
➢
➢
Item Analysis ∑𝒑
𝒂𝒗𝒆𝒓𝒂𝒈𝒆 𝒑 = 𝒏
Item Analysis ➢
➢ quantitatively
qualitative ➢
➢
𝐜𝐡𝐚𝐧𝐜𝐞 𝐨𝐟 𝐬𝐮𝐜𝐞𝐬𝐬 𝐩𝐫𝐨𝐩𝐨𝐫𝐭𝐢𝐨𝐧 + 𝟏. 𝟎𝟎
𝑰𝒅𝒆𝒂𝒍 𝒑 =
𝟐
item’s difficulty
item’s reliability ideal p = 0.75
item’s validity
ideal p = 0.6
item discrimination
ITEM-RELIABILITY INDEX
ITEM-DIFFICULTY INDEX
Item-Reliability Index
Item’s Difficulty
→
➢
(s)
(r)
➢
p1
𝒔𝟏 = √𝒑𝟏 (𝟏 − 𝒑𝟏 )
Factor Analysis – The correlation (r) between the item score and the
criterion score
➢
→
(r1 C)
(s1),
Item-validity index = 𝒔𝟏 𝒓𝟏 𝒄
→
ITEM-DISCRIMINATION INDEX
Item-Discrimination Index
→
➢
d d
(U)
(L)
→
➢ Alternatives
𝑼−𝑳 ∙A B C D E
𝒅=( ) 𝒏= Item 1 U 24 3 2 0 3
𝒏
L 10 5 6 6 5
U L
(U) (L)
Alternatives
A B C D ∙E
Item 2 U 2 13 3 2 12
L 6 7 5 7 7
U
Alternatives
U A B ∙C D E
L Item 3 U 0 0 32 0 0
L 3 2 22 2 3
U
U
L
L
Alternatives
A ∙B C D E
d Item 4 U 5 15 0 5 7
L 4 5 4 4 14
d = +1.00 → U
L
d=0→ U
L
d = –1.00 → U
L Alternatives
A B C ∙D E
Item 5 U 14 0 0 5 13
L 7 0 0 16 9
ANALYSIS OF ITEM ALTERNATIVES. L
U
➢
ITEM-CHARACTERISTIC CURVES OTHER CONSIDERATIONS IN ITEM ANALYSIS
ITEM FAIRNESS.
→
SPEED TESTS. Expert Panels
→ Sensitivity Review
→
→
→
Test Revisions
→
→
→
“Think Aloud” Test Administration
→
➢
→
TEST REVISION IN THE LIFE CYCLE OF AN EXISTING TEST
➢ ➢
→
➢
Cross-Validation
validity shrinkage
→
Co-Validation
co-norming
THE USE OF IRT IN BUILDING AND REVISING TESTS
➢
→
→
→
→
[2] Determining measurement equivalence across test
taker populations.
→ Differential Item Functioning (DIF)
→
→ DIF Analysis
→ DIF Items