0% found this document useful (0 votes)

19 views

5 Reliability

Reliability refers to the consistency of measurement in assessments, indicating how stable test scores are over time or across different tasks. It is assessed using statistical indices such as correlation coefficients and is necessary for validity, though high reliability does not guarantee high validity. Various methods for estimating reliability include test-retest, equivalent forms, and internal consistency measures, each with specific applications and limitations.

Uploaded by

susloves98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

5 Reliability

Uploaded by

susloves98

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 29

Reliability

Dr. Münevver İLGÜN DİBEK

Nature of Reliability
▪ Reliability refers to the consistency of measurement. That is, how consistent
test scores or other assessment results are from one measurement to another.

Example:
A weight scale provides a reliable score if it
tells the same weight every time
How similar would the students’ scores have been had she assessed them
yesterday,or tomorrow, or next week?

How much would the scores have differed had the different teacher scored it?

How much would the scores have differed had the teacher used diffferent sample of
tasks?
Close succession
Assessment Assessment

Attention

Fatigue

Guessing

Memory

Effort
Long period
Assessment Assessment

Learning experience

Health

Forgetting
Characteristics of Reliability

▪ Reliability refers to the results obtained with an assessment instrument and not
the instrument itself
▪ An estimate of reliability always refers to a particular type of consistency
▪ If you want to measure what individuals will be like at some future time, consistency
of scores over time is important
▪ If you want to measure individual’s current understanding of certain scientific
principles, consistency of the performance across different tasks is important
Characteristics of Reliability

▪ Reliability is assessed primarily with statistical indices

▪ Correlation coefficient (reliability coefficient) and standard error of measurement are
computed.
▪ Reliability is necessary but not sufficient condition for validity
▪ Low reliability indicates low degree of validity, but high reliability does not ensure high
degree of validity.
RELIABILITY - VALIDITY RELATION

Reliability If something has high

reliability, it may have either
high validity or low validity

Validity
If something has high validity
it has also high reliability
▪ Correlation coefficient: A static that indicates the degree of the relationship between
any two sets of scores obtained from the same group of individuals

▪ Reliability coefficient: A correlation coefficient that indicates the degree of the

relationship between two sets of scores intended to be measures of the same
characteristics

▪ Validity coefficient: A correlation coefficient that indicates the degree to which a

measure predicts or estimates performance on some criterion measure (e.g
correlation between scholastic apptitude scores and grades in school)
1.Test-retest (measure of stability)
Give the same test twice some time interval
2. Equivalent forms (measure of equivalence)
Give two forms of the test to the same group in close Methods of
succession
3. Test-retest equivalent forms (measure of stability and
Estimating Reliability
equivalence)
Give two forms of the test to the same group in different
times
4. Split-half (measure of internal consistency)
Give test once; score two equivalent halves of test
5. Kuder-Richardson and coefficient alpha
(measure of internal consistency)
Give test once, apply KR or Cronbach’s alpha formula
6. Interrater (measure of consistency of ratings)
Give a set of students responses to two or more raters
TEST-RETEST METHOD

Time 1 Time 2
Construct X Construct X
This correlation coefficient indicates
how stable the assessment results are
Instrument A Instrument A over a period of time

Form 1 Form 1
Sample n Sample n

SCORE 1 SCORE 2
CLOSE TO 1 High Reliability

CLOSE TO 0 Lower Reliability 11

TEST-RETEST METHOD…

▪ If the time interval between two tests is too short, the constancy of the results
will be distorted because students will remember the taks and the responses to
first test.

▪ If the time interval between two tests is too long, the constancy of the results
will be distorted because the actual changes in student will happen.
Equivalent/alternative /parallel forms method

Time 1 Time 1
Construct X Construct X
This correlation coefficient indicates
the degree to which the two
Instrument A Instrument A assessments are measuring the same
aspects of behaviour.
Form 1 Form 2
Sample n Sample n

SCORE 1 SCORE 2
CLOSE TO 1 High Reliability

CLOSE TO 0
Lower Reliability 13
Equivalent/alternative /parallel forms method

▪ It reflects short-term consistency of the students’ performance (not long term

consistency)
▪ Equivalent forms have the same content and difficulty
▪ This is the easiest way to determine whether an assessment measures an
adequate sample of the content. Different versions of the assessment covering
the same domain of content are constructed and the results are correlated.
Test-Retest with Equivalent/alternative /parallel forms Method

Time 1 Time 2
Construct X Construct X
This correlation coefficient indicates
the degree to which the two
Instrument A Instrument A assessments are measuring the same
aspects of behaviour.
Form 1 Form 2
Sample n Sample n

SCORE 1 SCORE 2
CLOSE TO 1 High Reliability

CLOSE TO 0
Lower Reliability 15
Internal-Consistency Methods-Split half reliability
İ01 responses
Methods of splitting • There are several
• the first versus second İ02 responses internal-consistency
half methods that require
only one administration
• odd versus even- İ03 responses of an instrument.
numbered
• a random selection İ04 responses • Split-half procedure:
involves scoring two
It indicates theİ05responses
degree to halves of a test
separately for each
which consistent results subject and calculating
obtained from İ06
tworesponses
halves the correlation
of the test coefficient between the
two scores.
İ07 responses

İ08 responses
Internal-Consistency Methods-Split half reliability…

Spearman-Brown Formula:
2 x correlation between half assessments
Reliability on full assessment =
1 + correlation between half assessments

Suppose correlation between half assessments = 0.60

2 x (0.60)
Reliability on full assessment = = 0.75
1 + (0.60)
Internal-Consistency Methods-KR-20,KR 21, Alpha coefficient…

Kuder-Richardson Approaches (KR20 and KR21) :

• A test administered to the group only once.

• When students’ responses are scored dichotomously, KR 20 and KR 21
formulas are used.

Alpha Coefficient:
• It is the generalization of the KR-20 for assessments that have more than
dicthotomous scores (e.g each tasks is scored on a 5-point scale)
• Both coefficients provide information about the degree to which the items or
tasks in the assessment measure similar characteristics
Limitations for Internal Consistency Methods

▪ They are not appropriate for the speed assessment-for assessments with time
limits preventing students from attempting every task.
▪ They do not indicate the constancy of student responses from day to day
because there is only one administration.
INTER-RATER RELIABILITY/scorer agreement method

▪ Open-ended questions, essays, lab experiment exercises…

▪ Whenever students’ work are judgmentally scored, it is reasonable to
have more than one judge.
INTER-RATER RELIABILITY/scorer agreement
method

Rater 1 Rater 2
Construct X Construct X
This correlation coefficient indicates
the degree to which the relative
Instrument A Instrument A ordering of responses is consistent
from one rater to another.
Form 1 Form 1
Sample n Sample n

SCORE 1 SCORE 2
CLOSE TO 1 High Reliability

CLOSE TO 0
Lower Reliability 21
Percentage of agreement

Percentage of exact agreement =

100x [(3+7+5+4+2+3)/50] = 48%
Standard Error of Measurement (SEM)

Suppose that we are assessing a student over and over again on the same
assessment procedure. We will obviously get different scores each time.

The amount of variation in the scores would be directly related to reliability.

Low reliability large variation

High reliability small variation
Standard Error of Measurement (SEM)

• Although it is impossible to administer the same set of assessment tasks many

many times to the same students, we can calculate an estimation of those
variations.
• SEM is the amount of error that must be considered in interpreting score.It
provides the limits within which we can reasonably expect to find true score.

• True score is one that would be obtained if the test were perfectly reliable.
If a student were tested repeatedly under identical conditions and there were no memory,
learning, practice or fatigue effects,
- We could be 68% sure that the true score of him/her would fall within one SEM of his/her
obtained score.

- We could be 95% sure that the true score of him/her would fall within two SEM of his/her
obtained score.

- We could be 99% sure that the true score of him/her would fall within three SEM of his/her
obtained score.
• Each obtained score has a confidence band/interval.

For example;
• Tuğrul has a score of 52 and standard error of measurement is 4.
• What does this mean?

• Tuğrul’s true score is between (52-4) AND (52+4) with 68% confidence. In other
words, we 68% confident that his true score is between 48 and 56.

• Tuğrul’s true score is between (52-4x2) AND (52+4x2) with 95% confidence. In
other words, we 95% confident that his true score is between 44 and 60.

• Tuğrul’s true score is between (52-4x3) AND (52+4x3) with 99% confidence. In
other words, we 99% confident that his true score is between 40 and 64.
Relationship between SEM and Reliability

𝑆𝐸𝑀 =𝑆𝐷 𝑋 √1 −𝑟
SD=Standard deviation, r = reliability

As the reliability coefficient increases for any standard deviation, the standard error of
measurement decreases. Coversely, small reliability coefficients are associated with large
measurement errors.
Factors Influencing Reliability
▪ Number of Assessment Tasks: The larger the number of tasks, the higher its reliability.
▪ Longer assessment provides a more adequate sample of the behaviour being measured.
▪ Scores are less affected by chance factors such as familiarity with a given task

Objectivity: It refers to the degree to which equally competent scorers obtained the same results.
▪ Raters are important. Raters should be trained on how to use rubrics.
▪ Clearly established rubrics.
USABILITY

▪ Ease of administration
▪ Directions should be simple and clear
▪ Time needed for the administration should not be too great
▪ Ease of interpretation and application of results
▪ If they are interpreted correctly and applied effectively, they contribute to
more intelligent educational decisions.

Students_Slides_1_Realibity
No ratings yet
Students_Slides_1_Realibity
59 pages
Language Test Reliability
No ratings yet
Language Test Reliability
20 pages
Reliability Estimates: Source of Error Variance Is Test Administration
No ratings yet
Reliability Estimates: Source of Error Variance Is Test Administration
8 pages
RELIABILITY 2024
No ratings yet
RELIABILITY 2024
30 pages
Reliability Reviewer
No ratings yet
Reliability Reviewer
5 pages
Unit 6
No ratings yet
Unit 6
37 pages
Reliability Assignment
No ratings yet
Reliability Assignment
6 pages
Introduction To Reliability: What Is Reliability? Why Is It Important?
No ratings yet
Introduction To Reliability: What Is Reliability? Why Is It Important?
14 pages
Chapter 13 Assessing Quality of Measurement Tools 2
No ratings yet
Chapter 13 Assessing Quality of Measurement Tools 2
57 pages
Reliability
No ratings yet
Reliability
11 pages
Submitted By: Fuenteblanca Gyka J. Lebeco Joanne Submitted To: Sir Ubenia ENG 106.1
No ratings yet
Submitted By: Fuenteblanca Gyka J. Lebeco Joanne Submitted To: Sir Ubenia ENG 106.1
26 pages
RELIABILITY AND VALIDITY
No ratings yet
RELIABILITY AND VALIDITY
47 pages
Psy 112 Handout 6
No ratings yet
Psy 112 Handout 6
6 pages
Readings Psy211
No ratings yet
Readings Psy211
23 pages
Reliability: Floramae Z. Campos Student/MA-GC
No ratings yet
Reliability: Floramae Z. Campos Student/MA-GC
29 pages
Reliabilty Lecture (5)
No ratings yet
Reliabilty Lecture (5)
16 pages
Reliability and Validity
No ratings yet
Reliability and Validity
18 pages
Real Iab Lity
No ratings yet
Real Iab Lity
20 pages
Good Psychometric Properties
No ratings yet
Good Psychometric Properties
44 pages
RMBS M2 Lecture 5a
No ratings yet
RMBS M2 Lecture 5a
42 pages
PSYCH STATS SEMI
No ratings yet
PSYCH STATS SEMI
11 pages
TYPESOFRELIABILITY
No ratings yet
TYPESOFRELIABILITY
5 pages
Characteristics of Research Tools
No ratings yet
Characteristics of Research Tools
3 pages
Reliability
No ratings yet
Reliability
9 pages
Assess 1 PED 106 Lesson 6
No ratings yet
Assess 1 PED 106 Lesson 6
75 pages
CLASS PRESENTATION - Test Reliability
No ratings yet
CLASS PRESENTATION - Test Reliability
7 pages
Relibility Testing
No ratings yet
Relibility Testing
44 pages
Chapter 4: Reliability
No ratings yet
Chapter 4: Reliability
40 pages
PSY211_READINGS
No ratings yet
PSY211_READINGS
12 pages
6. Establishing Validity and Reliability
No ratings yet
6. Establishing Validity and Reliability
39 pages
Reliability Test by Group 2
No ratings yet
Reliability Test by Group 2
28 pages
Chapter 5 Reliability
No ratings yet
Chapter 5 Reliability
38 pages
Psychometrics
No ratings yet
Psychometrics
102 pages
Paprint
No ratings yet
Paprint
3 pages
Validity and Reliability: I Qra Development Academy Reporter: Nur - Salam Sultan SEPT. 21, 2019
No ratings yet
Validity and Reliability: I Qra Development Academy Reporter: Nur - Salam Sultan SEPT. 21, 2019
22 pages
Characteristics of Effective Selection Techniques
No ratings yet
Characteristics of Effective Selection Techniques
17 pages
Reliability
No ratings yet
Reliability
9 pages
Reliability 1-1
No ratings yet
Reliability 1-1
17 pages
reliability
No ratings yet
reliability
15 pages
Chracteristics of A Good Test
No ratings yet
Chracteristics of A Good Test
58 pages
What Is Reliability and Its Types
No ratings yet
What Is Reliability and Its Types
6 pages
Assignment No. 2 (8624)
No ratings yet
Assignment No. 2 (8624)
109 pages
KPD Validity & Realibility
No ratings yet
KPD Validity & Realibility
25 pages
Ano DelCorro Assessment in Learning 1 FINAL
No ratings yet
Ano DelCorro Assessment in Learning 1 FINAL
17 pages
Properties of Assessment Method: Validity
No ratings yet
Properties of Assessment Method: Validity
30 pages
Psyc 385 Exam 2 Study Guide
No ratings yet
Psyc 385 Exam 2 Study Guide
17 pages
Test - Education (1) STANDARDIZED TESTS
No ratings yet
Test - Education (1) STANDARDIZED TESTS
9 pages
Chapter 6edited
No ratings yet
Chapter 6edited
15 pages
Lesson-6-1
No ratings yet
Lesson-6-1
16 pages
RELIABILITY Report
No ratings yet
RELIABILITY Report
20 pages
Lesson in EDUC 4 (Establishing Test Validity and Reliability)
No ratings yet
Lesson in EDUC 4 (Establishing Test Validity and Reliability)
20 pages
Module 4 Psychometric properties (1)
No ratings yet
Module 4 Psychometric properties (1)
49 pages
Slide 4-Reliability
No ratings yet
Slide 4-Reliability
17 pages
Reliability by Vartika Verma
No ratings yet
Reliability by Vartika Verma
17 pages
test constrcution
No ratings yet
test constrcution
39 pages
Handbook of Psychological Assessment Fourth Edition
100% (1)
Handbook of Psychological Assessment Fourth Edition
9 pages
Characteristics of A Good Test
No ratings yet
Characteristics of A Good Test
41 pages
GRE: What You Need to Know: An Introduction to the GRE Revised General Test
From Everand
GRE: What You Need to Know: An Introduction to the GRE Revised General Test
Kaplan Test Prep
5/5 (2)
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
From Everand
Certified Lean Six Sigma Green Belt (ICGB) Practice Questions And Exam Tests ICGB Exam Guidebook And Updated Questions
Idea Link
No ratings yet
Measurement - Task Sheets Gr. 3-5
From Everand
Measurement - Task Sheets Gr. 3-5
Chris Forest
No ratings yet
Earth Science Rubrics.
No ratings yet
Earth Science Rubrics.
1 page
Ziglar On Selling
No ratings yet
Ziglar On Selling
21 pages
UNIT 1 Students Worksheets (EAPP) ONLINE
No ratings yet
UNIT 1 Students Worksheets (EAPP) ONLINE
7 pages
Burns Connection January 2015
No ratings yet
Burns Connection January 2015
6 pages
THE IMPACT OF SOCIAL MEDIA USAGE EFFECTIVE COMMUNICATION SKILLS OF GRADE 12 FIDELIS SENIOR HIGH STUDENT
No ratings yet
THE IMPACT OF SOCIAL MEDIA USAGE EFFECTIVE COMMUNICATION SKILLS OF GRADE 12 FIDELIS SENIOR HIGH STUDENT
2 pages
Art Therapy Session Plan
No ratings yet
Art Therapy Session Plan
4 pages
Innovation in English Language Teacher Education
No ratings yet
Innovation in English Language Teacher Education
290 pages
New Magical Development and Concentration Exercises For New Hermeticists
100% (1)
New Magical Development and Concentration Exercises For New Hermeticists
17 pages
Legal Aid Society Complaint Re Homeless Youth
No ratings yet
Legal Aid Society Complaint Re Homeless Youth
65 pages
AOPS Volume 2 Solution
No ratings yet
AOPS Volume 2 Solution
183 pages
Activity Plan: Curriculum Connections
No ratings yet
Activity Plan: Curriculum Connections
2 pages
Abraham Maslow Personality Presentation
No ratings yet
Abraham Maslow Personality Presentation
19 pages
Physical Self
No ratings yet
Physical Self
48 pages
PSAC - 2021 - Report Science Modular Grade 5
No ratings yet
PSAC - 2021 - Report Science Modular Grade 5
10 pages
Bobath Concept
No ratings yet
Bobath Concept
5 pages
Joel Attachment Edted Finale
No ratings yet
Joel Attachment Edted Finale
21 pages
IELTS Speaking 6
No ratings yet
IELTS Speaking 6
3 pages
Lesson Plan Molly Lou Melon
No ratings yet
Lesson Plan Molly Lou Melon
4 pages
The Past Incest Survivor
No ratings yet
The Past Incest Survivor
4 pages
Sonic Virtuality Sound As Emergent Perception - Mark Grimshaw
No ratings yet
Sonic Virtuality Sound As Emergent Perception - Mark Grimshaw
169 pages
English For Economic - Chap1
No ratings yet
English For Economic - Chap1
1 page
Behaviour Descriptions
No ratings yet
Behaviour Descriptions
202 pages
Noun Clauses
No ratings yet
Noun Clauses
8 pages
Group Activity #1 Sample - Review of Related Literature & Studies
No ratings yet
Group Activity #1 Sample - Review of Related Literature & Studies
4 pages
The Effects of Playing Bingo Game in The Cognitive Level of Older Adult With Dementia
No ratings yet
The Effects of Playing Bingo Game in The Cognitive Level of Older Adult With Dementia
2 pages
2009 Choral Speaking Script
100% (1)
2009 Choral Speaking Script
9 pages
Full Text
No ratings yet
Full Text
6 pages
International Business: by Charles W.L. Hill
No ratings yet
International Business: by Charles W.L. Hill
35 pages
Questionare ECA
No ratings yet
Questionare ECA
3 pages
Command Development PDF
100% (2)
Command Development PDF
59 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

5 Reliability

Uploaded by

5 Reliability

Uploaded by

Reliability

Dr. Münevver İLGÜN DİBEK

▪ Reliability is assessed primarily with statistical indices

Reliability If something has high

▪ Reliability coefficient: A correlation coefficient that indicates the degree of the

▪ Validity coefficient: A correlation coefficient that indicates the degree to which a

CLOSE TO 0 Lower Reliability 11

▪ It reflects short-term consistency of the students’ performance (not long term

Suppose correlation between half assessments = 0.60

Kuder-Richardson Approaches (KR20 and KR21) :

• A test administered to the group only once.

▪ Open-ended questions, essays, lab experiment exercises…

Percentage of exact agreement =

The amount of variation in the scores would be directly related to reliability.

Low reliability large variation

• Although it is impossible to administer the same set of assessment tasks many

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.