0% found this document useful (0 votes)
7 views

Chapter 5

Chapter 5 discusses measurement, reliability, and validity in research, outlining different levels of measurement such as nominal, ordinal, interval, and ratio. It emphasizes the importance of reliability, which ensures consistency in measurements, and validity, which assesses whether a test measures what it intends to measure. Various types of validity, including content, criterion, and construct validity, are explained, along with methods to enhance reliability and validity in testing.

Uploaded by

Vi Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views

Chapter 5

Chapter 5 discusses measurement, reliability, and validity in research, outlining different levels of measurement such as nominal, ordinal, interval, and ratio. It emphasizes the importance of reliability, which ensures consistency in measurements, and validity, which assesses whether a test measures what it intends to measure. Various types of validity, including content, criterion, and construct validity, are explained, along with methods to enhance reliability and validity in testing.

Uploaded by

Vi Phạm
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

CHAPTER 5

Measurement, Reliability,
and Validity
Levels of measurement
Measurement
The assignment of reflect the way in which outcomes
values to outcomes are measured or assessed
Levels of measurement
• Nominal variables are categorical in nature.
Gender (male or female), preferences (like or dislike), voting record (for or against),…
• Ordinal variables reflect rankings.
Rank in college, order of finishing a race
• Interval variables have equal intervals between them.
Temperature, intelligence test scores, …
• Ratio variables have equal intervals between them and have an absolute zero.
Age, weight, time

A continuous variable is one that can assume any value along some underlying
continuum (e.g. Height, Age,…)
A discrete or categorical variable is one with values that can be placed only into
categories that have definite boundaries (e.g. Gender, marital status)
Practice
Identify the levels of measurement of the following variables
1. Amount of money in savings account
2. Letter grades (A, B, C,…) on an English essays
3. Time spent commuting to work
4. Classification of exercises (beginning, intermediate, advanced)
5. Levels of agreement (Strongly disagree, Disagree, Neither agree
nor disagree, Agree, Strongly agree)
6. Flavors of ice-cream
7. Types of living accommodation (house, apartment, trailer, other)
8. Weight
9. Time of day (dawn, morning, noon, afternoon, evening, night)
10. IELTS bandscores
Reliability: consistency of the measurement
Validity: accuracy & truthfulness  test what should test

Is it reliable?
Is it valid?

Time 1 Time 2 Time 3 Time 4 Time 5

45 kg 40 kg 50 kg 50 kg 43 kg
• Reliability occurs when a test measures the same thing
more than once and results in the same outcomes.
Reliability • Reliability consists of both an observed score and a true
score component.

Observed score = 7.3


True score = 8.5
Error = Observed – True = -1.2
Identify whether the following factors would contribute to
TRAIT or METHOD sources of errors.

1. Level of ability
2. Bias in grading
3. Test-taking skills
4. Interaction between examiner and test taker
5. Health
6. Fatigue
7. Motivation
8. Emotional strain
9. Testing environment
10.Ability to understand instructions
Increasing Reliability
• Increase the number of items or observations (larger sample means
more representative and reliable)
• Eliminate items that are unclear
• Standardize the conditions under which the test is taken
• Moderate the degree of difficulty of the tests
• Minimize the effects of external events
• Standardize instructions
• Maintain consistent scoring procedures
Measuring • Reliability coefficients (r): range in value from +1.00 to -
1.00. A value of 1.00 would be perfect reliability
Reliability r >= 0.8  the test is reliable

Test–retest reliability examines consistency over time (Time 1 vs. Time 2)


Parallel-forms reliability examines consistency between forms (Form 1 vs. Form 2)
Inter-rater reliability examines consistency across raters (Rater 1 vs. Rater 2)
Internal consistency examines the unidimensional nature of a set of items (Individual vs. Entire)
Practice
Identify the correct type of reliability
1. Two trained teachers observe young learners ‘behavior in a classroom.
Each teacher rates observed behaviors using the same form and the
correlation between the two teachers’ ratings was calculated.

2. An IQ test is given to 70 participants on October 1st and then the same


IQ test is administered to the same group of 70 participants one month
later. The correlation of scores between the two tests is finally calculated.

3. Two versions of an ICT knowledge test with the same level of difficulty
and contents are administered to the same group of participants. The
reliability is then determined by computing the correlation between the
results of the two test versions.
VALIDITY
The test/instrument you are using actually measures what you need
to have measured.
• Validity refers to the results/outcomes of a test, not to the test itself.
• Validity progression occurs in degrees from low validity to high validity.
• The validity of the results of a test must be interpreted within the
context in which the test occurs.

Example:
Which one is a valid question for an English vocabulary test?
• Give two synonyms of the word “enormous”
• How many bones are in the human body?
TYPES OF VALIDITY
Content Validity
• Content validity indicates the extent to which a test represents the universe
of items from which it is drawn.
• Expert opinion is often used to establish the content validity of a test.
TYPES OF VALIDITY
Content Validity
• Content validity indicates the extent to which a test represents the universe
of items from which it is drawn.
• Expert opinion is often used to establish the content validity of a test.

What students learned What should appear in the test


Chapter 1: History 5 Questions about History
Chapter 2: Geography 5 Questions about Geography
Chapter 3: Culture 5 Questions about Culture
TYPES OF VALIDITY
Criterion Validity
• Criterion validity is a measure of the extent to which a test is related to some
criterion.
Concurrent validity: how well a test estimates present performance
Do Section 1 scores correlate with the test scores?
Do IELTS scores correlate with GPA of English-major students?

Predictive validity: how well it predicts (future) performance


Do academic achievements (GPA) correlate with (future) high-paid jobs?
TYPES OF VALIDITY
Construct Validity
• Construct validity: the extent to which the results of a test are
related to an underlying set of related variables.
Example: An English Listening comprehension test should test:
 ability to understand details (bottom-up processing)
 ability to comprehend major points or gist (top-down processing)
 ability to make inferences
 ability to guess the meaning of unknown words from the context
 ability to write responses in paragraphs
(based on the Interactive model of listening comprehension)
TYPES OF VALIDITY
Construct Validity

TO ESTABLISH CONSTRUCT VALIDITY:


• Correlate your test with some already established tests
Correlate your listening test score with IELTS listening test, FCE listening
test,…
• Compare the results of the test between different groups of people
with/without certain characteristics
Administer your listening test to low English proficiency students and to
high proficiency students
• Check whether the test items/components are consistent with the
underlying theory
Does your test include items for checking students’ ability to understand
details?
Reliability and Validity
Reliability and Validity
• A test can be reliable but not valid, but a test cannot be valid without
first being reliable.
Reliability is a necessary, but not sufficient, condition of validity.

Which one is a valid question for an English vocabulary test?


Give two synonyms of the word “enormous”
How many bones are in the human body?
Considering the situation

A mathematics test for Vietnamese 12-graders:

Nine thousands people came to the conference venue on


Friday. Half of them left the conference on Saturday. Then
on this Sunday morning, two thirds of the rest returned
home.

Question: How many people were still at the conference at


7:00 PM on this Sunday?
A mathematics test for
Vietnamese 12-graders:
Suggested answer
The math test has some problems Nine thousands people came to
in establishing its validity: the conference venue on Friday.
• The language of the test is English, Half of them left the conference
not the native language of the test on Saturday. Then on this
takers (construct validity)
Sunday morning, two thirds of
• The level of difficulty of the test is the rest returned home.
not appropriate for 12 grader: it
tests too basic math knowledge
(content validity)
Question: How many people
were still at the conference at
7:00 PM on this Sunday?

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy