Chapter 5
Chapter 5
Measurement, Reliability,
and Validity
Levels of measurement
Measurement
The assignment of reflect the way in which outcomes
values to outcomes are measured or assessed
Levels of measurement
• Nominal variables are categorical in nature.
Gender (male or female), preferences (like or dislike), voting record (for or against),…
• Ordinal variables reflect rankings.
Rank in college, order of finishing a race
• Interval variables have equal intervals between them.
Temperature, intelligence test scores, …
• Ratio variables have equal intervals between them and have an absolute zero.
Age, weight, time
A continuous variable is one that can assume any value along some underlying
continuum (e.g. Height, Age,…)
A discrete or categorical variable is one with values that can be placed only into
categories that have definite boundaries (e.g. Gender, marital status)
Practice
Identify the levels of measurement of the following variables
1. Amount of money in savings account
2. Letter grades (A, B, C,…) on an English essays
3. Time spent commuting to work
4. Classification of exercises (beginning, intermediate, advanced)
5. Levels of agreement (Strongly disagree, Disagree, Neither agree
nor disagree, Agree, Strongly agree)
6. Flavors of ice-cream
7. Types of living accommodation (house, apartment, trailer, other)
8. Weight
9. Time of day (dawn, morning, noon, afternoon, evening, night)
10. IELTS bandscores
Reliability: consistency of the measurement
Validity: accuracy & truthfulness test what should test
Is it reliable?
Is it valid?
45 kg 40 kg 50 kg 50 kg 43 kg
• Reliability occurs when a test measures the same thing
more than once and results in the same outcomes.
Reliability • Reliability consists of both an observed score and a true
score component.
1. Level of ability
2. Bias in grading
3. Test-taking skills
4. Interaction between examiner and test taker
5. Health
6. Fatigue
7. Motivation
8. Emotional strain
9. Testing environment
10.Ability to understand instructions
Increasing Reliability
• Increase the number of items or observations (larger sample means
more representative and reliable)
• Eliminate items that are unclear
• Standardize the conditions under which the test is taken
• Moderate the degree of difficulty of the tests
• Minimize the effects of external events
• Standardize instructions
• Maintain consistent scoring procedures
Measuring • Reliability coefficients (r): range in value from +1.00 to -
1.00. A value of 1.00 would be perfect reliability
Reliability r >= 0.8 the test is reliable
3. Two versions of an ICT knowledge test with the same level of difficulty
and contents are administered to the same group of participants. The
reliability is then determined by computing the correlation between the
results of the two test versions.
VALIDITY
The test/instrument you are using actually measures what you need
to have measured.
• Validity refers to the results/outcomes of a test, not to the test itself.
• Validity progression occurs in degrees from low validity to high validity.
• The validity of the results of a test must be interpreted within the
context in which the test occurs.
Example:
Which one is a valid question for an English vocabulary test?
• Give two synonyms of the word “enormous”
• How many bones are in the human body?
TYPES OF VALIDITY
Content Validity
• Content validity indicates the extent to which a test represents the universe
of items from which it is drawn.
• Expert opinion is often used to establish the content validity of a test.
TYPES OF VALIDITY
Content Validity
• Content validity indicates the extent to which a test represents the universe
of items from which it is drawn.
• Expert opinion is often used to establish the content validity of a test.