Reliability & Pilot Testing
Reliability & Pilot Testing
Quarter 2 – Notes 2
1. Test-retest Reliability
- give the same test to the same group of people on two separate occasions
- tests whether the same people get similar scores when they take the same test more than once
Steps:
Have a group of people take your test. This is your first round of testing.
After the first test, wait for a while (it could be a few days, weeks, or months). The time between tests
depends on what you're measuring, but it should be long enough that the results aren’t just based on
memory.
Ask the same group of people to take the test again after the waiting period.
Compare the scores from the first test and the second test. If the scores are similar, that means the
test is reliable over time.
Calculate the Correlation (Pearson correlation coefficient): Use statistical methods (like calculating
the correlation coefficient) to see how strongly the scores from the first and second tests are related.
A high correlation means the test has good test-retest reliability.
If the scores are similar (with a high correlation), it shows that the test produces consistent results
over time. If the scores are very different, the test may need to be adjusted for better reliability.
Record the steps you took, the time gap between tests, and the results to show that you’ve tested for
reliability.
2. Split-half Method
- dividing the questionnaire into two equal parts and checking if both halves give similar results
Steps:
Give the full test to a group of people.
After the test is complete, divide it into two equal parts. There are different ways to do this, like:
o Odd-numbered questions in one half and even-numbered questions in the other.
o Randomly splitting the questions into two groups of equal length.
Calculate the scores for both halves of the test. For example, if the test has 20 questions, first calculate
the score for the odd-numbered questions and then for the even-numbered questions.
Compare the scores from the two halves. If the scores are similar, it means the test is reliable and
measures consistently.
Calculate the Correlation (Pearson correlation coefficient): Use a statistical method (like calculating the
correlation coefficient) to see how closely the two halves of the test are related. A high correlation (near 1)
indicates good split-half reliability.
If the halves don't match well, you might need to revise the test, maybe by ensuring both halves have
similar difficulty levels or balancing the types of questions.
Record how you split the test, calculated the scores, and interpreted the results to show you’ve tested for
reliability.
3. Internal Consistency
- checks whether the different parts of a test that are supposed to measure the same concept are doing so
consistently
Steps:
Give your test to a group of people.
Look at the questions in the test and see if they all measure the same thing. For example, if you're
testing "happiness," all questions should relate to different aspects of happiness, like "How happy do
you feel?" and "Do you enjoy your daily activities?"
Use a statistical method called Cronbach's Alpha to check internal consistency. This value tells you
how well the items in the test are related.
o A Cronbach's Alpha score closer to 1 (like 0.7 or higher) means the test has good internal
consistency.
o A lower score means the items might not be closely related, and you might need to revise or
remove some questions.
Keep track of how you calculated internal consistency, the items you reviewed, and what changes
you made
Pilot Testing
- also called as pre-test
- small-scale trial run of a survey, questionnaire, or experiment before the full version is launched
1. Find 10-15 people from your target group to pre-test the questionnaire.
2. Design or provide spaces where the testers can freely indicate their remarks. Such remarks may be any of the
following:
a. "Delete this statement. I don't understand the question/statement.”
b. "Revise the question/statement. Indicate the specific variables to be measured.”
c. "Retain the question/statement. This is good.”
d. "There are missing options in the list of choices.”
e. "The question is so long. It's getting boring."
Quiz
Intruction: Identify each statement as true or false.
1. Reliability is about how dependable and consistent test results are across different times or situations.
2. Test-retest reliability involves giving the same test to a different group of people each time.
3. In the test-retest method, the time gap between the two tests should be short to ensure results aren't affected
by memory.
4. A high correlation between first and second test scores suggests good test-retest reliability.
5. Split-half reliability checks if two different tests measure the same concept equally.
6. In the split-half method, the test can be divided into odd and even-numbered questions to compare consistency.
7. A high Pearson correlation coefficient indicates a lack of reliability in the test.
8. Internal consistency is measured by examining how well each part of the test measures a different concept.
9. Cronbach’s Alpha is a statistic used to measure internal consistency.
10. A Cronbach’s Alpha score closer to 1 means the test items are closely related and have good internal
consistency.