0% found this document useful (0 votes)
138 views

3 - Types of Reliability

Types of Reliability

Uploaded by

Mark Salanatin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
138 views

3 - Types of Reliability

Types of Reliability

Uploaded by

Mark Salanatin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Types of Reliability

Definition of Reliability
the probability that a system, component, or process will perform
its intended function without failure for a specified period under
stated conditions.

Any significant results must be repeatable.

Other researchers must be able to perform exactly the same


experiment, under same conditions and generate the same results.
Definition of Reliability of tests
According to Anastasi (1957), the reliability of test refers to
the consistency of scores obtained by the individual on different
occasions or with different sets of equivalent items.

According to Stodola and Stordahl (1972), the reliability of a


test can be defined as the correlation between two or more sets of
scores of equivalent tests from the same group of individuals.

According to Guilford (1954), reliability is the proportion of the


true variance in obtained scores.
Types of Reliability
● Inter-Rater or Inter-Observer Reliability
Used to assess the degree to which different raters/observers give
consistent estimates of the same phenomenon.
● Test-Retest Reliability
Used to assess the consistency of a measure from one time to another.
● Parallel-Forms Reliability
Used to assess the consistency of the results of two tests constructed
in the same way from the same content domain.
● Internal Consistency Reliability
Used to assess the consistency of results across items within a test.
Inter-Rater or Inter-Observer Reliability
The degree of agreement or consistency between two or more
raters or observers when they are assessing, evaluating, or
rating the same phenomenon independently.

How does it work:


1. Define what to measure
2. Independent Assessment
3. Comparing the ratings
4. Analyzing the results
Inter-Rater or Inter-Observer Reliability
Inter-rater reliability works by ensuring that different
raters, when using the same criteria, consistently produce
similar results. This consistency is crucial for the
reliability and validity of subjective assessments in
various fields.
Inter-Rater or Inter-Observer Reliability
Object or
phenomenon

Observer 1
Inter-Rater or Inter-Observer Reliability
Object or
phenomenon

?
=
Observer 1 Observer 2
Inter-Rater or Inter-Observer Reliability
Pilot Study - A pilot study can be defined as a 'small study
to test research protocols, data collection instruments,
sample recruitment strategies, and other research techniques
in preparation for a larger study.

Incorporating inter-rater reliability into a pilot study is


a crucial step in ensuring that the study’s measurement
tools are robust, the raters are well-calibrated, and the
data collected will be reliable and valid.
Inter-Rater or Inter-Observer Reliability
Limitations of this approach:
1. Subjectivity
Even with clear guidelines, different raters may interpret
criteria differently, leading to variability in their
judgments.
2. Training and Experience
The reliability can be influenced by the training and
experience of the raters. Inexperienced or poorly trained
raters may be less consistent, reducing inter-rater
reliability.
Test Re-test Reliability
The most frequently used method to find the reliability of a
test is by repeating the same test on same sample, on two
different time periods.

It involved administration of the same test at 2 different


times.

It estimates the error related to administering a test at 2


different times
Test Re-test Reliability
It involves three steps:
1. Administering a test to a group of individuals
2. Re-administering the same test on the same sample
3. Correlating the first set of scores to the 2nd
Test Re-test Reliability

Test = Test

Time 1 Time 2
Test Re-test Reliability
Stability over time
Test = Test

Time 1 Time 2
Test Re-test Reliability
Limitations of this approach:
1. Memory Effect / Carry Over Effect
Particularly holds true when the two administration takes
place within short span of time.
2. Practice Effect
Happens when repeated tests are being taken for the
improvement of test scores, as is typically seen in the case
of classical IQ where there is improvement in the scores.
3. Absence
People remaining absent for re-tests
Parallel Forms Reliability
Known by the various names such as Alternate Forms
Reliability, Equivalent Form Reliability, and Comparable
Form Reliability.

Compares two equivalent forms of a test that measures the


same attribute. The two forms use different items. However,
the rules used to select items of a particular difficulty
level are the same.
Parallel Forms Reliability
How it works:
1. Creating Two Test Versions
2. Administering the Tests
3. Comparing the Scores

This is essential for maintaining the consistency, fairness,


and validity of tests, especially when multiple versions are
necessary
Parallel Forms Reliability
Form A

=
Form B

Time 1 Time 2
Parallel Forms Reliability
Form A
Stability across forms

=
Form B

Time 1 Time 2
Parallel Forms Reliability
Limitations of this approach:
1. Difficulty in Creating Truly Equivalent Forms
Designing two or more tests that are truly equivalent in
content, difficulty, and format is challenging. Even small
differences between forms can affect the results, leading to
lower reliability.
2. Practice Effect
If the two forms are administered close in time, test-takers
might remember content from the first form, which could
influence their performance on the second form, potentially
inflating the reliability.
Internal Consistency Reliability
A measure of how consistently different items on a test or
survey assess the same construct or concept.

It checks if the items within a test or questionnaire are


all aligned and work together to measure the same underlying
trait or ability.

Key Concepts
1. Homogeneity of Items
2. Correlations Among Items
Internal Consistency Reliability
Measured in:
1. Average Inter-item Correlation
2. Average Item-Total Correlation
3. Split-Half Reliability
4. Cronbach’s Alpha
Internal Consistency Reliability
Average Inter-item Correlation
Average inter-item correlation assesses the degree of
correlation between each pair of items within a test or
survey that is designed to measure the same construct.
Item 1 Average Inter-item Correlation

Item 2

Item 3
Test
Item 4

Item 5

Item 6
Item 1 Average Inter-item Correlation

I1 I2 I3 I4 I5 I6
Item 2
I1 1.00
Item 3 I2 .89 1.00
Test I3 .91 .92 1.00
I4 .88 .93 .95 1.00
Item 4 I5 .84 .86 .92 .85 1.00
I6 .88 .91 .95 .87 .85 1.00
Item 5

Item 6
.90
Internal Consistency Reliability
Average Item-total Correlation
Measures the correlation of each individual item with the
total test score (excluding the item itself). It helps
determine whether each item is contributing effectively to
the overall measurement.
Average Item-total Correlation
Item 1

Item 2

Item 3
Test
Item 4

Item 5

Item 6
Average Item-total Correlation
Item 1
I1 I2 I3 I4 I5 I6

Item 2 I1 1.00
I2 .89 1.00
Item 3 I3 .91 .92 1.00
Test I4 .88 .93 .95 1.00
I5 .84 .86 .92 .85 1.00
Item 4 I6 .88 .91 .95 .87 .85 1.00
Total .84 .88 .86 .87 .83 .82 1.00
Item 5

Item 6
.85
Internal Consistency Reliability
Split-Half Reliability
Tests internal consistency by dividing the test into two
halves and checking how well the scores from each half
correlate with each other. It provides an estimate of
reliability for the full test.
Split-half Reliability
Item 1

Item 2

Item 3
Test
Item 4

Item 5

Item 6
Split-half Reliability
Item 1

Item 2 Item 1 Item 3 Item 4

Item 3
Test
Item 4

Item 5
Item 2 Item 5 Item 6
Item 6
Split-half Reliability
Item 1

Item 2 Item 1 Item 3 Item 4

Item 3
Test
Item 4 .87
Item 5
Item 2 Item 5 Item 6
Item 6
Internal Consistency Reliability
Cronbach’s Alpha
Provides an overall measure of internal consistency across
all items in a test. It combines item variance and total
test variance into a single coefficient to assess
reliability.

Cronbach’s Alpha is a valuable tool for assessing the


internal consistency of a test or questionnaire.
Internal Consistency Reliability
Cronbach’s Alpha
Cronbach’s Alpha provides a single number that summarizes
how consistently your test items measure the same underlying
concept. If you get a low value, it indicates that some
items might not fit well with the rest, and you might need
to refine your test items.
Cronbach’s alpha (α)
Item 1

Item 2

Item 3 SH1 .87


Test SH2 .85
Item 4 SH3 .91 Like the average of
SH4 .83 all possible
SH5 .86
Item 5 ...
split-half
SHn .85 correlations
Item 6
α = .85
Inter-Rater Reliability: Consistency between different raters.
Test-Retest Reliability: Stability of scores over time.
Parallel-Forms Reliability: Consistency between different versions of a
test.
Internal Consistency Correlation: Consistency of items within a test.
Average Inter-Item Correlation: Average correlation between items.
Average Item-Total Correlation: Average correlation between items and the
total score.
Split-Half Correlation: Consistency between two halves of a test.
Cronbach’s Alpha: Overall measure of internal consistency for all items
on a test.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy