Reliability and Validity Analysis: Dr. Jeevan Jyoti Dept. of Commerce University of Jammu
Reliability and Validity Analysis: Dr. Jeevan Jyoti Dept. of Commerce University of Jammu
ANALYSIS
2. Test-Retest Reliability
3. Parallel-Forms Reliability
• Example
Various questions for a personality test are tried out with a
class of students over several years. This helps the
researcher determine those questions and combinations that
have better reliability.
Parallel-Forms Reliability
It is a measure of reliability obtained by administering different
versions of an assessment tool (both versions must contain
items that probe the same construct, skill, knowledge base,
etc.) to the same group of individuals. evaluates different
questions and question sets that seek to assess the same
construct.
The creation of parallel forms begins with the generation of a
large pool of items representing a single content domain or
universe. At minimum, the size of this item pool should be more
than twice the desired or planned size of a single test form.
• Examples
An experimenter develops a large set of questions. They split
these into two and administer them each to a randomly-
selected half of a target sample.
In development of national tests, two different tests are
simultaneously used in trials. The test that gives the most
consistent results is used, whilst the other (provided it is
sufficiently consistent) is used as a backup.
Internal Consistency Reliability
Average inter-item correlation compares
correlations between all pairs of questions
that test the same construct by calculating the mean of
all paired correlations.
Average item total correlation takes the average inter-item
correlations and calculates a total score for each item, then
averages these.
Split-half correlation divides items that measure the same
construct into two tests, which are applied to the same group of
people, then calculates the correlation between the two total
scores.
Cronbach’s alpha is a measure of internal consistency, that is,
how closely related a set of items are as a group. It is considered
to be a measure of scale reliability. A "high" value for alpha does
not imply that the measure is unidimensional.
Internal Consistency Reliability
Item 2 I1 I2 I3 I4 I5 I6
I1 1.00
Item 3
I2 .89 1.00
Test I3 .91 .92 1.00
Item 4 I4 .88 .93 .95 1.00
I5 .84 .86 .92 .85 1.00
I6 .88 .91 .95 .87 .85 1.00
Item 5
Item 6
Internal Consistency Reliability
Item 2 I1 I2 I3 I4 I5 I6
I1 1.00
Item 3
I2 .89 1.00
Test I3 .91 .92 1.00
Item 4 I4 .88 .93 .95 1.00
I5 .84 .86 .92 .85 1.00
I6 .88 .91 .95 .87 .85 1.00
Item 5
Item 6
.90
Internal Consistency Reliability
Average item-total correlation
Item 1
I1 I2 I3 I4 I5 I6
Item 2
I1 1.00
I2 .89 1.00
Item 3 I3 .91 .92 1.00
Test I4 .88 .93 .95 1.00
Item 4 I5 .84 .86 .92 .85 1.00
I6 .88 .91 .95 .87 .85 1.00
Total .84 .88 .86 .87 .83 .82 1.00
Item 5
Item 6
.85
Internal Consistency Reliability
Split-half correlations
Item 1
Item 3
Test
Item 4 .87
Item 5
Item 1
item 1 item 3 item 4 item 1 item 3 item 4 item 1 item 3 item 4
Item 3
SH1 .87
Test
SH2 .85
Item 4 SH3 .91 Like the average of all
SH4 .83 possible split-half
SH5 .86 correlations
Item 5
...
SHn .85
Item 6
= .85
Composite Reliability
•Discriminant validity
Construct Validity
• Convergent validity tests that variables used to measure a constructs
expected to be related are, in fact, related. It is established through
• Factor Loadings > .50
• AVE > .50
• AVE=
Sum of Squared Standardised Regression Weights
Sum of Squared standardised Regression Weights+ Sum of Standard
error
Emotional
Cynicism Inefficacy
Constructs Ability Motivation Opportunity Exhaustion
Ability 0.529
Motivation (.233) 0.544
.483**
Opportunity (.099) (.145) 0.523
.314** .382**
Emotional (.041) (.056) (.073) 0.489
Exhaustion
-.203** -.236** -.270**
Cynicism (.099) (.083) (.094) (.389) 0.529