Breakdown of Validity
Breakdown of Validity
Breakdown of Validity
An example can be made by looking at Psychological measures as hammers. You might be told that a
hammer is a useful tool, but the usefulness of a hammer actually depends on the job to be done. If
you need to drive a nail into a surface or if you need to remove a nail from a surface, then a hammer
is useful. If you need to hold down a piece of paper while you are working or if you need to break
through a piece of sheetrock in a wall, then a hammer might indeed be useful. However, if you need
to tighten a screw, saw a piece of wood, change a light bulb, or call a contractor to fix the hole in
your wall, then a hammer is completely useless. So, it is somewhat simplistic and inaccurate to say
that a hammer is a useful tool without regard to the way in which it will be used. In short, validity is
used to identify is a test (tool) is valid as a measure. Does it measure what it is supposed to measure?
Definitions of validity:
The degree to which evidence and theory support the interpretations of test scores entailed
by the proposed uses” of a test (AERA, APA, & NCME, 1999, p. 9)
The validity of a test interpretation should be understood in terms of strong versus weak in-
stead of simply valid or invalid.
Construct Validity:
Refers to the degree to which test scores can be interpreted as reflecting a particular psychological
construct.
Relates to the match between the actual content of a test and the content that should be included in
the test. If a test is to be interpreted as a measure of a particular construct, then the content of the
test should reflect the important facets of the construct. The supposed psychological nature of the
construct should dictate the appropriate content of the test. Validity evidence of this type is some-
times referred to as content validity, but there are two ways that content validity might be compro-
mised.
A test should include no content (e.g., items or questions) that is irrelevant to the construct
for which the test is to be interpreted.
Construct under-representation.
A test should include the full range of content that is relevant to the construct, as much as
possible.
Notes on Validity:
A test’s internal structure is the way that the parts of a test are related to each other.
Imagine that you are asked to develop a midterm test for a class in personality psychology, and the
test is intended to measure “knowledge of Freud” as covered in the class lectures, discussion, and
readings, biographical questions about Freud’s life should not be included on the test, because they
were not covered in class
Face validity
Face validity is the degree to which a measure appears to be related to a specific construct,
in the judgment of non-experts such as test takers and representatives of the legal system.
A test has face validity if its content simply looks relevant to the person taking the test.
Face validity is not usually considered an important psychometric facet of validity – non-
experts’ opinions have no direct bearing on the empirical and theoretical quality of a test.
The difference between content validity and face validity is an important one.
Content validity:
Is the degree to which the content of a measure truly reflects the full domain of the con-
struct for which it is being used, no more and no less. In a sense, content validity can be
evaluated only by those who have a deep understanding of the construct in question.
Face validity:
Is the degree to which non-experts perceive a test to be relevant for whatever they believe it
is being used to measure.
Although test-takers’ beliefs about a test might affect their motivation and honesty in responding to
a test, test takers are not often experts on the theoretical and empirical meaning of the psychological
constructs being assessed by the tests.
Therefore content validity, but not face validity, is an important form of evidence in the overall evalu-
ation of construct validity.
In sum, the internal structure of a test is an important issue in construct validity. A test's internal
structure should correspond with the structure of the construct that the test is intended to measure.
Typically, the internal structure is examined through the correlations among the items in a test and
among the subscales in a test (if there are any), and researchers often use factor analysis in this pro-
cess.
Concurrent validity:
Refers to the degree to which test scores are correlated with other relevant variables that are meas-
ured at the same time as the primary test of interest.
Criterion validity:
Refers to the degree to which test scores can predict specific criterion variables. From this perspec-
tive, the key to validity is the empirical association between test scores and scores on the relevant
criterion variable, such as “job performance.”
Concurrent validity and predictive validity have traditionally been viewed as two types of criterion
validity because they refer to the association between test scores and specific criterion variables.
According to the traditional perspective on criterion validity, the psychological meaning of test scores
is relatively unimportant – all that matters it the test's ability to differentiate groups or predict specif-
ic outcomes.
Reflections
Activity 1:
A test is either reliable or not. Is this statement true or not? How should validity be interpreted?
Activity 2:
Please share some of the challenges you have faced with this section of your course material.
Activity 3:
All activities will be found on the discussion page under step 5, Validity practice activity
Resource:
http://www1.appstate.edu/~bacharachvr/chapter8-validity.pdf