Validity and Reliability of Measurements

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

Reliability, Validity

&
Threats to them

By
Dr. Abebe Megerso
(BSc. MPHE. PhD. Asst. Prof. Epidemiology)

1
Outline
• Introductory concepts & definitions of terms,
• Reliability & Threats to it,
• Reliability assessment methods,
• Validity & its common classifications,
• Relation b/n sample size & Validity,
• Threats to validity & methods to mitigate them,
• Demonstration using software applications,

2
Introduction
• How do you define Epidemiology?
• Informal definition of Epidemiology is that ‘it
is a Science of Sciences’.
• What is Science?
‘Science is the pursuit & application of
knowledge & understanding of the natural &
social world following a systematic
methodology based on evidence.’

(Society of Science, 2009)


3
Introduction …
• Public Health is not a simple, reactive ‘take the pill
3X a day’ solution;
– it is a systematic approach of generating evidence &
guiding interventions,
• Epidemiology is one of the pillars of PH where
Epidemiologists study distribution & determinants
of PH problems,
– Therefore, measuring, understanding & documenting
evidences is the key role of Epidemiologists,
• Ensuring Reliability/precision & Validity/accuracy
of measurements is crucial for Epidemiologists,
4
Reliability/ Precision
• Reliability refers to whether measurement
(data collection) techniques & analytic
procedures would reproduce consistent
findings:
– If they were repeated on an other occasions, or,
– If they were replicated by an other researcher,
• It is about how the measure provides stable,
dependable & consistent results,
– It is unrelated to usefulness of the result, but a
precondition for validity,
5
Precision …
Precise but not valid estimation Less precise but valid estimation

6
Threats to Research Reliability
• Participant Error – any factor which adversely
alters the way in which a participant performs,
• Participant Bias – any factor which produces a
false response,
• E.g. social desirability bias, volunteer bias
• Researcher Error - any factor which alters the
researchers interpretation,
• Researcher Bias – any factor which introduces
systematic error in the researchers’ recording of
responses,
– E.g. Bias at measurement, analysis or interpretation
7
Reliability Coefficient
• In measurement, there are two possible
variances:
– Variance of the true score (Vt) &
– Variance in the observed score (Vo),
• Reliability coefficient = Vt/ Vo  value b/n 0 & 1
• It is a measure of variability explained by the
true score differences among the participants,
– E.g. if is 0.95 95% of the score variability is
explained by true score differences while 5% is due
to measurement error,
8
Reliability Assessment Techniques
• There are a number of techniques which can
be selected based on the character to be
measured,
• Some examples are:
i. Test-retest
ii. Alternate forms Reliability,
iii. Internal Consistency,
iv. Inter-rater consistency, etc.

9
i. Test-Retest Technique
• This is use of same test on same group at
different times, & then test for agreement
between the two sets of results using different
statistical methods,
– This technique is good if the instrument measures
stable characteristics (e.g. intelligence),
• It is not appropriate when scores can be
affected by repeated measurement,
• It is also inappropriate if clients change b/n the
two measurement administrations,
10
ii. Alternate forms Reliability
• This is an application of two equivalent forms
of instruments to same group,
• It can be applied at different times, but this
can contribute to error,
• This approach is appropriate to measure
stable characteristics & when scores are not
affected by repeated measurement,

11
iii. Internal Consistency
• This is the most commonly used technique to
assess whether the items in the instrument
measures same attribute of the character,
• This include different methods:
– Split-half method – split the instrument in to half
so that each participant has two scores,
• Lower number of items result in decreased reliability,
– Cronbach’s alpha – formula used to calculate
inter-item consistency,

12
Internal Consistency ..
• It tests to see if multiple-
question Likert scale surveys
are reliable.
• Likert scale questions
measure latent variables
hidden or unobservable
variables like:
– a person’s conscientiousness,
neurosis or openness.
• Cronbach’s alpha tells us
how closely related a set of
test items are as a group to
measure these very difficult
to measure/latent variables.
13
Rule of Thumb for Cronbach’s alpha

14
iv. Inter-rater consistency
• This method is used when there are two or
more raters:
– Calculation of correlation coefficient or,
– Calculation of percentage agreement b/n raters,
• Used for behavioral observations,

• Providing training & having the raters work


independently reduces error,

15
Validity
• An instrument is valid when it measures what it
was intended to measure,
• A valid instrument is also accurate & reliable; but
not all reliable are valid instruments,
• Generally, there are four major categories of
Validity:
– Internal validity
– External validity
– Conclusion validity
– Construct validity

16
I. Interval Validity
• Internal validity is established when our
research demonstrates a causal relationship
between the study variables,
– e.g. Hyperlipidemia  Statin presc … why?
• It can also be defined as how well result of the
study explain the actual population studied,
– Sample  study population  source/target
population,

17
II. External Validity
• External validity is concerned with whether a
study’s result can be generalized to other
relevant settings or groups,
– e.g. study on whites & apply to blacks

18
III. Conclusion Validity
• Conclusion validity is concerned with
whether the drawn conclusions are based up
on the results of the study

19
IV. Construct Validity
• A construct is a group of inter-related variables
which can not be observed directly or latent
variable,
– e.g. intelligence, depression, aggression,
• Construct validity is the extent to which a
test/scale measures the construct it claims to
measure,
• E.g. Does the scale correlate to actual outcomes?
• Adequacy of operational definitions of study
variables,

20
Construct Validity …
• Construct validity measures has different types of
sub categories:
i. Face validity
ii. Content validity
iii. Criterion validity (Concurrent & predictive)
iv. Convergent
v. Discriminant

21
i. Face Validity
• This is not a true form of validity, but the first
requirement in designing instruments,
– Assessed through judgment or non expert rating,

• It is achieved when an instrument looks like


that it measures what it was designed to
measure,
• It is useful because participants may not
respond accurately without face validity,

22
ii. Content Validity
• An instrument has content validity when its items
are a representative sample of the larger content
domain,
– Does the measure sufficiently cover the area it intends
to cover?
• Content validity is used when we want to know
whether a sample of items truly reflects an entire
universe of possible items,
• It is determined by subject matter experts &
associated with achievement tests,
• See example in the following slide:
23
Which of this has Content Validity?
Lacks content Validity Has content Validity

24
iii. Criterion Validity
• This is achieved when an instrument’s scores are
substantially related to a criterion/standard in the
real world,
• This validity has two types:
– Concurrent validity – an instrument highly correlated
with a well established instrument & the two measures
occur at the same time,
– Predictive validity – the same as the concurrent, but
one measure is administered first & intended to predict
a criterion in the future,
• Scores on the measure predict behavior on a criterion
measured at a time in the future,
25
Example for Criterion Validity

26
iv. Convergent & Discriminant Validity
• Convergent – how instrument agrees
with good established instruments.
– Scores on the measures are related to other
measures of the same constructs (high
correlation),
• Discriminant - how instrument disagrees
with good established instruments.
– Scores on the measures are not related to other
measures that are theoretically different (low
correlation),

27
Relation b/n sample size &
Accuracy/Validity

28
Lesson from accuracy-sample plot
• Accuracy is 100% in the case of a census & the
pattern of accuracy growth is not linear,
• The accuracy of a sample equal to half of the
population size is not 50%; but very near to 100%,
• Good accuracy levels can be achieved at relatively
small sample sizes, if the samples are representative,
• The result of this relationship is that, beyond a certain
sample size, the gains in accuracy are negligible while
sampling costs increase significantly,

29
Threats of Validity
• Systematic Errors:
– Bias
– Cofounding
– Interaction
– Effect modification,
• Random Error:
– Chance,

NB: We need to remain vigilant of these threats as


we generate evidences,

30
Bias
• Broad classification of Bias:
1. Selection or sampling bias,
– Can simple random sampling always possible?
• How to compensate for the lost power?
– How to sample hard-to-reach population?
• Snow-ball, Respondent driven sampling etc.
– Design related selection biases
• E.g. survivor bias in survey etc.
2. Information or Measurement bias,
– Erroneous instrument, skill gap, respondent related etc.
– Desirability bias, researcher bias, etc.
• Once introduced nothing or little can be done to
control bias afterwards,
31
Confounding
• A confusion caused by 3rd variable which:
– Affects both the exposure & outcome,
– Away from the causal pathway (Acyclic),
• Control at design stage:
– Restriction
– Randomization
– Matching
• Control at Analysis stage:
– Stratified analysis
– Adjusting (multivariable Analysis)
32
Demo: How to assess reliability & Validity

• SPSS
• JAMOVI

• (Using COVID 19 Vaccine data)

33
Reliability Result in JAMOVI

34
Assignment
• Please, search for & download a published
paper of any analytical study design,
• Assess the article for threats of reliability &
validity,
• Prepare a short summary of your findings to
present it in class for discussion,
• It has to be ready for the subsequent week
class,

35
Thank you !

36

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy