0% found this document useful (0 votes)
9 views

reliability

The document discusses the assumptions and concepts related to psychological testing and assessment, emphasizing the importance of reliability and validity in measuring psychological traits and states. It outlines various methods for estimating reliability, including test-retest, parallel forms, and internal consistency measures, along with sources of measurement error. Additionally, it highlights the significance of norms and the process of norming in interpreting individual test scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

reliability

The document discusses the assumptions and concepts related to psychological testing and assessment, emphasizing the importance of reliability and validity in measuring psychological traits and states. It outlines various methods for estimating reliability, including test-retest, parallel forms, and internal consistency measures, along with sources of measurement error. Additionally, it highlights the significance of norms and the process of norming in interpreting individual test scores.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

TESTS AND TESTING RELIABILITY

I. SOME ASSUMPTIONS ABOUT • a synonym for dependability or consistency


PSYCHOLOGICAL TESTING AND • refers to consistency in measurement
ASSESSMENT
RELIABILITY COEFFICIENT
ASSUMPTION 1: PSYCHOLOGICAL TRAITS AND
• an index of reliability, a proportion that indicates the
STATES EXIST
ratio between the true score variance on a test and
• TRAIT – defined as “any distinguishable, relatively the total variance
enduring way in which one individual varies from
I. THE CONCEPT OF RELIABILITY
another”
• STATES – also distinguish one person from another VARIANCE (σ2)
but are relatively less enduring • a statistic useful in describing sources of test score
• CONSTRUCT – an informed, scientific concept variability
developed or constructed to describe or explain o TRUE VARIANCE – variance from true
behavior differences
ASSUMPTION 2: PSYCHOLOGICAL TRAITS AND o ERROR VARIANCE – variance from irrelevant,
STATES CAN BE QUANTIFIED AND MEASURED random sources
ASSUMPTION 3: TEST-RELATED BEHAVIOR MEASUREMENT ERROR
PREDICTS NON-TEST-RELATED BEHAVIOR • refers to, collectively, all of the factors associated with
ASSUMPTION 4: TESTS AND OTHER the process of measuring some variable, other than
MEASUREMENT TECHNIQUES HAVE STRENGTHS the variable being measured
AND WEAKNESSES o RANDOM ERROR – a source of error in
measuring a targeted variable caused by
ASSUMPTION 5: VARIOUS SOURCES OF ERROR
unpredictable fluctuations and inconsistencies of
ARE PART OF THE ASSESSMENT PROCESS
other variables in the measurement process
• ERROR VARIANCE – the component of a test score o SYSTEMATIC ERROR – refers to a source of
attributable to sources other than the trait or ability error in measuring a variable that is typically
measured constant or proportionate to what is presumed to
ASSUMPTION 6: TESTING AND ASSESSMENT CAN be the true value of the variable being measured
BE CONDUCTED IN A FAIR AND UNBIASED MANNER SOURCES OF ERROR VARIANCE
ASSUMPTION 7: TESTING AND ASSESSMENT 1. TEST CONSTRUCTION
BENEFIT SOCIETY 2. TEST ADMINISTRATION
II. WHAT’S A “GOOD TEST”? 3. TEST SCORING AND INTERPRETATION
a. RELIABILITY II. RELIABILITY ESTIMATES
• involves the consistency of the measuring tool TETS-RETEST RELIABILITY
b. VALIDITY
• to evaluate the stability of a measure
• measure what it purports to measure
• an estimate of reliability obtained by correlating pairs
III. NORMS of scores from the same people on two different
NORMS administrations of the same test

• are the test performance data of a particular group of PARALLEL-FORMS AND ALTERNATE-FORMS
test takers that are designed for use as a reference RELIABILITY ESTIMATES
when evaluating or interpreting individual test scores a. PARALLEL FORMS
NORMATIVE SAMPLE • exist when, for each form of the test, the means
and the variances of observed test scores are
• that group of people whose performance on a equal
particular test is analyzed for reference in evaluating b. ALTERNATE FORMS
the performance of individual test takers
• to evaluate the relationship between different
NORMING forms of a measure
• refer to the process of deriving norms • are simply different versions of a test that have
been constructed to be parallel
SPLIT-HALF RELIABILITY ESTIMATES THE TRUE SCORE MODEL OF MEASUREMENT AND
• obtained by correlating two pairs of scores obtained ALTERNATIVES TO IT
from equivalent halves of a single test administered TRUE SCORE
once • a value that genuinely reflects an individual’s ability
OTHER METHODS OF ESTIMATING INTERNAL (or trait) level as measured by a particular test
CONSISTENCY DICHOTOMOUS TEST ITEMS
a. INTER-ITEM CONSISTENCY • test items or questions that can be answered with only
• to evaluate the extent to which items on a scale one of two alternative responses, such as true-false,
relate to one another yes–no, or correct–incorrect questions
• refers to the degree of correlation among all the
POLYTOMOUS TEST ITEMS
items on a scale
o HOMOGENEITY – the degree to which a test • test items or questions with three or more alternative
measures a single factor responses
o HETEROGENEITY – the degree to which a IV. RELIABILITY AND INDIVIDUAL SCORES
test measures different factor
b. KUDER-RICHARDSON FORMULA 20 (KR-20) STANDARD ERROR OF MEASUREMENT
• the statistic of choice for determining the inter- • often abbreviated as SEM or SEM
item consistency of dichotomous items, • the tool used to estimate or infer the extent to which
primarily those items that can be scored right or an observed score deviates from a true score
wrong (such as multiple-choice items) STANDARD ERROR OF THE DIFFERENCE
c. COEFFICIENT ALPHA
• appropriate for use on tests containing non- • a statistical measure that can aid a test user in
dichotomous items determining how large a difference should be before
• typically ranges in value from 0 to 1 it is considered statistically significant
d. AVERAGE PROPORTIONAL DISTANCE (APD)
• a measure that focuses on the degree of
difference that exists between item scores
MEASURES OF INTER-SCORER RELIABILITY
INTER-SCORER RELIABILITY
• to evaluate the level of agreement between raters on
a measure
• the degree of agreement or consistency between two
or more scorers (or judges or raters) regarding a
particular measure
III. USING AND INTERPRETING A COEFFICIENT
OF RELIABILITY
THREE (3) APPROACHES TO THE ESTIMATION OF
RELIABILITY
1. TEST-RETEST
2. ALTERNATE OR PARALLEL FORMS
3. INTERNAL OR INTER-ITEM CONSISTENCY
THE NATURE OF THE TEST
1. HOMOGENEITY VERSUS HETEROGENEITY OF
TEST ITEMS
2. DYNAMIC VERSUS STATIC CHARACTERISTICS
3. RESTRICTION OR INFLATION OF RANGE
4. SPEED TESTS VERSUS POWER TESTS
5. CRITERION-REFERENCED TESTS

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy