0% found this document useful (0 votes)
15 views

Validity and Reliability

The document discusses validity and reliability in testing. Validity refers to how well a test measures what it claims to measure and there are different types of validity evidence including content validity, criterion-related validity, and construct validity. Reliability refers to the consistency of test scores and is important for a test to be trusted. Tests should be validated before use and specifications written to improve validity.

Uploaded by

buitunglam97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Validity and Reliability

The document discusses validity and reliability in testing. Validity refers to how well a test measures what it claims to measure and there are different types of validity evidence including content validity, criterion-related validity, and construct validity. Reliability refers to the consistency of test scores and is important for a test to be trusted. Tests should be validated before use and specifications written to improve validity.

Uploaded by

buitunglam97
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Validity and reliability

I. Validity
1. Definitions
- Valid (adj): based on truth or reason; able to be accepted
- Validity (n): accurate measure of what it is intended to measure.
- Term confusion: Validity or construct validity
o In recent years, ‘construct validity’ has been increasingly used to refer to the
general notion of validity
o Construct validity (n): Meaningfulness and appropriateness of interpretation we
make on the basis of test scores.
- Face validity (n): Subjective validity, seems reasonable for lay people
o Still important because it meets the stakeholders’ social purposes (the social
consequences of the test)
2. Evidence
a. Content validity
- Test content must have a representative sample of language skills, structures, etc.
- A specification of skills or structures must be made at early stages of test construction.
- Example: An intermediate-level achievement grammar test is intended to be made up of
proper items relating knowledge or control of grammar. So, it cannot contain items for
advanced learners.
b. Criterion-related validity (An external test)
- Test results must agree with the results provided by an independent and highly
dependable assessment of the candidate’s ability
- Concurrent validity: established when the test and criterion are administered at about
the same time.
- Predictive validity: the degree to which a test can predict candidate’s future
performance.
c. How is level of agreement measured?
- Correlation coefficient (validity coefficient): A mathematical measure of similarity.
o Perfect agreement: 1 (Perfectly valid)
o No agreement: 0 (Invalid)
- Scoring validity
o A reading test with short answer questions is meant to measure reading ability, If
the scoring takes into account spelling and grammar => May not be valid
o For a writing test, if we emphasize too much on mechanical features (e.g. spelling
and punctuation) => scoring may be invalid => the best may be invalid.
- Face validity
o The test looks as if it measures what it is supposed to measure.
o Indirect testing should be introduced slowly, carefully, and reasonably
3. How to make tests more valid
- Validate the test before operation
- For teacher-made tests, it is impossible to carry out full validation.
- Write test specification.
- Use direct testing.
- Use relevant scoring.
- Ensure reliability.
II. Reliability
- Definition: consistency of measurement of individuals by a test, usually expressed in a
reliability coefficient. Consistent. Something you can trust for a long period of time.
o Perfectly reliable: 1
o Unreliable: 0
- Without reliability, there’s no way for you to trust your results.
o Validity: testing what you’re measuring
o Reliability:
- The standard error of measurement estimates how repeated measures of a person on
the same instrument tend to be distributed around his or her true score. The true score
is always unknown.
- SEM is based on the reliability coefficient and a measure of the spread of all the scores
on the test
- => A way to predict a person’s actual score/true score.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy