0% found this document useful (0 votes)
49 views

EDUC3 Module 3

This document compares different types of tests used to measure student knowledge and understanding. It outlines the main purposes, scopes of content, interpretations, and other key aspects of psychological, educational, survey, mastery, norm-referenced, and criterion-referenced tests. It also examines the construction, administration, scoring, and effects of biases for standardized, informal, individual, and group tests. Finally, it categorizes test formats as selective tests like multiple choice, or supply tests like short answer, and essays like restricted response or extended response.

Uploaded by

Dahlia Galimba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views

EDUC3 Module 3

This document compares different types of tests used to measure student knowledge and understanding. It outlines the main purposes, scopes of content, interpretations, and other key aspects of psychological, educational, survey, mastery, norm-referenced, and criterion-referenced tests. It also examines the construction, administration, scoring, and effects of biases for standardized, informal, individual, and group tests. Finally, it categorizes test formats as selective tests like multiple choice, or supply tests like short answer, and essays like restricted response or extended response.

Uploaded by

Dahlia Galimba
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 23

MODULE 3 – DEVELOPMENT OF CLASSROOM TOOLS FOR MEASURING KNOWLEDGE AND

UNDERSTANDING

DIFFERENT TYPES OF TESTS


MAIN POINTS FOR TYPES OF TEST
COMPARISON

Psychological Educational

 Aims to measure  Aims to measure the


students intelligence result of instructions
or mental ability in a and learning (e.g.
large degree without Performance Tests)
reference to what
the
students has learned
Purpose  Measures the
intangible
characteristics of an
individual (e.g.
Aptitude Tests,
Personality Tests,
Intelligence Tests)
Survey Mastery

 Covers a broad range  Covers a specific


of objectives Objective
 Measures general  Measures
achievement in fundamental skills
Scope of Content certain subjects and abilities
 Constructed by  Typically constructed
trained professional by the teacher
Norm- Referenced Criterion-Referenced

 Result is interpreted  Result is interpreted


by comparing one by comparing
student’s student’s
performance performance
with other students’ based on a predefined

performance Standard
 Some will really pass  All or none may pass
 There is competition  There is no
for a limited competition for a
percentage of high limited percentage of
scores high score
 Describes pupil’s  Describes pupil’s
performance mastery of course
compared to others objectives

Interpretation
 Verbal  Non-verbal
 Words are used by  Students do not use
students in attaching words in attaching
meaning to or meaning to or in
responding to test responding to test
items items (e.g. graphs,
Language Mode numbers, 3-D
subjects)
Standardized Informal

 Constructed by a  Constructed by a
professional item classroom teacher
writer
 Covers a broad range  Covers a narrow
of content covered in range of content
a subject area
 Uses mainly multiple  Various types of
choice items are used
 Items written are  Teacher picks or
Construction screened and the best writes items as
items were chosen for needed for the test
the final instrument
 Can be scored by a  Scored manually by
machine the teacher
 Interpretation of  Interpretation is
results is usually usually criterion-
norm-referenced referenced
Individual Group

 Mostly given orally or  This is a paper-and-


requires actual pen test
demonstration of skill
 One-on-one  Loss of rapport,
situations, thus, many insight and
opportunities for knowledge about each

clinical observation Examinee


 Chance to follow-up  Same amount of time
examinee’s needed to gather
response in order to information from one
Manner of Administration clarify or student
comprehend it more
clearly
Objective Subjective
 Scorer’s personal  Affected by scorer’s
judgement does not personal opinions,
affect the scoring biases and judgement
 Worded that only one  Several answers are
answer is acceptable Possible
 Little or no  Possible to
disagreement on what disagreement on what
is the correct answer is the correct answer

Effect of Biases
Power Speed

 Consists of series of  Consists of items


items arranged in approximately equal
ascending order of in difficulty
difficulty
 Measures student’s  Measure’s student’s
Time Limit and Level of ability to answer speed or rate and
Difficulty more and more accuracy in
difficult items responding
Selective Supply

 There are choices for  There are no choices


the answer for the answer
 Multiple choice,  Short answer,
True or False, Completion,
Matching Type Restricted or
Extended Essay
 Can be answered  May require a longer
quickly time to answer
 Prone to guessing  Less chance to
guessing but prone to
Bluffing
 Time consuming to  Time consuming to
construct answer and score

TYPES OF TESTS ACCORDING TO FORMAT

1. Selective Type – provides choices for the answer


a. Multiple Choice – consists of a stem which describes the problem
and 3 or more alternatives which give the suggested solutions. The incorrect alternatives
are the distractions.
b. True-False or Alternative Response – consists of declarative
statement that one has to mark true or false, right or wrong, correct or incorrect, yes or
no, fact or opinion,, and the like
c. Matching Type – consists of two parallel columns: Column A, the
column of premises from which a match is sought; Column B, the column of responses
from which the selection is made.
2. Supply Test
a. Short Answer – uses a direct question that can be answered by a word,
phrase, a number, or a symbol
b. Completion Test – It consists of an incomplete statement
3. Essay Test
a. Restricted Response – limits the content of the response by
restricting the scope of the topic
b. Extended Response – allows the students to select any factual
information that they think is pertinent, to organize their answers in accordance with their best
judgement
Projective Test
 A psychological test that uses images in order to evoke responses from a subject and
reveal hidden aspects of the subject’s mental life

 These were developed in an attempt to eliminate some of the major problems inherent
in the use of self-report measures, such as the tendency of some respondents to give
“socially responsible” responses.

Important Projective Techniques


1. Word Association Test. An individual is given a clue or hint and asked to
respond to the first thing that comes to mind.
2. Completion Test. In this the respondents are asked to complete an incomplete
sentence or story. The completion will reflect their attitude and state of mind.
3. Construction Techniques (Thematic Apperception Test) – This is more or
less like completion test. They can give you a picture and you are asked to write a story about it.
The
initial structure is limited and not detailed like the completion test. For e.g.: 2 cartoons are
given and a dialogue is to be written.

4. Expression Techniques - : In this the people are asked to express the feeling or
attitude of each other people.

GUIDELINES FOR CIINSTRUCTING TEST ITEMS


When to use Essay Test
Essays are appropriate when:
1. the group to be tested is SMALL and the test is NOT TO BE USED again;
2. you wish to encourage and reward the development of student’s SKILL
WRITING;
3. you are more interested in exploring the student’s ATTITDES than in
measuring his/her academic achievement;
4. you are more confident of your ability as a critical and fair reader than as an
imaginative writer of good objective test items

When to Use Objective Test Items

Objective test items are especially appropriate when:


1. The group to be tested is LARGE and the test may be REUSED;
2. HIGHLY RELIABLE TEST SCOREs must be obtained as efficiently as possible;
3. IMPARTIALITY of evaluation, ABSOLUTE FAIRNESS, and
FREEDOM from possible test SCORING INFLUENCES – fatigue, lack of anonymity are
essential;
4. You are more confident of your ability to express objective test items clearly
than your ability to judge essay test answers correctly;
5. There is more PRESSURE IN SPEEDY REPORTING OF SCORES than
for speedy test preparation.

Multiple Choice Items


 It consists of:
1. Stem – which identifies the question or problem
2. Response alternatives or Options
3. Correct
answer Example:
Which of the following is a chemical change? (STEM)
a. Evaporation of alcohol c. burning oil
b. Freezing of water d. melting of wax Alternatives
Advantage of Using Multiple Choice Items
Multiple choice items can provide:
1. Versatility in measuring all levels of cognitive ability
2. Highly reliable test scores
3. Scoring efficiency and accuracy
4. Objective measurement of student achievement or ability
5. A wide sampling of content or objectives
6. A reduce guessing factor when compared to true-false items
7. Different response alternatives which can provide diagnostic feedback.

Limitations of Multiple Choice Items


1. Difficult and time consuming to construct
2. Lead a teacher to favour simple recall of facts
3. Place a high degree of dependence on student’s reading ability and teacher’s
writing ability
SUGGESTIONS FOR WRITING MULTIPPLE CHOICE ITEMS
1. When possible, state the stem as a direct question rather than as an incomplete
statement.
Poor: Alloys are ordinarily produced by…
Better: How are alloys ordinarily produced?
2. Present a definite, explicit singular question or problem in the stem.
Poor: Psychology…
Better: The science of mind and behaviour is called…
3. Eliminate excessive verbiage or irrelevant information from the stem.
Poor: While ironing her formal polo shirt, June burned her hand accidentally on
the hot iron. This was due to a heat transfer because…
Better: Which of the following ways of heat transfer explains why June’s hand
was burned after she touched a hot iron?
4. Include in the stem any word(s) that might otherwise be repeated in each
alternative.
Poor:
In national elections in the US, the President is officially
a. Chosen by the people
b. Chosen by the electoral college
c. Chosen by members of the Congress
d. Chosen by the House of Representative’
Better:
In national elections in the US, the President is officially chosen by
a. the people
b. the electoral college
c. members of the Congress
d. the House of Representative
5. Use negatively stated questions sparingly. When used, underline and/or
capitalize the negative word.
Poor: Which of the following is not cited as an accomplishment of Arroyo
administration?
Better: Which of the following is NOT cited as an accomplishment of Arroyo
administration?

6. Make all alternatives plausible and attractive to the less knowledge or skilfull.
What process is most nearly opposite of photosynthesis?
Poor Better
a. Digestion a. Digestion
b. Relaxation b. Assimilation
c. Respiration c. Respiration
d. Exertion d. Catabolism
7. Make the alternative grammatically parallel with each other and consistent with

Poor: What would advance the application of atomic discoveries to medicine?


a. Standardized techniques for treatment of patients
b. Train the average doctor to apply the radioactive treatments
c. Remove the restriction of the use of radioactive substances
d. Establishing hospital staffed by highly trained radioactive therapy

Better: What would advance the application of atomic discoveries to medicine?


a. Development of standardized techniques for treatment of patients
b. Removal of restriction on the use of radioactive substances
c. Addition of trained radioactive therapy specialist to hospital staffs
d. Training the average doctor in applicant of radioactive treatments.
8. Make the alternatives mutually exclusive.
Poor: The daily minimum required amount of milk that a 10-year old should

a. 1-2 glasses
b. 2-3 glasses*
c. 3-4 glasses*
d. At least 4 glasses
Better: What is the daily minimum required amount of milk a 10-year old child
should drink?

a. 1 glass
b. 2 glasses
c. 3 glasses
d. 4 glasses
9. When possible, present alternatives in some logical order (chronological, most to
least, alphabetical).
At 7 a.m. two trucks leave a diner and travel north. One truck averages 42 miles
per hour and the other truck averages 38 miles per hour. At what time will be 24 miles apart?

Undesirable Desirable
a. 6 p.m. a. 1 a.m.
b. 9 a.m. b. 6 a.m.
c. 1 a.m. c. 9 a.m.
d. 1 p.m. d. 1 p.m.
e. 6 a.m. e. 6 p.m.
10. Be sure there is only one correct or best response to the item.
Poor: The two most desired characteristics in a classroom test are validity and
a. Precision
b .Reliability*
c. Objectivity
d. Consistency*
Best: The two most desired characteristics in a classroom test are validity and
a. Precision
b. Reliability*
c. Objectivity
d. Standardization
11. Make alternative approximately equal in length.
Poor: The most general cause of low individual incomes in the US is
a. Lack of valuable productive services to sell*
b. Unwillingness to work
c. Automation
d. Inflation
Better: What is the most general cause of low individual incomes in the US?
a. A lack of valuable productive services to sell*
b. The population’s overall unwillingness to work
c. The nation’s increase reliance on automation
d. An increasing national level of inflation.
12. Avoid irrelevant clues, such as grammatical structure, well-known verbal
associations or connections between stem and answer.
Poor: (grammatical clue) A chain of islands is called an
a. Archipelago
b. Peninsula
c. Continent
d. Isthmus
Poor: (verbal association) The reliability of a test can be estimated by a
coefficient of

a. Measurement
b. Correlation*
c. Testing
d. Error
Poor: (connections between stem and answer) The height to which a water
dam is build depends on
a. The length of the reserve behind the dam.
b. The volume of the water behind the dam.
c. The height of water behind the dam.*
d. The strength of the reinforcing the wall.
13. Use at least four alternatives for each item to lower the probability of getting the
item correctly by guessing.
14. Randomly distribute the correct responses among the alternative positions
throughout the test having approximately the same proportion of the alternatives a, b, c, d and e
as correct response.
15. Use the alternative NONE OF THE ABOVE and ALL OF THE ABOVE
sparingly. When used, such alternatives should occasionally be used as the correct response.
True-False Test Items
True-false test items are typically used to measure the ability to identify whether or not the
statements of facts are correct. The basic format is simply a declarative statement that the student
must judge as true or false. No modification of the basic form in which the student must respond
“yes” or “no”, “agree” or “disagree.”

Three Forms:
1. Simple – consists of only two choices
2. Complex – consists of more than two choices
3. Compound – two choices plus a conditional completion response
Examples:
Simple: The acquisition of morality is a developmental process. True False
Complex: The acquisition of morality is a developmental process. True Fals Opinio
e n
Compound: An acquisition of morality is a developmental process. True Fals
e
If the statement is false, what makes it false?
Advantages of True-False Items
True-false items can provide:
1. The widest sampling of content or objectives per unit of testing time
2. Scoring efficiency and accuracy
3. Versatility in measuring all levels of cognitive ability
4. Highly reliable test scores; and
5. An objective measurement of student achievement or ability.
Limitations of True-False Items
1. Incorporate an extremely high guessing factor
2. It can often lead the teacher to write ambiguous statements due to the difficulty of
writing statements which are unequivocally true or false.
3. Do not discriminate between students varying ability as well as other item types.
4. It can often include more irrelevant clues than do other item types.
5. It can often lead a teacher to favour testing of trivial challenge.
Suggestions for Writing True-False Items (Payne, 1984)
1. Base true-false items upon statements that are absolutely true or false, without
qualifications or exceptions.
Poor: Nearsightedness is hereditary in origin.
Better: Geneticists and eye specialists believe that the predisposition to
nearsightedness is hereditary.

2. Express the item statement as simply as clearly as possible.


Poor: When you see a highway with a marker that reads: “Interstate 80,” you
know that the construction and upkeep of that road is built and maintained by the local and
national government.
Better: The construction and maintenance of the interstate highways are are
provided by both local and national government.

3. Express a single idea in each test item.


Poor: Water will boil at a higher temperature if the atmospheric pressure on its
surface is increased and more heat is applied to the container.
Better: Water will boil at a higher temperature if the atmospheric pressure on its
surface is increased; or water will boil at a higher temperature if more heat is applied to the
container.
4. Include enough background information and qualifications so that the ability to
respond correctly to the item does not depend on some special, uncommon knowledge.
Poor: The second principle of education is that the individual gathers
knowledge.
Better: According to John Dewey, the second principle of education is that the
individual gathers knowledge.
5. Avoid lifting statements directly from the text lecture or other materials so that
memory alone will not permit a correct answer.
Poor: For every actions there is an opposite or equal reaction.
Better: If you were to stand in a canoe and throw a life jacket forward to another
canoe, chances are, you canoe will jerk backward.

6. Avoid using negatively stated item statements.


Poor: The Supreme Court is not composed of nine justices.
Better: The Supreme Court is composed of nine justices
7. Avoid the use of unfamiliar vocabulary.
Poor: According to some politicians, the raison d’etre for capital punishment is
retribution.
Better: According to some politicians, justification for the existence of capital
punishment is retribution.
8. Avoid the use of specific determiners which should permit a test wise but
unprepared examinee to respond correctly. Specific determiners refer to sweeping terms like
always, all, none, never, impossible, inevitable. Statements including such terms are likely to be
false. On the other hand, statements using qualifying determiners such as usually, sometimes,
often, are likely to be true. When statements require specific determiners, make sure they appear
in both true and false items.
Poor: All sessions of Congress are called by the President (F)
The Supreme Court is frequently required to rule on the constitutionality
of the law.
(T) The objectives test is generally easier to score than an essay test. (T)
Better: When specific determiners are used, reverse the expected outcomes.

The sum of angles of a triangle is always 180 degrees. (T)


Each molecule of a given compound is chemically the same as every
other molecule of that compound. (T)
The galvanometer is the instrument usually used for the metering of
electrical energy use in a home. (F)

9. False items tend to discriminate more highly than true items. Therefore, use
more false items than true items (but not more than 15% additional false items).
Matching Test Items
In general matching items consists of a column of stimuli presented on the left side
of the exam page and a column of responses placed on the right side of the page. Students are
required to match the response associated with a given stimulus.
Advantages of Using Matching Test Items
1. Require short period of reading and response time allowing the teacher to cover
more content.
2. Provide objective measurement of student achievement or ability.
3. Provide highly reliable test scores.
4. Provide scoring efficiency and accuracy.
Disadvantages of Using Matching Test
Items
1. Have difficulty measuring learning objectives requiring more than simple recall
or information.
2. Are difficult to construct due to the problem of selecting a common set of stimuli
and responses.
Suggestions for Writing Matching Test items
1. Include directions which clearly state the basis for matching the stimuli with the
responses. Explain whether or not the response can be used more than once and indicate where
to write the answer.
Poor: Directions: Match the following.
Better: Directions: On the line to the left of each identifying location and
characteristics in Column 1, write the letter on the country in column III that is best defined.
Each country in Column may be used more than once.

2. Use only homogeneous material in matching items.


Poor: Directions: Match the following.
1. Water A. NaCI
2. Discovered Radium B. Fermi
3. Salt C. NH3
4. Year of the First Nuclear Fission by man D. 1942
5. Ammonia E. Curie
Better: Directions: On the line to the left of each compound in column I, write
the letter of the compound’s formula presented in column II. Use each formula once.
Column I Column II
1. Water A.H2SO4
2. Salt B. HCI
3. Ammonia C. NaCI
4. Sulfuric Acid D. H2O
A.H2HCI
3. Arrange the list of responses in some systematic order if possible – chronological,
alphabetical.
Directions: On the line to the left of each definition in column I, write the letter of the defense
mechanism in column II that is described. Use each defense mechanism only once.
Column I Column II
Undesirable Desirable
1. Hunting for reason to support A. Rationalization A. Denial of Reality
one’s belief
2. Accepting the values B. Identification B. Identification
and norms of others
3. As one’s own even if C. Projection C. Projection
they are contrary to previously held values
4. Attributing to other’s D. Introjection D. Projection
own unacceptable
impulse and thoughts and desires
5. Ignoring disagreeable E. Denial of Reality E.
Rationalization And situations, thoughts and
desires

4. Avoid grammatical or other clues to correct response.


Poor: Directions: Match the following in order to complete the sentence on the left.
1. Igneous rocks are formed A. a hardness of 7
2. The formation of coal requires B. with crystalline rock
3. Ageode is filled C. a metamorphic rock
4. Feldspar is classified as D. through the solid formation of molten
Better: Avoid sentence completion due to grammatical clues.

Note:
1. Keep matching items brief, limiting the list of stimuli to under 10
2. Include more responses than stimuli to help prevent answering through the
process of elimination.
3. When possible, reduce the amount of reading time by including only short
phrases or single word in the response list.
Completion Test Items
The completion items require the student to answer a question or to finish an
incomplete statement by filling in a blank with correct word or phrase.

Example:
According to Freud, personality is made up of three major systems, the ,
the , and the .
Advantages of Using Completion Items
Completion items can:
1. Provide a wide sampling of content;
2. Efficiency measure lower levels of cognitive ability;
3. Minimize guessing as compared to multiple choice or true-false items; and
4. Usually provide an objective measure of student achievement or ability
Limitations of Using Completion Items
Completion items:
1. Are difficult to construct so that the desired response is clearly indicated;
2. Have difficulty in measuring learning objectives requiring more than simple recall
of information;
3. Can often include irrelevant clues than do other item types;
4. Are more time consuming to score when compared to multiple choice or true-
false items; and
5. Are more difficult to score since more than one answer may have to be
considered correct if the item was not properly prepared.
Suggestions for Writing Completion Test Items
1. Omit only significant words from the statement. Poor:
Every atom has a central (core) called nucleus.
Better: Every atom has a central core called a (an) (nucleus)
2. Do not omit so many words from the statement that the intended meaning is
lost.
Poor: The were to Egypt was the were to Persia as were to the clearly
tribes of Israel.
Better: The Pharaohs were to Egypt as the were to Persia as were to the
early tribes of Israel.

3. Avoid grammatical or other clues to the correct response.


Poor: Most if the United States’ libraries are organized according to the
(Dewey) decimal system.
Better: Which organizational system is used by most of the United States’
libraries? (Dewey Decimal)
4. Be sure there is only one correct response.
Poor: Trees which shed their leaves annually are (seed-bearing, common).
Better: Trees which shed their leaves annually are called (delicious).
5. Make the blanks of equal length.
Poor: In Greek mythology, Vulcan was the sun of (Jupiter and Juno).
Better: In Greek mythology, Vulcan was the son of and .
6. When possible, delete words at the end of the statement after the student has
been presented a clearly defined problem.
Poor: (122.5) is the molecular weight of KC103.
Better: The molecular weight of KC103 is .
7. Avoid lifting statements directly from the text, lecture or other sources.
8. Limit the required response to a single word or phrase.

Essay Test Items


A classroom essay test consists of a small number of questions to which the student
is expected to demonstrate his/her ability to:

a. Recall factual knowledge;


b. Organize this knowledge; and
c. Present the knowledge is a logical, integrated answer to the question.
Classification of Essay Test:
1. Extended-response essay item
2. Limited Response or Short-answer essay item
Example of Extended-Response Essay Item:
Explain the difference between the S-R (Stimulus-Response) and the S-O-R
(Stimulus- Organism-Response) theories of personality. Include in your answer the following:

a. Brief description of both theories


b. Supporters of both theories
c. Research methods used to study each of the two theories (20 pts)
Example of Short-Answer Essay Item:
Identify research methods used to study the (Stimulus-Response) and the S-O-R
(Stimulus-Organism-Response) theories of personality. (10pts)

Advantages of Using Essay Items


Essay items:
1. Are easier and less time consuming to construct than most item types;
2. Provide a means for testing students’ ability to compose an answer and present it
in a logical manner; and
3. Can efficiently measure higher order cognitive objectives – analysis, synthesis,
evaluation.
Limitations of Using Essay Items
Essay Items:
1. Cannot measure a large amount of content or objectives;
2. Generally prove a low test scorer reliability;
1. Require an extensive amount of instructor’s time to read and grade; and
2. Generally do not provide an objective measure of student achievement or ability
(subject to bias on the part of the grader)
Suggestions for Writing the Essay Test Items
1. Prepare essay items that elicit the type of behaviour you want to measure.
Learning Objective: The student will be able to explain how the normal curve
serves as a statistical model.
Poor: Describe a normal curve in terms of symmetry, modality, kurtosis and
skewness.
Better: Briefly explain how the normal curve serves as statistical model for
estimation and hypothetical testing.
2. Phrase each items so that the student’s task is clearly indicated.
Poor: Discuss the economic factors which led to stock market crash of 2008.
Better: Identify the three economic conditions which led to the stock market crash
of 2008. Discuss briefly each condition in correct chronological sequence and in one paragraph
indicate how the three factors were interrelated.

3. Indicate for each item appoint or weight and an estimated the limit for
answering.
Poor: Compare the writing of Bret Harte and Mark Twain in terms of setting,
depth of characterization, and dialogue styles of their main characters.
Better: Compare the writings of Bret and Mark Twain in terms of setting, depth
of characterization, and dialogue styles of their main characters. (10 points 20 points)

4. Ask questions that will elicit responses on which experts could agree that one
answer is better than another.
5. Avoid giving a student a choice among optional items as this greatly reduces the
reliability of the test.
6. It is generally recommended for classroom examinations to administer several
short-answer items rather than only on or two extended-response items.

Guidelines for Grading Essay Items


1. When writing each essay item, simultaneously develop a scoring rubric.
2. To maintain a consistent scoring system and ensure same criteria are applied to
all assessments, score one essay across all test prior to scoring the next essay.
3. To reduce the influence of the halo effect, bias and other subconscious factors,
all essay questions should be graded blind to the identity of the student.
4. Due to the subjective nature of graded essays, the score on one essay may be
influenced by the quality of previous essays. To provide this type of bias, reshuffle the order of
assessments after reading through each item.
Principle 3: Balanced
- A balanced assessments sets target in all sets in domains of learning (cognitive, effective,
and psychomotor) or domains of intelligence (verbal-linguistics, logic mathematical,
bodily kinaesthetic, visual-spatial, musical-rhythmic, intrapersonal-social, intrapersonal-
introspection, physical world-natural-existential-spiritual)
- A balanced assessment makes use of both traditional and alternative assessment.
Principle 4. Validity
Validity – is a degree to which the assessment instrument measures what it intends
to measure.
 It is also refers to the usefulness of the instrument for a given purpose.
 It is the most important criterion of a good assessment instrument
Ways in Establishing Validity
1. Face Validity- is done by examining the physical appearance of the

instrument 2. Content Validity- is done through a careful and critical examination of the
.
objectives of assessment so that it reflects the curricular objectives.
3. Criterion-related Validity- is established statistically such that a
set of scores revealed by the measuring instrument IS CORRELATED with the
scores obtained in another EXTERNAL PREDICTOR OR MEASURE.
It has two purposes:
a. Concurrent Validity- describe the present status of the individual by correlating
the sets of scores obtained FROM TWO MEASUREs GIVEN
CONCURRENTLY.
Example: Relate the reading test result with pupils’ average grades in reading given by
the teacher.

b. Predictive Validity- describes the future performance of an


individual by correlating the sets of scores obtained from TWO MEASURES
GIVEN AT A LONGER TIME INTERVAL.
Example: The entrance examination scores in a test administered to a freshmen class at
the beginning of the school year is correlated with the average grades at the end of the
school year.

4. Construct Validity- Validity established by analysing the activities


and processes that correspond to a particular concept; is established statistically by
comparing psychological traits or factors that theoretically influence scores in a
test.
a. Convergence validity helps to establish construct validity when you use two
different measurement procedures and research methods (e.g., participant
observation and a survey) in your study to collect data about a construct (e.g., anger,
depression, motivation, task performance).
a. Divergent validity helps to establish construct validity by demonstrating that the
construct you are interested in (e.g., anger) is different from other constructs that might
be present in your study (e.g., depression).

Factors Influencing the Validity of an Assessment Instrument


1. Under Directions- directions that do not clearly indicate to the students
how to respond to the task and how to record the responses tend to reduce validity.
2. Reading Vocabulary and sentence structure too difficult-
Vocabulary and sentences structure that are too complicated for the student result in
the assessment of reading comprehension thus altering the meaning of assessment
result.
3. Ambiguity- Ambiguous statements in assessments task contribute to
misinterpretations and confusion. Ambiguity sometimes confuses the better
students more than it does the poor students.
4. Inadequate time limits- time limits that do not provide students with
enough time to consider the tasks and provide thoughtful responses can reduce the
validity of interpretations of results.
5. Overemphasis of easy- to assess aspects of domain at the expense
of important, but hard- to assess aspects (construct under the presentation). It is easy
to develop test question that asses factual recall and generally harder to develop ones
that tap conceptual understanding or higher-order thinking processes such as the
evaluation of completing positions or arguments. Hence it is important to guard
against under representation of task getting the important, but more difficult to assess
aspects of achievement.
6. Test items inappropriate for the outcomes being measured-
attempting to measure understanding, thinking, skills and other complex types of
achievement with test forms that are appropriate for only measuring factual
knowledge will invalidate the results.
7. Poorly constructed test items- test items that unintentionally provide
clues to the answer tend to measure the students’ alertness in detecting clues as
well as mastery of skills or knowledge the test is intended to measure
8. Test too short- if a test is too short to provide a representative sample
of the performance we are interested in its validity will suffer accordingly.
9. Improper arrangement of items- test items are typically arranged in
order of difficulty, with the easiest items first. Placing difficult items first in the test
may cause students to spend too much time on these and prevent them from reaching
items they could easily answer. Improper arrangement may also influenced validity by
having a detrimental effect on student motivation.
10. Identifiable pattern of answer- Placing correct answers in some
systematic pattern (e.g., T,T,F,F, or B,B,BC,C,C,D,D,D) enables students to guess the
answers to some items more easily, and this lowers validity.
TABLE OF SPECIFICATIONS – TOS

Table of specification is a device for describing test items in terms of the content and
the process dimensions. That is, what a student is expected to know and what he or she
is expected to do with that knowledge. It is described by combination of content and
process in the table of specification.

Sample of One way table of specification in Linear Function


Content Number of Class Number of Items Test Item

Sessions Distribution
1. Definition of linear function 2 4 1-4

2. Slope of a line 2 4 5-8

3. Graph of linear function 2 4 9-12

4. Equation of linear function 2 4 13-16

5. Standard Forms of a line 3 6 17-22

6. Parallel and perpendicular lines 4 8 23-30

7. Application of linear functions 5 10 31-40

TOTAL 20 40 40

Number of items= Number of class sessions x desired total number of items

Total number of

class sessions

Example :

Number of items for the topic‖ definition of linear function‖

Number of class

session= 2 Desired

number of items= 40

Total number of

class sessions=20
Number of items=

Number of class

sessions x desired total

number of itens Total

number of class sessions

=2x40 20

Number of items= 4

Sample of two-way table of specification in Linear Function


Content Class Know Com App Analysi Synthe Eval To
hours s sis
uati ta l
on
1.Definition of 2 1 1 1 1 4
linear
function
2.Slope of a line 2 1 1 1 1
3.Graph of linear 2 1 1 1 1 4
function
4.Equation of linear 2 1 1 1 1 4
function
5.Standard Forms 3 1 1 1 1 1 1 6
of a line
6.Parallel and 4 1 2 1 2 8
perpendicular line
7.Application of 5 1 1 3 1 3 10
linear
functions
TOTAL 20 4 6 8 8 7 7 40

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy