0% found this document useful (0 votes)
315 views

Enjoy1 130512161848 Phpapp02

The document discusses various topics related to student assessment including: 1) The key aspects of assessment are gathering information about student performance through tests, assignments, and other tasks to measure their knowledge and skills. Measurement involves obtaining numerical scores and evaluation examines whether students have met learning objectives. 2) There are different types of tests like norm-referenced which compare students to peers and criterion-referenced which measure specific skills. Formative and summative assessments are used during and after instruction respectively to monitor progress and certify mastery. 3) Effective assessment aligns objectives, instruction, and evaluation so student performance accurately reflects what was taught based on clearly defined learning goals.

Uploaded by

Rocky Yap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
315 views

Enjoy1 130512161848 Phpapp02

The document discusses various topics related to student assessment including: 1) The key aspects of assessment are gathering information about student performance through tests, assignments, and other tasks to measure their knowledge and skills. Measurement involves obtaining numerical scores and evaluation examines whether students have met learning objectives. 2) There are different types of tests like norm-referenced which compare students to peers and criterion-referenced which measure specific skills. Formative and summative assessments are used during and after instruction respectively to monitor progress and certify mastery. 3) Effective assessment aligns objectives, instruction, and evaluation so student performance accurately reflects what was taught based on clearly defined learning goals.

Uploaded by

Rocky Yap
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 42

ASSESMENT OF LEARNING (pg417)

Assessment refers to the process of gathering ,describing or qualifying information about the student
performance.it includes paper and pencil test, extended responses (example essays) assessment
performance are usually referred to as “authentic assessment” tasks (example presentation of research
work).

Measurement is a process of obtaining a numerical description of the degree to which an individual


possesses particular characteristics. Measurement answers the question “How Much”.

Evaluation is refers to the process of examining of the performance of the student.it also determine the
whether or not the student has met the lesson instructional objective.

Test an instrument or systematic procedure design measure the quality, ability, skill or knowledge of
student by giving a set of question in a uniform manner. Since test is a uniform of assessment, test also
answers the question “How does individual student perform”.

Testing is a method used to measure the level of achievement or performance of the learners.it also
refers to the administration, scoring and interpretation of an instrument (procedure) designed to elicit
information about performance in sample of particular area of behavior.

Types of Measurement

There are two ways of interpreting the student performance in relation to classroom instruction. These
are the Norm-reference test and criterion.-referenced test.

Norm-reference test is a test design to measure the performance of the student compared with the
other student each individual is compared with other examinees and assigned a score-usually express as
percentile, a grade equivalent score or a stanine. The achievement of the student is reported for abroad
skill areas, although some norm-referenced tests do report student achievement for individual.

The purpose is to rank each student with respect to the achievement of others in the broad areas of
knowledge and to discriminate high and low achievers.

Criterion-references test is attest designed to measure the performance of the student with respect to
some particular criterion or standard. Each individual is compared with a predetermined set of standard
foe acceptable achievement. The performances of the other examinees are irrelevant. A student’s score
is usually expressed as a percentage and student achievement is a report for individual skills.

The purpose is to determine whether each student has achieved specific skills or concept. And to find out
how much students before instruction begins and after it has finished, other term le often used for
criterion-referenced are objective referenced, domain references, content referenced and universe
referenced.

According to Robert L. Linn and Norman E. Gronlund (1995) pointed out the common characteristics and
differences of norm-referenced test and criterion referenced test.
Common Characteristics of Norm-References Test and Criterion-Referenced Tests

1. Both require specification of the achievement domain to be measure.


2. Both require a relevant and representative sample of the test items.
3. Both use the same types of test items
4. Both used the same rules for items writing (except for item difficulty).
5. Both are judge with the same qualities of goodness (validity and reliability).
6. Both are useful in educational assestment.

Differences between Norm-referenced tests and Criterion-Referenced Tests

Norm-Referenced Tests Criterion-Referenced Test


1. Typical covers a larges domain of learning 1. Typically focuses on delimited domain of
tasks, with just a few items measuring learning task, with a relative large number of
each specific task. items measuring each specific task.
2. Emphasizes discrimination among 2. Emphasizes among individuals can and
individuals in terms of relative of level of cannot perform.
learning.
3. Favors items of large difficulty and 3.Matches items difficulty to learning Tasks,
typically omits very easy and very hard without altering item difficulty or omitting easy or
items. hard items
4. Interpretation requires a clearly defined 4. Interpretation requires a clearly defined and
group. delimited achievement domain.

TYPES OF ASSESSMENT

There are four types of assessment in terms of their functional role in relation to classroom instruction.
These are the placement assessment, diagnostic assessment, formative assessment, and summative
assessment.

A. Placement Assessment is concerned with entry performance of the student. The purpose of the
placement assessment is to determine the prerequisite skills, degree of mastery of the course
objectives and the best mode of learning.
B. Diagnostic Assessment is a type of assessment given before instruction. It aims to identify the
strengths weaknesses of the students regarding the topics to be discussed. The purpose of the
diagnostic assessment:
1. To determine the level of competence of the students;
2. To identify the students who have already knowledge about the lesson;
3. To determine the cause of learning problems and formulate a plan for remedial action.
C. Formative Assessment is a type of assessment used to monitor the learning progress of the
students during or after instruction. Purpose of formative assessment:
1. To provide feedback immediately to both student and teacher regarding the success and
failure of learning.
2. To identify the learning errors that is in need of correction.
3. To provide information to the teacher for modifying instruction and used for improving
learning and instruction.
D. Summative Assessment is a type of assessment usually given at the end of a course or unit.
Purposes of summative assessment:
1. To determine the extent to which the instructional objectives have been met;
2. To certify student mastery of the intended outcome and used for assigning grades;
3. To provide information for judging appropriateness of the instructional objectives;
4. To determine the effectiveness of instruction.

MODE OF ASSESSMENT

A. Traditional Assessment
1. Assessment in which student typically select an answer or call recall information to complete
the assessment. Test may be standardized or teacher made test, these tests may be multiple-
choice, fill in the blanks true or false, matching type.
2. Indirect measure of assessment since the test items are design to represent competence by
extracting knowledge and skills from their real life context.
3. Items on standardize instrument tend to test only the domain of knowledge and skill to avoid
ambiguity the test takers.
4. One-time measures to rely the on single correct answer to each item. there is a limited
potential for traditional test to measure higher order thinking skills.
B. Performance Assessment
1.Assessment in which the student are ask to perform real-world task that demonstrate
meaningful application of essential knowledge and skill(Jon Mueller).
2.Direct measure of student performance because tasks are design to incorporate context,
problem, and solution strategies that student would use in real life,
3.Design ill-structured challenges since the goal is to help students prepare for the complex
ambiguities in life.
4. Focus on process and rationales. There is no single correct answer ;instead students are led to
craft polished, thorough and justifiable responses ,performances products.
5. Involved long-range projects, exhibits and performances are linked to the curriculum
6. Teacher is an important collaborator in creating tasks, as well as in developing guidelines for
scoring and interpretation.
C. Portfolio Assessment
1. Portfolio is collection of students work specifically elected to tell a particular story about the
student.
2. A portfolio is not a pile of student work that accumulates over a semester or a year.
3 A portfolio contains a purposefully selected subset of student work
4. It measures the growth and development of students.

The Key to effective Testing


Objectives:

The specific statement of the aim of the instruction;it should express what the students should be
able to do or know as a result of taking the course;the objectives should indicate the cognitive level,
affective and psychomotor level of performance.

Instruction:

It consist all the elements of the curriculum designed to teach the subject including th lesson plan,
study guide, and reading and homework assignment; the instruction should correspond directly to
the objective.

Assessment:

The process of gathering, describing or qualifying information about the performance of the learner
testing components of the subject; the weight given to different subject matter areas on the test
should match with the objectives as well as he emphasis given to each subject area during
instruction.

Evaluation:

Examining the performance of the students and comparing and judging its quality. Determining
whether or not learner has the objective of the lesson and the extent of understanding.

INSTRUCTIONAL OBJECTIVES

Instructional objective play a very important role in the instructional process and the evaluation
process. It serves as guides for teaching and learning, communicate the intent of the instruction to
others and it provide a guideline for assessing of the learning of the student. Instructional Objectives
also known as behavioral objectives or learning objectives are statement which clearly described an
anticipated learning outcome.

Characteristics of well written and useful instructional objectives

1. Describe a learning outcome.

2. Be student oriented focus on the learner not on the teacher

3. Be observable or describe an observable product.

4. Be sequentially appropriate

5. Be attainable within a reasonable amount of time,

6. Be developmentally appropriate.

Factors to consider when Constructing Good Test Items


A.VALIDITY is the degree to which the test measure what Is intended to measure.it is the
usefulness the test for given purpose. A valid test is always reliable.
B.RELIABILITY refers to the consistency of score obtained by the same person when retested
using the same instrument or one that is that parallel to it.
C.ADMISTRABILITY the test should be administered uniformly to all students so that the score
obtained will not vary due to factors other than differences of the student knowledge and skills.
there should be a clear provision for the instruction for the student proctors and even the who
will check the test or the score.
D.SCORABILTY the test should be easy to score, direction for scoring is clear, provide the answer
sheet and the answer key.
E.APPROPRAITENESS the test items that the teacher construct must assess the exact
performance called for the learning objectives. The test items should require the same
performance of the student as specified in the learning objectives,
F.ADEQUACY the test should contain a wide sampling of items to determine the educational
outcomes or ability so that the resulting score are representatives of the total performance of
the total areas measure,
G.FAIRNESS the test should not be biases to the examinees.it should not be offensive to any
examinee subgroup. A test can only be good if it also fair to all test takers.
H.OBJECTIBITY represents the agreement of two or more rater or a test administrator concerning
the score of the student. If the raters to assess the same student on the same test cannot agree
on score, the test lacks objectivity and the score neither judge is valid, thus, lack of objectivity
reduces test validity in the same way that lack reliability influence validity.

TABLE OF SPECIFICATION

Table of specification is a device for describing test items in terms of the content and the process
dimensions. That is, what a student is expected to know and what he or she is expected to do with
the knowledge. It is describe by combination of the content and the process in the table of the
specification.

SAMPLE OF ONE WAY TABLE OF SPECIFICATION IN LINEAR FUNCTION

Content Number of Number of Test Item


Class Sessions Items Distribution
1. Definition of linear Function 2 4 1-4
2. Slope of a Line 2 4 5-8
3. Graph of linear function 2 4 9-12
4. Equation of linear function 2 4 13-16
5. Standard forms of a lines 3 6 17-22
6. Parallel and perpendicular 4 8 23-30
7. Applications of linear function 5 10 31-40
TOTAL 20 40 40

Number of items= Number of class sessions x desired total number of items


Total number of class sessions
Example: Number of items for the topic “definition of linear function”.
Number of class sessions= 2
Desired number of items = 40
Total number of class sessions = 20

Number of items = Number of class sessions x desired total number of items


Total number of class sessions
= 2 x 40
20 Number of items = 4

SAMPLE OF TWO WAY TABLE OF SPECIFICATION IN LINEAR FUNCTION

Content Class Knowledge Comprehension Application Analysis Synthesis Evaluation TOTAL


Hours
1. Definition of 2 1 1 1 1 4
linear function
2. Slope of a line 2 1 1 1 1
3. Graph of a 2 1 1 1 1 4
linear function
4. Equation of 2 1 1 1 1 4
linear function
5. Standard 3 1 1 1 1 1 1 6
Forms of a line
6. Parallel and 4 1 2 1 2 8
perpendicular
7. Application of 5 1 1 3 1 3 3 10
linear functions
TOTAL 20 4 6 8 8 7 7 40

ITEMS ANALYSIS

Items analysis refers to the process of examining the student’s response to each items of the
test. According to Abubakar S. Asaad and Willam M. Hailaya (measurement and evaluation concepts
and principles) Rex bookstore (2004 edition), there are two characteristics of an item. These are
desirable and undesirable characteristics. An item that has desirable characteristics can be retained
for subsequent use and that with undesirable characteristics is either be revised or rejected.

Three criteria in determining the desirability and undesirability of an item.

a. Difficulty of an item
b. Discriminating power of an item
c. Measures of attractiveness

Difficulty index (Df) refers to the proportion of the number of the students in the upper and lower groups
who answered an item correctly. In a classroom achievement test, the desired indices of difficulty not
lower than 0.80.The average index of difficulty from 0.30 or o.40 to a maximum of 0.60.
Index Range Difficulty Level
0.00 – 0.20 Very difficult
0.21 – 0.41 Difficult
0.41 – 0.60 Moderate Difficult
0.60 – 0.81 Easy
0.81 – 1.00 Very Easy
Discrimination Index is the difference between
the proportion of high performing students who got the item right and the proportion of low
performing students usually defined as the upper 27% of the students based on the total examination
score and the lower 27% of the students based on the total examination score. Discrimination index
is the degree to which the items discriminate between high performing groups in relation of scores
on the total test. Index of discrimination is classified into positive discrimination, negative
discrimination, and zero discrimination. Positive Discrimination if the proportion of students who got
an item right in the upper performing group is greater than the proportion of the low performing
group. Negative Discrimination if the proportion of the students who got an item right in the low
performing group is greater than the students in the upper performing group. And Zero
Discrimination if the proportion of the students who got an item right in the upper performing group

Discrimination Index Item evaluation


0.40 and up Very good item
0.30 – 0.39 Reasonably good item but possibly subject to
improvement
0.20 – 0.29 Marginal item, Usually needing and being subject
to improvement
Below 0.19 Poor item, to be rejected or improved by revision

Maximum Discrimination is the sum of the proportion of the upper and lower groups who answered
the item correctly. Possible maximum discrimination will occur if the half or less of the sum of the
upper and lower groups answered an item correctly.

Discriminating Efficiency is the index of discrimination divided by maximum discrimination.

Notations:

PUG = proportion of the upper group who got an item right

PLG = proportion of the lower group who got an item right

Di = discrimination Index

DM = Maximum discrimination

DE = Discrimination Efficiency

Formula:

Di = PUG - PLG
Example: Eight students took an examination in algebra, 6 students in the upper group got the correct
answer and 4 students in the lower group got correct answer for item number 6.Find the discriminating
efficiency.

Given:

Number of students took the exam=80

27% of 80=21.6 or 22 , which means that there are 22 students in the upper performing group
and 22 students in the lower performing group.

(insert)

This can be interpreted as on the average, the item is discriminating at 20% of the potential of its
difficulty.

Measures of Attractiveness

To measure the attractiveness of the incorrect option (distracters) in multiple choice tests, we
count the number of the students who selected the incorrect option in the both the upper and lower
groups. The incorrect option is said to be effective distracter if there are more students in the lower
group chose that in correction option than those students in the upper group.

Steps of Item

Df
analysis

1. Rank the scores of the student from highest score to lowest score.

2. Select 27% of the papers within the upper performing group and 27% of the papers within the
lower performing group.

3.Set aside the 46% of the papers because they will not be used for item analysis.

5.compute the difficulty of each items

6.compute the discriming

7.

Validity Test

Validity refers to the appropriate of score-based inferences; or decisions made based on the
student’s results. The extent to which a test measures what it’s supposed to measure.
Important thing to Remember about Validity

1. Validity refers to the decisions we make, and not to the test itself or to the
measurement.
2. Like reliability, validity, I not an all or nothing concept; it is never totally absent or
absolutely perfect.
3. A validity estimate, called a validity coefficient, refers to specific type of validity. It ranges
between 0 to 1.
4. Validity can never be finally determined; it is specific to each administration of the test.

TYPES OF VALIDITY

1. Content Validity –a type of validation that refers to the relationship between a test and the
instructional objectives, establishes content so that the test measures what it is supposed to
measure .things to remember about validity.
a. The evidence of the content validity of your test is found in the table of Specification.
b. This is the most important type of validity to you, as a classroom teacher.
c. There is no coefficient for content validity. It is determined judgmentally, not empirically.
2. Criterion-related Validity- a type of validation that refers to the extent to which scores from a
test relate to theoretically similar measures. It is a measure of how accurately a student’s current
tests score can be used to estimate a score on a criterion measure, like performance in courses,
classes or another measurement instrument. Example classroom reading grades should indicate
similar level of performance as standardized Reading Test scores.

A. Construct Validity-a type of validation that refers to measure of the extent to which a
test measures a hypothetical and unobservable variable or quality such as intelligence,
math achievement, performance anxiety, etc. It established through intensive study of
the test or instrument.
B. Predictive Validity-A type of validation that refers t measure other extent to which a
person’s current test result can be used to estimate accurately what persons
performance or other criterion, such as test scores will be at the later time
3. Concurrent Validity-a type of validation that require the correlation of the predictor or
concurrent measure with the criterion measure. Using this, we can determine whether a test is
useful to us as predictor or as substitute (concurrent) measure. The higher the validity coefficient,
the better the validity evidence of the test. In establishing the concurrent validity evidence no
time interval is involved between the administration of the new test and the criterion or
established test.

Factors Affecting the Validity of a Test Item

1. The test itself.


2. The administration and scoring of the test.
3. Personal factors influencing how student’s response the test.
4. Validity is always specific to a particular group.

Ways to Reduce the Validity of the Test Items

1. Poorly constructive Items


2. Unclear direction
3. Ambiguous items
4. Reading vocabulary too difficult
5. Complicated syntax
6. Inadequate time limit
7. Inappropriate level of difficulty
8. Unintended clues
9. Improper arrangement of items

Test Design to Improve Validity

1. What is the purpose of the test?


2. How will do the instructional objectives selected for the test represents the instructional
goals?
3. Which test items format will best measure achievement of each objective?
4. How many test items will be required to measure the performance adequately on each
objective?
5. When and how will the test be administered?
6. How many items will be required to measure the performance adequately on each
objective?

Reliability of the Test

Reliability refers to the consistency of measurement; that is, how consistent test result
or other assessment results from one measurement to another. We can say that a
test is reliable when it can be used to predict practically the same scores when the
test administered twice to the sane group of student and with a reliability index of
0.50 or above. The reliability of attest can be determined by means of Correlation
Product Coefficient, Spearman Brown Formula and Kuder-Richadson Formula.

Factors Affecting the Reliability Of Test

1. Length of a Test
2. Moderate item difficulty
3. Objective scoring
4. Heterogeneity of the student group
5. Limited time

Four Methods of Establishing Reliability


1. Test-retest Method. A type of reliability determined the administering the same
test twice to the same group of students with any time interval between test . the
result of the test scores correlated using the Pearson Product Correlation
Coefficient ( r) and this correlation coefficient provides a measure of stability .
This indicates how stable the test result over period of time.
2. Equivalent – Form Method. A type of reliability determined y administering two
different but equivalent of the test (also called parallel or alternate forms) to the
same group of student s in closed succession. The equivalent forms are
constructed to the same set of specification that is similar in content, type of
items and difficulty. The result of the test scores are correlated using the Person –
product Correlation coefficient ( r) and this correlation coefficient provides a
measure of the degree to which generalization about the performance of the
students from one assessment to another assessment is justified. It measures the
equivalence of the tests

3. Split- Half Method. Administer test once, score two equivalent halves of the test. To
split the test in two halves that are equivalent, the usual procedure is to score the
even number and odd numbers separately. This provides two score for each student.
The result of the test scores are correlated using the Spearman-Brown formula and
tis correlation Coefficient provides a measure of internal consistency it indicates the
degree to which consistent result are obtain from two halves of the test.

4. Kuder- Richardson Formula. Administered the test once. Score total test and apply the Kuder-
Richardson Formula. He Kuder-Richardson formula is applicable only in situation where students’
responses are score dichotomously and therefore and is most useful with traditional test items that
are scored as right or wrong. KR-20 estimate of reliability that provides information the degree to
which the items in the test measure the same characteristics it is an assumption that all items are of
equal of difficulty. ( A statistical procedure to estimate coefficient alpha, a correlation coefficient is
given.)

Descriptive Statistics of the test

Statistics play a very important rule in describing the test of the students. Teachers should have a
background on the statistical techniques in order for them to analyze and describe the results of
measurements obtained in their own classroom ; understand the statistic used in the test and research
reports interpret the type of scores used in testing .

Descriptive Statistic – is concerned with collecting, describing, and analyzing a set of data without
drawing conclusions or inferences about a large group of data in terms if table, graphs, or single number
(example average score of the class in a particular test).

Inferential statistic – is concerned with the analysis of the subset of a data leading the prediction or
inferences about the entire set of data or population.

We shall discus different statistical techniques used d in describing and analyzing test result.

1. Measures of central tendency (average).


2. Measures of variability ( spread of scores)
3. Measures of relationship(correlation)
4. Skewness

Measures of Central Tendency. It is single value that is used to identify the center f the data, it is
taught a the typical value in a set of scores. It tends to lie with in the center if it is range from
lowest to highest or vice versa. There are three measures of central tendency commonly, the
mean median and mode.
The mean
The mean is common measures of center and it is also known as arithmetic average.
N
Population mean = µ = ∑ Xi = X1 + X2 + …..Xn
i=1 N

Sample mean = ∑x
N
∑ = sum of the scores
X= individual score
N= number of scores

Step in solving the in mean value using raw scores.


1. Get the sum of all the scores in the distribution
2. Identify the number of scores (n )
3. Substitute the given formula and solve the mean value.

Example: find the mean of the score of student in algebra quiz.


( x) Scores of students in algebra

45
35
48
60
44
39
47
55
58
54

∑x = 485
N= 10

4. Mean = ∑x
N

= 485
10
Mean = 48.5

Properties of mean
1. Easy to compute
2. It may not be an actual observation in the data set
3. It can be subjected to numerous mathematical computation
4. Most widely used
5. Each data contributes to the mean value.
6. It is easily affected by the extremes values
7. .applied to interval level data

The median is a point that divides the scores in the distribution into two equal parts when the
scores are arrange according to magnitude, that is from lowest score to highest score or highest score to
lowest score. If the number of score is an odd number, the value of the median is the middle score. When
the number of scores is an even number, the median vale s the average of the two middle most scores.

Example 1: find the median of the cores of ten students in algebra quiz.
(x) scores of students in algebra
45
35
48
60
44
39
47
55
58
54

First, arrange the scores from lowest to highest and find the average of the two middle most
scores. Since the number of cases is an even .
35
39
44
45
47
48
54
55
58
60
Medan = 47 +48
2
= 47.5 is the median score
50% of the scores in the distribution fall below for the 47.5.

Example 2: find the median of scores of nine students in algebra quiz.


(x)
35
39
44
45
47
48
54
55
58
The median of value is the 5th score which is 47. Which means that 50% of the scores fall below
47.

Properties of Median
1. It is not affected by extremes values.
2. It is applied to ordinal level of data.
3. The middle most score in the distribution.
4. Most appropriate when there are extreme scores.

The Mode
The mode refers to the score of the scores that occurred most in the distribution. There are
classifications of mode: a) unimodal is a distribution that consists of only one mode, b) bimodal is a
distribution of scores that consist of two modes, c) multimodal is a score distribution that consist of more
than two modes.

Properties of Modes
1. It is the score/s occurred most frequently.
2. Nominal average.
3. It can be used for qualitative and quantitative data.
4. Not affected by extreme values.
5. It may not exist.

Example 1: Find the mode of the scores of the students in algebra quiz: 34, 36, 45, 65, 34, 45, 55, 61, 34,
46.

Mode = 34, because it appeared three times. The distribution is called unimodal.

Example 2: Find the mode of the scores of students in algebra quiz: 34, 36, 45, 61, 34, 45, 55, 61, 34, 45.
Mode = 34 and 45, because both appeared three times. The distribution is called bimodal

Measures of Variability

Measures of Variability is a single value that is used to describe the spread out of the scores in a
distribution; that is above or below the measures of central tendency. There are three commonly used
measures of variability, the range, quartile deviation and the standard deviation.

The Range

Range is the difference between highest score and lowest score in the data set.

R = HS –LS

Properties of Range

1. Simplest and crudest measure.


2. A rough measure of variation.
3. The smaller the value, the closer the scores to each other or the larger the value, the more
scattered the scores are.
4. The value easily fluctuates, meaning if there is a change in either the highest score or lowest
score the value of range easily changes.

Example: Scores of 10 students in Mathematics and Science. Find the range and in what subject has a
greater variability?

Mathematics Science
35 35
33 40
45 25
55 47
62 55
34 35
54 45
36 57
47 39
40 52

Mathematics Science
HS = 62 HS = 57
LS = 33 LS = 25
R = HS –LS R = HS -LS
R = 62- 33 = 57-25
R = 29 R = 32

Based from the computed value of the range, the scores in Science have greater variability.
Meaning, scores in Science are more scattered than in the scores in Mathematics.

The Quartile Deviation

Quartile Deviation is the half of the difference between the third quartile ( Q3) and the first quartile (Q1).
It is based on the middle 50% of the range, instead the range of the entire set of distribution. In symbol,
QD = Q3 – Q1 where,
2
QD = quartile deviation
Q3 = third quartile value
Q1 = first quartile value

Example: In a score of 50 students, the Q3 = 50.25 and Q1 = 25.45, Find the QD.

QD = Q3 –Q 1
2
= 50.25- 25.45
2
QD = 12.4

The value of QD = 12.4 which indicates the distance we need to go above or below the median to include
approximately the middle 50% of the scores.

Standard Deviation

The Standard Deviation is the most important and useful measures of variation, it is the square root of
the variance. It is an average of the degree to which each set of the scores in the distribution deviates
from the mean value. It is more stable measures of variation because it involves all the scores in the
distribution rather than the range and quartile deviation.
Rubrics

Rubrics is a scoring scale and instructional tool to assess the performance of student using a task- specific
set of criteria. It contains two essential parts: the criteria of the task and level of performance for each
criterion. It provides teachers an effective means of students- centered feedback and valuation of the
work of the students. It also enables teachers to provide detailed and informative evaluations of their
performance.
Rubrics is very important especially if you are measuring the performance of the students against a set of
standard or pre- determined set of criteria. Through the use of scoring rubrics or the teachers can
determine the strengths and weaknesses of the students, hence it enables the students to develop their
skills.

Steps in Developing a Rubrics

1. Identify your standards, objectives and goals of your students. Standard is a statement of what
the students should be able to know or be able to perform. It should indicate that your students
should met these standards. Know also the goals for instruction, what are the learning
outcomes?
2. Identify the characteristics of a good performance on that task, the criteria. When the students
perform or present their work, it should indicate that they performed well in the task given to
them: hence they met that particular standard.
3. Identify the levels of performance for each criterion. There is no guidelines with regards to the
number of levels of performance, it vary according to the task and needs. It can have as few as
two levels of performance or as many teacher can develop. In this case, the rater can sufficiently
discriminate the performance of the students in each criteria. Through this level of performance,
the teacher or the rater can provide more detailed feedback about the performance of the
students. It is easier also for the teacher and students to identify the areas needed for
improvement.

Types of Rubrics

1) Holistic Rubrics
In holistic rubrics does not list a separate levels of performance for each criterion. Rather, holistic
rubrics assigns a level of performance along with a multiple criteria as a whole, in other words
you put all the components together.
Advantage: quick scoring, provide overview of students’ achievement.

Disadvantage: does not provide detailed information about the students’ performance in specific
areas of the content and skills. Maybe difficult to provide one overall score.

2) Analytic Rubrics
In analytic rubrics the teacher or the rater identify and assess components of a finished product.
Breaks down the final product into component parts and each part is scored independently. The
total score is the sum of all the rating for all the parts that are to be assessed or to be evaluated.
In analytic scoring, it is very important for the rater to treat each part as separate to avoid bias
towards the whole product.

Advantage: more detailed feedback, scoring more consistent across the students and graders.

Disadvantage: time consuming the score.

Example of Holistic Rubric

3 – Excellent Researcher
 Included 10-12 sources
 No apparent historical inaccuracies
 Can easily tell which source information was drawn from
 All relevant information is included

2- Good Researcher
 Included 5-9 sources
 Few historical inaccuracies
 Can tell with difficulty where information came from
 Bibliography contains most relevant information

1 – Poor Researcher
 Included 1 – 4 sources
 Lots of historical inaccuracies
 Cannot tell from which source information came
 Bibliography contains very little information

Source: jonathan.mueller.faculty.noctrl.edu/toolbox/howstep4.htm

Example of Analytic rubrics

Criteria Limited 1 Acceptable 2 Proficient 3


Made good Observations are absent Most observations are All observations are
Observations or vague clear and detailed clear and detailed
Made good predictions Predictions are absent Most Prediction is All Predictions are
and irrelevant reasonable reasonable
Appropriate Conclusions is absent or Conclusions is Conclusion is consistent
inconsistent with consistent with most with observations
observations observations

Source: jonathan.mueller.faculty.noctrl.edu/toolbox/howstep4.htm

Advantages of Using Rubrics

When assessing the performance of the students using performance-based assessment is


very important to use scoring rubrics. The advantages of using rubrics in assessing students’
performance are:

1. Rubrics allow assessment to become more objective and consistent;


2. Rubrics clarify the criteria in specific terms;
3. Rubrics clearly show the students how work will be evaluated and what is expected;
4. Rubrics promote student awareness of the criteria to use in assessing peer
performance;
5. Rubrics provide useful feedback regarding the effectiveness of the instruction; and
6. Rubrics provide benchmarks against which to measure and document progress
(Gabuyo, 2011).

PERFORMANCE BASED ASSESSMENT

Performance based assessment is a direct and systematic observation of the actual performances of the
students based from the pre-determined performance criteria ( Zimmaro, 2003) as cited by (Gabuyo,
2011). It is an alternative form of assessing the students that represents a set of strategies for the
applications of knowledge, skills, and work habits through the performance of task that are meaningful
and engaging to students” (Hibbard, 1996) and (Brualdi, 1998) in her article ”Implementing Performance
Assessment in the Classroom”.

Framework of Assessment Approaches

Selection Type Supply Type Product Performance


True-False Completion Essay, story or Poem Oral representation of
report
Multiple Choice Label a Diagram Writhing Portfolio Musical, Dance or
Dramatic Performance
Matching Type Short Answer Research Report Typing test
Concept map Portfolio exhibit, art Diving
exhibit
Writing Journal Laboratory
Demonstration
Cooperation in group
works

Forms of performance Based assessment (Gabuyo, 2011)

1. Extended response task


a. Activities for single assessment may be multiple and varied.
b. Activities may be extended over a period of time.
c. Products from different students may be different in focus

2. Restricted Response tasks


a. Intended performances more narrowly defined than on the extended- response tasks
b. Questions may begin with a multiple-choice or short-answer stem, but then ask for
explanation, or justification.
c. May have introductory material like an interpretative exercise, but then asks for an
explanation of the answer, not just an answer itself.

3. Portfolio is a purposeful collection of student works that exhibits a student’s efforts, progress
and achievements in one or more areas.

Uses of performance Based Assessment (Gabuyo, 2011)


1. Assessing the cognitive complex outcomes such as analysis, synthesis, and evaluation
2. Assessing non-writing performances and products
3. Must carefully specify the learning outcomes and construct activities or task that actually called
forth.

Focus of Performance Based Assessment (Gabuyo, 2011)


Performance based assessment can assess the process, product or both (Process and product)
depending on the learning outcomes. It also involves doing rather than knowing about the activity or
task. The teacher will assess the effectiveness of the process or procedures and the product used in
carrying out the instruction. The question is when to use the process and the product?

Use the process when:


1. There is no product;
2. The process s orderly an directly observable;
3. Correct procedure/ steps is crucial to later success;
4. Analysis of procedural steps can help in improving the product;
5. Learning is at early stage.

Use the product when:


1. Different procedures result in an equally good product;
2. Procedures not available for observation;
3. The procedures have been mastered already;
4. Products have qualities that can be identified and judge.
Assessing the Performance
The final step in performance assessment is to assess and score the student’s performance. To
assess the performance of the students the evaluator can used checklist approach, narrative or anecdotal
approach, rating scale approach, and memory approach. The evaluator can give feedback on the
student’s performance in the form of a narrative report or a grade. There are different ways to record the
result of the performance-Based assessments (Airasian, 1991 ; Stiggins, 1994) as cited by (Gabuyo, 2011):

1. Checklist Approach are observation instruments that divide performance whether it is certain or
not certain. The teacher has to indicate only whether or not certain elements are present in the
performances.
2. Narrative/ Anecdotal Approach is a continuous description of student behavior as it occurs,
recorded without judgment or interpretation. The teacher will write narrative reports of what
was done during each of the performances. From these reports, teachers can determine how
well their students met their standards.
3. Rating Scale Approach is a checklist that allows the evaluator to record information on a scale,
noting the finer distribution that just presence or absence of a behavior. The teacher they
indicate to what degree the standards were met. Usually, teacher will use a numerical scale. For
instance, one teacher may rate each criterion on a scale of one to five with one meaning “skill
barely present” and five meaning “skill extremely well executed.”
4. Memory Approach the teacher observes the students when performing the tasks without taking
any notes. They use the information from their memory to determine whether or not the
students were successful. This approach is not recommended to use for assessing the
performance of the students.

PORTFOLIO ASSESSMENT

Portfolio assessment is the systematic, longitudinal collection of the student work created in
response to specific, known instructional objectives and evaluated in relation to the same criteria
(Ferenz, 2001). Students Portfolio is a purposeful collection of the student work that exhibits the
student’s efforts, progress and achievements in one or more areas. The collection most include student
participation in selecting contents, the criteria for selection, the criteria for judging merit and evidence of
students reflection. (Paulson, Paulson, Meyer 1991) as cited by Ferenz (2001) in her article ”using student
Portfolio for Outcomes Assessment.

Comparison of Portfolio and Traditional Forms of Assessment (Ferenz, 2001)


Traditional Assessment Portfolio Assessment
Measures student’s ability in one time Measures student’s ability over time
Done by the teacher alone, students are not aware Done by the teacher and the students, the students
of the criteria. were aware of the criteria.
Conducted outside instruction Embedded in instruction
Assigns students a grade Involves student in own assessment
Does not capture student’s language ability Capture many facets of language learning
performance.
Does not include teacher’s knowledge of student as Allows of expression of the teacher’s knowledge of
a learner student as learner
Does not give student responsibility Students learn how to take responsibility.

Three Types of Portfolio


There are three basics types of portfolios to consider for classroom use. These are working
portfolio, showcase portfolio and progress portfolio.
1. Working Portfolio
The first type of portfolio is the working portfolio also known as “teacher-students
portfolio”. As the name implies that it is a project “in the work” it contains the work in progress
as well as the finished samples of work use to reflect on process by the students and teachers. It
documents the stages of learning and provides a progressive record of the student growth. It is
an interactive teacher-student portfolio that aids in communication between teacher and
student.
The working portfolio can be used to diagnose student needs. In this both students and
teacher have evidence of student strengths and weaknesses in achieving learning objectives,
information extremely useful in designing future instruction.
2. Showcase Portfolio
Showcase portfolio is the second type portfolio and also known as best works portfolio or
display portfolio. In this kind of portfolio it focuses on the student’s best and most representative
work, it exhibit the best performance of the student. Best works portfolio may document student
efforts with respect to the curriculum objectives, it may also include evidence of student
activities beyond school for example a story written at home.
It is like an artist’s portfolio where a variety of work is selected to reflect breadth of
talent, painters can exhibit the best paintings. Hence, in this portfolio the student selects what he
or she thinks is representative work. This folder is most often seen at open houses and parent
visitations (Columba & Dolgos, 1995).
The most rewarding use of student portfolios is the displays of the students’ best work,
the work that makes them proud. In this case, it encourages self-assessment and builds self-
esteem to the students. The pride and sense of accomplishment that students feel make the
effort well worthwhile and contribute to a culture for learning in the classroom.
3. Progress Portfolio
The third type of portfolio is the progress portfolio and it is also known as Teacher
Alternative Assessment Portfolio. It contains examples of students’ work with the same types
done over a period of time and they are utilized to assess their progress. All the works of the
students in this type of portfolio are scored, rated, ranked or evaluated.
Teachers can keep individual student portfolios that are solely for teacher' use as an
assessment tool. This is a focused type of portfolio and is a model of holistic approach to
assessment (Calumba & Dolgos, 1995).
Assessment portfolios used to document student learning on specific curriculum
outcomes and used to demonstrate the extent of mastery in any curricular area.

Uses of Portfolios
1. It can provide formative and summative opportunities for monitoring progress toward reaching
identified outcomes.
2. Portfolios can communicate concrete information about what is expected to students in terms of
the content and quality of performance in specific curriculum areas.
3. A portfolio is that it allows the students to documents aspects of their learning that do not show
up well in traditional assessments.
4. Portfolios are useful to showcase periodic or end of the year accomplishments of students such
as poetry, reflections on growth, samples of best works, etc.
5. Portfolios may also be used to facilitate communicates between teacher and parents regarding
their child’s achievements and progress in a certain period of time.
6. The administrators may use portfolios for national competency testing to grant high school
credit, to evaluate educational programs.
7. Portfolios may be assembled for combination of purposes such s instructional enhancement and
progress documentation. A teachers review of students’ portfolios periodically and make notes
for revising instruction for next year used.

According to Mueller (2010) there are seven steps in developing portfolios of students. Below are
the discussions of each step.
1. Purpose: What is the purpose(s) of the portfolio?
2. Audience: For what audience(s) will the portfolio be created?
3. Content: What samples of students work will be included?
4. Process: what processes (e.g., selection of work to be included, reflection on work, conferencing)
will be engaged in during the development of the portfolio?
5. Management: How will time and materials be managed in the development of the portfolio? The
6. Communication: How and when will the portfolio be shared with pertinent audiences?
7. Evaluation: If the portfolio is to be used for evaluation, when and how should it be evaluated?

Guidelines for Assessing Portfolios

1. Include enough documents (items) on which on base judgment.


2. Structure the contents to provide scorable information.
3. Develop judging criteria and a scoring scheme for reters to use in assessing the portfolios.
4. Use observation instrument such as checklist and rating scales when possible to facilitate scoring.
5. Use trained evaluators or assessors.

Direction: encircle the best answer that makes the statements true.

1. Teacher Adrian will construct an achievement test. Which of the following he will
accomplish first?
A. Construct relevant test items.
B. Prepare table of specification.
C. Determine the number of items to be constructed.
D. Identify the intended learning outcomes.

Rationalization: D- the first step in constructing test items is to identify the learning outcomes or
go back to the instructional objectives.

Situation A (For 2-3)

Direction: Column A describes events associated with U.S presidents, inventor, civil right leader.
Indicate which name in column B matches each event by placing the appropriate letter to the left
of the number to column A. Each name may be used once only.

Column A Column B
1. President of the 20th Century A. Lincoln
2. Invented the telephone B. Nixon
3. Delivered the Emancipation Proclamation C. Whitney
4. Recent president to resign from office D. Ford
5. Civil rights leader E. Bell
6. Invented the cotton gin F. King
7. Our first president G. Washington
8. Only president elected for more than two terms H. Roosevelt

2. Which guidelines of writing matching type item was NOT FOLLOWED?


A. It is very difficult test items. C. It is NOT valid test items.
B. Consist of less than ten items. D. It is NOT homogeneous.

Rationalization: D- the descriptions and options are not homogeneous, it consist of three
groups: name of presidents, name of inventors and name of civil right leader. In constructing
matching type of test the descriptions and options must be homogeneous.

3. Using the data in situation A, how would you improve the options to avoid ambiguity?
A. Arrange the options in alphabetical order.
B. Add two more options to avoid guessing.
C. Write the complete names in the options.
D. Remove two options to have valid options.

Rationalization: C- to avoid ambiguity write the complete name of persons among the options.

4. Which of the following objectives is the highest level of Bloom’s taxonomy?


A. Identifies the meaning of item.
B. Identifies the order of the given events.
C. Interprets the meaning of an idea.
D. Improves defective test items.

Rationalization: D- improves defective test items is an example of application level. While


options A and B are knowledge level and option C is an example of comprehension level.

5. Which statement/s is/are true in constructing matching types of test?


I. The options and descriptions not necessarily homogeneous.
II. The options must be greater than the descriptions.
III. The directions must state the basis of matching.
IV. Descriptions in Column A and options in Column B.
A. I, II and III C. I, II and IV
B. II, III and IV I, II, III and IV

Rationalization: B- statements II, III and IV are the guidelines in constructing matching type of
test. While statement I is a violation in the guidelines of test construction of matching type.

6. Which of the following statements are characteristics of imperfect type of matching test?
I. The minimum item is three.
II. The item has no possible answer.
III. More options than descriptions.
IV. Items not necessarily homogeneous.
A. I, II and IV
B. I, II and III
C. II, III and IV
D. II and IV only

Rationalization: D- statement II and IV are violations of constructing matching type of test.


While statements I and III are the guidelines in constructing matching test item.

7. Which statement best describes the limitation of true or false type of test?
A. Useful for outcomes with two possible alternatives.
B. Scoring is easy, objective and reliable.
C. Can measure complex outcomes.
D. Scores are more influence by guessing.

Rationalization: D- scores are more influence by guessing because of 50% probability guessing
the correct answer. There are only two alternatives.

8. Which of the following should be AVOIDED in constructing true or false test?


I. Verbal clues and specific determiner.
II. Terms denoting definite degree or amount.
III. Taking statements directly from the book.
IV. Keep true and false statement the same in length.
A. I and III only
B. I, II and III
C. I, II and IV
D. II and IV only

Rationalization: A- in constructing true or false test item you should avoid verbal clues and
specific determiner and avoid taking directly from the book. Avoid also: terms denoting
indefinite degree or amount, placing items in systematic order, use of always or never use of
negatives in stating the item.

9. Which of the following test item can best effectively measure higher order of cognitive
learning objectives?
A. Objective test
B. Achievement test
C. Completion test
D. Extended essay test

Rationalization: D- extended essay test is the best way to measure higher order of cognitive
learning objectives. It requires student to plan their own answer and express them in their own
words. Students have freedom to express their individuality in the answers given and to present
more realistic answers.

10. Which statements best describe a short-answer test item?


I. It is easy to write test items.
II. Broad range of knowledge outcomes can be measured.
III. Adaptable in measuring complex learning outcomes.
IV. Scoring is NOT tedious and time consuming.
A. I, II and III
B. I and II only
C. II and IV only
D. II, III and IV

Rationalization: B- a short-answer item is easy to write and can measure a broad range of
learning outcomes. It is not adaptable in measuring complex learning outcomes; it is tedious and
time consuming to score.

SITUATION B. The data on the table below are results of test which was administered to four
subjects in which Ritz Glenn belong. Using the said data answer the questions.

(11-15).

Subject Ritz’s Score Mean Standard Deviation


English 88 85 3.5
Mathematics 95 97 5
Music 90 98 6.5
PE 94 91 4

11. In which subject did Ritz Glenn performed best in relation to the performance of the
group?
A. English
B. Music
C. Mathematics
D. PE

Rationalization: A- Compute the z-score of each subject, z-score in English = 0.86, z-score in
Mathematics = .40, z-score in music = -1.23, z-score in PE = 0.75. The highest value of z-score is
English, hence Ritz performed best in English.

12. What type of learner is Ritz?


A. Bodily Kinesthetic
B. Logical
C. Musical
D. Linguistic

Rationalization: D- Ritz Glenn performed best in English, hence he is a linguistic type of learner.

13. In which subject did Ritz Glenn performed poorly in relation to the group performance?
A. English C. Mathematics
B. Music D.PE

Rationalization: B- Ritz Glenn performed very poor in Music since the z-score = -1.23 which is
the lowest among the four subjects.
14. In which subject the scores most dispersed?
A. English C. Mathematics
B. Music D. PE

Rationalization: B- The scores in music is the most dispersed. Compute the CV of each subject.
The larger the value of CV, the more the dispersed the scores are. The smaller the value of CV,
the scores is less dispersed. CV of English = 4.12%, CV of Music = 6.63%, CV of Mathematics =
5.15%, CV of PE = 4.40%.

15. In which subject the scores less dispersed?


A. English C. Mathematics
B. Music D. PE

Rationalization: A- The scores in English is less dispersed, the CV = 4.12%.

16. Which statement best describe normal distribution?


A. Only few got average scores.
B. The mean, the median are equal.
C. Negatively skewed distribution.
D. Most of the scores lies at one end.

Rationalization: B- The mean and median are equal when the scores are normally distributed.

17. Standard deviation is to measure of Variation as_____is to Measures of central tendency.


A. Quartile deviation C. Range
B. Mean deviation D. Mode

Rationalization: D- Mode is a kind of measures of central tendency. The other two kinds are
mean and median.

18. What type of measure of variation easily affected by the extreme scores?
A. Range C. Inter- quartile range
B. Mean D. Standard deviation

Rationalization: A- Range is a measure of variation that easily affected by the extreme score
because if there is a change in either the highest score or lowest score the value of the range also
changes.

19. Which measure/s of central tendency easily affected by the extreme scores?
A. Median C. Mode
B. Mean D. Mean and Median

Rationalization: B- Mean easily affected by the extreme score. When the lowest score becomes
lower, the mean value will be pulled down and when the highest score becomes higher the mean
value will be pulled up. Hence, changes in either lowest or highest score cause a change in the
mean value.

20. Adrian’s score in Statistics quizzes are as follows: 96, 90, 85, 89, 65, 99, 84, 82. What is the
mean value?
A. 83.25 C. 85.25
B. 84.25 D. 86.25
Rationalization: D- The sum of all the scores is 690 divided by 8 equal to 86.25.

21. Given the following scores: 88, 83, 89, 78, 89, 85, 85, 89, 75, 90, 95, and 95. What
characteristics best described the distribution?
A. Normally distributed C. Bimodal
B. Unimodal D. Multi-modal

Rationalization: B- Unimodal distribution because there is only one mode that is 89 which
appeared three times.

22. A type of error committed in grading the performance of the students by the rater who
avoids both extremes of the scale and tends to rate everyone as average.
A. generosity error C. logical error
B. severity error D. central tendency error

Rationalization: D- Central tendency error was committed by the rater if the rater tends to give
average scores by avoiding scores at the lower end of upper end of the scale.

23. What error committed by the rater if he overate the performance of the student/s?
A. generosity error C. logical error
B. severity error D. central tendency error

Rationalization: A- Generosity error was committed by the rater if tends to use the high end of
the scale only or the rater tends to give a high grades to the performance of high performing
student/s.

24. What error committed by the rater if the lower end of the scale is favored?
A. generosity error C. logical error
B. severity error D. central tendency error

Rationalization: B- Severity error was committed by the rater if favored the low performing
student/s. Tends to give additional grades to the low performing student/s.

25. Which of the following assessment techniques best assess the objective” plans and design
an experiment to be performed”.
A. Paper and pencil test C. Checklist
B. Rating scale D. Essay

Rationalization: C- Checklist is the best assessment techniques to assess the objective,” Plans
and designs an experiment to be performed”. Checklist is useful in assessing the performance
where the activities best assess by observation rather than testing.

26. Which measures of variation is the most stable?


A. Range C. Quartile deviation
B. Inter- quartile range D. Standard deviation

Rationalization: D- Standard deviation is the most stable measures of variation because we


utilized all the scores in solving the value of standard deviation. Range utilized only the highest
score and the lowest score. While inter- quartile range and quartile deviation utilizes the middle
50% of the distribution.
27. Teacher Leila conducted a short quiz to get feedback on the learning progress of the
learners after discussing the lesson on “multiplication of rational expressions”. This type
of assessment is classified as a:
A. Placement Assessment C. Diagnostic assessment
B. Formative Assessment D. Summative Assessment

Rationalization: B- Formative assessment is a type of assessment given after during or after


discussing a certain lesson wherein it is not used for grading purposes. The main Purpose of this
test is to get feedback regarding the learning progress of the learners.

28. Which of the following statements is NOT included in constructing table of specification?
A. Decide on the content areas to be included.
B. Decide on the number of test item per content.
C. Decide the skills to measure in each content.
D. Decide on the number of answer sheets needed.

Rationalization: D- Decide on the number of answer sheets needed in the test is not included in
constructing table of specification. Options A,B, and C are needed in constructing table of
specification.

29. Teacher Gina is talking about “grading on the curve“in a teacher’s assembly. This means
that she’s referring to what type of grading system?
A. Cumulative method of grading.
B. Norm- reference grading
C. Criterion- reference grading
D. Combination of B and C

Rationalization: B- Norm- reference grading means grading on the curve.

30. The computed value of r= 0.95 in Mathematics and English. What does this imply?
A. Mathematics score is not related to English score.
B. English score is moderately related to Mathematics score.
C. Mathematics score is highly positive related to English score.
D. English score is not anyway related to Mathematics score.

Rationalization: C- The correlation coefficient of r= 0.95 is very high positive correlation. Which
means that Mathematics score is highly positive related to English score.

31. Teacher Jean will conduct a test “to measure her student’s ability to organize thoughts
and present original ideas”. Which type of test is most appropriate?
A. Modified true-false test item C. Short answer test
B. Completion type of test D. Essay test

Rationalization: D- The most appropriate to measure student’s ability to organized thoughts


and ideas is through the use of essay test.

32. Teacher Hyacinth conducted 25 items test in Algebra. Students’ scores were as follows:
20, 12, 13, 14, 15, 14, 20, 22, 20, 22, 23, 23, 24, 25, 25. Which measure/s of central tendency
does score 20 represent?
A. Mode only C. Median and Mode
B. Mean D. Mean, Median and Mode
Rationalization: D- the score 20 represent the mean, median and mode.

33. What type of a multiple choice test is this?

5: 15 as 4: _____. A. 12 B. 15 C. 16 D. 18

A. Completion type
B. Analogy
C. Solving problem
D. Short answer test

Rationalization: B- this type of a multiple choice test is an example of analogy.

34. Teacher Jay constructed a matching type of test. In this column A of description are
combination of current issues, government agencies, data of events and government
officials. Which guidelines of constructing matching type of test NOT FOLLOWED?
A. Arrange the descriptions in alphabetical order.
B. Make the descriptions equal in length.
C. Make the descriptions homogeneous.
D. Make the descriptions heterogeneous.

Rationalization: C- the descriptions consist of current issues, government agencies, data of


events, government officials. Hence the guideline regarding homogeneity was not followed.

35. Which statement is true about normal curve?


A. The scores are concentrated at the left side of the distribution.
B. The scores are concentrated at the right side of the distribution.
C. There are more high scores than scores.
D. The value of the mean is equal to the value of the median.

Rationalization: D- normal curve means that the mean value is equal to the median value or the
scores are normally distributed.

36. Which characteristics best described the given score distribution? The scores are: 22, 23,
24, 24, 24, 25, 26, 26, 35, 36, 37, 38, 39, 39, 39, 40, 40, 45.
A. Multi-modal C. Normally distributed
B. Bimodal D. skewed to the left

Rationalization: B- the score distribution is bimodal because there are two modes. The modes
are 24 and 39.

37. Which is true when the standard deviation is small?


A. The scores are concentrated near the mean value.
B. The scores are spread apart within the mean value.
C. The scores concentrated at the right end of the distribution.
D. The scores are concentrated at the left end of the distribution.

Rationalization: A- when the value of the standard deviation is small, on the average the scores
are closer to the mean.

38. All of the given statements are best practices of preparing multiple-choice test items
EXCEPT:
A. Stem should be stated in positive form.
B. Use the stem that could serve as a short-answer item.
C. Underline words or phrases in the stem to give emphasis.
D. Shorten the stem so that options can be written longer.

Rationalization: D- shortening the stem and lengthening the options is a violation in


constructing multiple-choice test. Options A, B and C are the guidelines in constructing multiple-
choice test items.

39. Which of the following statement refers to criterion-reference interpretation?


A. Ritz got the highest score in Mathematics.
B. Luis computed the problem solving faster than his classmates.
C. Vinci set up his laboratory equipments in 2 minutes.
D. Lovely’s test score is higher than 95% of the class.

Rationalization: C- Vinci set up his laboratory experiment in 2 minutes is an example of criterion


reference interpretation. Criterion reference interpretation means that performance of the learner
is compared wit a certain standard.

40. Which of the following is an example of norm-reference interpretation?


A. Lord’s test score is higher than 89% of the class.
B. Vinci set up his laboratory equipments in 2 minutes.
C. Harold must spell 25 words correctly out of 30 words.
D. Mark solves 5 problems correctly in 30 minutes.

Rationalization: A- Lord’s test score is higher than 89% of the class is an example of norm-
reference interpretation. Norm-reference interpretation means that performance of the learner is
compared with the performance of others in the class.

41. Which of the following learning outcomes is the most difficult to assess objectively?

1. A concept 3. An appreciation
2. An application 4. None of the above

How can you improve the test item?


A. Rewrite the stem to statement form
B. Remove the indefinite articles “a” and “an” from the options.
C. Change the option “none of the above” with an interpretation.
D. Change the numbers in the options to letters.

Rationalization: C- change the option “none of the above” with “an interpretation”. Avoid using
none of the above option when asking for the best answer.

42. What is the main advantage of using table of specification (TOS) when constructing
periodic test.
A. It increases the reliability of the test result.
B. It reduces the scoring time.
C. It makes test construction easier.
D. It improves the sampling of content areas.
Rationalization: D- using table of specification in constructing a test items can improve the
sampling of contents of the entire test. You can assure your students that the test has content
validity when using table of specification in constructing test.

43. The main objective of testing in teaching is:


A. To assess students learning and the effectiveness of instruction.
B. To assess the effectiveness of teaching methods used.
C. To evaluate the instructional materials used.
D. To evaluate the performance of the teacher in that particular lesson.

Rationalization: A- the main purpose of testing in teaching is to evaluate the learning progress
of the students and assess whether the instruction was effective or not.

44. The instructional objectives is very important in test construction when they are stated in
terms of:
A. Teacher activities. C. stated in general terms.
B. Learning activities. D. student performance.

Rationalization: D- instructional objectives must be stated in terms of student performance.

45. Which of the following statement is the main reason why should negative words be
avoided in constructing multiple-choice test?
A. Increase the difficulty of the test item.
B. More difficult to construct options.
C. Might be overlooked.
D. Stems tend to be longer.

Rationalization: C- avoid using negative words in constructing multiple-choice test they might
be overlooked by the test takers. If can’t be avoided bold it or type it in capital letters to give
emphasis.

46. Obtaining a dependable ranking of students is a major concern when using:


A. Teacher-made diagnostic test.
B. Norm-reference summative test.
C. Criterion-reference formative test.
D. Mastery achievement test.

Rationalization: B- norm-reference testing means comparing the performance of students to


another students. Ranking of students is an example of comparing the performance of other
students.

47. Which of the following statement is an advantage of multiple-choice test items over an
essay questions?
A. Provide assessment of more complex learning outcomes.
B. It emphasis more on the low level of learning outcomes.
C. Provide more extensive sampling of the content area.
D. Requires more time in preparing the test items.

Rationalization: C- multiple-choice test item measures a broader sampling of content areas.


Widely used for measuring knowledge outcomes and various type of complex learning
outcomes.
48. When a test is lengthened, the reliability is likely to___________________?
A. Increase C. not determined
B. Decrease D. both A and B

Rationalization: A- to increase the reliability of the test, the teacher should give enough number
of test items.

49. All of the following best describe interpreting norm-reference scores EXCEPT:
A. Percentile rank C. Grade Equivalent scores
B. Standard scores D. raw scores

Rationalization: D- raw scores is not an example of norm-reference scores. Examples of norm-


reference scores are percentile ranks, standard scores, and grade equivalent scores.

50. Which of the following statements describe performance based assessment?


I. Evaluate complex learning outcomes and skills.
II. Encourages the application of learning to “real life” situation.
III. Measure broad range of contents.
A. I only C. III only
B. I and II D. I, II and III

Rationalization: B- performance based assessments evaluate complex learning outcomes and


encourages the application of learning to real life situation.

51. Teacher Adrian conducted item analysis and he found out that more from the lower
group got the test item number 6 correctly. This means that the test item_________.
A. Has a low reliability
B. Has a high validity
C. Has a positive discriminating power
D. Has a negative discriminating power

Rationalization: D- negative discrimination means that ore students from the lower group got
an item correctly than those students from the upper group.

52. Which is implied by a positively skewed score distribution?


A. Most of the scores are below the mean value.
B. Most of the scores are above the mean value.
C. The mean is less than the median.
D. The mean, the median and the mode are equal.

Rationalization: A- positively skewed distribution means most of the scores are low and below
the mean value.

53. Most of the students who took the examination got scores above the mean. What is the
graphical representation of the score distribution?
A. Skewed to the left C. scores are normally distribute
B. Skewed to the right D. positively skewed

Rationalization: A- skewed to the left means that most of the scores are above the mean. Hence
the graphical representation of the scores is skewed to the left.
54. Which statement best describes a negatively skewed score distribution?
A. The value of mean and median are equal.
B. Most examinees got scores above the mean.
C. The value of mode corresponds to a low score.
D. The value of median is higher than the value of mode.

Rationalization: B- negatively skewed distribution describes a good performance of the


students. The students performed well because most of the examinees got scores above the
mean.

55. In a normal distribution a T- score of 80 is_____________.


A. Two SD’s below the mean C. three SD’s below the mean
B. Two SD’s above the mean D. three SD’s above the mean

Rationalization: D- using the formula T score = 10z + 50 and solve for z by substitution. The
value of z = 3, which means that the distance above the mean is three times the standard
deviation.

56. The distribution of a class with academically poor students is more likely______.
A. Normally distributed C. skewed to the right
B. Skewed to the left D. leptokurtic

Rationalization: C- skewed to the right means most of the students got scores below the mean
which means that the examinees performed very poor or most of the scores are low.

57. Teacher Paul conducted item analysis and he found out that significantly greater number
from the upper group of the class got test item number 10 correctly. This means that the
test item_____________.
A. Has a negative discriminating power
B. Has a positive discriminating power
C. Has low reliability
D. Has high validity

Rationalization: B- positive discriminating power means that more students in the upper group
got the item correctly.

58. Mary Anne obtained a NAT percentile rank of 93. This imply that________.
A. She surpassed in performance 7% of the group.
B. She got a score of 93.
C. She answered 93 items correctly.
D. She surpassed in performance 93% of her fellow examinees.

Rationalization: D- percentile rank of 93 means that 93% of the examinees got a score below an
indicated score. Thus, Mary Anne surpassed in performance 93% of those who took the
examination.

59. Which instructional objective below is the highest level of Bloom’s Taxonomy?
A. Define fraction
B. Explain the different rules of addition of fractions
C. Add fractions correctly
D. Determine the steps in solving fractions
Rationalization: C- add fraction correctly is an instructional objective under application’. Hence,
it is the highest level of Bloom’s Taxonomy.

60. Under which assumption is portfolio assessment based?


A. Assessment should stress the reproduction of knowledge.
B. Portfolio assessment is dynamic assessment.
C. An individual learner is inadequately characterized by a test score.
D. An individual learner is adequately characterized by a test score.

Rationalization: B- portfolio assessment is a process of gathering multiple indicators of student


progress to support course objectives in dynamic, ongoing and collaborative process.

61. Which of the following statements best describes the incorrect options in item analysis?
A. Determining the percentage equivalent of the cut off score.
B. Determining the highest score
C. Determining the effectiveness of distracters
D. Determining the cut of score

Rationalization: C- determining the effectiveness of distracters. If a teacher will conduct an item


analysis he/she should know the difficulty of an item, discrimination of an item and the
effectiveness of the distracters.

62. When points in a scatter diagram are spread evenly in all directions this means that:
A. The correlation between two variables is positive.
B. The correlation between two variables is low.
C. The correlation between variables is high.
D. There is no correlation between two variables.

Rationalization: D- scatter diagram of two variables without correlation is scattered in all


directions in the Cartesian coordinate system.

63. Roel’s score in Science test is 89 which is equal to 95th percentile. What does this mean?
A. 95% of Roel’s classmates got scores lower than 89.
B. 95% of Roel’s classmates got scores higher than 89.
C. Roel’s score is less than 89% of his classmates.
D. Roel’s score is higher than 95% of his classmates.

Rationalization: A- the statement can be transformed to P95=89. This means that, there are 95% of
those who took the exam in Science got scores lower than Roel’s score which is 89.

64. Which applies when there are extreme scores?


A. The median is very reliable measure of central tendency.
B. The mean will be very reliable measure of central tendency.
C. There is no reliable measure for central tendency.
D. The mode will be the most reliable measure of central tendency.

Rationalization: A- median is an appropriate measure of central tendency with the presence of


extreme scores. The value of the mean is highly affected by the extreme scores. The reliable
measure of central tendency when there are extreme scores is the median because it is a
positional measure.
65. In a normal distribution, about how many percent of the cases fall between -1SD to
+1SD?
A. 15.73% C. 49.86%
B. 34.13% D. 68.26%

Rationalization: D- the area of -1SD to +1SD of a normal distribution is 68.26%. That is, from
-1SD to mean is 34.13% and from mean to +1SD is also 34.13%. Thus, the sum is 68.26%.

66. Teacher Kristy gave a chapter test, in which competency did her students find greatest
difficulty? In the item with a difficulty index of _____________.
A. 0.25 C.0.75
B. 0.15 D. 1.00

Rationalization: B- the difficulty index is 0.15. The level of difficulty is very difficult.

67. Fifty students took 40 items in English, below are their scores. ( items 67 to 68)
Scores Number of Students
11-15 7
16-20 10
21-25 7
26-30 20
31-35 6

In which interval the median value lies?

A. In the interval 16-20

B. In the interval 26-30

C. Between the interval 16-20 and 21-25

D. In the interval 21-25

Rationalization: B- 50% of the scores lies in the interval 26-30.

68. Based on the data on item number 82, how many percent of the scores lower than 21?
A. 14% C. 20%
B. 17% D. 34%

17
Rationalization: D- There are 32 percent of the scores lower than 21. ( X100%=34%)
50
69. Teacher Lawrence gave a test in Mathematics. The facility of item No. 10 is 75%. The best
way to describe item No. 10 is _______.
A. very easy C. average item
B. easy item D. difficult item

Rationalization: B- 75% of the students got item number correctly. The level of difficulty is easy.

70. At the end of the school year, all third year students presented their portfolio in English
subject. Students, teachers, and other stakeholders were asked to view and give their
comments regarding what was viewed. Which authentic assessment was organized?
A. Exhibits C. Conference
B. Program D. Seminar

Rationalization: C- Conference is a form of meeting of symposium for a public discussion of


some topic especially one in which the participants form an audience and make presentations
and assessment regarding what was viewed.

71. The point of departure of an inter-quartile range which indicate the spread of the scores
is_________.
A. Upper limit C. mean
B. Median D. range

Rationalization: B- median. The value of the inter-quartile range represents the dispersion of the
middle 50% of the scores from the median.

72. The admissions office of a certain university conducted a qualifying test five batches of
examinees. The number of qualifiers and their mean scores are presented below.

Batch number Number of qualifiers Mean score

Batch I 20 94

Batch II 10 85

Batch III 15 92

Batch IV 25 87

Batch V 10 95

What is the mean score of the entire group of qualifiers?

A. 90.44
B. 90.60
C. 5.66
D. 92.00

Rationalization: A- the mean score of the entire qualifiers is 90.44. (7235 divided by 80)

73. Joseph’s score in Science is 1.5 standard deviation above the mean of his group and 2
standard deviation above mathematics. What does this mean?
A. He excels both in Science and in Mathematics.
B. He is better in Mathematics than in Science.
C. He is better in Science than in Mathematics.
D. He does not excel in both subjects.
Rationalization: B- the standard deviation indicates the number of units above or below the
mean. Standard deviation in Mathematics is higher than the standard deviation in Science.
Hence, Joseph performed well in Mathematics than in Science.

74. The criterion of success in Teacher Ofel objective is that “the students must be able to get
80% of the test items correctly”. Luis and 24 other students in the class answered only 20
out of 25 items correctly. This means that teacher Ofel________.
A. Attained her lesson objective because of her effective problem solving drills.
B. Did not attain her lesson objective because her students lack of attention.
C. Attained her lesson objective.
D. Did not attain her lesson objective as far as the 25 students are concerned.

Rationalization: C- exactly 80% of the students answered the items correctly. Hence, teacher
Ofel attained her objectives.

75. The grading system of Department of Education is averaging. What is the average final
grade of Andie in English for four grading periods?

English First Second Third Fourth Final rating


grading grading grading grading

90 88 93 95 ?

A. 91.75 C. 94.00
B. 92.25 D. 95.00

Rationalization: A- 91.75 (90 + 88 + 93 + 95) /4

76. The grading method which gives weight to the present grade and the previous grade of
1 2
the student such as (Third grading grade) + (Fourth grading grade) = Final Grade is
3 3
called__________.
A. Averaging
B. Criterion reference
C. Norm reference
D. Cumulative

Rationalization: D- the method of grading the involves the present grade and the previous
grade to compute the final grade is cumulative method.

77. To increase the difficulty of a multiple-choice test item, which of the following should be
done?
A. Make the stem short and clear
B. Make the options homogeneous
C. Make it grammatically correct
D. Make the options equal in length

Rationalization: B- to increase the difficulty of multiple-choice test item, make options in the
same class or the it must be homogeneous.

Situation C. (Item Number 78 to 83)


Given the table on item analysis for non-attractiveness and non-plausibility of distracters based
on the results of a try-out test in English. The letter mark with an asterisk is the correct answer.

Item No. 10 A* B C D
Upper 27% 16 3 10 1
Lower 27% 14 6 8 2

78. Based on the table, which group got more correct answer?
A. Lower group C. can’t be determined
B. Upper group D. either lower group or upper group

Rationalization: B- upper group got more correct answer.

79. The table shows that the item analyzed has _____________.
A. Positive discriminating power
B. Negative discriminating power
C. High validity index
D. High reliability index

Rationalization: A- more students go the item correctly from the upper group than the lower
group. Hence it has a positive discriminating power.

80. Based on the table in situation C, which is the most effective distracters?
A. Option A. C. Option C
B. Option B D. option D

Rationalization: B- option B is the most effective distracter since more students from the lower
group choose the incorrect option.

81. Based on the table in situation C, which distracters should be revised?


A. Option A. C. Option C

B. Option B D. option D

Rationalization: C- option C should be revised because more students from the upper group
choose this incorrect option. Hence it is not effective.

82. What is the level of difficulty of item 6 in Situation C?


A. Very easy C. moderately difficult
B. Easy D. difficult

Rationalization: C- (30/60) x 100% = 50%. The difficulty index is 50%. Therefore the level of
difficulty is moderately difficult.

83. What is discriminating index of item number 6 in Situation C?


A. 3% C. 7%
B. 6% D. 50%

Rationalization: C- the discriminating index is 7%. [16 – 14) /30 x 100%].

84. Which statement about performance-based assessment is FALSE?


A. They emphasize on process as well as product.
B. They also stress on doing, not only knowing.
C. Essay test are example of performance-based assessment.
D. They emphasize only on process.

Rationalization: D- performance-based assessment emphasizes on both process and product.


Hence the statement emphasize only on process is false about performance-based assessment.

85. Teacher Ritz wrote of Michael.” When Michael came to class this morning, he seemed
very tired and slouched into his seat. He took no part in his class discussion and seemed
to have no interest in what was being discussed. This was very unusual for he has been
eager to participate and often monopolizes the class discussion. What Teacher Ritz wrote
is an example of a/an_______.
A. Anecdotal report C. personality report
B. Observation report D. incidence report

Rationalization: A- anecdotal reports are notes written by the teacher regarding incidents at the
classroom that might be needs special attention in the future.

86. Assessment is said to be authentic when the teacher__________.


A. Considers students’ suggestions in testing
B. Gives valid and reliable paper-and-pencil test
C. Gives students real-life tasks to accomplish
D. Includes parents in the determination of assessment procedures.

Rationalization: C- authentic assessment is an assessment applying real life situation.

87. If teacher Jerick Ivan want to test his students’ synthesizing skills. Which of the following
has the highest diagnostic value?
A. Completion test
B. Performance test
C. Essay test
D. Multiple-choice test

Rationalization: C- essay test is the most appropriate tool to measure the synthesizing skills of
the students.

88. Which of the following statement about marking on a normative basis?


A. The normal distribution curve should be followed
B. Most of the students got low scores.
C. Most of the students got high scores.
D. The grading should based from the given criteria.

Rationalization: A- marking on a normative basis follow the normal distribution curve.

89. The discriminating index of item number 15 is 0.44. This means that__________.
A. More students from the upper group got the item correctly.
B. More students from the lower group got the item correctly.
C. Equal number of students got the correct answer from the upper and lower
group.
D. The test item is very easy.
Rationalization: A- the discriminating index is 0.44 which means that the item is very
discriminating and more students from the upper group got the item correctly.

90. The difficulty index of item 20 is 0.55 and the discrimination index is 0.33. What should
the teacher do with this item?
A. Reject the item C. revise the item
B. Retain the item D. make the item bonus

Rationalization: B- the difficulty level is moderately difficult and the discriminating level is
discriminating. Therefore, the item should be retained.

91. The discriminating index of item number 1 is -0.15. This means that_______.
A. More students from the upper group got the item correctly.
B. More students from the lower group got the item correctly.
C. Equal number of students got the correct answer from the upper and lower
group.
D. The test item is very difficult.

Rationalization: B- the discrimination index is -0.15 which is negative, hence more students from
the lower group got the item correctly.

92. The score distribution of set A and set B have equal mean but with different SD’s Set A
has SD of 2.75 while Set B has SD of 3.25. Which statement is TRUE of the score
distributions?
A. Majority of the scores in set B are clustered around the mean.
B. Majority of the scores in set A are clustered around the mean than in set B.
C. Scores in set A are more widely scattered.
D. The scores of set B has less variability than the scores in set A.

Rationalization: B- the SD of A = 2.75, SD of B=3.25 and the mean are equal. Therefore, the
scores in set A is more clustered around the mean than in set B. The smaller the value of SD, on
the average the scores are closer to the mean.

93. About how many percent of the cases fall between -2SD and +2SD in the normal curve.
A. 99.72 C. 68.26
B. 95.44 D. 34.13

Rationalization: B- the area of the normal curve from -2SD to +2SD is approximately 95.44%.
Illustration, from mean to 1SD is 34.13%; from mean to -1SD is also 34.13%. From 1SD to 2SD is
13.59%; from -1SD to -2SD is 13.59%. Thus, the area under the normal curve is 34.13% + 34.13% +
13.59% + 13.59% = 95.44%.

94. In research analysis of variance utilizing the F-test is the appropriate significance test to
measure between:
A. Frequency
B. Median
C. Two means only
D. Three or more means

Rationalization: D- ANOVA using F-test is a statistical tool that is used to test the significant
difference of three or more means.
95. Skewed score distribution means:
A. The scores are normally distributed.
B. The mean and the median are equal.
C. The mode, the mean, and the median are equal.
D. The scores are concentrated more at one end or the other end and the
distribution.

Rationalization: D- when the scores concentrated at the left part of the distribution it is skewed
to the right while the scores concentrated at the right part of the distribution then it is skewed to
the left. Options A, B, and C deals with normal distribution.

96. Which of the following describe norm-referenced statement?


A. Gabby performed better in spelling than 88% of his classmate.
B. Gabby was able to spell 94% of the words correctly.
C. Gabby was able to spell 94% of the words correctly and spelled 38% words out
of 50 correctly.
D. Gabby spelled 38 words out of 50 correctly.

Rationalization: A- Gabby performed better in spelling than 88% of his classmate. This means
that the performance of Gabby was compared to the performance of his classmates. Norm-
reference means comparing the performance of a certain student with the performance of other
students.

97. What type of validity is needed when you test course objectives and scopes?
A. Construct C. concurrent
B. Criterion D. content

Rationalization: D- content validity is a type of validation which refers to the relationship


between a test and the instructional objectives. It establishes the content so that the test measures
what it is suppose to measure.

98. Teacher Anne gives achievement test to her 30 students. The test consists of the 25 items.
She wants to compare her students’ performance based on the test result. What is the
appropriate measure for the position?
A. Percentage C. Z-score
B. Percentile rank D. standard nine

Rationalization: C- to compare the individual performance of the students Z-score will be


utilized. It tells the number of standard deviations below or above the mean performance the
higher the value of Z-score the better the performance of a certain student is.

99. Teacher V give a 100 items multiple-choice test three students make scores of 94, 89 and
75, respectively, while the other 27 students in the class make scores ranging from 33 to
67. The measure of central tendency which is best describes for this group of 30 students
is:
A. Mean and median C. Mode
B. Mean D. Median

Rationalization: B- Mean is the most appropriate to describe the performance of the entire group
because you are going to utilize al the scores of the students. Thus, using mean you can describe
best their group performance.
100. If teacher gets the difference between the highest score and the lowest score, he
obtains the_____________.
A. Range C. standard deviation
B. Standard deviation D. index difficulty

Rationalization: A- Range is the difference between the highest and the lowest. Using the
formula: Range= Highest Score – Lowest Score.

Pg 471

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy