Assessment in Education Notes
Assessment in Education Notes
IUM 2017
Ongwediva Campus
1
Define the term assessment
Assessment is the process of gathering and discussing information from multiple and
diverse sources in order to develop a deep understanding of what students know,
understand, and can do with their knowledge as a result of their educational
experiences; the process culminates when assessment results are used to improve
subsequent learning. (Learner-Centered Assessment on College Campuses: shifting the
focus from teaching to learning by Huba and Freed 2000)
Assessment is the systematic basis for making inferences about the learning and
development of students. It is the process of defining, selecting, designing, collecting,
analyzing, interpreting, and using information to increase students’ learning and
development. (Assessing Student Learning and Development: A Guide to the
Principles, Goals, and Methods of Determining College Outcomes by Erwin 1991)
Assessment is the systematic collection, review, and use of information about
educational programs undertaken for the purpose of improving student learning and
development. (Assessment Essentials: planning, implementing, and improving
assessment in higher education by Palomba and Banta 1999
Areas of assessment
1.1.1 Cognitive domain
This domain refers to intellectual capability and skills. The six categories of the cognitive
domain are knowledge, comprehension, application, analysis, synthesis, and evaluation and
describe the increasing difficulty in thinking skills expected from the learners as the
knowledge and content becomes more difficult. This domain is the primary learning domain,
because thinking skills are required in both of the following domains.
2
1.5.1 Informal assessment
An informal assessment is a method of measuring an individual's performance by casually
watching their behaviour or using other informal techniques. Often the learner is unaware that
he/she is being assessed. Informal assessments are not interested in facts, figures or numbers
but are concerned about content and performance. This type of assessment wants to find out
what learners know or how well they can perform a certain task such as reading. Informal
assessments are used to inform instruction.
Examples of informal assessment tools: Observation, Classwork, Homework
Examples of formal assessment tools: Standardised test, Examination and Diagnostic test
Types of assessment
1.7.1 Formative assessment
Formative continuous assessment is any assessment made during the school year in order to
improve learning and to help shape and direct the teaching-learning process. Teachers make
frequent, interactive assessments of learner understanding. Formative assessments are an
integral part of the learning process, and include informal assessments. Formative
assessments are primarily to determine what learners have learned in order to plan further
instruction. This enables teachers to adjust their teaching to meet learner needs and to guide
them towards helping the learners to reach expected standards. Findings of formative
assessments can be used to understand what changes need to be made to the teaching and
learning process.
b. Formative assessment approaches are suitable for obtaining valid, reliable and sufficient
evidence of learner progress.
c. Information is informative and useful to the teacher, parent and learner with respect to
progress and needs.
d. Formative assessments should inform teaching practice by identifying trends and
weaknesses to be addressed with whole groups and individual learners.
e. Identification of trends and weaknesses is consistently accurate and promotes continuous
improvement in assessment.
f. It is used to motivate learners to extend their knowledge and skills, establish sound values,
and to promote healthy habits of study
g. Assessment tasks help learners to solve problems intelligently by using what they have
learned
3
a. Flexible
Formative assessments do not have a designated time at which to be implemented. This
flexibility allows teachers to tailor their lessons and assessments to the needs of their learners.
b. Easy to Implement
Because their flexibility, formative assessments are easy to implement. They can be as large
or small, in-depth or general, as needed.
c. Checks for Understanding
Formative assessment can take many shapes. However, in any form, it is an assessment of
understanding. Implementing many formative assessments as the class moves through
material allows a teacher to catch and address any misconceptions theclass or individual
learners may have.
d. Informs Curriculum
Teachers can use the results of formative assessments to inform the curriculum and the
delivery of content. A teacher may choose to spend more time on a specific area in which
many learners struggle, or spend less time on an area with which most students are
comfortable.
e. Assesses the teacher
Formative assessments provide opportunities for teachers to evaluate their own performance.
The results of the assessments can reveal weaknesses or strengths in the delivery of
instruction.
4
h. Summative assessment decisions are consistent with decisions made about similar
evidence from other learners. Decisions are justified by valid, authentic and sufficient
evidence presented by and about learners
i. Summative assessment results are interpreted fairly and accurately and in line with national
assessment and promotion policies. Interpretations help to assess and promote learning and to
modify instruction in order to encourage the continuous development of learners
j. Results are interpreted in the light of previous results and experience. Interpretations
provide useful insight into learning and foster continuous improvement of practice.
k. Records of the assessment meet the quality requirements of the school.
.
Benefits of summative assessment
a. Development of a standardized (consistent) set of information about each learner’s
achievement
b. Help in the determination of key learning goals and teaching responsibilities
c. Combine test scores and make educational decisions based on this information
d. Create rationale for large-scale educational decision-making
e. Acknowledgement of a job well done
30
1.7.3 Self-assessment
Self-assessment refers to the assessment of activities within and outside the classroom that
enable learners to reflect on what they have learnt and to evaluate their learning against a set
of assessment criteria. It describes the process of a learner gaining an understanding of how
he/she learns as opposed to what he/she is learning. It guides the learner to greater
understanding of him-/herself as learner.
5
e. It matches pupils’ perceptions of understanding with that of teachers – pupils explain
strategies and in this way the teacher identifies their thinking process
f. More efficient lessons will allow greater challenge
Disadvantages of self-assessment
a. It puts more demands on the workload of the teacher, because it takes time for the learners
to become skilled in self-assessment. While the learners learn how self-assessment works, the
teacher has to guide them, which places more demands on the time of the teacher.
b. There is a risk of grades being inflated or unreliable
c. Learners feel ill equipped to do the assessment or do not have enough confidence in
assessing themselves
1.7.4. Peer assessment
Peer-assessment is nearly the same as self-assessment, except that learners are explicitly
involved in helping each other to identify the standards and criteria, and making judgements
about each other's work in relation to those criteria.
b. Learners have to be involved in the process of making judgements about the extent to
which their work, and the work of fellow learners has or has not met the identified standards
and/or criteria.
6
Types of Diagnostic Assessments
Pre-tests (on content and abilities)
Self-assessments (identifying skills and competencies)
Discussion board responses (on content-specific prompts)
Interviews (brief, private, 10-minute interview of each student)
• Continuous assessment can take place within various types of contact moments, e.g.
practical, workshops, lectures, placements, projects, cases, etc.
• Continuous assessment is the result of the continuous assessment of the learning
performance on a course module. The assessment task can verify which developmental
process you are going through. The continuous assessment (partially) counts towards the final
mark for the course module.
• Continuous assessment often goes hand in hand with information about: the
assessment criteria, how you performed, what went smoothly, what went less smoothly, and
the things you still have to work on.
1. Objective, which require students to select the correct response from several alternatives
or to supply a word or short phrase to answer a question or complete a statement.
2. Subjective or essay, which permit the student to organize and present an original
answer
7
Subjective questions are appropriate when:
Validity: Defined
The term validity has varied meanings depending on the context in which it is being used.
Validity generally refers to how accurately a conclusion, measurement, or concept
corresponds to what is being tested. For this lesson, we will focus on validity in assessments.
Validity is defined as the extent to which an assessment accurately measures what it is
intended to measure. The same can be said for assessments used in the classroom. If an
assessment intends to measure achievement and ability in a particular subject area but then
measures concepts that are completely unrelated, the assessment is not valid.
Types of Validity
There are three types of validity that we should consider: content, predictive, and construct
validity. Content validity refers to the extent to which an assessment represents all facets of
tasks within the domain being assessed. Content validity answers the question: Does the
assessment cover a representative sample of the content that should be assessed?
For example, if you gave your students an end-of-the-year cumulative exam but the test only
covered material presented in the last three weeks of class, the exam would have low content
validity. The entire semester worth of material would not be represented on the exam.
Educators should strive for high content validity, especially for summative assessment
purposes. Summative assessments are used to determine the knowledge students have gained
during a specific time period.
8
Content validity is increased when assessments require students to make use of as much of
their classroom learning as possible.
The next type of validity is predictive validity, which refers to the extent to which a score on
an assessment predicts future performance.
Construct validity is used to determine how well a test measures what it is supposed to
measure. In other words, is the test constructed in a way that it successfully tests what it
claims to test?
Construct validity is usually verified by comparing the test to other tests that measure similar
qualities to see how highly correlated the two measures are. For example, one way to
demonstrate the construct validity of a cognitive aptitude test is by correlating the outcomes
on the test to those found on other widely accepted measures of cognitive aptitude.
9
Reliability
Reliability is the degree to which an assessment tool produces stable and consistent results.
Types of Reliability
Example: If you wanted to evaluate the reliability of a critical thinking assessment, you
might create a large set of items that all pertain to critical thinking and then randomly
split the questions up into two sets, which would represent the parallel forms.
1. Test length: Generally, the longer a test is, the more reliable it is.
2. Speed: When a test is a speed test, reliability can be problematic. It is inappropriate to
estimate reliability using internal consistency, test-retest, or alternate form methods. This is
because not every student is able to complete all of the items in a speed test. In contrast, a
power test is a test in which every student is able to complete all the items.
3. Group homogeneity: In general, the more heterogeneous the group of students who
take the test, the more reliable the measure will be.
4. Item difficulty: When there is little variability among test scores, the reliability will
be low. Thus, reliability will be low if a test is so easy that every student gets most or all of
the items correct or so difficult that every student gets most or all of the items wrong.
5. Objectivity: Objectively scored tests, rather than subjectively scored tests, show a
higher reliability.
6. Test-retest interval: The shorter the time interval between two administrations of a
test, the less likely that changes will occur and the higher the reliability will be.
7. Variation with the testing situation: Errors in the testing situation (e.g., students
misunderstanding or misreading test directions, noise level, distractions, and sickness) can
cause test scores to vary
10
Methods used to determine the validity and reliability in assessment
Validity
Validity is arguably the most important criteria for the quality of a test. The term validity
refers to whether or not the test measures what it claims to measure. On a test with high
validity the items will be closely linked to the test's intended focus. For many certification
and licensure tests this means that the items will be highly related to a specific job or
occupation. If a test has poor validity then it does not measure the job-related content and
competencies it ought to. When this is the case, there is no justification for using the test
results for their intended purpose.
There are several ways to estimate the validity of a test including content validity, concurrent
validity, and predictive validity. The face validity of a test is sometimes also mentioned.
2. Reliability
Reliability is one of the most important elements of test quality. It has to do with the
consistency, or reproducibility, or an examinee's performance on the test.
For example, if you were to administer a test with high reliability to an examinee on two
occasions, you would be very likely to reach the same conclusions about the examinee's
performance both times. A test with poor reliability, on the other hand, might result in very
different scores for the examinee across the two test administrations. If a test yields
inconsistent scores, it may be unethical to take any substantive actions on the basis of the test.
There are several methods for computing test reliability including test-retest reliability,
parallel forms reliability, decision consistency, internal consistency, and interrater reliability.
For many criterion-referenced tests decision consistency is often an appropriate choice.
Benjamin Bloom introduced Bloom’s Taxonomy in 1956. The initial focus was primarily for
academia and now finds a comfortable place in training. Bloom and associates identified three
domains of learning:
11
Bloom’s Taxonomy
Benjamin Bloom introduced Bloom’s Taxonomy in 1956. The initial focus was primarily for
academia and now finds a comfortable place in training. Bloom and associates identified
three domains of learning:
In this blog, the focus is on the cognitive domain and the application of the six levels of
Bloom’s Taxonomy. These levels represent a hierarchy of learning that goes from the simple
(level 1) to the complex (level 6). The levels are as follows:
Now that we have defined the six levels, let’s look at how they can be applied to instructional
design. Lynne’s blog explained how Bloom’s Taxonomy could be used in structuring
questions; this blog will add how it applies to the testing process.
This is usually assessed using a non-performance test that checks for knowledge of the
information the learner has been taught. This is accomplished through quizzes using assorted
multiple choice, matching, or true/false questions. You want the learner to define, repeat,
recall from memory, list, etc. the information he/she has learned. (e.g. List the six steps of
Langevin’s learning strategy.)
This next level is also a non-performance check for knowledge, but now you want the learner
to “put it in their own words” by describing, explaining, discussing, etc. the information
he/she has been taught. (E.g. describe the six steps of the learning strategy.)
Here, the focus is on performance-based assessment. You have the learner apply, interpret,
practice, etc. the information he/she has been taught. (e.g. Create a brief lesson using the
learning strategy that you will present to the group. You must use all six steps.)
12
4. Analysis – interpret elements; break the information into smaller parts
For this level, you ask the learner to compare, investigate, solve, examine, tell why, etc.
(e.g. This is an outline for a course, which was not received well by the
learners. Compare this to the learning strategy; identify which part(s) of the learning strategy
were omitted, and how this omission contributed to the course not being successful.)
Here, you have the learner suppose, create, construct, improve, etc. (e.g. this is a handout of a
course that is structured according to the learning strategy. It follows the six steps, but is not
as dynamic as is could be. What would you add to each step to create a more dynamic course
that gets the learner involved?)
In this final level of Bloom’s Taxonomy, you ask the learner to offer opinions, criticize,
judge, recommend, justify, evaluate, or explain which option is better, based on a set of
knowledge and criteria. (e.g. You have examples of two courses that use the learning
strategy. First, compare the examples against the learning strategy, then compare one
example against the other. Determine which one best exemplifies the learning strategy. Be
prepared to present your decision to the table group.
13