2-Testing Listening
2-Testing Listening
2-Testing Listening
It may seem rather odd to test listening separately from speaking, since
the two skills are typically exercised together in oral interaction. However,
there are occasions, such as listening to the radio, podcasts, listening
to lectures, online talks and tutorials, or listening to railway station
announcements, when no speaking is called for. Also, as far as testing
is concerned, there may be situations where the testing of oral ability is
considered, for one reason or another, impractical, but where a test of
listening is included for its backwash effect on the development of oral
skills. Listening may also be tested for diagnostic purposes.
Because it is a receptive skill, the testing of listening parallels in most
ways the testing of reading. This chapter will therefore spend little time
on issues common to the testing of the two skills and will concentrate
more on matters that are particular to listening. The reader who plans
to construct a listening test is advised to read both this and the previous
chapter.
The special problems in constructing listening tests arise out of the
transient nature of the spoken language. Listeners cannot usually move
backwards and forwards over what is being said in the way that they can a
written text. The one apparent exception to this, when an audio-recording
is put at the listener’s disposal, does not represent a typical listening task
for most people. Ways of dealing with these problems are discussed later
in the chapter.
Content
Operations
Some operations may be classified as global, inasmuch as they depend on
an overall grasp of what is listened to. They include the ability to:
• obtain the gist;
• follow an argument;
• recognise the attitude of the speaker.
163
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
Other operations may be classified in the same way as were speaking skills
12 Testing listening
Interactional:
• understand greetings and introductions
• understand expressions of agreement
• understand expressions of disagreement
• recognise speaker’s purpose
• recognise indications of uncertainty
• understand requests for clarification
• recognise requests for clarification
• recognise requests for opinion
• recognise indications of understanding
• recognise indications of failure to understand
164
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
• recognise and understand corrections by speaker (of self and others)
Texts
For reasons of content validity and backwash, texts should be specified as
fully as possible.
Text type might be first specified as monologue, dialogue, or multi-
participant, and further specified: conversation, announcement, talk or
lecture, instructions, directions, etc.
Text forms include: description, exposition, argumentation, instruction,
narration.
Length may be expressed in seconds or minutes. The extent of short utterances
or exchanges may be specified in terms of the number of turns taken.
Speed of speech may be expressed as words per minute (wpm) or syllables
per second (sps). Reported average speeds for samples of British English are:
WPM SPS
Radio monologues 160 4.17
165
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
Accents may be regional or non-regional.
12 Testing listening
If authenticity is called for, the speech should contain such natural features
as assimilation and elision (which tend to increase with speed of delivery)
and hesitation phenomena (pauses, fillers, etc.).
Intended audience, style, topics, range of grammar and vocabulary may be
indicated.
Increasingly, test developers are incorporating video and other visual
information into listening tests. In terms of authenticity this has benefits.
Although there are situations, such as listening to the radio, or to airport
announcements, where we rely purely on verbal information, these
are not the most common. Even traditional ‘voice only’ phone calls are
increasingly being replaced with video calls. In most real-life situations
we not only listen, but receive other, non-verbal, information, such as
mouth movements, facial expressions, body language or even visual
aids. Therefore, tests which contain visual as well as audio information
are arguably a better representation of authentic listening. Where visual
information is to be included in items, it should of course be included in
the test specifications, as in the operations listed above.
166
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
the performance of individuals may be affected by the recording faults in
Writing items
For extended listening, such as a lecture, a useful first step is to listen to
the passage and note down what it is that candidates should be able to get
from the passage. We can then attempt to write items that check whether
or not they have got what they should be able to get. This note-making
procedure will not normally be necessary for shorter passages, which will
have been chosen (or constructed) to test particular abilities.
In testing extended listening, it is essential to keep items sufficiently far
apart in the passage. If two items are close to each other, candidates may
miss the second of them through no fault of their own, and the effect of
this on subsequent items can be disastrous, with candidates listening for
‘answers’ that have already passed. Since a single faulty item can have
such an effect, it is particularly important to trial extended listening tests,
even if only on colleagues aware of the potential problems.
Candidates should be warned by key words that appear both in the item
and in the passage that the information called for is about to be heard.
For example, an item may ask about ‘the second point that the speaker
makes’ and candidates will hear ‘My second point is … ’. The wording
does not have to be identical, but candidates should be given fair warning
in the passage. It would be wrong, for instance, to ask about ‘what the
167
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
speaker regards as her most important point’ when the speaker makes the
12 Testing listening
point and only afterwards refers to it as the most important. Less obvious
examples should be revealed through trialling.
Other than in exceptional circumstances (such as when the candidates are
required to take notes on a lecture without knowing what the items will
be, see below), candidates should be given sufficient time at the outset to
familiarise themselves with the items. As was suggested for reading in the
previous chapter, there seems no sound reason not to write items and accept
responses in the native language of the candidates. This will in fact often be
what would happen in the real world, when a fellow native speaker asks for
information that we have to listen for in the foreign language.
Possible techniques
Multiple choice
The advantages and disadvantages of using multiple choice in extended
listening tests are similar to those identified for reading tests in the
previous chapter. In addition, however, there is the problem of the
candidates having to hold in their heads four or more alternatives while
listening to the passage and, after responding to one item, of taking in and
retaining the alternatives for the next item. If multiple choice is to be used,
then the alternatives must be kept short and simple. The alternatives in the
following invented example item are too complex.
Before beginning a journey by car, what is the motorist advised to do?
a. He should increase the pressure in his tyres to the required level.
b. He should connect his sat nav and enter his intended destination.
c. He should make sure that the vehicle is fully roadworthy.
d. He should ensure that all doors are properly closed, with child locks
activated.
Multiple choice can work well for testing lower-level skills, such as
phoneme discrimination.
The candidate hears bat
and chooses between pat mat fat bat
Short answer
This technique can work well, provided that the question is short and
straightforward, and the correct, preferably unique, response is obvious.
Below is an example from the IELTS test. The candidates hear an extract
from a talk given to a group who are going to stay in the UK. Note that the
candidates need only give two examples of community groups, with theatre
Listening
providedsample
as antask – Short-answer questions (to be used with IELTS Listening Recording 3)
example.
SECTION 2
Questions 11 – 16
Write NO MORE THAN THREE WORDS AND/OR A NUMBER for each answer.
What TWO factors can make social contact in a foreign country difficult?
• 11 ...............................
• 12 ...............................
Which types of community group does the speaker give examples of?
• theatre
• 13 ..................................
• 14 ..................................
• 15 ..................................
• 16 ..................................
169
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
Gap filling
12 Testing listening
This technique can work well where a short answer question with a
unique answer is not possible.
Woman: Do you think you can give me a hand with this?
Man: I’d love to help but I’ve got to go round to my mother’s in a minute.
The woman asks the man if he can her but he has to
visit his .
Information transfer
This technique is as useful in testing listening as it is in testing reading,
since it makes minimal demands on productive skills. It can involve
such activities as the labelling of diagrams or pictures, completing forms,
making diary entries, or showing routes on a map. In the following
example, which is taken from the IELTS exam, candidates label a map
Listening sample task – Plan/map/diagram labelling
while listening to someone describing the layout of a library.
SECTION 2
Questions 11-15
Choose FIVE answers from the box and write the correct letters A-I next to questions
11-15.
Town Library
Seminar room
14 ………......... A Art collection
B Children's books
15 …......... C Computers
Non-fiction
13 ……….........
D Local history
collection
Fiction
Library area
E Meeting room
F Multimedia
12 ………......... G Periodicals
11 …….........
Library office H Reference books
I Tourist
information
Librarian’s desk
Entrance
170
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
Tapescript
You will hear the librarian of a new town library talking to a group of people who are
12 Testing listening
visiting the library.
OK everyone. So here we are at the entrance to the town library. My name is Ann,
and I'm the chief librarian here, and you'll usually find me at the desk just by the main
entrance here. So I'd like to tell you a bit about the way the library is organised, and
what you'll find where … and you should all have a plan in front of you. Well, as you
see my desk is just on your right as you go in, and opposite this the first room on
your left has an excellent collection of reference books and is also a place where
people can read or study peacefully. Just beyond the librarian's desk on the right is a
room where we have up to date periodicals such as newspapers and magazines and
this room also has a photocopier in case you want to copy any of the articles. If you
carry straight on you'll come into a large room and this is the main library area. There
is fiction in the shelves on the left, and non-fiction materials on your right, and on the
shelves on the far wall there is an excellent collection of books relating to local
history. We're hoping to add a section on local tourist attractions too, later in the year.
Through the far door in the library just past the fiction shelves is a seminar room, and
that can be booked for meetings or talks, and next door to that is the children's
library, which has a good collection of stories and picture books for the under
elevens. Then there's a large room to the right of the library area – that's the
multimedia collection, where you can borrow videos and DVDs and so on, and we
also have CD-Roms you can borrow to use on your computer at home. It was
originally the art collection but that's been moved to another building. And that's
about it – oh, there's also the Library Office, on the left of the librarian's desk. OK,
now does anyone have any questions?
Note taking
Where the ability to take notes while listening to, say, a lecture is in
question, this activity can be quite realistically replicated in the testing
situation. Candidates take notes during the talk, and only after the talk
is finished do they see the items to which they have to respond. When
constructing such a test, it is essential to use a passage from which notes
can be taken successfully. This will only become clear when the task is
first attempted by test writers. We believe it is better to have items (which
can be scored easily) rather than attempt to score the notes, which is not a
task that is likely to be performed reliably. Items should be written that are
perfectly straightforward for someone who has taken appropriate notes. In
order to aid authenticity in academic contexts, candidates may be supplied
with a copy of the slides used in the lecture. This allows them to make
notes on the slides, as they commonly would in their future studies.
It is essential when including note taking as part of a listening test that
careful moderation and, if possible, trialling should take place. Otherwise,
items are likely to be included that even highly competent speakers of the
language do not respond to correctly. It should go without saying that,
since this is a testing task which might otherwise be unfamiliar, potential
171
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
candidates should be made aware of its existence and, if possible, be
12 Testing listening
provided with practice materials. If this is not done, then the performance
of many candidates will lead us to underestimate their ability.
Partial dictation
While dictation may not be a particularly authentic listening activity
(although in lectures at university, for instance, there is often a certain
amount of dictation), it can be useful as a testing technique. As well as
providing a ‘rough and ready’ measure of listening ability, it can also
be used diagnostically to test students’ ability to cope with particular
difficulties (such as weak forms in English).
Because a traditional dictation is so difficult to score reliably, it is
recommended that partial dictation is used, where part of what the
candidates hear is already written down for them. It takes the following
form:
The candidate sees:
When I someone for the first time,
I them my name. and I always shake their
hand. I think the polite thing to do. I often
nervous when I meet new people so
I play with my hair. I wish I didn’t do that.
What do I usually about? The weather and
. But I don’t talk about .
That’s rude!
The tester reads:
When I meet someone for the first time, I tell them my name and I
always shake their hand. I think that’s the polite thing to do. I often feel
nervous when I meet new people so I sometimes play with my hair. I
wish I didn’t do that. What do I usually talk about? The weather and
jobs. But I don’t talk about money. That’s just rude!
Testers can either write their own passages or they can use authentic
transcripts, either from online resources or from student coursebooks,
as with the example above. There are advantages to using coursebooks.
In addition to the practical benefit of having an audio recording to use,
the excerpts from coursebooks will have been written for specific levels
of language ability. The possible disadvantage is that some candidates
may already be aware of the coursebook. Therefore, we recommend
coursebook excerpts only be used in classroom tests. For higher-stakes
tests, we suggest it is preferable to use one of the many online resources
of authentic listening samples, some of which are listed at the end of
this chapter.
Since it is listening that is meant to be tested, correct spelling should
probably not be required for a response to be scored as correct. However,
172
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
it is not enough for candidates simply to attempt a representation of the
Transcription
Candidates may be asked to transcribe numbers or words which are
spelled letter by letter. The numbers may make up a telephone number.
The letters should make up a name or a word which the candidates should
not already be able to spell. The skill that items of this kind test belong
directly to the ‘real world’. In the trialling of a test we were involved with
recently, it was surprising how many teachers of English were unable to
perform such tasks satisfactorily. A reliable and, we believe, valid way of
scoring transcription is to require the response to an item to be entirely
correct for a point to be awarded.
173
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
with consistency. Needless to say, speakers should have a good command
12 Testing listening
of the language of the test and be generally highly reliable, responsible and
trustworthy individuals.
READER ACTIVITIES
1. a. Choose an online video lecture that would be appropriate for a group of
students with whom you are familiar (see end of this chapter for possible
resources). Play a five-minute stretch to yourself and take notes. On the
basis of the notes, construct eight short-answer items. Ask colleagues
to take the test and comment on it. Amend the test as necessary, and
administer it without video (audio only) to half of the group of students
you had in mind. Analyse the results.
b. Administer the same test to the other half of the group, showing them the
video as well as the audio. What differences do you notice between the
performance of the two groups of students? Go through the test item by
item with the students and ask for their comments. How far, and how well,
is each item testing what you thought it would test?
2. Design short items that attempt to discover whether candidates can
recognise: sarcasm, surprise, boredom, elation. Try these on colleagues and
students.
3. Design a test that requires candidates to draw (or complete) simple
pictures. Decide exactly what the test is measuring. Think what other things
could be measured using this or similar techniques. Administer the test and
see if the students agree with you about what is being measured.
FURTHER READING
General
Buck (2001) is a thorough study of the assessment of listening. Field (2019)
evaluates many of the conventions behind listening tests and provides
practical ideas for how they might be rethought.
Test methods
Sherman (1997) examines the effects of candidates previewing listening
test items. Buck and Tatsuoka (1998) analyse performance on short-answer
items. Hale and Courtney (1994) look at the effects of note taking on
performance on TOEFL® listening items. Note taking is suggested to be
a good indicator of listening ability in Song (2012). Shohamy and Inbar
(1991) look at the effects of texts and question type. Cai (2013) examines
the validity of partial dictation as a test of ‘higher order’ listening abilities.
The effects of visual information in listening tests are investigated in Ginther
(2002), Ockey (2007), Wagner (2010) and Batty (2015).
174
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press
Test validation
Texts
Freedle and Kostin (1999) investigate the importance of the text in TOEFL®
minitalk items. Examples of recordings in English that might be used as the
basis of listening tests are Crystal and Davy (1975); Hughes et al. (2012),
if regional British accents are relevant. Harding (2012) investigates the
possibility of bias where accents of speakers in recordings are similar to
those of the test-takers’ L1. Ockey and Wagner (2018) is a collection of
articles on authenticity in the assessment of listening ability.
Online resources
There are countless online resources of authentic spoken English, which
testers can use to create tests. What follows is a brief selection of resources
that can easily be found using a search engine. The Self-access centre
for Language Learning at the University of Reading provides dozens of
authentic academic lectures. TED has thousands of talks and lectures on
every subject imaginable. Transcripts can be accessed through the TED
website. Podcasts are another good way to use authentic listening samples
in tests. The BBC website contains hundreds of podcasts in different genres.
175
https://doi.org/10.1017/9781009024723.012 Published online by Cambridge University Press