A Critical Review of The IELTS Writing PDF
A Critical Review of The IELTS Writing PDF
A Critical Review of The IELTS Writing PDF
Introduction Large-scale ESL tests such as Cambridge certificate exams, IELTS, and
TOEFL (the Test of English as a Foreign Language) are widely used around
the world, and they play an important and critical role in many peoples lives
as they are often used for making critical decisions about test takers such as
admission to universities. Therefore, it is necessary to address the
assessment procedures of such large-scale tests on a regular basis to make
sure that they meet professional standards and to contribute to their further
development. However, although there have been several publications
evaluating these tests in general, these publications often do not offer
detailed information specifically about the writing component of these tests.
Scholars, on the other hand, acknowledge that writing is a very complex and
difficult skill both to be learnt and assessed, and that it is central to academic
success especially at university level. For this reason, the present article aims
to focus only on the assessment of writing, particularly in the IE LT S test,
because as well as being one of the most popular E S L tests throughout the
world it is unique among other tests in terms of its claims to assess English
as an international language, indicating a recognition of the expanding
status of English. After a brief summary of the I E LT S test in terms of its
purpose, content, and scoring procedures, the article aims to discuss several
reliability and validity issues about the I E LT S writing test to be considered
both by language testing researchers and test users around the world.
Reliability issues Hamp-Lyons (1990) defines the sources of error that reduce the reliability in
a writing assessment as the writer, task, and raters, as well as the scoring
procedure. IELT S has initiated some research efforts to minimize such
errors, including the scoring procedure, and to prove that acceptable
reliability rates are achieved.
In terms of raters, IELT S states that reliability is assured through training
and certification of raters every two years. Writing is single marked locally
and rater reliability is estimated by subjecting a selected sample of returned
scripts to a second marking by a team of I E LT S senior examiners. Shaw
(2004: 5) reported that the inter-rater correlation was approximately 0.77 for
the revised scale and g-coefficients were 0.840.93 for the operational
single-rater condition. Blackhurst (2004) also found that the paired
examinersenior examiner rating from the sample I ELT S writing test data
produced an average correlation of 0.91. However, despite the reported high
reliability measures, in such a high-stakes international test, single marking
is not adequate. It is widely accepted in writing assessment that multiple
judgements lead to a final score that is closer to a true score than any single
judgement (Hamp-Lyons 1990). Therefore, multiple raters should rate the
IE LT S writing tests independently and inter- and intra-rater reliability
Conclusion To sum up, IELT S is committed to improving the test further and has been
carrying out continuous research to test its reliability and validity. However,
some issues such as the fairness of using a single prescriptive criterion on
international test takers coming from various rhetorical and argumentative
traditions and the necessity of defining the writing construct with respect to
the claims of IE LT S to be an international test of English, have not been
adequately included in these research efforts. In addition, some areas of
research on the reliability of test scores highlight serious issues that need
further consideration. Therefore, the future research agenda for I E LT S
should include the following issues.
In terms of reliability:
n the comparability and appropriateness of prompts and tasks for all test
takers should be continuously investigated
n multiple raters should be included in the rating process and inter- and
intra-rater reliability measures should be constantly calculated
n more research is needed regarding scales and how scores are rounded to
a final score
n rater behaviour while using the scales should be investigated.
IE LT S has rich data sources such as ESM in hand; however, so far this
source has not been fully exploited to understand interactions among the
above-mentioned factors with relation to test taker and rater profile.
In terms of improving the validation efforts with regard to the I E LT S writing
module:
n future research should be performed to explore whether the
characteristics of the IE LT S test tasks and the TL U tasks match, not only
in the domain of the UK and Australia, but also in other domains