Assessment 2 Rationale

Building the Ideal Relationship:
An assessment plan to test academic vocabulary and writing
Introduction
Language testing poses a moral dilemma, in that its intended purpose is to discriminate
between people. Consequently, test designers have a responsibility to be discriminating in
discrimination (Hamps-Lyon 1989, cited in Lynch 1997, p. 315) by considering a tests validity,
reliability, washback, authenticity and practicality. This paper discusses two progress tests,
namely a discrete-point vocabulary test, including selected and constructed-response tasks,
and a writing portfolio. The tests are designed for the Academic Reading and Writing (ARW)
unit, which is part of the English for Tertiary Studies (ETS) programme at UOW College. It will
be argued that this combination of objective and subjective measurements builds the ideal
relationship and thus discriminates discriminately. Firstly, the context and student cohort will
be discussed. Secondly, the ARW curriculum will be outlined. Thirdly, language constructs and
test specifications will be described. Fourthly, both tests will be evaluated in relation to the
aforementioned criteria. The assessment plan, the tests, and scoring and grading scales can
be found in the Appendices (pp. 18 - 52).
Context
UOW College, which is an integral part of University of Wollongong (UOW), provides direct
entry programmes for domestic and international students. ETS is a ten-week non-credit
carrying EGAP programme for international students planning to undertake tertiary courses
in business, medicine, law, humanities, creative arts, engineering, science, IT, or social
sciences at UOW. Traditionally, ETS programmes comprise a heterogeneous student cohort,
Bianca van de Water 5037712 EDGT983 Assessing & Evaluating Rationale Page 1 of 52
who not only intend to specialise in disparate disciplines upon conclusion of dissimilar
pathways, but also speak in a babel of native tongues. Students from more than thirty L1s
have participated in the programme (UOW College 2015). Future students require
intermediate English language proficiency or higher for admission, which needs to be
evidenced by appropriate results from an accepted test. This includes either the IELTS or
TOEFL standardised test, whereby students need to achieve a minimum of an IELTS 5.5 overall
band score with modest user competence in reading and writing, or 525 in the TOEFL paper
test or 195 in the computerised equivalent. Alternatively, students can undertake a
placement test or participate in a preparatory programme at UOW College.
ARW Curriculum
ETS is an integrated skills-based programme, comprising three core units: Critical Literacy;
Academic Listening and Speaking; and, Academic Reading and Writing. The ARW component
carries the majority weighting of sixty percent. Learning objectives include computer literacy;
time management; and, essay writing. The unit focusses on writing an argumentative essay,
whereby teaching and learning includes several microskills, namely rhetoric, process writing,
writing from sources, and, academic lexis.
The present paper is based on the 2003 ETS draft syllabus and proposes two alternative
formative assessments for the ARW unit. As the argumentative essay appears to be the
principal learning objective, an assessment plan that promotes beneficial washback for
writing in this genre is proposed (Appendix A, p. 18). It entails an academic vocabulary test
and a process portfolio that includes compositions, essay plans, and writing journals. Their
purpose is to identify what students have learned and still need to learn (Brown 1998, p. 15)
in regards to mastering control of the target task, i.e. an academic argumentative essay. The
assessments fulfil a triadic function, namely to offer feedback; provide scaffolded learning;
and, assess the curriculums appropriateness (Ibid.). Both tests are criterion-referenced,
whereby criteria are based on theoretical constructs and ARW learning objectives. All
objectives are addressed, except for note-taking, time management, and writing abstracts
(Appendix A, p. 18). Before test specifications can be described, underlying constructs need
to be discussed, in that tests are operational definitions of constructs (Brown &
Abeywickrama 2010, p. 33).
Language Constructs
The language model to be tested involves two distinct constructs, namely, argumentative
essay conventions, and the skills required to write in this genre. lvarez (2001, cited in
Bejarano and Chapetn 2013, p. 129) defines it as an interactive text, whereby the author
defends a perspective in order to convince or persuade. Andrews (1995, cited in Winsgate
2012, p. 146) emphasizes the genres cohesion, in that it consists of a connected series of
statements intended to establish a position. According to Toulmin, Reike and Janik (1984,
cited in Winsgate 2012, p. 146) these statements contain claims and reasons. Furthermore, a
carefully constructed argumentation includes qualifiers, i.e. a form of hedging, and rebuttals
of counterarguments (Toulmin 1958, cited in Liu & Stapleton 2014, p. 118). In sum, the
academic argumentative essay construct is defined as a text type that aims to convince or
persuade, which is achieved through interrelated moves, including claims, reasons, qualifiers
and rebuttals. This construct formed the premise for test development and marking rubric
design.
Additionally, the design process was informed by a pragmatic amalgam of writing constructs.
Firstly, the test is informed by genre theory. Practicality considerations preclude a dedicated
genre approach, in that the context comprises a student cohort who plan to specialise in
diverse disciplines, whereas argumentative texts are highly discipline-specific (Winsgate
2012, p. 147). However, Swaless move structure concept has been incorporated. This is
evidenced in the writing journals, which stipulate that students elaborate on moves and
communicative purposes incorporated in compositions (Appendix C, pp. 45 - 47). Secondly,
the tests incorporate process writing concepts, which emphasize intrapersonal cognitive
skills. For instance, the first test includes an editing task (Appendix B, pp. 34 - 43), whereas
the second test requires students to develop essay plans (Appendix C, pp. 45 - 47). Thirdly,
writing from sources, which is a fundamental aspect of real-life scholarly writing (Gebril 2009,
p. 508), has been included as a criterion in the writing assignments, which is demonstrated in
the marking rubric (Appendix D, p. 49). Finally, the tests are informed by a product-oriented
approach, which emphasizes textual conventions such as paragraph structure, syntax and
lexis. This construct is evidenced in the first test, which measures academic vocabulary
(Appendix B, pp. 19 - 43), and elaborated in the second test, whereby compositions are scored
according to paragraph cohesion; sentence structure variation; and, appropriate TL use
(Appendix D, pp. 48 - 51). Thus, the academic writing construct is defined as an amalgamation
of cognitive skills, in order to communicate effectively, according to the linguistic conventions
of the TL domain.
Vocabulary Test Specifications
The first measurement comprises a triadic discrete-point test of academic vocabulary,
whereby each component measures distinct vocabulary knowledge and abilities (Appendix B,
pp. 19 - 44). The tasks were sequenced in order to increase the skills complexity level,
progressing from recognition to production skills. The test measures high-frequency academic
vocabulary, which was selected since many EFL students perceive academic lexis as
particularly challenging (Li & Pemberton 1994, cited in Hyland & Tse 2007, p. 326). Test items
originate from Coxheads (2000) Academic Word List (AWL), which consists of 570 word
families, including 3,112 individual items (Hyland & Tse 2007, p. 327). The selection
predominantly consists of the most frequently occurring members of AWL word families,
although occasionally less frequently occurring words had to be selected for practical reasons.
Furthermore, the tests third task, which comprises authentic texts, required occasional
deviations from the AWL framework.
The first part of the vocabulary test comprises a multiple-choice (MC) instrument, which
measures the ability to recognise denotations of Latinate content words (Appendix B, pp. 20
29). All test items include four options to reduce the effect of guessing (Hughes 2003, p. 77),
which are furthermore homogenous in length and content, and contain plausible distractors.
For example, Item 8 tests recognition, or recall, of the word interpretation, whereby the
distractors include translation, inducement, and solution. The items were designed
according to Bothels (2001) guidelines, whereby the aforementioned item is realised as
follows:
8. This paper offers an alternative interpretation of Manchesters football history, arguing that it was a
minor form of football in a city dominated by a rugby code.
What is the purpose of a paper that offers an interpretation?
a. to offer an explanation
b. to offer a translation
c. to offer an inducement
d. to offer a solution
The item adheres to a memory-plus-application-format (Bothell 2001, pp. 1 2), whereby
the prompt is framed as a question, preceded by an authentic reading passage containing the
token in question. This format encourages students to recall principles, rules or facts in a real-
life context thus emphasizing higher-level thinking skills (Ibid.). All reading passages were
selected from authentic articles, sourced from Google Scholar and UOW library databases. A
few sentences were adapted from Swales and Feaks (2012) Academic Writing for Graduate
Students.
The second part comprises a single-word gap filling test, which focusses on accuracy and
measures productive vocabulary knowledge of collocations (Appendix B, pp. 30 - 33). Previous
research found that even advanced NNSs have great difficulty with native-like collocations
(Ellis, Simpson-Vlach & Maynard 2008, p. 378); however, fluent and accurate usage of
formulaic language signals competence to a given discourse community (Hyland 2008, p. 42).
The test contains discrete sentences, whereby students need to fill in the correct preposition
following a verb or noun provided. All prompts were selected from authentic texts according
to the procedure described above.
The third part entails a MC editing test, which focusses on pragmatic meanings and measures
both receptive and productive abilities (Appendix B, pp. 34 - 43). This task type was selected
in order to promote proofreading skills (Brown & Abeywickrama 2010, p. 247) in preparation
of the skills required for the second assessment, i.e. the portfolio. The task contains two
separate texts, which require students to proofread each sentence; identify inappropriate
vocabulary; and, provide an accurate and appropriate alternative. It measures knowledge of
lexical phrases, discourse markers and content words. The two texts were adapted from
authentic academic articles on general knowledge topics, i.e. the resources curse and
modern slavery. As such, they do not require specialist knowledge thus cultivating fairness
for test-takers (Bachman & Palmer 1996, pp. 106 107)
The vocabulary test will be scored according to an absolute grading scale, whereby one point
is awarded for each question answered correctly. The MC test is scored dichotomously and
the gap-filling task follows an exact word-scoring procedure, whereby a point is awarded for
the correct preposition only. The MC editing task is scored according to appropriate word-
scoring and partial credit-scoring procedures. Credit is awarded for grammatically correct and
appropriate language choices whereby acceptable answers do not necessarily need to
originate from the AWL. Half a point is awarded for selecting the key and another for
providing a suitable alternative. The entire test contains one hundred items and aggregate
scores map onto the grading scale (Appendix E, p. 52) as provided in the ETS syllabus, e.g. a
total score of sixty-eight points translates into a credit grade.
Writing Test Specifications
The second test is a performance-based assessment, namely a process portfolio, which
includes compositions, essay plans and writing journals (Appendix C, p. 45 - 47). According to
Brown and Abeywickrama (2012, p. 131), portfolio development has various pedagogical
benefits, including individualisation of learning; encouragement of critical thinking and
revision processes; and, promotion of intrinsic motivation, responsibility and ownership.
The portfolio is based on three separate themes, or topics, whereby each requires a
composition, an accompanying writing journal, and an essay plan. Students do not have a
choice of topics and need to undertake all tasks. Each composition focusses on one aspect of
the essay macrostructure, for instance, the first composition task requires students to write
an introduction; the second requires body paragraphs; and, the third requires a conclusion.
Additionally, three writing journals are required (Appendix C, pp. 45 - 47), whereby content is
restricted to thinking and writing processes related to each specific composition, with the
purpose of raising awareness of logical argumentation, communicative purposes and
language choices. The essay plans are not discussed in this paper, as they are intended to be
integrated into daily classroom activities.
The test contains three separate prompts, which were designed according to Kroll and Reids
(1994) guidelines and include contemporary social issues, namely genetically-modified food;
constant public surveillance; and, human cloning. For example, Task Three (Appendix C, p.
47) prompts students to consider the following:
Topic: Twenty years ago, the first mammal, Dolly the Sheep, was cloned successfully.
To date, no human clone has been born. Nonetheless, the topic is a rich source for
media and fiction narratives alike. What opportunities, or threats, do you think
human cloning presents? Relate the response to your intended field of study.
The framed prompt above does neither require specific cultural schemata nor discipline-
specific knowledge. Furthermore, it allows multiple interpretations and content is based on a
body of knowledge that is equally accessible to all students (Horowitz 1991, cited in Kroll &
Reid 1994, p. 235) thus avoiding cultural biases.
The compositions will be scored according to an analytical scale (Appendix D, p. 48 -51), based
on the theoretical constructs outlined above. Criteria included relate to logical
argumentation; textual coherence and cohesion; language choices; source writing; and,
presentation. The performance levels include four gradations, ranging from excellent,
good, pass to unsatisfactory. The rubric is scaled, with points assigned for each level of
performance, whereby the maximum score amounts to one hundred points, so that
aggregate scores translate directly into a grade as per the ETS syllabus scale (Appendix E, p.
52). The scales purpose is to provide individualised feedback and pinpoint the microskills not
mastered yet (Bloom et al. 1971, cited in Perkins 1983, p. 656). The writing journal will not be
scored or graded, but instead functions as an additional instrument for providing
individualised feedback in that it provides insights into students language awareness and
thinking processes. Process portfolios and writing journals will be discussed during one-on-
one conferences. When compositions have been revised, they become part of the final
presentation portfolio.
Reliability
The vocabulary test represents a relatively reliable instrument. Firstly, it includes a substantial
number of items, based on the premise that the more items there are on a test, the more
reliable it will be (Hughes 2003, p. 44). In addition, the first sixty items each provide a fresh
start thus further improving test reliability. Secondly, it entails a comparatively reliable
scoring procedure. The first two tasks represent objective tests, in that there is only one
correct answer for each question. The MC editing task is more subjective, in that multiple
alternatives might be possible, and scoring requires a judgement call on behalf of the raters.
Rater reliability could be improved by introducing a second rater for this specific task (Hughes
2003, p. 50).
Several safeguards were implemented to increase the reliability of the writing portfolio.
Firstly, it comprises three independent, untimed compositions thus preventing the snap-shot
approach, which creates unreliable and unrepresentative impressions (Hamp-Lyons & Kroll
1996, p. 53). Secondly, students do not have a choice of topics, in that too much freedom in
topics is likely to have a depressing effect on test reliability (Hughes 2003, p. 45). Thirdly,
scoring should be conducted according to a detailed marking rubric, which may increase intra-
rater reliability, prevent accidental construct under-presentation (Hughes 2003, p. 102), and
improve trustworthiness and auditability (Brown & Hudson 1998, p. 655). In sum, several
steps were taken to improve the writing tests reliability.
Nonetheless, both tests should be trialled to assess other plausible threats to test reliability.
Firstly, the vocabulary test is relatively lengthy. Although it is not timed, it needs to be finished
within one period. Consequently, test takers could become fatigued, which may cause
inaccurate test results (Brown & Abeywickrama 2012, p. 29). Perhaps it needs to be
administered across several periods to address this possible threat. Secondly, MC items need
to be tested to determine distractor efficiency, item facility and item discrimination. Thirdly,
the composition prompts need to be appraised to establish whether content is indeed
unbiased and appropriate for this specific context (Kroll & Reid 1994, p. 241).
Validity
The portfolio demonstrates several types of validity. Firstly, it demonstrates face validity, in
that it entails a direct test of writing skills. Secondly, it demonstrates content validity, in that
it addresses the majority of ARW learning objectives and requires performance of the target
task, namely an argumentative essay. Finally, it demonstrates construct validity, in that it is
premised on research-based frameworks for argumentation and academic writing.
Contrarily, the vocabulary test may have construct validity issues in that the existence of a
common core of academic vocabulary is contested. Hyland and Tse (2007, p. 248) argue that
the AWL has variable usefulness for various disciplines since many items are
underrepresented in some fields. The AWL has most utility for IT students and least for those
studying biology (Ibid.). This suggests that caution may need to be exercised in
implementation. Notwithstanding, the AWL could be used as a threshold concept (Clarke &
Hernandez 2011, p. 66) to develop discipline-specific language awareness through in-class
readings and reflective activities. Furthermore, aside from EAP courses, students have limited
opportunities for common core vocabulary development, in that sub-technical lexis is unlikely
to be taught by content teachers (Flowerdew 1993, cited in Hyland & Tse 2007, p. 236). Thus,
a vocabulary test may have a signficant role to play by motivating students to study and
review (Bachman 1990, p. 22) the building blocks of academic language.
Authenticity
The process portfolio could be considered an authentic assessment, in that there is a strong
correspondence between the test and the skills required in real-life academic contexts.
Arguing a case is one of the most frequent and important types of assessment tasks set at
university (Lee 2008, p. 240), whereby logical argumentation is considered a key writing skill
(Lee & Street 1998, cited in Winsgate 2012, p. 145). Academic essays must both carry
appropriate authority and engage readers in ways they are likely to find credible and
persuasive (Hyland 2002, p. 215). However, many students demonstrate language lacks in
discussing, arguing and evaluating competently, logically and persuasively (Lee 2008, p. 240;
Winsgate 2012, p. 145). The process portfolio provides step-by-step support in constructing
an argumentative essay, from planning and revising, to explaining and rebutting. Thus, this
assessment addresses both potential language lacks and authentic language demands.
However, the authenticity of the vocabulary test is more tenuous. Brown and Hudson (1998,
p. 659) state that this test type lacks authenticity, in that real-life language is not multiple-
choice. Bothell (2001, p. 4) emphasizes that MC questions should be avoided when other
item types are more appropriate. Nonetheless, constructed-responses are the only test items
that can address avoidance strategies, whereby NNSs evade using challenging lexis (Brown
2000, p. 149). Although other constructed-response tests could likewise address the
undesirable strategy, the MC test was selected in that it is more reliable than a true-false test
(Brown & Hudson 1998, p. 659) and may have more construct validity than a matching test,
in that the latter can become more of a puzzle-solving process than a genuine test of
linguistic comprehension (Brown & Abeywickrama 2012, p. 218).
Washback
Both tests could potentially promote beneficial washback effects. Although the vocabulary
assessment may seem a bolt-on test to discriminate between those who have memorised
sufficient lexis and those who have not, its inclusion is based on sound theoretical principles.
Mitchell, Miles and Marsden (2013, p. 143) reiterate research that found that explicit
vocabulary learning may lead to proceduralisation of this knowledge. Thus, it is hoped that
vocabulary learning required for the first test, leads to fluent use in future writing tasks.
Beneficial washback is promoted by turning the traditional test into a feedback opportunity,
which aims to identify students strengths and weakness and provide suggestions for further
development (Appendix B, p. 44).
The portfolio assessment could promote beneficial washback by integrating similar tasks into
daily teaching and learning. Ideally, test content will be taught according to a genre-based
reading-writing paradigm, whereby students explore and analyse argumentative essays in
class perhaps with the use of the AWL - and sequentially apply and transform new
understandings during classroom writing activities. As students are required to relate their
compositions to own disciplines (Appendix C, p. 45- 47), it would be most effective if they
source own discipline-specific readings. Additionally, students could work on the portfolio in
class, whereby learning to write is scaffolded by peer reviews and teacher feedback.
Practicality
The two tests balance each other in regards to practicality. The vocabulary test could be re-
used; implementation is straightforward; and, scoring is comparatively effortless, in that sixty
per cent of the test is scored objectively. Practicality could be further increased by
computerising the test, whereby current items form the basis for an item bank. By contrast,
performance based assessments such as portfolios are relatively time-consuming and
potentially costly to administer, in that colleagues may require training, rating sessions may
need to be conducted (Brown & Hudson 1998, p. 662), and, sufficient time needs to be
allocated for student conferences (Brown & Abeywickrama 2012, p. 134). Consequently,
portfolios and conferences have a low practicality rating (Ibid.).
Conclusion
The two tests were carefully constructed by considering reliability, validity, authenticity,
washback and practicality. Reliability was improved by adding objectively-scored tasks and
developing a detailed analytical scoring rubric. Validity was addressed by designing the
assessments according to relevant theoretical constructs and ARW learning objectives.
Authenticity criteria were met by aligning the task with real-life language demands in
university settings. Washback was addressed with the suggestion of integrating assessment
tasks into daily learning activities. Finally, time and effort required for portfolio assessment
and conferencing is balanced by the practicality of the vocabulary test. In conclusion,
favourable conditions were created to foster the ideal relationship between tests as well as
between test and context.
References
Bachman, LF 1990, Measurement, Fundamental Considerations in Language Testing, Oxford University Press,
Oxford, UK, pp. 18 52.
Bachman, LF & Palmer, A 1996, Describing, identifying, and defining: test purposes, tasks in the TLU domain,
characteristics of test-takers, and the construct to be measured, Language Testing in Practice:
Designing and Developing Useful Language Tests, Oxford University Press, Oxford, UK, pp. 95 132.
Bejarano, PAC, Chapetn, CM 2013, The Role of Genre-Based Activities in the Writing of Argumentative Essays
in EFL, Profile, vol. 15, no. 2, pp. 127 147.
Bothell, TW 2001, 14 Rules for Writing Multiple-Choice Questions, 2001 Annual University Conference, Brigham
Young University, pp. 1 5.
Brown, HD 2000, Principles of language learning and teaching, 5th edn, Pearson Longman, White Plains, NY.
Brown, HD & Abeywickrama, P 2010, Language Assessment: Principles and Classroom Practice, 2nd edn, Pearson
Longman, White Plains, NY.
Brown, J 1998, Language testing: purposes, effects, options, and constraints, TESOLANZ Journal, vol. 6, pp. 13
30.
Brown, JD & Hudson, T 1998, The alternatives in language assessment, TESOL Quarterly, vol. 32, no. 4, pp. 653
675.
Clarke, IL & Hernandez, A 2011, Genre Awareness, Academic Argument, and Transferability, WAC Journal, vol.
22, pp. 65 78.
Coxhead, A 2000, A New Academic Word List, TESOL Quarterly, vol. 34, no. 2, pp. 213 238.
Ellis, NC, Simpson-Vlach, R & Maynard, C 2008, Formulaic Language in Native and Second Language Speakers:
Psycholinguistics, Corpus Linguistics, and TESOL, TESOL Quarterly, vol. 42, no. 3, pp. 375 396.
Gebril, A 2009, Score generalizability of academic writing tasks: Does one test method fit it all? Language
Testing, vol. 26, no. 4, pp. 507 531.
Hamp-Lyons, L & Kroll, B 1996, Issues in ESL Writing Assessment: An Overview, College ESL, vol. 6, no. 1, pp. 52
72.
Hughes, A 2003, Testing for Language Teachers, 2nd edn, Cambridge University Press, Cambridge.
Hyland, K 2002, Directives: Argument and Engagement in Academic Writing, Applied Linguistics, vol. 23, no. 2,
pp. 215 239.
Hyland, K 2008, Academic clusters: text patterning in published and postgraduate writing, International Journal
of Applied Linguistics, vol. 18, no. 1, pp. 41 62.
Hyland, K & Tse, P 2007, Is There an Academic Vocabulary?, TESOL Quarterly, vol. 41, no. 2, pp. 235 253.
Kroll, B & Reid, J 1994, Guidelines for Designing Writing Prompts: Clarifications, Caveats, and Cautions, Journal
of Second Language Writing, vol. 3, no. 3, pp. 231 255.
Lee, SH 2008, An integrative framework for the analyses of argumentative/persuasive essays from an
interpersonal perspective, Text and Talk, vol. 28, no. 2, pp. 239 270.
Liu, F & Stapleton, P 2014, Counterargumentation and cultivation of critical thinking in argumentative writing:
Investigating wash-back from a high-stakes test, System, vol. 45, pp. 117 128.
Lynch, B 1997, In search of the ethical test, Language Testing, vol. 14, no. 3, pp. 315 327.
Mitchell, R, Myles, F & Marsden, E 2013, Second Language Learning Theories, 3rd edn, Routledge, London.
Perkins, K 1983, On the Use of Composition Scoring Techniques, Objective Measures, and Objective Tests to
Evaluate ESL Writing Ability, TESOL Quarterly, vol. 17, no. 4, pp. 651 671.
Swales, JM & Feak, CB 2012, Academic Writing for Graduate Students, University of Michigan Press.
UOW College, Diversity and Equity, viewed 10 February 2015, http://www.uowcollege.edu.au/
about/diversity-equity/index.html.
Winsgate, U 2012, Argument! helping students understand what essay writing is about, Journal of English for
Academic Purposes, vol. 11, no. 2, pp. 145 154.

Assessment 2 Rationale

Uploaded by

Copyright:

Available Formats

Assessment 2 Rationale

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assessment 2 Rationale

Uploaded by

Copyright:

Available Formats

Building the Ideal Relationship:

An assessment plan to test academic vocabulary and writing

between people. Consequently, test designers have a responsibility to be discriminating in

namely a discrete-point vocabulary test, including selected and constructed-response tasks,

be found in the Appendices (pp. 18 - 52).

sciences at UOW. Traditionally, ETS programmes comprise a heterogeneous student cohort,

intermediate English language proficiency or higher for admission, which needs to be

test or 195 in the computerised equivalent. Alternatively, students can undertake a

placement test or participate in a preparatory programme at UOW College.

writing from sources, and, academic lexis.

to be discussed, in that tests are operational definitions of constructs (Brown &

Abeywickrama 2010, p. 33).

defends a perspective in order to convince or persuade. Andrews (1995, cited in Winsgate

diverse disciplines, whereas argumentative texts are highly discipline-specific (Winsgate

communicative purposes incorporated in compositions (Appendix C, pp. 45 - 47). Secondly,

according to paragraph cohesion; sentence structure variation; and, appropriate TL use

of cognitive skills, in order to communicate effectively, according to the linguistic conventions

Vocabulary Test Specifications

The first measurement comprises a triadic discrete-point test of academic vocabulary,

deviations from the AWL framework.

according to Bothels (2001) guidelines, whereby the aforementioned item is realised as

minor form of football in a city dominated by a rugby code.

What is the purpose of a paper that offers an interpretation?

The item adheres to a memory-plus-application-format (Bothell 2001, pp. 1 2), whereby

measures productive vocabulary knowledge of collocations (Appendix B, pp. 30 - 33). Previous

to the procedure described above.

for test-takers (Bachman & Palmer 1996, pp. 106 107)

appropriate language choices whereby acceptable answers do not necessarily need to

total score of sixty-eight points translates into a credit grade.

Writing Test Specifications

The second test is a performance-based assessment, namely a process portfolio, which

benefits, including individualisation of learning; encouragement of critical thinking and

revision processes; and, promotion of intrinsic motivation, responsibility and ownership.

purpose of raising awareness of logical argumentation, communicative purposes and

integrated into daily classroom activities.

47) prompts students to consider the following:

specific knowledge. Furthermore, it allows multiple interpretations and content is based on a

Reid 1994, p. 235) thus avoiding cultural biases.

on the theoretical constructs outlined above. Criteria included relate to logical

scored or graded, but instead functions as an additional instrument for providing

steps were taken to improve the writing tests reliability.

the composition prompts need to be appraised to establish whether content is indeed

task, namely an argumentative essay. Finally, it demonstrates construct validity, in that it is

premised on research-based frameworks for argumentation and academic writing.

Hernandez 2011, p. 66) to develop discipline-specific language awareness through in-class

review (Bachman 1990, p. 22) the building blocks of academic language.

linguistic comprehension (Brown & Abeywickrama 2012, p. 218).

development (Appendix B, p. 44).

reading-writing paradigm, whereby students explore and analyse argumentative essays in

used; implementation is straightforward; and, scoring is comparatively effortless, in that sixty

performance based assessments such as portfolios are relatively time-consuming and

portfolios and conferences have a low practicality rating (Ibid.).

assessments according to relevant theoretical constructs and ARW learning objectives.

and conferencing is balanced by the practicality of the vocabulary test. In conclusion,

between test and context.

Oxford, UK, pp. 18 52.

characteristics of test-takers, and the construct to be measured, Language Testing in Practice:

in EFL, Profile, vol. 15, no. 2, pp. 127 147.

Young University, pp. 1 5.