Teachers of English To Speakers of Other Languages, Inc. (TESOL) TESOL Quarterly

Teachers of English to Speakers of Other Languages, Inc.
(TESOL)
Effects of Dynamic Corrective Feedback on ESL Writing Accuracy

Author(s): K. JAMES HARTSHORN, NORMAN W. EVANS, PAUL F. MERRILL, RICHARD R.
SUDWEEKS, DIANE STRONG-KRAUSE and NEIL J. ANDERSON
Source: TESOL Quarterly, Vol. 44, No. 1 (March 2010), pp. 84-109
Published by: Teachers of English to Speakers of Other Languages, Inc. (TESOL)
Stable URL: https://www.jstor.org/stable/27785071
Accessed: 06-09-2018 05:05 UTC
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Teachers of English to Speakers of Other Languages, Inc. (TESOL) is collaborating with

JSTOR to digitize, preserve and extend access to TESOL Quarterly
This content downloaded from 103.102.252.6 on Thu, 06 Sep 2018 05:05:35 UTC
All use subject to https://about.jstor.org/terms
Effects of Dynamic Corrective Feedback
on ESL Writing Accuracy
K. JAMES HARTSHORN, NORMAN W. EVANS, PAUL F. MERRILL,
RICHARD R. SUDWEEKS, DIANE STRONG-KRAUSE, AND NEIL J.
ANDERSON
Brigham Young University
Provo, Utah, United States
Though recent research has shown that written corrective feedback

(WCF) may improve aspects of writing accuracy in some English as a
second language (ESL) contexts, many teachers continue to be confused
about the practical steps they should utilize to help their students
improve their writing. Moreover, some have raised concerns as to
whether commonly used approaches to ESL writing pedagogy and
grammar instruction are effective in helping students improve their
linguistic accuracy. This article describes an instructional strategy we
developed for improving students' accuracy based on insights gleaned
from practice, research, and theory. We refer to this instructional
methodology as dynamic WCF. The article also discusses a test of the
methodology's efficacy that compared the performance of two groups of
students, one using a conventional process approach to writing
instruction and the other using the dynamic WCF approach. Test results
demonstrated that although rhetorical competence, writing fluency, and
writing complexity were largely unaffected by the dynamic WCF
pedagogy, significant improvement was observed for writing accuracy.
doi: 10.5054/tq.2010.213781
Thougheducation,
writing manyability is English
learners of one ofas athe most
second salient
language (ESL) outcomes of higher
continue to struggle to produce writing that is linguistically accurate
(e.g., Hinkel, 2002, 2004; Silva, 1993). Not only is this challenge
common for students enrolled in intensive English programs, but it is
evident for many matriculated university students as well. In an attempt
to provide practitioners with guidance for the best ways to teach second
language (L2) writing, many studies over the past few decades have
examined the effects of error correction or written corrective feedback
(WCF). Although some studies have claimed that WCF is ineffective or
harmful (e.g., Kepner, 1991; Truscott, 1996, 1999, 2004, 2007; Truscott
& Hsu, 2008), others have shown that, in certain contexts, it can improve
84 TESOL QUARTERLY Vol. 44, No. 1, March 2010
aspects of L2 writing accuracy (e.g., Bitchener, 2008; Bitchener & Knoch,
2008; Bitchener, Young 8c Cameron, 2005; Ellis, Sheen, Murakami, 8c
Takashima, 2008; Ferris, 2006; Russell Valezy & Spada, 2006; Sheen, 2007).1
Despite growing evidence of the potential benefits of WCF in certain
contexts, many practitioners continue to feel perplexed about how to
interpret recent research and the practical steps they should take to
apply its findings in their classrooms. For many practitioners who have
continued to utilize WCF, the most important question was never whether
it was beneficial, but rather how to use it effectively to help their students
write more accurately. Despite the ongoing research, the answer to this
essential question has remained elusive. Therefore, this study is an
attempt to move us closer toward an understanding of how to use WCF
to maximize ESL student opportunities to learn to improve the linguistic
accuracy of their writing.
At the outset, we acknowledge that the accuracy of L2 writing may be
dramatically influenced by a number of variables such as the learning
environment, learner differences, and instructional methodologies
(Evans, Hartshorn, McCollum, 8c Wolfersberger, in press). Although
each of these deserves greater attention in our research and practice, the
focus of this study deals specifically with our growing concern that
common approaches to L2 writing pedagogy (largely based on models
for teaching first language [LI] writing) may be inadequate for helping
ESL learners to maximize the accuracy of their writing (see Hinkel,
2002, 2004; Grabe, 2001; Silva, 1993). Therefore, an instructional
strategy was developed based on compelling insights from practice,
theory, and research, with the specific intent of improving L2 writing
accuracy. Thus the purpose of this article is (a) to provide a brief
rationale for this instructional methodology and (b) to test its efficacy in
one specific ESL learning context.
THE NEED FOR A BETTER INSTRUCTIONAL

METHODOLOGY
We begin with a brief discussion of why it may be useful to rethink our

instructional methodologies used for teaching L2 writing. First, we point
to a number of meta-analyses examining the benefits of formative
feedback in a variety of disciplines that have consistently demonstrated a
moderate to strong positive effect for feedback recipients when
compared to those in contrast groups (e.g., Azevedo & Bernard, 1995;
Guzzo, Jette, & Katzell, 1985; Kluger 8c DeNisi, 1996). We also note the
1 Because it is unfeasible for this article to present a broad survey of all of the relevant WCF
literature, these sources may be useful, particularly the extensive review provided by
Bitchener (2008).
EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 85
growing evidence suggesting that negative feedback that draws learner
attention to linguistic form may play a meaningful role in facilitating L2
language development (e.g., Ayoun, 2001; Gu & Wang, 2008; Hino,
2006; Iwashita, 2003; Long, Inagaki, & Ortega, 1998; McDonough, 2005;
Pawlak, 2005).
Though such findings should give us confidence in the general
benefits of such feedback, similar gains have not always been apparent in
some ESL contexts where WCF has been utilized (e.g., Truscott, 1996,
1999, 2004, 2007). Although some researchers have appropriately
suggested that this lack of observable improvement in ESL writing
accuracy may be due to flaws in research methods (e.g., Ferris, 1999,
2004, 2006; Truscott, 1996, 2004), including neglecting to account for
individual learner differences (e.g., Ferris, 2006; Guenette, 2007), we
believe that it is at least as important to recognize that weaknesses in
instructional methodologies may also play a significant role in
preventing ESL learners from maximizing their ability to write more
accurately (Evans et al., in press).
In order to understand the nature of the improvements that may be
needed to increase ESL writing accuracy, consider two related problems
observed extensively in practice. First, utilizing WCF in many ESL writing
contexts is overwhelming for both the teacher and the student.
Providing quality feedback can be time-consuming for the teacher,
and the tasks of processing and implementing large amounts of
feedback can be unrealistic for the student. Second, the learning cycle
is seldom completed, in that instruction and feedback often fail to result
in observable improvements in the linguistic accuracy of the writing that
ESL learners produce. Whether students attend a traditional grammar
class or a class that focuses on process writing along with WCF, many
continue to make the same errors in writing tasks, despite explicit
classroom instruction and feedback from their teachers.
AN ALTERNATIVE INSTRUCTIONAL METHODOLOGY

In an effort to overcome such problems commonly observed with
traditional approaches to ESL grammar and writing pedagogy, an
alternative instructional methodology was devised based on a number of
insights gleaned from practice, research, and relevant theory. The goal
was to bring a fresh perspective to the problem of how to maximize the
opportunities of individual students to learn to write more accurately. In
doing so, we acknowledge that we have focused more on the immediate
needs of students and practitioners than we have on what might make a
tidy contribution to the current WCF research. Though recent research
literature has played an essential role in shaping our thinking, many of
86 TESOL QUARTERLY
our views have also grown out of numerous decades of experience
teaching, observing, and assessing LI and L2 writing and its inherent
challenges. Although some of these views may represent a departure
from current thinking in the literature, we believe that they contribute
substantially to the dialogue surrounding WCF research.
Skill Acquisition Theory

First, we briefly discuss skill acquisition theory in terms of its relevance to
second language acquisition as described by DeKeyser (2001, 2007). He
asserts that declarative knowledge (what one knows) is required for the
development of procedural knowledge (what one can do) and that it must
be based on explicit rules and numerous examples. He also claims that
proceduralization requires extensive and deliberate practice, which then
leads the learner toward greater automatization. Although such notions
appear to be highly relevant for informing how WCF might be utilized, it
seems that, generally, they have not been applied effectively in pedagogy.
Two additional concepts from skill acquisition theory are important to
this study. First, the theory predicts that accuracy is a function of practice.
Second, the theory predicts that procedural knowledge does not transfer
well. Thus, if students are to learn to produce accurate writing, practice
tasks and activities must be authentic. With such a premium on writing
practice that is both frequent and authentic, we recognized the need to
effectively balance explicit instruction and extensive practice, along with
the other insights gleaned from observation and theory.
Making WCF Dynamic

Based on the need for practice that is both frequent and authentic, we
developed what we term dynamic WCF, the core component of our
instructional methodology. For our purposes, dynamic WCF is narrowly
defined as having two essential elements that we have hypothesized
many students may need in order to maximize their opportunity to learn
to write more accurately. It includes (a) feedback that reflects what the
individual learner needs most, as demonstrated by what the learner
produces,2 and (b) a principled approach to pedagogy that ensures that
writing tasks and feedback are meaningful, timely, constant, and manageable
for both student and teacher. Each of these aspects of dynamic WCF is
addressed below.
2 This is opposed to focused feedback based on a form that may be targeted as an

instructional objective or as part of a research study. For additional discussion of focused
and unfocused feedback, see Ellis et al. (2008).
Meaningful
To ensure that feedback is meaningful to the learner at a cognitive
level, indirect feedback is provided in the form of coded symbols that
identify the error type and its location. However, the student is
responsible for correcting errors on subsequent drafts (see Appendix
A for a list of errors and their corresponding symbols). Students are
taught how to interpret the symbols, and they record each of their errors
on an Error List, a comprehensive inventory of the errors they produce
along with the written context in which they are produced. Use of these
symbols also helps facilitate the students' ability to keep track of errors
on a Tally Sheet (see Appendix B), a cumulative list of errors that shows
frequencies for each error type. In addition to raising student awareness,
these tools are used to identify high-frequency errors, which form the
basis for the explicit instruction essential to skill-acquisition theory.
Finally, student writing is given a holistic score that reflects both
linguistic accuracy as well as the overall quality of the writing.3
Timely and Constant
Skill acquisition theory suggests that, in order for the feedback to be

meaningful enough to process effectively, it also needs to be timely and
constant. In dynamic WCF, feedback is timely, in that student writing is
consistently marked with the coded symbols and returned the following
class period. It is constant, in that students produce new pieces of writing
and receive feedback nearly every class period of the course.
Manageable
Another vital aspect of the tasks and feedback is that, in order for
them to be meaningful, timely, and constant, they also must be
manageable. Feedback is manageable for teachers when they have
enough time to attend to the quality and completeness of what they
communicate to their students. Feedback is manageable for the students
when they have the time and ability to process, learn from, and apply the
needed feedback from their teachers. Without manageable tasks and
feedback, students may be unable to process feedback effectively and
may experience something akin to the learning breakdown predicted by
cognitive load theory (Kirschner, 2002; Paas, Renkl, & Sweller, 2004).
3 Though a number of different rubrics might be used depending on specific features the
teacher may wish to emphasize, the purpose is to provide students with an overall sense of
the quality of their writing that may help them to contextualize improvement.
88 TESOL QUARTERLY
Moreover, without manageability, the frequency and the meaningfulness
of the needed practice would be impossible to maintain.
Though references to the need for manageability in ESL writing
practice have been largely absent from the literature, recently Bitchener
(2008, p. 109) has also noted the importance for providing manageable
feedback to prevent "information overload." To avoid this problem,
Bitchener (2008, p. 108) suggests that teachers and learners focus on
"one or only a few error categories" at a time. This view has been widely
advocated by other WCF researchers as well (e.g., Ellis et al., 2008; Ferris,
2006; Sheen, 2007). However, whereas such an approach may be useful
or necessary for certain types of research or theory building, it may be
much less practical for a classroom of students who are anxious to
improve the overall accuracy of their writing.
Such an approach would be especially problematic if the error
categories targeted for feedback did not represent the most frequent
error types produced by the individual students. In addition, focusing on
such a limited number of error categories seems at odds with notions of
effective practice suggested by researchers such as Ranta and Lyster
(2007, p. 151), who advocate practice that is "inherently repetitive and
psychologically authentic." It seems that, in order for writing tasks to be
truly authentic, students would need to focus on the accurate
production of all aspects of writing, simultaneously.
Therefore, rather than limit the focus of the feedback, the alternative
approach we use to ensure that tasks and feedback are manageable in
dynamic WCF is simply to limit the length of the student writing. With a
shorter piece of writing, teachers can identify all linguistic errors
produced by their students, without overwhelming themselves or their
students. Thus the essential element of our instructional methodology is
a 10-min paragraph written daily. Ten minutes was chosen because it
seemed long enough to provide a meaningful sample of writing, while
still being manageable enough for the teacher to mark and for the
student to process. This cycle, at the heart of providing dynamic WCF,
involves six steps as summarized in Figure 1.
RESEARCH QUESTIONS
Though the main focus of this study is ESL writing accuracy, analyzing
accuracy without regard for other important dimensions of writing
would be meaningless. For example, though a piece of writing may be
completely free from linguistic errors, its ultimate quality must be
evaluated by its overall communicative effect. Thus it was determined
that linguistic accuracy would need to be examined within the context of
the rhetorical competence reflected in the writing, as well as its writing
Student edits paragraph for Student records errors on tally sheet;
remaining errors if necessary types errors in error log; resubmits
^ and resubmits to teachers. typed copy of edited composition.
Teacher marks edited composition

and returns it to student
FIGURE 1. Feedback cycle for dynamic written corrective feedback.
fluency and writing complexity. This is because we concluded that, even

if the methodology had a positive effect on linguistic accuracy, such
improvements would need to be viewed in terms of any potential trade
off effects that might be observed among complexity, fluency, and
accuracy, as described by Skehan (1998).
Measures of L2 Writing Production

In order to operationalize our research questions, the notions of
writing accuracy, rhetorical competence, writing fluency, and writing
complexity were carefully defined.
Writing Accuracy
Though many measures of accuracy might have been used, we

defined writing accuracy in terms of the error-free T-unit ratio (EFT/T)
as described and recommended by Wolfe-Quintero, Inagaki, and Kim
(1998). The EFT/T is calculated as the total number of error-free T
units4 (EFT) in a given piece of writing divided by the total number of T
units. The EFT was chosen as the unit of measure in this study, because it
has been shown to be one of the most effective measures of accuracy
(Wolfe-Quintero, Inagaki, & Kim, 1989) and because we felt it would
give us the most reliable results.
4 The T-unit was originally developed by Hunt (1965) as a way of measuring writing maturity
to overcome problems associated with the sentence as a unit of production. Hunt (1965, p.
49) defined the T-unit as "one main clause plus the subordinate clauses attached to or
embedded within it."
90 TESOL QUARTERLY
Rhetorical Competence
In addition to defining writing accuracy, it was necessary to establish

an appropriate measure of rhetorical competence. It was intended that
this measure capture the substance, organization, and flow of ideas in
student writing. To do this, a rubric was adapted from the TOEFL iBT
(test of English as a second language Internet-based test). To suit the
purpose of this study, the rubric was modified to be limited to aspects of
rhetorical competence. Though nearly 80% of the original rubric stayed
intact, the modified rubric isolated rhetorical features of writing
common to process-writing instruction. These included criteria such as
addressing the writing task successfully; demonstrating effective organi
zation and development; providing appropriate examples, details, or
support; and conveying a sense of unity and coherence (see rubric in
Appendix C).
Writing Fluency and Complexity
In order to further contextualize writing accuracy and rhetorical

competence, we also sought to examine writing fluency and writing
complexity. Fluency was simply defined as the "number of words ... a
writer is able to include in their writing within a particular period of
time" (Wolfe-Quintero et al., 1998, p. 14). Though writing complexity
could have been defined many ways that target different aspects of
writing, the measure that seemed best suited for our purposes in this
study was the mean length of T-unit (the average number of words per T
unit) as described by Ortega (2003) and Wolfe-Quintero et al. (1998).
Operationalized Research Questions
Though classroom practice was based on 10-min compositions so tasks

and feedback could be manageable, meaningful, timely, and constant,
the intent behind our instructional methodology was to improve the
accuracy of our students' writing. We concluded that this would need to
be demonstrated by new pieces of writing in a pretest-posttest design,
rather than merely examining text revisions (e.g., Bitchener, 2008;
Truscott, 2007). Therefore, we chose a 30-min essay as the elicitation
task, with the hypothesis that the dynamic WCF provided for the 10-min
compositions would transfer to the larger 30-min essays. Thus our
research questions were operationalized as
1. Based on 30-min pretest and posttest essays, will mean accuracy scores
from the treatment group posttest essays be significantly greater than
those from the contrast group?
2. Based on 30-min pretest and posttest essays, will rhetorical competence
scores, fluency scores, or complexity scores from the treatment group
posttest essays be significantly lower than those from the contrast group?
METHOD
Participants
The Students
This study included 47 advanced-low to advanced-mid ESL students

who were studying at Brigham Young University's English Language
Center (ELC) in the United States. The treatment group consisted of 28
students ranging from ages 18 to 45 years (with a mean of 24 years), and
the contrast group included 19 students ranging from ages 18 to 33 years
(with a mean of 25 years). Table 1 summarizes the composition of the
treatment and contrast groups in terms of native language and gender.
This breakdown of student Lls is useful for examining the potential
effect of language distance or the notion that differences between
various Lls and English may account for some of the relative difficulty or
speed with which a learner may acquire English (Odlin, 1989). Corder
(1981) claimed that native speakers of western European languages such
as Spanish would likely experience less difficulty learning English when
compared with native speakers of Asian languages such as Chinese,
Japanese, or Korean. With this in mind, we note that the percentage of
native speakers of western European languages in the control group was
just under 53%, and the percentage in the treatment group was just over
71%. In addition, the native speakers of Chinese, Japanese, and Korean
TABLE 1
Experimental Groups by Native Language and Gender
Experimental groups
Treatment Contrast
Native language Male Female Total Male Female Total

Spanish 10 9 19 2 4 6
Korean 4 2 6 0 3 3
Mandarin 0 0 0 1 2 3
Portuguese 0 0 0 1 2 3
Japanese 1 1 2 0 0 0
French 10 10 1 1
Mongolian 0 0 0 0 1 1
Romanian 0 0 0 0 1 1
Russian 0 0 0 0 1 1
Totals 16 12 28 4 15 19
92 TESOL QUARTE
made up just over 31% of the control group and 29% of the treatment
group.
Although these data may imply a slight advantage for the treatment
group, additional insights from Ringbom (1987) suggest that any
potential advantage would likely be minimal. First, he noted that LI
influence is stronger for younger learners than for older learners.
Second, he observed that LI influence is greatest for those with lower
proficiency and less significant for those with higher proficiency. Third,
he concluded that LI influence is greater in highly communicative tasks
and less significant when more monitoring takes place. Unlike those
learners who most likely would be affected by language distance, the
students in this study were advanced-level adult learners who were
engaged in writing tasks, which allowed for substantial monitoring.
Therefore, it was assumed that the influence of language distance on
student performance would be minimal, if not negligible.
The Teachers
Three different teachers taught students in the treatment group.

Teacher A taught 10 students, Teacher B taught 10 students, and
Teacher C taught 8 students. For the contrast group, Teacher D taught 6
students and Teacher E taught 5 students. In addition to teaching
students in the treatment group, Teacher C also taught 8 students in the
contrast group. All of the teachers who taught the students in the
contrast group were well experienced in teaching traditional process
writing, and all of the instructors in both groups were highly regarded as
effective teachers by their students.
The Scorers and Raters
In an effort to estimate the reliability of the measures investigated in

this study, three experienced teachers scored or rated essays or essay
components after undergoing a brief period of rigorous training and
practice. Additional 30-min essays that were not part of this study were
used for training for both scorers and raters over a two-week period.
During this time, several meetings were held to discuss the specific
criteria to be used to evaluate the essays. Participants also practiced by
alternating scoring or rating as a group, using think-aloud protocols, and
then as individuals. The goal was to achieve a sustained pattern of
consistency among participants (i.e., r > 0.90), before moving on to the
essays included in this study. Although this goal was easily exceeded by
those scoring EFT/Ts, it took much more effort to reach this goal for
those who rated essays for rhetorical competence. Though two of these
raters also served as teachers in the experiment, they were blind to
student and testing occasion in their rating.
Treatment and Contrast Groups

Although students were not randomly assigned to experimental
groups, program administrators followed their usual practice of
"balancing" classes of the same proficiency level in terms of factors
such as LI, nationality, gender, and age. Efforts were made to ensure
that student experiences between the treatment and contrast groups
were as similar as possible. For example, using a series of in-house
language tests, all of the participants in both groups were placed into
Level 5, the highest proficiency level at the ELC. Students in both groups
studied English intensively for 15 weeks. During that time, course work
for both groups included four 65-min class periods per day, Monday
through Thursday.
Table 2 provides a breakdown of how time was allocated for the
treatment and contrast groups and helps illustrate similarities between
the groups. Though the nonexperimental study for both groups
centered on classroom and homework activities designed to strengthen
reading, listening, and speaking skills, the experimental study was
unique to each group. While students in the treatment group were
exposed to the instructional methodology utilizing dynamic WCF, those
in the contrast group participated in a traditional process writing course.
Careful reviews of the curriculum and limited classroom visits led us to
assume that there was nothing in the nonexperimental classes that was
likely to affect the two groups differently in terms of writing accuracy.
Students in the treatment group wrote 10-min compositions nearly
every day of the course. Writing focused on diverse topics, ranging from
opinions, analyses on social issues, science, history, popular culture, and
so on. They used the Tally Sheets and Error Lists described previously.
They also received indirect feedback in the form of coded symbols and
continued to rewrite their compositions until all of their errors were
corrected. Classroom discussions and activities were centered on the
TABLE 2
Weekly Time Allocations for Classroom and Homework Activities
Experimental groups
Control Treatment
Study emphasis Class time Homewor
Experimental 4 hr 20 min 2 hr 4 hr
Nonexperimental 13 hr 6 hr 13 hr 6 hr
Totals 17 hr 20 min 8 hr 17 hr 20 min 8 hr
94 TESOL QUARTERLY
most frequent types of errors being produced by the students in their
daily writing.
On the other hand, students in the contrast group were taught skills
common to process writing. During the experimental period, these
students wrote four multidraft papers and received detailed feedback on
each draft. However, this class not only emphasized a variety of
rhetorical writing skills, but it also focused on linguistic accuracy. It
may be helpful to emphasize that, unlike some contrast groups in recent
studies, who were only given rhetorical feedback (e.g., Bitchener, 2008;
Bitchener & Knoch, 2008; Bitchener et al., 2005), these students were
also given a wide variety of feedback on the linguistic accuracy of what
they produced. Students in both groups participated in three or four 30
min essays like those that were administered as the pretest and posttest.
Design
A pretest, posttest nonequivalent control group design was used for
this study, as described by Shadish, Cook, and Campbell (2002). A
mixed-model, repeated-measures analysis of variance (ANOVA) was
computed using the statistical package for the social sciences. Because
multiple tests would be analyzed, a pseudo-Bonferroni adjustment of
0.01 for the significance level was used as described by Huck (2008).
Though this adjustment was chosen in an effort to balance the risk of
making either Type I or Type II errors, it was anticipated that effect sizes
would need to be analyzed to help contextualize the test results. In
addition, Facets software (Linacre, 2006) was used to analyze rating data
based on the many-facets Rasch model (MFRM).
Reliability
For the findings of this study to be meaningful, it was necessary to
provide appropriate estimates of the reliability for the included
measures. Of the four dependent variables examined in this study,
rhetorical competence was based on the rubric ratings from three
judges, and the measures of accuracy, fluency, and complexity were
based on scores provided by two judges.
Scoring
A criterion of absolute agreement for the number of T-units for each
essay was established between the first scorer (SI) and the second scorer
(S2). When discrepancies emerged, SI and S2 reexamined the essay and
determined the number of T-units jointly. While SI scored all 94 essays
on the number of EFTs for each essay, S2 scored 48 of the essays, based
on a stratified random sample that drew proportionally from six possible
groups of essays, including pretest and posttest essays from students who
were rated by their teachers to be at a low, middle, or high proficiency
level. This is illustrated in Table 3.
Rating
Three raters were used in this study, Rl, R2, and R3. Rl used the
rhetorical competence rubric to rate all students on both the pretest and
posttest (94 essays). R2 rated all of the pretests (47 essays), and R3 rated
all of the posttests (47 essays). This rating design, described by
Schumacker (1999), allowed us to use the MFRM to account for and
adjust the ratings based on differences in essay difficulty as well as
interrater or intrarater inconsistencies (see Bond & Fox, 2007; Linacre,
1994). In addition to the MFRM, an intraclass correlation for the three
raters was also estimated. Because this required a fully crossed design (all
raters providing a score for each essay), this estimate was based on an
additional 23 posttest essays rated by R2 and an additional 23 pretest
essays rated by R3.
Elicitation Procedures
The pretest task for both the treatment group and the contrast group
was simply to write for 30 min in response to the prompt
Do you agree or disagree with the following statement? Only people who earn
a lot of money are successful. Use specific reasons and examples to support
your answer.
Similarly, the posttest task for both experimental groups was to write for
30 min in response to the prompt
TABLE 3
Stratification for the Second Scorer's Random Sampling
Testing occasion Proficiency level Control group Treatment group

Pretest High 3 5
Middle 3 5
Low 3 5
Posttest High 3 5
Middle 3 5
Low 3 5
Total 18 30
96 TESOL QUARTERLY
In your opinion, what is the most important characteristic (for example,
honesty, intelligence, a sense of humor) that a person can have to be
successful in life? Use specific reasons and examples from your experience to
explain your answer.
In both instances the elicitations occurred in a computer lab, where

students typed their responses during the regular final exam period in a
secure testing environment. Although the software allowed the students
to cut, copy, or paste text, no other word processing tools were available.
Once the time ran out, the software prevented the students from being
able to continue to type additional text.
RESULTS
Reliability Estimates
Scoring and Rating Reliability

The Pearson correlation coefficient between SI and S2 for the EFT/T
was 0.97. Ratings from Rl, R2, and R3 produced an intraclass correlation
coefficient of 0.87 for the 48 essays that were triple rated based on the
rhetorical competence rubric. In addition, data analysis from Facets
software showed good separation among essays (i.e., the essays were
fairly reliably separated from one another in terms of the level of
rhetorical competence demonstrated by each).
Although such separation among essays is desirable, we note that
separation is undesirable among raters. Though separation was higher
than expected5 among the three raters, the in-fit statistics appear to be at
acceptable levels (i.e., these data adequately fit the predicted model).
According to Wright and Linacre (1994), the in-fit statistics would need
to fall within the range of 0.5 to 1.7 in order to be acceptable for clinical
observation. However, based on the procedure6 recommended and used
by others, such as McNamara (1996) and Pollitt and Hutchinson (1987),
the in-fit statistics would need to fall within the range of 0.3 to 1.38.
Because the most extreme in-fit statistic of 0.51 from the most lenient
rater was within both of these acceptable ranges, raters could be
considered consistent enough to allow the model to produce a fair
average for each essay, to adjust for the observed rater inconsistencies.
These fair average scores were used for our statistical analyses.
5 This is based on a minimum benchmark of at least 0.80 for a reliability separation index
and at least 2.0 for separation (J. M. Linacre, personal communication, May 15, 2008).
Essays: reliability separation index = 0.86, separation = 2.43. Raters: reliability separation
index = 0.91, separation = 3.14.
6 The in-fit statistic cannot be less than or greater than the mean square mean plus or minus
twice the standard deviation (i.e., [0.84 +/~ 2(0.27)] = 0.3-1.38).
Gender and Teacher Differences
Before examining our results, we briefly address two concerns we had

regarding our data. The first was the disproportionately large number of
female students in the contrast group, and the second was the issue of
teacher differences. To test whether the effect of gender should be a
concern in our data analysis, a mixed-model ANOVA was performed for
gender by time, with accuracy scores as the dependent variable. Results
of this test suggest that gender was not a significant influence on mean
student performance, F(l,45) = 0.002, p= 0.96.
The second concern with our data dealt with teacher differences.
Although one teacher taught students in both the treatment group and
the contrast group, there was no overlap for the other teachers. Though
an effort was made to use comparable teachers for both groups, we
realized that such variability would make it difficult to rule out teacher
effect. Therefore, to test for teacher effect, we performed a two-way
ANOVA for experimental group by teacher, where students of the
instructor who taught both experimental groups functioned as their own
group and were contrasted with the students who were taught by the
other two teachers. No significant difference was observed between
mean accuracy scores of students grouped by teacher, ^(1,45) = 1.06,
p = 0.31. This suggests that teacher differences probably had little effect
on mean performance levels.
ANOVA Test Results
Before discussing our ANOVA test results, we should briefly comment

on how well our data met the requisite ANOVA assumptions. Though a
strict process for random assignment was not possible in this study, we
attempted to make student experiences unrelated to the treatment as
similar as possible for both groups. In addition, the Kolmogorov
Smirnov test was used (p ^ 0.10), suggesting that distributions were
normal. We also used the Levene's test, pretest: 7^(1,45) = 2.48, p= 0.12;
posttest: F(l,45) = 3.2, p = 0.08, indicating that the equality of error
variance across groups was at acceptable levels.
The first research question dealt with whether the mean accuracy
scores from the treatment group posttest essays would be significantly
greater than those from the contrast group. Table 4 provides the means
and standard deviations for accuracy scores for the treatment and
contrast group. Table 5 shows a significant interaction effect (p = 0.001)
illustrated in Figure 2, demonstrating that significantly higher accuracy
scores were produced by those who received the treatment than those
who had been instructed with the traditional approach.
98 TESOL QUARTERLY
TABLE 4
Descriptive Statistics for Accuracy Scores
Group Pretest Posttest Mean

Control (w = 19) Mean 16.30 13.78 15.04
SD 10.70 11.81 11.26
Treatment (w = 28) Mean 14.02 24.16 19.09
SD 15.00 19.46 17.23
Total (N = 47) Mean 14.94 19.97 17.46
SD 13.35 17.42 15.39
Note. SD = standard deviation.
TABLE 5
Mixed ANOVA Summary Table for Accuracy Scores
Source 55 df MS
Between Subjects 46
Group 371.05 1 371.05 0.95 0.33 0.02
Error 17,536.12 45 389.69
Within Subject 47
Time 329.01 1 329.01 4.44 0.04 0.09
Time x Group 908.19 1 908.19 12.26 0.001 0.21
Error 3,333.22 45 74.07
Total 22,477.59 93
Two additional observations are worth noting. First, an analysis of

simple main effects shows that, whereas posttest differences between the
experimental groups were significant, pretest differences were not,
30
25
20
15
10
Contrast
"? Treatment
Pretest Posttest
FIGURE 2. Illustration of effect in Table 5
99
EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON
F(l,46) ? 2.3, p = 0.14, showing relatively equal performance levels for
both experimental groups on accuracy prior to the treatment. Second,
based on the guidelines proposed by Cohen7 (1988), the partial eta
squared (rj2p) of 0.21 suggests a fairly large effect size that could be
attributed to the effect of the instructional methodology.
The second research question dealt with whether means from the
rhetorical competence scores, writing fluency scores, or writing
complexity scores from the treatment group posttest essays would be
significantly lower than those from the contrast group. None of these
measures were significant, based on our pseudo-Bonferroni adjustment
of 0.01 determined previously: rhetorical competence: ,F(1,45) = 0.09,
p = 0.77; writing fluency: 7^(1,45) = 1.8, p = 0.19; writing complexity:
7^(1,45) = 3.2, p ? 0.08. Nevertheless, there was a small effect for writing
fluency (= 0.07), suggesting that the instructional methodology may
have had a slight negative effect on writing fluency and complexity.
DISCUSSION
The purpose of this study was to test the effects of our instructional
methodology on ESL writing accuracy within the context of its overall
impact on rhetorical competence, writing fluency, and writing complex
ity. In order to do this, one group of students was taught utilizing
dynamic WCF. The writing performance of the treatment group was
contrasted with the performance of another group of students who were
taught using traditional approaches to process writing. Results revealed
that the treatment had a relatively large effect on improving the mean
accuracy scores of those students in the treatment group compared with
those in the contrast group.
Although there were no statistical differences between the two groups
over time, in terms of their mean rhetorical competence ratings, writing
fluency scores, or writing complexity scores, the traditional approach to
process writing slightly favored the contrast group in terms of writing
fluency and writing complexity, according to analyses of effect size.
Although it is not clear exactly why the writing from students in the
treatment group may have produced slightly less fluency and complexity
compared with the writing from the contrast group, it is conceivable
that, as students strive to write more accurately, the fluency and
complexity of their writing may be inhibited slightly as they monitor
their production more carefully.8
Cohen's guidelines for interpreting effect sizes (rj2p) 0.01 = small, 0.06 = moderate, 0.14
= large.
8 These results seem consistent with the trade-offs among accuracy, fluency, and complexity
described by Skehan (1998).
100 TESOL QUARTERLY
However, two important points should be kept in mind regarding
these findings. First, the effect of the experiment on accuracy scores,
which favored the treatment group, was relatively large, whereas the
effects of the treatment on writing fluency and complexity, which
appears to have favored the contrast group, were much smaller. Second,
it should be noted that, in terms of statistical significance based on
simple main effects, the treatment group's performance levels for
fluency and complexity did not decline. Rather, they remained
unchanged over time or improved, though not to the same extent as
the contrast group.
With these findings in mind, one might ask whether the observed
increase in accuracy is worth the small but apparent negative effect the
treatment may have had on the development of fluency and complexity.
One way to attempt to answer this question is to convert mean scores on
these measures into units that can be discussed in more practical terms.
For example, consider writing fluency. Since a test of simple main effects
revealed no significant differences between the contrast group and the
treatment group on the pretest, 46) = 0.16, p = 0.69, then the
posttest scores might serve as a practical estimate of the effect of the
treatment on fluency.
An examination of posttest means shows that, on average, the
treatment group wrote approximately 36 fewer words (roughly one and a
one-half to two sentences) when compared with the contrast group, of
an average of approximately 388 words written during the 30-min time
limit. Although both groups significantly increased their fluency over the
experimental period, these data suggest that students in the treatment
group produced approximately 9% less text compared with the contrast
group for the allotted time.
Similarly, we should also examine the treatment's practical effect on
writing complexity. Although this is less straightforward because of an
interaction effect with different pretest and posttest scores, it may be
instructive to note that, whereas a test of simple main effects showed that
pretest and posttest means for the treatment group were not significantly
different, 7^(1,46) = 0.37, p = 0.55, the contrast group advanced from a
mean length of T-unit of 12.56 to 14.13, a gain of approximately 1.5
words per T-unit.
Finally, we should contextualize these findings by examining the
practical effect of the treatment on writing accuracy. Although
differences in the pretest accuracy scores between the treatment and
contrast groups were not statistically different, 7^(1,46) = 2.3, p = 0.14,
mean posttest scores suggest that, on average, the writing of the students
in the treatment group (M = 24.16) was just over 75% more accurate
than the writing of the students in the contrast group (M = 13.78), based
on the error-free T-unit ratio. This difference is perhaps most
meaningful when we keep in mind that this study did not compare the
effects of treatment with the effects of a methodology that did not utilize
WCF; rather, both experimental groups in this study received WCF that
targeted linguistic accuracy. This seems to underscore the notion that
how one uses WCF may make a great difference in the outcome.
With these findings in mind, two observations are in order. First, the
treatment appears to have had a large beneficial effect on the accuracy
of the student's ESL writing. Second, though rhetorical competence
seemed unaffected, the treatment appears to have had a slight
unfavorable effect on writing fluency and writing complexity.
Nevertheless, we believe that most teachers who strive to improve the
writing accuracy of their ESL students would welcome such progress,
even if it meant sacrificing a sentence or two from an essay or a slight
reduction in its complexity. One might well ask, "What is the true value
of small gains in writing fluency or complexity when the substance of
those gains is laden with linguistic errors that undermine commu
nicative efficacy?" It seems that, for improvements in writing fluency or
complexity to become truly meaningful, it would be necessary to observe
equal or greater improvements in accuracy.
Pedagogical Implications
Perhaps the most salient outcome of this study is that it has shown
that a systematic approach to WCF can have a positive effect on the
accuracy of ESL writing. Although the skills developed through process
writing and the activities that strengthen rhetorical competence, writing
fluency, and writing complexity are important pursuits that have an
appropriate place in an ESL writing curriculum, traditional approaches
to process writing may be inadequate for helping students maximize
their linguistic accuracy. Perhaps this is because traditional approaches
lack the frequency and volume of practice and feedback needed for
improvement. Thus efforts to improve accuracy may be more successful
if separated from attempts to develop other aspects of ESL writing. For
example, though the instructional methodology presented here may not
be a good substitute for a general writing class designed to improve
rhetorical aspects of writing, it may be more effective at producing
linguistic accuracy than traditional methodologies that lack frequent
opportunities for productive practice and feedback.
Though we recognize that linguistic accuracy may not be a priority in
every L2 learning context and that dynamic WCF may not be well suited
for all ESL writers, these findings seem promising for practitioners who
are striving to help motivated students improve the accuracy of their L2
writing. Although care should be exercised in generalizing beyond the
102 TESOL QUARTERLY
context of this study, if these findings represent an appropriate
description of what might be observed in similar settings, then ESL
writing teachers and administrators may want to weigh the possible
benefits and trade-offs of such an approach to improving writing
accuracy within their specific teaching and learning contexts.
Limitations and Further Research
Despite the potential benefits of these findings, there are a number of

limitations in this study that should be considered. Though these intact
classes had been balanced by administrators in terms of proficiency, LI,
nationality, and gender prior to the experiment, they were not subjected
to a strict process of randomization. This should be corrected in future
research. Moreover, although dynamic WCF includes several different
components that may strengthen pedagogy, this can be problematic for
research, because it may be unclear which elements of the methodology
are the most helpful. Additional research might clarify this by isolating
the various components of dynamic WCF to identify those elements that
have the greatest effect on improved accuracy. A related question deals
with whether dynamic WCF could be equally useful for students at other
proficiency levels such as intermediate-high or intermediate-low.
In addition, though the experimental period in this study was only
one semester, another important question for further research would be
to determine how the results might differ if the study were continued
over two or three semesters. For example, would student accuracy over a
longer period continue to improve, plateau, or decline? In addition,
would a longitudinal study result in different effects for rhetorical
competence, fluency, or complexity? These and other longitudinal
questions could be pursued to increase our understanding of how to
help ESL students improve the accuracy of their writing over time.
CONCLUSION
Though some have questioned the effectiveness of traditional
approaches to WCF, this study has shown that dynamic WCF, based on
insights from practice and theory, has helped ESL students improve the
accuracy of their writing. Though additional research is needed to test
the benefits of dynamic WCF in other learning contexts and to answer
additional questions about the best ways to use WCF, these findings
should be valuable to program administrators and practitioners who
strive to help their ESL students write more accurately.
Perhaps the time has come to reframe the WCF debate to focus less
on whetherWCF is effective and more on how to use WCF to help students
learn to write more accurately. In doing so, related research should focus
on providing practitioners with information that can inform pedagogy in
ways that are both meaningful and practical. Although the path toward
accurate ESL writing may be steep and strewn with challenges, the
findings of this study suggest that substantial progress may be possible.
Explicit instruction, coupled with ongoing practice and dynamic WCF,
may hasten many L2 learners along this important path in their
language development.
THE AUTHORS
K. James Hartshorn has been involved in second language education in the United
States and Asia for more than two decades. He currently serves as the curriculum
coordinator at Brigham Young University's English Language Center, Provo, Utah,
United States. Research interests include second language writing, pronunciation,
curriculum development, and teacher training.
Norman W. Evans is a faculty member in the Linguistics and English Language

Department and director of curriculum development at the English Language
Center at Brigham Young University, Provo, Utah, United States. His research
interests include writing in a second language, language teaching methods, and
curriculum development.
Paul F. Merrill is a professor of instructional psychology and technology at Brigham

Young University, Provo, Utah, United States. He received his doctorate from the
University of Texas at Austin. He is the principal author of the text Computers in
Education, published by Allyn and Bacon.
Richard R. Sudweeks is a professor in the Instructional Psychology and Technology

Department at Brigham Young University, Provo, Utah, United States. His research
interests focus on problems related to assessing the cognitive, behavioral, and
affective outcomes of instruction.
Diane Strong-Krause is an associate teaching professor of linguistics and English

language at Brigham Young University, Provo, Utah, United States. Her research
focuses on language assessment and its relationship to curricula.
Neil J. Anderson is a humanities professor of linguistics and English language at

Brigham Young University, Provo, Utah, United States. His research interests include
second language reading, language learner strategies, and English language
teaching leadership development. Professor Anderson served as president of the
global association Teachers of English to Speakers of Other Languages from 2001 to
2002.
REFERENCES
Ayoun, D. (2001). The role of negative and positive feedback in the second language
acquisition of passe compose and imparfait. Modem Language Journal, 85, 226
243.
104 TESOL QUARTERLY
Azevedo, R., & Bernard, R. M. (1995). A meta-analysis of the effects of feedback in
computer-based instruction. Journal of Educational Computing Research, 13, 109
125.
Bitchener, J. (2008). Evidence in support of written corrective feedback. Journal of
Second Language Writing, 17, 102-118.
Bitchener, J., 8c Knoch, U. (2008). The value of a focused approach to written
corrective feedback. Language Teaching Research, 12, 409-431.
Bitchener, J., Young, S., & Cameron, D. (2005). The effect of different types of
corrective feedback on ESL student writing. Journal of Second Language Writing, 14,
191-205.
Bond, T. G., 8c Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in
the human sciences (2nd ed.). Mahwah, NJ: Lawrence Erlbaum.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale,
NJ: Lawrence Erlbaum.
Corder, S. P. (1981). Error analysis and interlanguage. London, England: Oxford
University Press.
DeKeyser, R. (2001). Automaticity and automatization. In P. Robinson (Ed.),
Cognition and second language instruction (pp. 125-151). Cambridge, England:
Cambridge University Press.
DeKeyser, R. (2007). Skill acquisition theory. In B. VanPatten 8c]. Wiliams (Eds.),
Theories in second language acquisition (pp. 97-113). Mahwah, NJ: Lawrence
Erlbaum.
Ellis, R., Sheen, Y., Murakami, M., 8c Takashima, H. (2008). The effects of focused
and unfocused written corrective feedback in an English as a foreign language
context. System, 36, 353-371.
Evans, N. W., Hartshorn, K.J., McCollum, R. M., 8c Wolfersberger, M. (In press).
Contextualizing corrective feedback in L2 writing pedagogy. Language Teaching
Research.
Ferris, D. R. (1999). The case for grammar correction in L2 writing classes: A
response to Truscott (1996). Journal of Second Language Writing, 8, 1-11.
Ferris, D. R. (2004). The "Grammar Correction" debate in L2 writing: Where are we,
and where do we go from here? (and what do we do in the meantime....?).
fournal of Second Language Writing, 13, 49-62.
Ferris, D. R. (2006). Does error feedback help student writers? New evidence on the
short- and long-term effects of written error correction. In K Hyland 8c F. Hyland
(Eds.), Feedback in second language writing: Contexts and issues (pp. 81-104).
Cambridge, England: Cambridge University Press.
Grabe, W. (2001). Notes toward a theory of second language writing. In T. Silva 8c
P. K. Matsuda (Eds.), On second language writing (pp. 39-57). Mahwah, NJ:
Lawrence Erlbaum.
Gu, S., & Wang, T. (2008). The impact of negative feedback, noticing, and modified
output on EFL question development. Foreign Language Teaching and Research, 40,
270-278.
Guenette, D. (2007). Is feedback pedagogically correct? Research design issues in
studies of feedback on writing, fournal of Second Language Writing, 16, 40-53.
Guzzo, R. A., Jette, R. D., 8c Katzell, R. A. (1985). The effects of psychologically based
intervention programs on productivity: A meta-analysis. Personnel Psychology, 38,
275-291.
Hinkel, E. (2002). Second language writers' text: Linguistic and rhetorical features.
Mahwah, NJ: Lawrence Erlbaum.
Hinkel, E. (2004). Teaching academic ESL writing: Practical techniques in vocabulary and
grammar. Mahwah, NJ: Lawrence Erlbaum.
Hino, J. (2006). Linguistic information supplied by negative feedback: A study of its
contribution to the process of second language acquisition (Unpublished doctoral
dissertation). University of Pennsylvania, Philadelphia.
Huck, S. W. (2008). Reading statistics and research (5th ed.). Boston, MA: Pearson
Education.
Hunt, K. W. (1965). Grammatical structures ivritten at three grade levels. Urbana, IL: The
National Council of Teachers of English.
Iwashita, N. (2003). Negative feedback and positive evidence in task-based
interaction: Differential effects on L2 development. Studies in Second Language
Acquisition, 25, 1-36.
Kepner, C. G. (1991). An experiment in the relationship of types of written feedback
to the development of second-language writing skills. Modern Language Journal,
75, 305-313.
Kirschner, P. (2002). Cognitive load theory: Implications of cognitive load theory on
the design of learning. Learning and Instruction, 12, 1-10.
Kluger, A. N., & DeNisi, A. (1996). The effects of feedback interventions on
performance: Historical review, a meta-analysis and a preliminary feedback
intervention theory. Psychological Bulletin, 119, 254-284.
Linacre, J. M. (1994). Many-faceted Rasch measurement. Chicago, IL: MESA Press.
Linacre, J. M. (2006). Facets Rasch measurement computer program (Version 3.6).
Chicago, IL: Winsteps.com.
Long, M., Inagaki, S., & Ortega, L. (1998). The role of implicit negative feedback in
SLA: Models and recasts in Japanese and Spanish. Modern Language Journal, 82,
357-371.
McDonough, K. (2005). Identifying the impact of negative feedback and learners'
responses on ESL question development. Studies in Second Language Acquisition,
27, 79-103.
McNamara, T. F. (1996). Measuring second language performance. London, England:
Longman.
Odlin, T. (1989). Language transfer: Cross-linguistic influence in language learning. New
York, NY: Cambridge University Press.
Ortega, L. (2003). Syntactic complexity measures and their relationship to L2
proficiency: A research synthesis of college-level L2 writing. Applied Linguistics, 24,
492-518.
Paas, F. G. W. C, Renkl, A., & Sweller, J. (2004). Cognitive load theory: Instructional
implications of the interaction between information structures and cognitive
architecture. Instructional Science, 32, 1-8.
Pawlak, M. (2005). The feasibility of integrating form and meaning in the language
classroom: A qualitative study of classroom discourse. Glottodidactica, 30-31, 283
294.
Pollitt, A., 8c Hutchinson, C. (1987). Calibrated graded assessment: Rasch partial
credit analysis of performance in writing. Language Testing, 4, 72-92.
Ranta, L., 8c Lyster, R. (2007). A cognitive approach to improving immersion
students' oral language abilities: The awareness-practice-feedback sequence. In
R. DeKeyser (Ed.), Practice in a second language: Perspectives from applied linguistics
and cognitive psychology (pp. 141-160). Cambridge, England: Cambridge
University Press.
Ringbom, H. (1987). The role of the first language in foreign language learning. Clevedon,
England: Multilingual Matters.
Russell Valezy, J., 8c Spada, N. (2006). The effectiveness of corrective feedback for
second language acquisition: A meta-analysis of the research. In J. Norris 8c L.
Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 133-164).
Amsterdam, The Netherlands: John Benjamins.
106 TESOL QUARTERLY
Schumacker, R. E. (1999). Many-faceted Rasch analysis with crossed, nested, and
mixed designs. Journal of Outcome Measurement, 3, 323-338.
Shadish, W., Cook, T., & Campbell, D. (2002). Experimental and quasi-experimental
designs for generalized causal inferences. New York, NY: Hough ton Mifflin.
Sheen, Y. (2007). The effect of focused written corrective feedback and language
aptitude on ESL learners' acquisition of articles. TESOL Quarterly, 41, 255-283.
Silva, T. (1993). Toward an understanding of the distinct nature of L2 writing: The
ESL research and its implications. TESOL Quarterly, 27, 657-675.
Skehan, P. (1998). A cognitive approach to language learning. Oxford, England: Oxford
University Press.
Truscott, J. (1996). The case against grammar correction in L2 writing classes.
Language Learning, 46, 327-369.
Truscott, J. (1999). The case for "the case for grammar correction in L2 writing
classes": A response to Ferris. Journal of Second Language Writing, 8, 111-122.
Truscott, J. (2004). Dialogue: Evidence and conjecture on the effects of correction: A
response to Chandler. Journal of Second Language Writing, 13, 337-343.
Truscott, J. (2007). The effect of error correction on learners' ability to write
accurately. Journal of Second Language Writing, 16, 255-272.
Truscott, J., & Hsu, A. Y. (2008). Error correction, revision, and learning. Journal of
Second Language Writing, 17, 292-305.
Wolfe-Quintero, K., Inagaki, S., & Kim, H. (1998). Second language development in
writing: Measures of fluency, accuracy, and complexity. Manoa, HI: University of
Hawaii Press.
Wright, B. D., & Linacre, J. M. (1994). Reasonable mean-square fit values. Rasch
Measurement Transactions, 8, 370.
APPENDIX A
Indirect Coding Symbols
?> = Determiner S/PL = Singular/Plural

SV = Subject Verb Agreement C/NC = Count/Noncount
VF = Verb Form ? = Meaning is not clear
ro- = Run-on Sentence AWK = Awkward Wording
woe = Incomplete Sentence /^\^ = Word Order
VT = Verb Tense C = Capitalization
pp = Preposition p = Punctuation
SPG = Spelling ?9-- = Omit
M)f = Word Form /\ = Something is m
M)C = Word Choice f = New Paragraph
107
EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY
APPENDIX B
Sample Tally Sheet
I
0
0
0_ 0_
0 0
0 0_
0 0
0
0
0_
0
0 0
108 TESOL QUARTERLY
APPENDIX C
Rhetorical Writing Competence Rubric
Writing Rubric Adapted from the TOEFL iBT
I TS Level Description
The essay accomplishes the following:
effectively addresses the topic and task
is well organized and well developed, using clearly appropriate explanations,
examples, support or details
displays unity, progression, and coherence
The essay is marked by one or more of the following:

addresses the topic and task using somewhat developed explanations, example or
details
displays unity, progression, and coherence, though connection of ideas may be
i obscured _
The essay is seriously flawed by one or more of the following:

serious disorganization or underdevelopment
irrelevant specifics or questionable responsiveness to the task
little or no detail
Directions to Raters: The purpose of this rubric is to measure the rhetorical competence of the writers
whose essays you will analyze. While it is understood that problems with linguistic accuracy may affect
your ability to understand an essay and follow its organization and development, strive to focus on those
features of rhetorical competence included in the rubric without concern for linguistic accuracy. Use the
benchmark essays carefully to guide your rating.

Teachers of English To Speakers of Other Languages, Inc. (TESOL) TESOL Quarterly

Uploaded by

Copyright:

Available Formats

Teachers of English To Speakers of Other Languages, Inc. (TESOL) TESOL Quarterly

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Teachers of English To Speakers of Other Languages, Inc. (TESOL) TESOL Quarterly

Uploaded by

Copyright:

Available Formats

Teachers of English to Speakers of Other Languages, Inc.

Effects of Dynamic Corrective Feedback on ESL Writing Accuracy

Teachers of English to Speakers of Other Languages, Inc. (TESOL) is collaborating with

Though recent research has shown that written corrective feedback

84 TESOL QUARTERLY Vol. 44, No. 1, March 2010

THE NEED FOR A BETTER INSTRUCTIONAL

We begin with a brief discussion of why it may be useful to rethink our

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 85

AN ALTERNATIVE INSTRUCTIONAL METHODOLOGY

Skill Acquisition Theory

Making WCF Dynamic

2 This is opposed to focused feedback based on a form that may be targeted as an

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 87

Timely and Constant

Skill acquisition theory suggests that, in order for the feedback to be

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 89

Teacher marks edited composition

FIGURE 1. Feedback cycle for dynamic written corrective feedback.

fluency and writing complexity. This is because we concluded that, even

Measures of L2 Writing Production

Though many measures of accuracy might have been used, we

In addition to defining writing accuracy, it was necessary to establish

Writing Fluency and Complexity

In order to further contextualize writing accuracy and rhetorical

Operationalized Research Questions

Though classroom practice was based on 10-min compositions so tasks

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 91

This study included 47 advanced-low to advanced-mid ESL students

Native language Male Female Total Male Female Total

Three different teachers taught students in the treatment group.

The Scorers and Raters

In an effort to estimate the reliability of the measures investigated in

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 93

Treatment and Contrast Groups

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 95

Testing occasion Proficiency level Control group Treatment group

In both instances the elicitations occurred in a computer lab, where

Scoring and Rating Reliability

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 97

Before examining our results, we briefly address two concerns we had

ANOVA Test Results

Before discussing our ANOVA test results, we should briefly comment

Group Pretest Posttest Mean

Two additional observations are worth noting. First, an analysis of

100 TESOL QUARTERLY

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 101

102 TESOL QUARTERLY

Limitations and Further Research

Despite the potential benefits of these findings, there are a number of

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 103

Norman W. Evans is a faculty member in the Linguistics and English Language

Paul F. Merrill is a professor of instructional psychology and technology at Brigham

Richard R. Sudweeks is a professor in the Instructional Psychology and Technology

Diane Strong-Krause is an associate teaching professor of linguistics and English

Neil J. Anderson is a humanities professor of linguistics and English language at

104 TESOL QUARTERLY

EFFECTS OF DYNAMIC CORRECTIVE FEEDBACK ON ESL WRITING ACCURACY 105

106 TESOL QUARTERLY