6-Sato
6-Sato
6-Sato
doi:10.1017/S0272263119000159
Research Article
PRACTICE IS IMPORTANT BUT HOW ABOUT ITS QUALITY?
CONTEXTUALIZED PRACTICE IN THE CLASSROOM
Masatoshi Sato*
Universidad Andres Bello
Kim McDonough
Concordia University
Abstract
This study explored the impact of contextualized practice on second language (L2) learners’
production of wh-questions in the L2 classroom. It examined the quality of practice (correct vs.
incorrect production) and the contribution of declarative knowledge to proceduralization. Thirty-
four university-level English as a foreign language learners first completed a declarative knowledge
test. Then, they engaged in various communicative activities over five weeks. Their production of
wh-questions was coded for accuracy (absence of errors) and fluency (speech rate, mean length of
pauses, and repair phenomena). Improvement was measured as the difference between the first and
last practice sessions. The results showed that accuracy, speech rate, and pauses improved but with
distinct patterns. Regression models showed that declarative knowledge did not predict accuracy or
fluency; however, declarative knowledge assisted the learners to engage in targetlike behaviors at
the initial stage of proceduralization. Furthermore, whereas production of accurate wh-questions
predicted accuracy improvement, it had no impact on fluency.
Practice has been shown to positively impact second language (L2) speakers’ accuracy
during memory retrieval and their processing speed. Drawing on skill-learning theories
This work was partially supported by grants awarded to the first author by the Fondo Nacional de Desarrollo
Cientı́fico y Tecnólogico from the Ministry of Education of Chile (FONDECYT: 1181533) and PIA
(CIE160009) from the Chilean National Commission of Science and Technology (CONICYT) as well as
funding awarded to the second author from the Canada Research Chairs program (950-231218).
The experiment in this article earned an Open Materials badge for transparent practices. The materials
are available at https://www.iris-database.org/iris/app/home/detail?id5york%3a936167.
We would like to thank the teachers who generously supported our project and the research assistants who
helped with data collection and coding: Mayuri Kewlani, Estefanı́a Valencia, Camila Valenzuela, Mélanie
Vergara, and Paula Viveros.
*Correspondence concerning this article should be addressed to Masatoshi Sato, Department of English, Universidad
Andres Bello, Fernández Concha 700, Las Condes, Santiago, 7550000, Chile. Email: masatoshi.sato@unab.cl
(Anderson, 1985; Schneider & Fisk, 1983), research in cognitive psychology has
typically examined error rate and reaction time during paired-associate tasks. Neuro-
imaging evidence has shown that during skill development, learners increasingly become
reliant on procedural memory over declarative memory, resulting in more skilled
performance, that is, faster and more accurate performance (Eichenbaum, 2012; Ullman,
2001). What drives the change in memory retrieval is practice; by being engaged in
repeated practice, learners develop memory routines based on the procedural system
more and more, that is, proceduralization. Ultimately, L2 learners may automatize L2
knowledge by restructuring the components of procedural memory. To that end, the vast
majority of L2 practice studies have examined the spacing effect, that is, the length of
intervals between practice sessions (e.g., Nakata, 2015; Rodgers, 2011; Schuetze, 2015;
Serrano, Stengers, & Housen, 2015; Suzuki & DeKeyser, 2017a; Toppino & Gerbier,
2014), with the findings revealing positive effects of practice on L2 learning overall.
However, previous L2 research faces several methodological issues. First, the practice
tasks in empirical studies have been largely mechanical and decontextualized. Given the
evidence related to skill specificity (Healy & Bourne, 1995) and transfer-appropriate
processing (Morris, Bransford, & Franks, 1977), practice conditions should approximate
the context where L2 learners will use the learned skills in the real world. Specifically, if
the instructional goal is to help learners develop L2 communicative skills, repeated
practice should elicit memory retrievals of those skills. Therefore, the current study
implemented contextualized practice activities as part of classroom instruction and
examined their effect on English as a foreign language (EFL) learners’ production of wh-
questions. Second, previous research has examined learners’ correct responses to ex-
perimental stimuli only. However, during the learning processes, L2 learners may draw
upon their interlanguage systems and produce incorrect utterances. It is unclear whether
practicing inaccurate forms has any longer-term effects. Third, while skill-learning
theories claim that declarative knowledge supports the development of procedural
knowledge (DeKeyser, 2017), evidence for their association has been scarce. Therefore,
the current study considered the potential role of declarative knowledge, along with both
correct and incorrect production, as potential predictors of increased accuracy and
fluency.
Drawing on theories of skill learning (Anderson, 1985) and automaticity (Schneider &
Fisk, 1983) from cognitive psychology, skill acquisition theory states that L2 skill
development involves two types of memory systems. On the one hand, the declarative
memory system (knowledge that) is characterized by controlled processing that is slow,
serial, and effortful. Though often thought to hold rules of grammatical systems, de-
clarative representations may contain a broader range of factual knowledge, such as the
meaning of words, their phonological forms, and grammatical specifications (Morgan-
Short & Ullman, 2012). On the other hand, the procedural memory system subsumes
fast, parallel, and fairly effortless processes. The advantage of procedural processing is
that it frees up space in short-term memory so that an individual can handle other tasks
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 3
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
4 Masatoshi Sato and Kim McDonough
Similar findings have been reported in real languages as well. For instance, Bird
(2010) investigated the acquisition of an English morphosyntactic structure (i.e., tense/
aspect of verb forms). The learners engaged in practice sessions involving error cor-
rection of ungrammatical forms on worksheets. In Suzuki and DeKeyser (2017a),
grammatical accuracy of a Japanese morphology (te-form) was targeted. Regardless of
the practice conditions (spaced or massed), learners showed improvement over four
practice sessions, with changes conforming to the power law of practice. With a cross-
sectional design involving three different proficiency groups, Rodgers (2011) compared
oral production during picture description tasks. In these tasks, learners completed
sentences by filling in a gap related to the target structure (the regular present tense of the
indicative in Italian). The analysis of error rates showed that the three groups signifi-
cantly differed and Rodgers concluded that “learners become more automatic … as they
become more proficient in the language” (pp. 311–312).
With regards to the speed of cognitive processing, skill-learning research has
primarily examined reaction time when responding to stimuli or performing a task,
based on a premise that reaction time provides insight into “how the mind works, and
infer about the cognitive processes or mechanisms involved in language processing”
(Jiang, 2012, p. 2). Various studies reported practice effects as evidenced by a decline
in reaction time (see DeKeyser, 2017). However, the speed of cognitive processing
can be operationalized as oral fluency as well, as Anderson (2005) illustrated in
relation to foreign language learning: “the procedural, not the declarative, knowledge
governs the skilled [fluent] performance” (p. 282). Indeed, oral fluency is a more
relevant construct when considering classroom L2 learning, which is the focus of the
current study.
L2 fluency was conceptualized into three types by Segalowitz (2010): cognitive,
utterance, and perceived. Utterance fluency involves temporal aspects of L2 production
and cognitive fluency is arguably manifested in some of those indices. Perceived fluency
is based on listeners’ impressions of speeches. The development of utterance fluency can
be characterized by temporal measures, such as speed (e.g., speech rate), pauses (filled or
unfilled), and repair (e.g., repetitions) phenomena (see Derwing, 2017). Research
suggests that certain aspects of utterance fluency represent cognitive fluency, although
findings have been mixed. For instance, in Segalowitz and Freed’s (2004) study,
cognitive processing was investigated through lexical access (animacy judgment tests)
and the results were compared with the learners’ utterance fluency during oral pro-
ficiency interviews. Among different oral fluency indices, filled pauses were correlated
with the reaction time scores, meaning that cognitive fluency was observed in pause
phenomena. However, in an investigation of cognitive versus utterance fluency, de Jong,
Steinel, Florijn, Schoonen, and Hulstijn (2013) found that speech rate (mean syllable
duration) was the best predictor of cognitive fluency (b 5 .50), while pause phenomena
(silent pause duration) explained only 5% of the total variance of cognitive fluency.
Several studies examined the effect of practice on utterance fluency. De Jong and
Perfetti (2011), for example, examined whether a fluency-focused oral activity (the 4/3/2
task) impacted the development of temporal aspects of spontaneous production. The
learners who repeated the same speech multiple times, as opposed to giving different
speeches, improved their fluency by reducing the length of pauses, increasing syllables
per minute, and lengthening stretches of speech. In Suzuki and DeKeyser (2017a),
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 5
reaction time included “from the onset of the prompt word to the end of the utterance” (p.
174), suggesting that the scores indicated how fast the learners spoke—a similar index to
speech rate. Having found improvement in the fluency measure, Suzuki and DeKeyser
claimed that “[a] speed measure is a more sensitive index of procedural knowledge
because it not only tests the accuracy of conjugation, but also how fast the morphological
knowledge can be deployed” (p. 182). In sum, research has shown that repeated practice
supports proceduralization (or automatization) both in accuracy and fluency (see also
Lim & Godfroid, 2015). However, the practice and testing materials in the studies
reviewed have been rather decontextualized (e.g., fill in the gap), leaving a question as to
whether practice effects extend to a more communicative skill.
QUALITY OF PRACTICE
learners with the target words in a story. The results showed that learners who were
given isolated training scored higher on the test in which individual words were
presented on a computer screen, while learners in the context-training group, who were
taught words in passages, outperformed the isolated-training group in the reading
passage test. Hence, evidence suggests that practice effects are specific in regard to
skills, tasks, and context, and if the goal of L2 instruction is to help learners develop
communicative skills, practice should be contextualized. (We will detail our oper-
ationalization of contextualization in the “Methods” section.)
RQ1: Does contextualized practice facilitate L2 speakers’ production of accurate and fluent wh-
questions?
RQ2: What learner and practice factors predict changes in the accuracy or fluency of the L2
speakers’ wh-questions?
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 7
METHODS
PARTICIPANTS
TARGET STRUCTURE
The target structure of the intervention was wh-questions, which have been widely
studied in classroom-based SLA research to explore the developmental outcomes as-
sociated with corrective feedback (Adams, 2007; Lightbown & Spada, 1990), task
complexity (Kim, 2012), and structural priming (McDonough & Chaikitmongkol, 2010),
to test the effectiveness of form-focused instruction (Spada & Lightbown, 1999; White,
Spada, Lightbown, & Ranta, 1991) and to explore learners’ awareness of L1-L2 dif-
ferences (Ammar, Lightbown, & Spada, 2010; Lightbown & Spada, 2000). In addition to
allowing for comparisons with previous research, the focus on wh-questions was also
motivated by pedagogical reasons, as questions complemented the objectives and
materials of the participants’ EFL integrated skills course.
The wh-question form under investigation consisted of a question word followed by an
obligatory auxiliary verb, subject, and lexical verb, which has been referred to as a wh-
aux 2nd question (Pienemann & Johnston, 1987; Pienemann, Johnston, & Brindley,
1988). This type of wh-question was selected because L2-English learners often alternate
between several variants: the targetlike form in which the obligatory auxiliary verb is
supplied (when was the baby born?), an interlanguage form without an auxiliary verb
(*when the baby born?), or an interlanguage form with a misplaced auxiliary verb
(*when the baby was born?). Even when learners produce an auxiliary verb in the correct
location, they may still make errors involving subject/auxiliary verb agreement (e.g.,
*where are the family going?) or auxiliary/lexical verb agreement (e.g., *what goal does
he has?). In the current study, only wh-questions that require auxiliary verbs were
considered because a missing auxiliary verb is not ungrammatical for some question
types, such as wh-questions with the copula (what color is the dog?) or questions in
which the wh-word functions as a subject (which boy likes chocolate?).
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
8 Masatoshi Sato and Kim McDonough
MATERIALS
across five practice sessions (three tasks per session). A complete list of the task types,
versions, and the distribution across the practice activities is provided in Table 1. Al-
though there was variation in the communicative goal of the tasks (e.g., deciding if the
instructor was telling the truth or lying vs. guessing the identity of a Chilean celebrity),
they were all one-way tasks in which the instructor held the information and the students’
role was to ask questions to obtain that information. All activities were led by the teacher
in the classroom where the whole class interacted with the teacher concurrently, so as to
approximate the experimental setting to real classroom interaction.
All tasks were presented through PowerPoint slides that showed images along with
prompts in the form of wh-words (what, when, where, how, why, how much/many, what
1 noun, and which 1 noun). The number of wh-prompts varied from four to eight across
the six task types, but each practice session presented a total of 20 wh-prompts with the
same distribution of specific wh-words. The wh-prompts were timed to appear on the
images one at a time, giving students 10 seconds to produce a question using the wh-
prompt. An audio signal (i.e., a beep) was used to indicate that time had expired before
the next wh-prompt appeared (see Supplementary Materials for sample tasks). The
instructor provided answers after each wh-prompt or at the end of each task. Depending
on the task, the teachers were provided with the answers. For instance, in the Biography
task, where the learners’ goal was to guess a Chilean celebrity, the teacher was given
relevant information about the celebrity. However, in the Interview task, where the
learners asked questions to know more about the teacher’s personal life, the teacher
answered the learners’ questions spontaneously.
DESIGN
PROCEDURE
The activities were carried out in the participants’ regularly scheduled classes over a five-
week period (see Table 2). In the first week, the participants completed the consent and
background information forms (15 min), along with the declarative knowledge test (20
min). To familiarize the students with the activity format and audio-recording equipment,
a task with communicative activities similar to the intervention activities was imple-
mented in week two (20 min). However, the familiarization task targeted yes/no
questions instead of wh-questions. In weeks three to seven (i.e., Time 1 to Time 5),
learners engaged in the contextualized practice activities. Each session took approxi-
mately 15 minutes to complete, and was administered at the beginning or end of a class
period based on its fit with the lesson’s other instructional activities. All students held
a digital audio-recorder to individually record their language production during the
practice tasks. It is important to stress that the teachers did not provide any feedback on
the learners’ erroneous utterances. While skill acquisition theory underscores the im-
portance of feedback to avoid entrenching inaccurate knowledge structures, feedback
was controlled in the current experiment so as to tease apart the effect of contextualized
practice.
Declarative knowledge
The tests were scored by the researchers by giving one point to each correct item, with no
half points for partially correct answers. For the fill-in-the-blank items, correct items
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 11
were those for which a participant had supplied a question word, auxiliary verb, or lexical
verb that resulted in a grammatically accurate question. For the error correction items,
correct items had a change (insertion or deletion) that repaired the error and resulted in
a grammatically accurate question. Finally, the scrambled question items were con-
sidered correct if the question contained all the words provided in a grammatically
accurate word order without any additions or deletions. The participants’ scores for the
24 wh-aux 2nd question items were summed and used in the subsequent analysis, but
their scores for the nine distracter questions (wh-subject and wh-copula) were not an-
alyzed further.
Accuracy
The audio-recordings of the practice sessions were transcribed and verified by research
assistants. The transcripts were analyzed by research assistants for the occurrence of wh-
questions, which were classified as fragments, wh-subject, wh-copula, and wh-aux 2nd.
Definitions and examples of each type are provided in Table 3.
After classifying all the participants’ questions, their wh-aux 2nd questions were
further analyzed for accuracy. To be classified as accurate, a wh-aux 2nd question
required the correct (a) question word, (b) word order (i.e., position of auxiliary verb), (c)
subject/auxiliary verb number agreement, and (d) auxiliary verb/main verb agreement in
tense and aspect. Errors involving forms unrelated to question formation, such as article
use, plurals, or lexical choice (e.g., blood banker as opposed to blood bank), were not
considered in the accuracy coding. Proportional scores were obtained by calculating the
percentages of incorrect utterances out of total utterances in each session. A subset of the
data (20%) was coded by a research assistant for interrater reliability, which was
Fragments Wh-question word with one or two lexical items What subject?
What kind?
Wh-subject Wh-questions in which the wh-word acts as the How many people went with pets?
subject, and auxiliary verbs are not required What happened on your last birthday?
*What make Alex feel hungry?
*What inspire him to study English?
Wh-copula Wh-questions with copular be What is your goal in life?
What is his motivation?
*Why does she your favorite actress?
*What is inspirational people for you?
Wh-aux 2nd Wh-questions that require an auxiliary verb Which sport does he like?
located between the wh-word(s) and the How often does she practice this sport?
subject Where did she donate blood?
*Where did you took the cane?
*When she go to the university?
*Where the plane is going?
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
12 Masatoshi Sato and Kim McDonough
calculated as Cohen’s kappa for the classification of wh-questions (.85) and intraclass
correlation coefficients for the accurate wh-aux 2nd questions (.94).
Fluency
Following previous practice research, we focused on fluency of correct productions of
wh-aux 2nd questions. The resulted number of questions analyzed for fluency was 1,133
out of 1,973 (57.43%) (details are presented in the results section). First, fluency of
individual questions was analyzed using Audacity. Then, the average scores of the
fluency indices, based on the number of questions that each learner produced, were
calculated for each learner at each practice session. Hence, the scores presented in the
“Results” section indicate fluency per question.
Three fluency indices were used: speech rate (syllables per second), mean length of
pauses (average length of pauses), and repair phenomena (summed frequency of false
starts, repetitions, and self-corrections). In terms of the cutoff length of pauses, de Jong and
Bosker (2013) compared different cutoff lengths (between 100 ms to 1,000 ms) in relation
to vocabulary knowledge and perceived fluency. The results showed that a 250–300 ms
cutoff was the “optimal threshold for measuring the number of pauses (per total or per
speaking time) with respect to studies that aim to investigate L2 proficiency” (p. 20).
Following this result, the silences shorter than 250 ms, so-called micropauses (Riggenbach,
1991), were not considered as pauses in the current study. Interrater reliability was
calculated on a subset of the data (20%) which yielded acceptable agreement rates for all
indices (speech rate: r 5 .78; mean length of pauses: r 5 .75; repairs: r 5 .91).
Statistical analyses
In the dataset, only 19 learners participated in all five sessions. As a result, 25 cases were
missing, among a potential of 170 cases (34 participants tested five times). Accordingly,
we conducted missing values analyses on the dataset, using the expectation-
maximization algorithm (see Mbogning, Bleakley, & Lavielle, 2015).2
To understand the overall patterns of accuracy and fluency changes over the five
sessions, descriptive statistics were first analyzed and visually inspected. Then, the data
were submitted to two types of inferential statistics. First, repeated measures ANOVAs
(RM ANOVAs) were chosen for examining the impact of contextualized practice on the
participants’ accuracy and fluency. The assumption of normality was met for error rate at
five sessions (all above p 5 .20 according to Kolmogorov-Smirnov) as well as all fluency
indices at five sessions (all above p 5 .06 according to Kolmogorov-Smirnov). However,
Mauchly’s tests showed that the dataset violated the assumption of sphericity (error rate:
p 5 .005; speech rate: p , .000; mean length of pauses: p 5 .004; repair phenomena: p ,
.000). In the “Results” section, therefore, we report the results with Huynh-Feldt cor-
rection (sphericity not assumed). For the post-hoc pairwise comparisons, LSD was
chosen given that some of the comparisons were irrelevant to the objective of the study
(e.g., Time 2 vs. Time 5). The alpha level for all tests of significance was set at .05.
Second, to answer the second research question pertaining to the predictors of im-
proved accuracy or fluency, hierarchical multiple regressions were chosen. In total, four
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 13
models were tested for the four indices of accuracy and fluency. In all models, the two
predictor variables were the frequencies of correct and incorrect productions. The
expectation-maximization algorithm was not applied to the raw frequencies data,
meaning that the scores reflected actual production frequencies. The frequencies from
Times 2, 3, and 4 were summed; Times 1 and 5 were not included so as to avoid violating
the independence of observation for the inferential statistics. To explore the contribution
of declarative knowledge to proceduralization, the scores of the declarative knowledge
test were entered in the first block (Step 1: see Petrocelli, 2003). In the second block (Step
2), the frequencies of correct and incorrect productions were entered as the predictors. In
so doing, we tested (a) the predictive value of declarative knowledge for procedurali-
zation in the first models, and (b) we controlled the learners’ preexisting declarative
knowledge and teased apart the effect of the quality of practice on the changes in
accuracy and fluency over time in the second models.
The assumptions of hierarchical multiple regressions were met (see Osborne &
Waters, 2002). First, the sample size requirement based on two predictors was met
according to Pedhazur (1982). Second, all the outcome variables were normally dis-
tributed: error rate (Kolmogorov-Smirnov, p 5 .08), speech rate (Kolmogorov-Smirnov,
p 5 .20), mean length of pauses (Kolmogorov-Smirnov, p 5 .09), and repair phenomena
(Kolmogorov-Smirnov, p 5 .14). Third, multicollinearity was not observed in the
dataset: the correlation coefficient between the two predictors was .36 (see Supple-
mentary Materials for the correlation matrix). Also, the variance inflation factors in all
models (VIFrange 5 .67–1.48) supported the use of regression analyses (Heiberger &
Holland, 2004). Finally, no homoscedasticity was observed in the scatterplots.
RESULTS
DATA OVERVIEW
Over the five practice sessions, the participants produced 2,851 questions in total. As can
be seen in Table 4, learners produced comparable numbers of questions in each session,
as shown by the individual averages ranging from 18.90 to 23.21. Also, judging from the
standard deviations, ranging from 6.24 to 11.28, it can be said that all learners produced
questions in every session to a certain degree. The minimal frequencies in each session
were: Time 1 5 18; Time 2 5 17; Time 3 5 15; Time 4 5 16; and Time 5 5 15.
Across the data collection points, the learners produced more wh-aux-2nd ques-
tions—the target structure of the current study—than wh-subject and wh-copula
questions. As the intervention activities aimed for, the production of wh-subject
questions was the lowest, amounting to 3% (84/2,851) of the total questions (details can
be found in Supplementary Materials). Notably, the learners produced wh-aux-2nd
questions consistently over the five sessions (range 5 12.73–16.26), with much less
variation than wh-subject and wh-copula questions.
In the regression analyses, the cumulative frequencies of wh-aux-2nd questions from
Times 2, 3, and 4 were entered as the predictor variables. In total, the learners produced
1,216 wh-aux-2nd questions. On average, they produced 21.62 (SD 5 14.58) correct
questions and 23.82 (SD 5 16.77) incorrect questions. Both scores were normally
distributed according to Kolmogorov-Smirnov (correct: p 5 .20; incorrect: p 5 .19).
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
14
Masatoshi Sato and Kim McDonough
TABLE 4. Descriptive statistics of question types
Time 1 Time 2 Time 3 Time 4 Time 5 Total
Note. f 5 the frequency of produced questions; wh-S wh-subject questions; wh-C 5 wh-copula questions; wh-A 5 wh-auxiliary 2nd questions. The means represent the number
of questions produced by each learner.
Contextualized Practice in the Classroom 15
Accuracy
The RM ANOVA of error rate showed a significant main effect: F(3.59, 118.40) 5
4.863, p 5 .002, h2p 5 .128. Table 5 reports the descriptive statistics of each session.3 The
pairwise comparisons for each interval revealed that the error rates were significantly
reduced from Time 1 to Time 2 (p 5 .002, d 5 .32, 95% CI [.54, 15.26]) and from Time 4
to Time 5 (p 5 .025, d 5 .32, 95% CI [.86, 12.07]). In addition, the comparison between
Time 1 and Time 5 reached significance (p , .000, d 5 .72, 95% CI [7.33, 22.68]). The
changes in the other intervals were not significant: Time 2–Time 3 (p 5 .994, d , .00 ,
95% CI [–7.16, 7.22]); Time 3–Time 4 (p 5 .832, d 5 .02, 95% CI [–5.25, 6.49]). Figure
1 depicts the changes of error rate over time.
Fluency
First, the RM ANOVA of speech rate detected a significant main effect: F(1.93, 63.71) 5
18.108, p , .000, h2p 5 .354. The pairwise comparisons between each interval showed
that the speech rates significantly increased from Time 1 to Time 4 at each interval: Time
1–Time 2 (p 5 .048, d 5 .26, 95% CI [–.316, –.001]); Time 2–Time 3 (p 5 .025, d 5 .30,
95% CI [–.394, –.028]); and Time 3–Time 4 (p , .000, d 5 .33, 95% CI [–.311, –.132]).
The difference between Time 4 and Time 5 did not reach significance (p 5 .067, d 5 .40,
95% CI [–.019, .523]). Furthermore, the comparison between Times 1 and 5 showed
a significant increase (p , .000, d 5 1.53, 95% CI [–1.117, –.570]). Figure 2 depicts the
changes in speech rate over time.
The RM ANOVA of mean length of pauses also yielded a significant main effect: F
(3.34, 110.27) 5 18.108, p 5 .005, h2p 5 .114. The analyses of the intervals showed that
the mean length of pauses significantly decreased from Time 4 to Time 5 (p 5 .001, d 5
.59, 95% CI [.047, .160]). The differences in the other intervals were not significant:
Time 1–Time 2 (p 5 .932, d , .00, 95% CI [–.098, .106]); Time 2–Time 3 (p 5 .204, d 5
.16, 95% CI [–.023, .103]); or Time 3–Time 4 (p 5 .750, d , .00, 95% CI [–.064, .046]).
TABLE 5. Changes in error rates and fluency indices over five sessions
Time 1 Time 2 Time 3 Time 4 Time 5
M M M M M
(SD) (SD) (SD) (SD) (SD)
Note. MLP 5 mean length of pauses. The fluency indices—speech rate, MLP, and repairs—show scores per
question.
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
16 Masatoshi Sato and Kim McDonough
However, the comparison between Time 1 and Time 5 reached significance: p 5 .005,
d 5 .66, 95% CI [.045, .233].4 Figure 3 visually describes the changes in mean length of
pauses over time.
FIGURE 3. Mean lengths of pauses and their standard deviations at five sessions.
Finally, the RM ANOVA of repair phenomena did not show a significant effect: F
(3.20, 105.72) 5 .351, p 5 .801, h2p 5 .011. Figure 4 depicts the changes in the number
of repair phenomena over time.
Accuracy
The first regression model for accuracy, where the preexisting declarative knowledge
was the predictor, yielded a nonsignificant result: F(1, 32) 5 .232, p 5 .634, R2 5 .007.
However, the second model, which tested the effect of contextualized practice over and
above learners’ preexisting declarative knowledge, was significant: F(2, 30) 5 4.558,
p 5 .019, accounting for 24% (R2) of the total variance in the change over time (DR2 5
.231). As shown in Table 6, the grammar test scores did not predict the practice effect in
the second model either (p 5 .299; b 5 –.195). However, both correct (p 5 .007) and
incorrect (p 5 .045) productions were significant predictors. The negative correlation
coefficients indicate that the more correct questions a learner produced, the more accurate
his or her production became over time. More precisely, the standardized beta coefficient
(b) indicates that for every unit increase in correct production, a decrease of .56 SD units
can be expected in the change of error rate. On the contrary, production of incorrect
questions showed positive predictive power (b 5 .39), meaning that the more incorrect
questions a learner produced, the less accurate she or he became at the end of
intervention.
Fluency
The difference scores (i.e., Time 5 minus Time 1) of the following fluency indices were
entered as the outcome variables in each model: (a) speech rate, (b) mean length of
pauses, and (c) repair phenomena (see Supplementary Materials for the correlation
matrix).
The first model for speech rate, where the preexisting declarative knowledge was the
predictor, yielded a nonsignificant result: F(1, 32) 5 .472, p 5 .497, R2 5 .015. The
second model, after entering correct and incorrect productions, was not significant either:
F(2, 30) 5 1.264, p 5 .304, accounting for 11% (R2) of the total variance in the change
over time (DR2 5 .098). Table 7 shows the details of the second model.
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 19
Neither of the two models for mean length of pauses was significant; The first model: F
(1, 32) 5 .730, p 5 .399, R2 5 .022; and the second model: F(2, 30) 5 0.293, p 5 .830,
with a minimal change from the first model (DR2 5 .006).
Similar to the results of the other fluency indices, neither of the two models for repair
phenomena was significant; the first model: F(1, 32) 5 0.317, p 5 .577, R2 5 .010; and
the second model: F(2, 30) 5 0.276, p 5 .842, accounting for 3% (R2) of the total
variance in the change over time (DR2 5 .017).
The results of the grammar test showed that the learners had not fully developed the
declarative knowledge of the target structure (M% 5 59). With this amount of declarative
knowledge, the learners engaged in five sessions of contextualized practice. The RM
ANOVAs showed that error rates of oral production incrementally decreased over time,
especially between Time 1 and Time 2 and between Time 4 and Time 5. The difference
between Time 1 and Time 5 was also significant. Similar patterns were observed in the
changes of speech rate: incremental increases over time and a significant difference
between Time 1 and Time 5. Although the decrease of mean length of pauses was
gradual, the comparison between Time 1 and Time 5 reached significance. However,
repair phenomena were not affected by the contextualized practice. In terms of predictors
of the practice effect, preexisting grammar knowledge did not predict changes in either
accuracy or fluency. The second model for accuracy showed that correct productions
predicted the decrease of error rate. In contrast, the production of incorrect questions
negatively affected the change in accuracy. Neither correct nor incorrect productions
predicted changes in any indices of fluency.
DISCUSSION
CONTEXTUALIZED PRACTICE AND PROCEDURALIZATION
The first research question asked about the impact of contextualized practice oper-
ationalized as L2 production during communicative activities in the classroom. The
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
20 Masatoshi Sato and Kim McDonough
results of error rate showed an initial significant decline (Time 1–Time 2) followed by
gradual changes (Time 2–Time 4). This pattern of changes conforms to the expected
learning curves resembling the power function of practice. By engaging in contextu-
alized practice, the learners were able to become more accurate in their spontaneous
production over time; in other words, the practice may have helped develop the pro-
cedural memory of English wh-questions. In the current study, feedback was not
provided during the practice sessions (cf. Suzuki & DeKeyser, 2017a). It may be the case
that during the proceduralization, learners became more reliant on procedural memory
that, in turn, freed up their short-term memory for accessing their declarative knowledge.
Differently put, the practice did not aid the learners to learn a novel structure; rather, it
helped them develop procedural representations of a target structure for which the
learners already possessed declarative knowledge. In this situation, practice alone was
sufficiently powerful to facilitate proceduralization.
One may argue that the initial decline was due to the testing effect. However, we note
that not only did the learners participate in a training session with the identical activity
but also the testing effect does not account for the significant difference between Time 4
and Time 5. We speculate that the significant change between Time 4 and Time 5 is an
indication of partial proceduralization (DeKeyser, 2010). As can be seen in the error rate
in Time 5 (34.33%), there was much more developmental space left at the end of the
intervention. Accordingly, we believe that the learners were not yet at the automatization
stage at Time 5 (see Hulstijn, Van Gelderen, & Schoonen, 2009; Segalowitz &
Segalowitz, 1993). This interpretation suggests a possibility that the changes observed in
the current study may represent faster access to the declarative system rather than
deployment of the relevant knowledge in the procedural system.
The current study focused on three fluency indices: speech rate, mean length of pauses,
and repair phenomena. The result of speech rate differed from that of accuracy in that,
first, there was no significant change from Time 1 to Time 2, and second, the learners
consistently became faster over the five sessions. The mean length of pauses also de-
creased but only when Time 1 and Time 5 were compared. While the power function was
not observed in those indices, the overall changes support previous fluency studies. In de
Jong and Perfetti (2011), for example, the group who repeated the same speech involving
the same lexical items exhibited more signs of proceduralization than the group that
recited different speeches. Those learners increased speech rate (phonation/time ratio)
and decreased mean length of pauses. The current study showed that the learners who
engaged in repeated practice of the same grammatical structure (wh-questions) but with
different lexical items became faster with shorter pauses during communicative
interaction.
The differential findings of the three fluency indices add to the discussion pertaining to
cognitive versus utterance fluency. If we consider that the learners were in the process of
proceduralization of English wh-questions, only speech rate and pause phenomena were
susceptible to the effect of contextualized practice. Also, given the consistent increase in
speech rate, it can be argued that speech rate best represents the change in the underlying
cognitive mechanism (de Jong et al., 2013). However, repair phenomena were largely
unrelated to the other measurements. In Freed’s (1995) study that compared the fluency
of L2 learners who studied at home and those who studied abroad showed that the study
abroad learners produced faster speech; however, those learners produced more repair
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 21
markers, such as false starts, repetitions, and self-corrections. Freed (1995) concluded
that the study-abroad learners “monitor their speech and self-correct along the way” (p.
141) while expressing complex structures. The results of the current study seem to
support this claim in that although the learners became more accurate and faster, their
repair attempts were not related to those changes.
In sum, contextualized practice yielded positive changes in both accuracy and fluency,
except for repair phenomena. The effect sizes of the comparisons between Time 1 and
Time 5 of error rate (d 5 .72), speech rate (d 5 1.53), and mean length of pauses (d 5
.66) were all medium to large (Plonsky & Oswald, 2014). Regardless of the exact
knowledge structure that the learners developed, it is encouraging that their commu-
nicative production became more accurate and faster, and their access to relevant
memory systems became more efficient and effortless. From a practical perspective, such
changes may be an L2 learning goal for both learners and teachers (DeKeyser, 2017).
The second research question was related to learner (preexisting declarative knowledge)
and practice (correct vs. incorrect productions) factors as potential predictors of the effect
of contextualized practice. The results support skill-learning theories in that (a) de-
clarative knowledge assists proceduralization at least at the initial stage of the process
and (b) repeated practice has a robust effect in allowing learners to increasingly rely on
procedural memory over time. Skill-learning theories, including those in cognitive
psychology (Schneider & Fisk, 1983), neuroimaging (Ullman, 2001), and psycholin-
guistic (DeKeyser, 1997) research, explain that declarative and procedural systems are
developed concurrently while the shifts in reliance on the two systems occur as a person
becomes more accurate and faster during a given task. It was predicted that the de-
velopment of skilled performance may be supported by declarative knowledge to de-
velop and/or restructure the representations in the procedural system.
Interestingly, the scores of the declarative knowledge test, administered prior to
engaging in contextualized practice, did not predict the extent of the practice effect on
accuracy or fluency changes. This result indicates that having declarative knowledge of
a grammatical structure may not be related to the development of the procedural system
of that structure when practice is considered as the cause of the changes. Accordingly, it
could be said that contextualized practice alone facilitates a positive change in accuracy,
on the one hand. On the other hand, the result seems to challenge skill acquisition theory
in that learners may not need an explicit understanding of a grammatical structure to
benefit from contextualized practice. However, in the current study, all learners pos-
sessed some declarative knowledge of the target structure. Hence, it is premature to argue
that practice alone is sufficient to develop procedural memory of a grammatical structure.
What the results suggest is, instead, that the amount of declarative knowledge was not
related to the extent to which each learner benefited from practice.
To further examine the contribution of declarative knowledge to proceduralization, we
ran additional correlational analyses (Pearson, two-tailed) between the grammar test
scores and error rate in Time 1, which yielded a significantly negative correlation (r 5
–.36, p 5 .036). At Time 5, the relationship became nonsignificant (r 5 –.16, p 5 .348).
This means that before the intervention, the more declarative knowledge a learner
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
22 Masatoshi Sato and Kim McDonough
possessed, the more accurate his or her oral production was. However, as the learners
engaged in extensive practice, the initial declarative knowledge became less relevant to
their oral production. It may be the case that knowing grammatical rules are helpful for
engaging in the targetlike behavior at an earlier stage of proceduralization, yet practice
effects forge such a relationship. Similar patterns were observed in speech rate (Time 1:
r 5 .46, p 5 .035; Time 5: r 5 –.15, p 5 .403) and mean length of pauses (Time 1: r 5
–.44, p 5 .010; Time 5: r 5 –.15, p 5 .398). Those results suggest that at the beginning of
the intervention, the learners who possessed more declarative knowledge produced faster
utterances with shorter pauses. Such patterns disappeared after the intervention, which
indicates the robust effects of contextualized practice on the development of procedural
memory. Also, the results imply that had the learners had no declarative knowledge of the
target structure, the proceduralization might not have been observed as they would not
have any knowledge source to monitor their production while engaging in contextualized
practice.
From a pedagogical perspective, it can be said that teaching grammatical rules ex-
plicitly may help learners engage in targetlike production at the beginning. However, the
amount of declarative knowledge that learners possess prior to engaging in practice may
be unrelated to the individuals’ rates of proceduralization. Therefore, promoting con-
textualized practice after the initial instruction would be recommended, without con-
tinuing explicit instruction. In other words, it may be the case that repeated
contextualized practice may be sufficient to develop accurate procedural knowledge of
a particular grammatical structure when learners have some declarative knowledge of
that structure. However, the quality of practice seems to matter in the rate of proce-
duralization, which we discuss in the following section. In addition, we caution that we
did not test the learners’ declarative knowledge after the intervention; it is possible that
declarative knowledge also developed due to the contextualized practice, which may
have supported the development of procedural memory.
The second regression model included the frequencies of correct and incorrect pro-
ductions as the predictors. The results showed that correct productions predicted the
change in accuracy. This may not be a surprising finding; the more a learner produces
correct utterances, the more accurate she or he becomes, confirming the findings in the
studies in which only correct responses were analyzed to investigate proceduralization
(e.g., Rodgers, 2011; Suzuki & DeKeyser, 2017a). The current study extended the
research agenda by investigating the impact of incorrect production. This was because L2
learners constantly produce inaccurate utterances and research has not discovered how
those utterances impact L2 learning. Somewhat unexpectedly, the regression analysis
showed that incorrect production negatively predicted the change of the accuracy scores.
At a glance, the results suggest that learners are better off not producing incorrect
utterances. However, a careful interpretation is in order because the current study in-
vestigated practice of a complex linguistic structure in a classroom context. Despite its
negative predictive power, we encourage teachers to promote contextualized L2 pro-
duction as much as possible for the following reasons. First, the frequency of correct
questions increased along with that of incorrect questions (r 5 .361; p 5 .036), meaning
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 23
that students were producing more of both forms. Besides, in the classroom, the teacher
cannot force students to produce correct utterances only, especially during communi-
cative activities. Second, as the output hypothesis states, L2 learners test their linguistic
hypotheses based on their interlanguage systems by producing the L2, which helps them
notice what they can or cannot produce (see Loewen & Sato, 2018). This cognitive
process may have contributed to the proceduralization found in the current study, even
though the incorrect productions were not related to the gain scores. Finally, the pre-
dictive power of correct production (b 5 .56) was higher than that of incorrect pro-
duction (b 5 .39); hence, statistically speaking, promoting communicative usage of
language would ultimately facilitate proceduralization of grammatical knowledge.
The findings related to incorrect production pinpoint the importance of feed-
back—another pillar of skill acquisition theory along with practice. Anderson and
Schunn (2000) cautioned that without feedback during practice, “the student might wind
up entrenching the wrong knowledge structures” (p. 16) and feedback at propitious
moments during repeated practice promotes meaningful learning. To this end, L2 re-
search of corrective feedback has abundantly documented that when a learner is given
feedback, she or he may produce the correct version of the error, in the forms of
successful uptake or modified output (Lyster, Saito, & Sato, 2013). Such evidence has
been reported in the development of question formation as well (Lightbown & Spada,
1990). Had the teachers provided corrective feedback in the current study, the frequency
of correct productions might have increased and that of incorrect productions decreased,
which would have increased the overall effectiveness of the contextualized practice.
Indeed, in Sato and Lyster (2012), EFL learners who engaged in extensive paired
communicative activities improved their utterance fluency, while only those who re-
ceived feedback improved accuracy. Hence, we recommend that the teacher increase
opportunities whereby students engage in contextualized practice while providing
corrective feedback at appropriate moments.
In terms of the changes in fluency, none of the predictors were significant in any
model. Given the findings that speech rate and mean length of pauses improved over
time, we can conclude that the correctness of production does not matter as far as fluency
development is concerned. Simply speaking, by engaging in repeated practice, the
learners became faster in executing a particular performance—asking wh-questions.
Pedagogically, if the instructional goal is to improve fluency, the teacher does not have to
be concerned about the learners’ grammatical knowledge or correctness of their pro-
duction. Clearly, however, an increase in fluency should not be the sole goal of L2
instruction.
The current study focused on three issues related to practice effects. First, while skill
learning researchers have long argued for the importance of meaningful practice, often
the experimental tasks have been mechanical and decontextualized. The current study
operationalized “contextualized practice” and delivered it as classroom activities.
Second, to investigate the practice effect on real language use, we examined changes in
accuracy and fluency during communicative L2 production. Third, drawing on de-
clarative/procedural models, the contribution of learners’ declarative knowledge to
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
24 Masatoshi Sato and Kim McDonough
proceduralization was tested. While the exact nature of knowledge structures that
contextualized practice (re-)structured is unknown, we conclude that contextualized
practice supported the development of procedural knowledge of wh-questions, at least
partially. At the same time, those learners may have become faster in accessing de-
clarative knowledge of the structure.
The current study comes with ample ecological validity. The intervention activities
were collaboratively developed with the teachers. The resulting activities align with L2
teachers’ understanding of contextualized practice in general (see Korkmaz & Korkmaz,
2013). In addition, the intervention was delivered by the classroom teachers as part of
regular classes. The drawbacks of increasing the study’s ecological validity included the
large attrition rate and the wide standard deviations (see Sato & Loewen, 2019, for pros
and cons of classroom research). Nonetheless, given the positive results, we recommend
that teachers implement activities that elicit repeated contextualized practice of a specific
linguistic form, be it morphosyntactic, phonological, or pragmatic.
The limitations of the current study include, first, the lack of a control group. Without
a comparison, we remain cautious to claim that the observed changes were solely due to
the intervention. Second, although the frequencies of correct/incorrect productions were
entered in the regression models, each learner engaged in different amounts of practice,
which might have mediated the overall impact of repeated practice. Third, while the
practice and outcome measures addressed the desired goal of L2 instruction (i.e.,
communicative production), the current study did not measure changes in compre-
hension. Given the evidence related to skill specificity (DeKeyser, 1997), it is possible
that the learners developed productive skills only. Similarly, the grammar test might not
have been an ideal measure of declarative knowledge. Future research may benefit by
administering more sophisticated tests of declarative and procedural knowledge prior to
and after practice. To answer a theoretical question as to whether particular L2
knowledge has been automatized—as opposed to a practical question as to whether L2
production has become faster and more accurate—a measurement that taps into the
change in the underlying cognitive structure (e.g., brain scanning) would be useful.
Automatization has been operationalized with a statistical technique (Segalowitz &
Segalowitz, 1993), but there is a criticism of such an approach (Hulstijn et al., 2009).
Fourth, we did not consider learners’ individual aptitude found to moderate practice
effects (see Suzuki & DeKeyser, 2017b). Future research can consider those weaknesses
while considering the goal of contributing to our understanding of the role of con-
textualized practice in L2 learning and teaching.
NOTES
1
The idea of proceduralization/automatization relates to the interface positions in SLA research—whether
explicit knowledge can become implicit knowledge. This theoretical debate is beyond the scope of this article,
but in general we concur with Ullman and Lovelett (2018) arguing that “the explicit/implicit dichotomy is not
isomorphic to the declarative/procedural distinction…. Thus, it is difficult to directly compare the SLA and DP
model positions” (p. 45).
2
We considered that the absentees, resulting in the missing values in the dataset, were a random factor
(missing completely at random) (see Myers, 2011). A total of 14.7% of missing values was considered ac-
ceptable for the values to be imputed (Roth, 1994).
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 25
3
The application of missing data analyses did not alter the outlook of the dataset. The analyses with 19
participants who participated in all five sessions yielded significant differences in the same pairwise com-
parisons both of accuracy and fluency.
4
We examined the articulation rate (syllable per second after removing pauses) as well. The results showed
the identical statistical patterns to those of speech rate.
REFERENCES
Adams, R. (2007). Do second language learners benefit from interacting with each other? In A. Mackey (Ed.),
Conversational interaction in second language acquisition: A collection of empirical studies (pp. 29–51).
Oxford, UK: Oxford University Press.
Ammar, A., Lightbown, P., & Spada, N. (2010). Awareness of L1/L2 differences: Does it matter? Language
Awareness, 19, 129–146.
Anderson, J. (1985). Cognitive psychology and its implications (2nd ed.). New York, NY: Freeman.
Anderson, J. (2005). Cognitive psychology and its implications (6th ed.). New York, NY: Worth Publishers.
Anderson, J., & Schunn, C. (2000). Implications of the ACT-R learning theory: No magic bullets. In R. Glaser
(Ed.), Advances in instructional psychology (Vol. 5, pp. 1–33). Mahwah, NJ: Lawrence Erlbaum.
Bird, S. (2010). Effects of distributed practice on the acquisition of second language English syntax. Applied
Psycholinguistics, 31, 635–650.
Chein, J. M., & Schneider, W. (2005). Neuroimaging studies of practice-related change: fMRI and meta-
analytic evidence of a domain-general control network for learning. Cognitive Brain Research, 25, 607–623.
Council of Europe. (2001). Common European framework of reference for languages: Learning, teaching,
assessment. Cambridge, UK: Cambridge University Press.
de Jong, N., & Bosker, H. R. (2013). Choosing a threshold for silent pauses to measure second language
fluency. Paper presented at the 6th Workshop on Disfluency in Spontaneous Speech.
de Jong, N., & Perfetti, C. (2011). Fluency training in the ESL classroom: An experimental study of fluency
development and proceduralization. Language Learning, 61, 533–568.
de Jong, N., Steinel, M. P., Florijn, A., Schoonen, R., & Hulstijn, J. H. (2013). Linguistic skills and speaking
fluency in a second language. Applied Psycholinguistics, 34, 893–916.
DeKeyser, R. (1997). Beyond explicit rule learning: Automatizing second language morphosyntax. Studies in
Second Language Acquisition, 19, 195–221.
DeKeyser, R. (2007a). Skill acquisition theory. In B. VanPatten & J. Williams (Eds.), Theories in second
language acquisition: An introduction (pp. 97–113). Mahwah, NJ: Lawrence Erlbaum.
DeKeyser, R. (2010). Practice for second language learning: Don’t throw out the baby with the bathwater.
International Journal of English Studies, 10, 155–165.
DeKeyser, R. (2017). Knowledge and skill in ISLA. In S. Loewen & M. Sato (Eds.), The Routledge handbook of
instructed second language acquisition (pp. 15–32). New York, NY: Routledge.
DeKeyser, R. (Ed.) (2007b). Practice in a second language: Perspectives from applied linguistics and cognitive
psychology. Cambridge, UK: Cambridge University Press.
Derwing, T. (2017). L2 fluency development. In S. Loewen & M. Sato (Eds.), The Routledge handbook of
instructed second language acquisition (pp. 246–259). New York, NY: Routledge.
Eichenbaum, H. (2012). The cognitive neuroscience of memory: An introduction (2nd ed.). Oxford, UK: Oxford
University Press.
Ellis, R., & Shintani, N. (2014). Exploring language pedagogy through second language acquisition. London,
UK: Routledge.
Ferman, S., Olshtain, E., Schechtman, E., & Karni, A. (2009). The acquisition of a linguistic skill by adults:
Procedural and declarative memory interact in the learning of an artificial morphological rule. Journal of
Neurolinguistics, 22, 384–412.
Freed, B. (1995). What makes us think that students who study abroad become fluent? In B. Freed (Ed.), Second
language acquisition in a study abroad context (pp. 123–148). Amsterdam, The Netherlands: John
Benjamins.
Gade, M., Druey, M. D., Souza, A. S., & Oberauer, K. (2014). Interference within and between declarative and
procedural representations in working memory. Journal of Memory and Language, 76, 174–194.
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
26 Masatoshi Sato and Kim McDonough
Healy, A., & Bourne, L. (Eds.). (1995). Learning and memory of knowledge and skills: Durability and
specificity. Thousand Oaks, CA: Sage.
Heiberger, R. M., & Holland, B. (2004). Statistical analysis and data display: An intermediate course with
examples in S-PLUS, R, and SAS. New York, NY: Springer.
Hulstijn, J., Van Gelderen, A., & Schoonen, R. (2009). Automatization in second language acquisition: What
does the coefficient of variation tell us? Applied Psycholinguistics, 30, 555–582.
Jiang, N. (2012). Conducting reaction time research in second language studies. New York, NY: Routledge.
Kim, Y. (2012). Task complexity, learning opportunities, and Korean EFL learners’ question development.
Studies in Second Language Acquisition, 34, 627–658.
Korkmaz, S., & Korkmaz, Ş. Ç. (2013). Contextualization or de-contextualization: Student teachers’ per-
ceptions about teaching a language in context. Procedia-Social and Behavioral Sciences, 93, 895–899.
Langley, P., Laird, J., & Rogers, S. (2009). Cognitive architectures: Research issues and challenges. Cognitive
Systems Research, 10, 141–160.
Leont’ev, A. N. (1981). The problem of activity in psychology. In J. V. Wertsch (Ed.), The concept of activity in
Soviet psychology (pp. 37–71). Armonk, NY: M. E. Sharpe.
Lightbown, P. (2000). Classroom SLA research and second language teaching. Applied Linguistics, 21,
431–462.
Lightbown, P., & Spada, N. (1990). Focus-on-form and corrective feedback in communicative language
teaching: Effects on second language learning. Studies in Second Language Acquisition, 12, 429–448.
Lightbown, P., & Spada, N. (2000). Do they know what they’re doing? L2 learners’ awareness of L1 influence.
Language Awareness, 9, 198–217.
Lim, H., & Godfroid, A. (2015). Automatization in second language sentence processing: A partial, conceptual
replication of Hulstijn, Van Gelderen, and Schoonen’s 2009 study. Applied Psycholinguistics, 36,
1247–1282.
Loewen, S., & Sato, M. (2017). Instructed second language acquisition (ISLA): An overview. In S. Loewen &
M. Sato (Eds.), The handbook of instructed second language acquisition (pp. 1–12). New York, NY:
Routledge.
Loewen, S., & Sato, M. (2018). State-of-the-arts article: Interaction and instructed second language acquisition.
Language Teaching, 51, 285–329.
Lyster, R., & Sato, M. (2013). Skill acquisition theory and the role of practice in L2 development. In M.
P. Garcı́a Mayo, J. Gutierrez-Mangado, & M. Martı́nez Adrián (Eds.), Contemporary approaches to second
language acquisition (pp. 71–92). Amsterdam, The Netherlands: John Benjamins.
Lyster, R., Saito, K., & Sato, M. (2013). Oral corrective feedback in second language classrooms. Language
Teaching, 46, 1–40.
Martin-Chang, S., & Levy, B. (2005). Fluency transfer: Differential gains in reading speed and accuracy
following isolated word and context training. Reading and Writing, 18, 343–376.
Martin-Chang, S., & Levy, B. (2006). Word reading fluency: A transfer appropriate processing account of
fluency transfer. Reading and Writing, 19, 517–542.
Mbogning, C., Bleakley, K., & Lavielle, M. (2015). Joint modelling of longitudinal and repeated time-to-event
data using nonlinear mixed-effects models and the stochastic approximation expectation–maximization
algorithm. Journal of Statistical Computation and Simulation, 85, 1512–1528.
McDonough, K., & Chaikitmongkol, W. (2010). Collaborative syntactic priming activities and EFL learners’
production of wh-questions. The Canadian Modern Language Review, 66, 817–841.
Morgan-Short, K., Steinhauer, K., Sanz, C., & Ullman, M. T. (2012). Explicit and implicit second language
training differentially affect the achievement of native-like brain activation patterns. Journal of Cognitive
Neuroscience, 24, 933–947.
Morgan-Short, K., & Ullman, M. (2012). The neurocognition of second language. In S. Gass & A. Mackey
(Eds.), The Routledge handbook of second language acquisition (pp. 282–300). New York, NY: Routledge.
Morris, C., Bransford, J., & Franks, J. (1977). Levels of processing versus transfer appropriate processing.
Journal of Verbal Learning and Verbal Behavior, 16, 519–533.
Myers, T. A. (2011). Goodbye, listwise deletion: Presenting hot deck imputation as an easy and effective tool
for handling missing data. Communication Methods and Measures, 5, 297–310.
Nakata, T. (2015). Effects of expanding and equal spacing on second language vocabulary learning: Does
gradually increasing spacing increase vocabulary learning? Studies in Second Language Acquisition, 37,
677–711.
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
Contextualized Practice in the Classroom 27
Newell, A., & Rosenbloom, P. (1981). Mechanisms of skill acquisition and the law of practice. In J. Anderson
(Ed.), Cognitive skills and their acquisition (pp. 1–55). Hillsdale, NJ: Erlbaum.
Ortega, L. (2014). Understanding second language acquisition (2nd ed.). New York, NY: Routledge.
Osborne, J. W., & Waters, E. (2002). Four assumptions of multiple regression that researchers should always
test. Practical Assessment, Research, and Evaluation, 8, 1–5.
Pavlik, P., & Anderson, J. (2005). Practice and forgetting effects on vocabulary memory: An activation-based
model of the spacing effect. Cognitive Science, 29, 559–586.
Pedhazur, E. (1982). Multiple regression in behavioral research. Fort Worth, TX: Holt, Rinehard and Winston.
Petrocelli, J. V. (2003). Hierarchical multiple regression in counseling research: Common problems and
possible remedies. Measurement and Evaluation in Counseling and Development, 36, 9–22.
Pienemann, M., & Johnston, M. (1987). Factors influencing the development of language proficiency. In
D. Nunan (Ed.), Applying second language acquisition research (pp. 45–141). Adelaide, Australia: National
Curriculum Resource Center, Adult Migrant Education Program.
Pienemann, M., Johnston, M., & Brindley, G. (1988). Constructing an acquisition-based procedure for second
language assessment. Studies in Second Language Acquisition, 10, 217–243.
Plonsky, L., & Oswald, F. L. (2014). How big is “big”? Interpreting effect sizes in L2 research. Language
Learning, 64, 878–912.
Riggenbach, H. (1991). Toward an understanding of fluency: A microanalysis of nonnative speaker con-
versations. Discourse Processes, 14, 423–441.
Rodgers, D. M. (2011). The automatization of verbal morphology in instructed second language acquisition.
International Review of Applied Linguistics in Language Teaching, 49, 295–319.
Roth, P. L. (1994). Missing data: A conceptual review for applied psychologists. Personnel Psychology, 47,
537–560.
Sato, M., & Loewen, S. (2019). Methodological strengths, challenges, and joys of classroom-based quasi-
experimental research: Metacognitive instruction and corrective feedback. In R. DeKeyser & G. Prieto
Botana (Eds.), Doing SLA research with implications for the classroom: Reconciling methodological
demands and pedagogical applicability (pp. 31–54). Amsterdam, The Netherlands: John Benjamins.
Sato, M., & Lyster, R. (2012). Peer interaction and corrective feedback for accuracy and fluency development:
Monitoring, practice, and proceduralization. Studies in Second Language Acquisition, 34, 591–262.
Schneider, W., & Fisk, A. (1983). Attentional theory and mechanisms for skilled performance. In R. Magill
(Ed.), Memory and control of action (pp. 119–143). New York, NY: North-Holland Publishing Company.
Schneider, W., Dumais, S., & Shiffrin, R. (1984). Automatic and controlled processing and attention. In
R. Parasuraman & D. Davies (Eds.), Varieties of attention (pp. 1–27). London, UK: Academic Press.
Schuetze, U. (2015). Spacing techniques in second language vocabulary acquisition: Short-term gains vs. long-
term memory. Language Teaching Research, 19, 28–42.
Segalowitz, N. (2010). Cognitive bases of second language fluency. London, UK: Routledge.
Segalowitz, N., & Freed, B. (2004). Context, contact, and cognition in oral fluency acquisition: Learning
Spanish in at home and study abroad contexts. Studies in Second Language Acquisition, 26, 173–199.
Segalowitz, N., & Segalowitz, S. (1993). Skilled performance, practice, and the differentiation of speed-up from
automatization effects: Evidence from second language word recognition. Applied Psycholinguistics, 14,
369–385.
Serrano, R., Stengers, H., & Housen, A. (2015). Acquisition of formulaic sequences in intensive and regular
EFL programmes. Language Teaching Research, 19, 89–106.
Sobel, H. S., Cepeda, N. J., & Kapler, I. V. (2011). Spacing effects in real-world classroom vocabulary learning.
Applied Cognitive Psychology, 25, 763–767.
Spada, N., & Lightbown, P. M. (1999). Instruction, first language influence, and developmental readiness in
second language acquisition. The Modern Language Journal, 83, 1–22.
Suzuki, Y., & DeKeyser, R. (2017a). Effects of distributed practice on the proceduralization of morphology.
Language Teaching Research, 21, 166–188.
Suzuki, Y., & DeKeyser, R. (2017b). Exploratory research on second language practice distribution: An
Aptitude 3 Treatment interaction. Applied Psycholinguistics, 38, 27–56.
Toppino, T. C., & Gerbier, E. (2014). About practice: Repetition, spacing, and abstraction. Psychology of
Learning and Motivation, 60, 113–189.
Ullman, M. (2001). The neural basis of lexicon and grammar in first and second language: The declarative/
procedural model. Bilingualism: Language and Cognition, 4, 105–122.
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159
28 Masatoshi Sato and Kim McDonough
Ullman, M., & Lovelett, J. T. (2018). Implications of the declarative/procedural model for improving second
language learning: The role of memory enhancement techniques. Second Language Research, 34, 39–65.
Van Oers, B. (1998). From context to contextualizing. Learning and Instruction, 8, 473–488.
Walz, J. (1989). Context and contextualized language practice in foreign language teaching. The Modern
Language Journal, 73, 160–168.
White, L., Spada, N., Lightbown, P., & Ranta, L. (1991). Input enhancement and L2 question formation.
Applied Linguistics, 12, 416–432.
Downloaded from https://www.cambridge.org/core. Stockholm University Library, on 26 Apr 2019 at 08:21:43, subject to the Cambridge Core
terms of use, available at https://www.cambridge.org/core/terms. https://doi.org/10.1017/S0272263119000159