8000

Download as pdf or txt
Download as pdf or txt
You are on page 1of 101

Phonetics: Precursors to Modern Approaches 473

The treatise shows astonishing linguistic insight,


anticipating principles more closely associated with
the twentieth century. Unfortunately it was not published until 1818, and remained almost unknown
outside Scandinavia for many years.

Spelling Reform
During the Middle Ages Latin had dominated the
linguistic scene, but gradually the European vernaculars began to be thought worthy of attention. Dante,
in his short work De vulgari eloquentia On the eloquence of the vernaculars, gave some impetus to this
in Italy as early as the fourteenth century. The sounds
of the vernaculars were inadequately conveyed by the
Latin alphabet. Nebrija in Spain (1492), Trissino in
Italy (1524), and Meigret (see Meigret, Louis (?15001558)) in France (1542) all suggested ways of improving the spelling systems of their languages. Most early
grammarians took the written language, not the spoken, as a basis for their description of the sounds.
Some of the earliest phonetic observations on
English are to be found in Sir Thomas Smiths De
recta et emendata linguae Anglicanae scriptione dialogus Dialogue on the true and corrected writing of
the English language, 1568. He tried to introduce
more rationality into English spelling by providing
some new symbols to make up for the deficiency of
the Latin alphabet. These are dealt with elsewhere
(see Phonetic Transcription: History). Smith was one
of the first to comment on the vowel-like nature
(i.e., syllabicity) of the hli in able, stable and the
final hni in ridden, London.
John Hart (d. 1574) (see Hart, John (?1501-1574)),
in his works on the orthography of English (1551,
1569, 1570), aimed to find an improved method of
spelling which would convey the pronunciation while
retaining the values of the Latin letters. Five vowels
are identified, distinguished by three decreasing
degrees of mouth aperture (ha, e, ii), and two degrees
of lip rounding (ho, ui). He believed that these five
simple sounds, in long and short varieties, were as
many as ever any man could sound, of what tongue or
nation soever he were. His analysis of the consonants
groups 14 of them in pairs, each pair shaped in the
mouth in one selfe manner and fashion, but differing
in that the first has an inward sound which the
second lacks; elsewhere he describes them as softer
and harder, but there is no understanding yet of the
nature of the voicing mechanism. Harts observations
of features of connected speech are particularly noteworthy; for example, a weak pronunciation of the
pronouns me, he, she, we with a shorter vowel, and
the regressive devoicing of word final voiced consonants when followed by initial voiceless consonants,

as in his seeing, his shirt, have taken, find fault. Like


Smith, he recognized the possibility of syllabic hli,
hni, and hri in English (otherwise hardly commented
on before the 19th century). His important contribution to phonetic transcription is dealt with elsewhere
(see Phonetic Transcription: History). As a phonetic
observer he was of a high rank.

The Beginnings of General Phonetics:


Jacob Madsen
An important phonetic work appeared in 1586, written by the Dane Jacob Madsen of Aarhus (153886)
(see Madsen Aarhus, Jacob (15381586)), and entitled De Literis On Letters. Madsen was appointed
professor at Copenhagen in 1574, after some years
studying in Germany. He was familiar with a number
of modern European languages as well as Greek,
Latin, and Hebrew, and in spite of the title of his
work he was not simply a spelling reformer. The
term litera is used in a broad sense, including the
sound as well as the symbol. Moreover, he intended
his work to cover the sounds of all languages; in this
respect he can be described as the first to deal with
general phonetics and not just the sounds of a particular language. He placed considerable emphasis or
direct observation, as opposed to evidence from earlier
authorities, but was strongly influenced by Petrus
Ramus (see Ramus, Petrus (15151572)), the French
philosopher/grammarian, and borrows extensively
from his Scholae grammaticae (1569). Aristotelian influence is apparent in his two causes of sounds the
remote cause (guttur, throat) provides the matter of
sounds (breath and voice), and this is converted into
specific sounds by the proximate cause the mouth
and the nose. The mouth has a movable part (the
active organ), and a fixed part, (the passive or
assisting organ). The former comprises lower jaw,
tongue, and lips, and the latter the upper jaw, palate,
and teeth.
In common with earlier accounts Madsen divides
the vowels into lingual and labial, remarking that
their sound is determined by the varying dimensions
of the mouth. He identifies three lingual vowels, ha, e,
ii, differing in mouth aperture large, medium, and
small respectively, and three labial, ho, u, yi, having
progressively smaller lip opening hui and hyi also
have more protrusion than hoi. Madsen does not
commit himself on the tongue position of the labials
(one would need a glass covering of the mouth, he
says, to observe it), and thinks it unnecessary to describe it because nature spontaneously adapts it to
the sound. Few early descriptions are able to improve on this. He distinguished two varieties of
hei and three of hoi (including Danish hi).

474 Phonetics: Precursors to Modern Approaches

The consonants are also divided into lingual and


labial. Linguals are subdivided into linguopalatine
and linguodental. Linguopalatines, for which the
tongue is in a concave shape with its tip toward the
palate, are further divided into movable (tip not
touching the palate hs, ri) and fixed (tip touching
the palate and interrupting the airstream hl, ni). For
the upper linguodentals the tip is against the upper
teeth ht, di, and for the lower the tongue has a
convex shape, with the tip against the lower teeth,
and some involvement of the inner part of the tongue.
The lower interior group includes hhi and hci (
[k]), which he believed was articulated further back
than hgi ( [g]). Madsen refutes the common assertion that hhi is not a letter but just a breath, pointing
out that it can distinguish different words, and
he quotes its use in the postaspiration of vowels in
Danish words such as dah then. The lower exterior
group comprises hji ( [j]) and hgi. Labials are
divided into labiodentals hf, vi, and labiolabials
hp, b, mi. He believed that hfi had a more forceful
articulation than hvi, and was articulated further
forward on the lower lip.
Madsen fails to distinguish oral and nasal sounds
correctly, and leaves a number of gaps in his consonant description; for instance, he does not mention
glottal stop, [] or [X], although they occur in Danish.
However, in spite of these deficiencies, his work
constitutes a major landmark in the development
of articulatory phonetic descriptions in the Western
tradition.

The 17th Century


The upsurge of scientific investigation and the general
spirit of inquiry which pervaded the 17th century led
to further advances in the understanding of language
and speech. There was a continuing interest in
spelling reform, stimulated by the spread of literacy.
Alexander Gill (Logonomia anglica, 1619), Charles
Butler (The English Grammar, 1634), Simon Daines
(Orthoepia Anglicana, 1640), and Richard Hodges
(The English Primrose, 1644), were among the English
spelling reformers, often described as orthoepists. Gill
is unusual in devoting a substantial section to the variations found in different dialects of English, but neither he nor the others contributed substantially to the
establishment of an improved phonetic framework,
though they provided valuable evidence regarding
contemporary pronunciations of English.
Robert Robinson

Robert Robinsons primary aim was to help foreign


learners of English. The Art of Pronuntiation [sic],
is divided into two parts: Vox audienda the

description of sounds, and Vox videnda ways of


transcribing sounds. In spite of his youth and lack
of learning (for which he apologizes), he assures the
reader that his book is based on his own experience. It
contains some perceptive and novel ideas.
Part 1 gives an account of the vocal organs, contrasting the motion of air from the lungs with the restraint
imposed by those organs through which the air passes
(cf. Madsens remote and proximate causes). The
diversity of sounds arises from three aspects of this
restraint its instrument (the active articulator),
its place, and its manner. He divides sounds into three
categories vowels, consonants, and the vital sound.
This is described as framed in the passage of the
throat. Without it, he says, all the other parts of
the voice would be but as a soft whispering; we can
assume be is referring to voicing, but he is vague and
does not understand its origin. He distinguishes ten
different vowel qualities, five short and five long, but
only on one dimension place of articulation; he does
not mention aperture. The consonant classification is
based on three places of articulation in the mouth
outward, middle, inward; and four manners
mutes ( stop), semimutes (breath through the
nose), greater obstricts (stricture midway between
mutes and lesser obstricts), and lesser obstricts (stricture midway between greater obstricts and vowels).
The peculiar (the lateral) is well described, and the
breast consonant, though described as an aspirate,
may refer to the glottal stop. He says it involves a
sudden stay of motion of the breath.
In part 2 Robinson provides a novel set of symbols,
intended to convey the sounds unambiguously. There
are two other interesting innovations in this part: (a)
his use of a diagram depicting the palate, with indications of the tongue positions along it corresponding
to his vowels; and (b) his provision of a diacritic to
mark the beginning of a syllable, and another to indicate the occurrence of the aspirate. The aspirate is thus
treated as a prosodic feature: its diacritic is placed
above the first letter of a syllable, but according to the
direction in which it points it signifies either syllable
initial or syllable final hhi. This diacritic also functions
to distinguish the voiceless member of pairs such as
hp/bi, ht/di which otherwise share a common symbol.
Robinson mentions tonal differences, but gives no
explanation of how they are produced.
John Wallis

John Wallis (16161703) was one of the outstanding


mathematicians of his day, and occupied the Savilian
Chair of Geometry in Oxford for over 50 years (see
Wallis, John (16161703)). He was something of a
polymath, and his accomplishments included skills in
deciphering documents in the Civil War (164249),

Phonetics: Precursors to Modern Approaches 475

and teaching speech and language to the deaf. It was


partly because of these interests that he decided to
attempt a more thorough examination of speech. His
general phonetic treatise, entitled Tractatus de
Loquela Treatise on Speech, was first published in
1653 in the same volume as his influential Grammatica linguae Anglicanae Grammar of the English Language. He claimed to have been the first to set out the
whole structure of speech systematically and in one
place. This was something of an overstatement, but
certainly his high international standing made his
book particularly influential.
Wallis lists the vocal organs and what he believes
are some of their functions: the lungs are responsible
for producing greater or less strength or sonority; the
trachea determines differences of pitch between different voices; while the larynx is responsible for the
normal pitch variation in speech and song, through a
narrowing or widening of the rimula slit within it,
and also for the difference between whisper and full
speech (aperta loquela), through a tremulous vibration. The nature and function of the vocal cords were
still not known at this time, and Walliss account is as
accurate as one could expect, though less perceptive
than Holders.
For the vowels he proposes three places of articulation: labial, palatal, and guttural (a division taken
over from Arabic and Hebrew grammarians). This
threefold place classification may at first sight resemble the front /central/back categories of Melville
Bell some centuries later, but whereas Bells three
categories referred to tongue positions, Walliss labial
refers only to lip posture. Like Madsen, he does not
specify tongue positions for labials, and attributes
the difference between the labial vowels [u] and [y]
to a difference of mouth aperture rather than to back
and front tongue positions. He also omits the French
and German front rounded vowels [] and [] from
his inventory. The mouth aperture, too, is divided
into three degrees: wide, medium, and narrow, giving
a total of nine vowels which Wallis believed could
account for all the vowel sounds to be heard in languages. However, he concedes that since the aperture
is a continuum it can be infinitely divided.
For the consonants also he is content to limit the
places of articulation to three, and he has two strictures closed (oral and nasal stops) and open
(continuants). To provide for the further distinctions
necessary he introduces a category based on mouth
aperture thinner for [s, z, x] and rounder for [y, ,
h]. In distinguishing between the members of pairs
such as hp/bi, ht/di, hf/vi, he is unwilling to accept that
the difference originates in the larynx, and attributes
it instead to different directions of the air-stream. For
hfi, he says, it passes entirely through the mouth,

for hvi it is equally split between nose and mouth,


and for hmi it passes entirely through the nose. He
had observed that whispered hvi and hdi are not
identical with hfi and hti, but found the wrong explanation. In fact, for a whispered sound (in a technical
phonetic sense) the vocal cords are brought closer
together than for a voiceless sound, and this accounts
for the difference perceived between the sounds.
It is rare to find these early writers on phonetics
discussing the broad characteristics of different
languages what would now be called articulatory
settings, but Wallis has some interesting comments:
The English push forward the whole of their pronunciation into the front part of the mouth . . . the
Germans retract their pronunciation to the back of
the mouth . . . the French articulate all their sounds
nearer the [hard] palate and the mouth cavity is not
so wide.
In short, though not free of defects, as his contemporary Amman pointed out, Wallis presents a clear,
articulatorily based description of speech which was
an advance on previous classifications. The Tractatus,
together with his account of English sounds in the first
chapter of his Grammar, formed a valuable source,
referred to in the 19th century by A. J. Ellis and Henry
Sweet in their works on English pronunciation.
William Holder

Holder (161698), like Wallis, was a member of the


Royal Society, and became Canon of Ely Cathedral
and later of St Pauls (see Holder, William (1616
1698)). He shared Walliss interest in teaching the
deaf, and quarreled violently with him over the question of which of them had taught a deaf person to
speak, both claiming credit for it. But it was this
interest which led him to investigate speech, and to
produce what is in many ways an outstanding work
for its time Elements of Speech (1669). He starts
with a good description of the vocal organs, divided
into a material group (lungs, trachea, larynx, uvula,
nose, arch of the palate providing and transmitting
breath and voice), and a formal group (tongue, palate, gums, jaw, teeth, lips forming the material into
specific sounds). His description of the voicing mechanism is the best of his time: the larynx both gives
passage to the breath and also, by the force of muscles, to bear the sides of the larynx stiff and near
together, as the breath passes through the rimula
[slit], makes a vibration of those cartilaginous bodies which forms that breath into a vocal sound or
voice. His material organs provide the basis for
four groups of sounds: breathed (i.e., voiceless) oral,
voiced oral, breathed nasal, and voiced nasal.
He rejects the traditional criterion for distinguishing vowels from consonants, namely that only vowels

476 Phonetics: Precursors to Modern Approaches

can stand alone: consonants, unlike vowels, involve


an appulse or approach of the vocal organs to each
other. They are divided into plenary and occluse
(with complete closure) and partial and pervious
(without closure), the occluse consonants having
three places of articulation labial (p, b, m) gingival (t, d, n) and palatick (k, g, ng); and the pervious
ones four labiodental (f, v), linguadental (th, dh
[y, ]), gingival (s, z) and palatick (sh, zh [S,
Z]), a total of 17. hli is described as having either a
unilateral or bilateral airstream, and Holder gives an
accurate account of the balance between muscle tension in the tongue and strength of airstream to produce the trilled hri. The logic of his system requires
him to recognize the possibility of breathed as well
as voiced nasals (though he calls them harsh
and troublesome), and also of breathed hli and hri.
He adds to these a stop formed in the larynx, but
believes that, like h (only a guttural aspiration), it
is not an articulation. If it is relaxed, he says, one
gets a shaking of the larynx (meaning possibly
creaky voice, or possibly a uvular trill). He notices
some language-specific features, such as the devoicing of final consonants in German, and the greater
aspiration in the Bocca Romana Roman dialect.
Holders system contains eight vowels, classified by
place guttural, palatick, labialguttural ( [w])
and labialpalatal ( [y]) and by different degrees
of aperture. Again, the way his analysis is structured
means that each vowel can be short or long, nasal or
oral, whispered or voiced, and (exceptionally for writers of this period) he points out that lip rounding can
be added to any vowel. In two of the vowels he
describes larynx lowering.
Although Walliss international reputation made his
work better known. Holder presents a more accurate
overall description of the formation of speech sounds.
George Dalgarno and John Wilkins

Dalgarno (162687) and Wilkins (161472) are best


known for their pioneering works on universal language. However, they had some interesting contributions to make to phonetic description. Dalgarno was
born in Aberdeen, but spent most of his life as a
private teacher in Oxfordshire (see Dalgarno, George
(ca. 16191687)). Descriptions of sounds are found
both in the first chapter of his Ars Signorum (1661)
and in his Discourse on the nature and number of
double consonants (1680). He limits the sounds described to those needed for his own system, but there
are some perceptive comments on structural combinations of sounds, such as the frequent occurrence of
initial hsi in consonant clusters, and of hli and hri after
stops in releasing clusters and before consonants in
final clusters.

Wilkins was the son of an Oxford goldsmith, who


became a distinguished academic and was appointed
Warden of Wadham College, Oxford, and later
Master of Trinity College, Cambridge and Bishop of
Chester (see Wilkins, John (16141672)). He was one
of a group of highly talented men who were founder
members of the Royal Society. In his Essay towards a
Real Character and a Philosophical Language (1668)
he presents a well-organized framework for the pronunciation of his universal language, acknowledging
his debt to Wallis, among others. The sounds are
represented in a table, which shows the articulators
involved and the various manner categories, mostly
accurately distinguished. Wilkins keeps the traditional terms sonorous and mute for the pairs p/b, v/f
etc., instead of following Walliss analysis; however,
he believed wrongly that the sonority was caused by
the epiglottis. Like Holder, and unlike most other
descriptions prior to the 19th century, he accepts the
possibility of voiceless nasals. There are three different methods of transcribing sounds (see Phonetic
Transcription: History), and some comments on the
characteristics of different languages, for example:
the Italians pronounce more slowly and
majestically, the French more volubly and hastily,
the English in a middle way betwixt both (classic
English compromise!).
Petrus Montanus

Petrus Montanus (Pieter Berg) is a curiously isolated


figure, and appears to have been little known,
perhaps because he wrote in Dutch (see Montanus,
Petrus (ca. 15941638)). He produced a highly complex analysis of sounds entitled De Spreeckonst (The
Art of Speech), published in Holland in 1635. The
main influence he acknowledges was from Madsen,
but his use of a series of dichotomies to classify the
sounds, and his notion of speech as an imposition of
form on matter are clearly Aristotelian. The organs of
speech are divided into two types: (a) breathing
(lungs, trachea, and head cavities); and (b) forming
(throat, mouth, and nose), each type having its own
muscle groups. The throat organs (or vessels, as he
calls them) include larynx, epiglottis, and uvula. The
mouth is divided into inner and outer; inner
includes the palate, gums, teeth, and tongue, outer
consists of the front of the teeth and the lips. Palate
and tongue are each subdivided from back to front,
and the lips are divided into edge and blade. The
place of articulation is known as the dividing door;
the cavity behind this is called the pipe, and that in
front of it the box. The dividing door may be
closed or open, and if open may be wide or narrow. A closed position gives oral and nasal stops (the
nasals listed include the palatal), wide open gives

Phonetics: Precursors to Modern Approaches 477

vowels and fricatives [h, s, z, f, v, x, X], and narrow


gives trilled hri and hli, the split letter. Voicing is
produced by the sounding hole in the throat, which
gives a smooth passage to the air, while the rustling
hole causes the air to be impeded at some point
(presumably giving fricative noise). The sounding
hole may be combined with a rustling hole. Montanus
claims to distinguish 48 vowel qualities found in
languages, but his terminology is complicated and
sometimes hard to interpret. This remarkably detailed and sophisticated analysis deserved to win
wider recognition, but remained virtually unknown
for over 300 years.

Doctors and Elocutionists


By the end of the 17th century, the anatomy and
physiology of the vocal organs had been reasonably
well described, though some areas were still obscure,
notably the mechanism of voice. This deficiency was
to be rectified in the 18th century.
Conrad Amman (16691730)

Amman was a Swiss-born doctor working in Amsterdam (see Amman, Johann Conrad (16691730)).
Like Holder and Wallis he had practiced teaching
the deaf, and had attained some fame through his
earlier work entitled Surdus Loquens The Dumb
Man Speaks in 1692, setting out his method. In the
preface to his Dissertatio de Loquela Treatise on
Speech (1700) he includes a letter of appreciation
directed to Wallis, who had written to Amman with
comments on the earlier book. Amman was apparently not acquainted with Walliss Tractatus or with
any other work on the subject when he wrote his own
treatise, but comments in his letter on what he (correctly) saw as deficiencies in Walliss account. These
included the restriction of the vowels to nine by his
oversymmetrical 3  3 classification, and his failure
to recognize the nature of what Amman calls mixed
vowels, namely those represented in German by hu i,
ho i, and ha i which Amman interpreted as mixtures of
hei with hui, hoi, and hai respectively. He makes the
perceptive comment that each vowel has a certain
variability (Latin latitudo); it may be more open or
more closed, but this does not change its nature.
Wallis had separated [j] and [w] (as consonants)
from [i] and [u], because in words such as ye and
woo the initial segments had a closer articulation
than the vowels following them. Amman, however,
includes this variation within the vowels latitudo,
which perhaps hints at an early notion of the phoneme. He makes sensible criticisms of three aspects of
Walliss treatment of consonants: (a) the failure
to mention the explosion phase of stops; (b) his

explanation of the difference between pairs such as


hv/fi and hz/si as due to different directions of the
airstream rather than to the presence or absence of a
sonus vocalis (vocal sound); and (c) his analysis of the
sounds [S, Z, tS, dZ] as hsy, zy, tsy, dzyi.
Ammans own framework of description was in
other respects quite similar to Walliss. The speech
organs are divided into those that produce voice and
breath, and those that form articulations a similar
division to Holders. He criticizes the traditional idea
that voice is produced simply by a narrowing of the
chink (fistula) in the larynx; there must be a tremulous and quick undulation. He had witnessed an
anatomist who claimed to produce voice from a
dead mans larynx, but did not find the demonstration convincing. The larynx is described in some
detail it consists of five solid cartilages, smooth
and of great elasticity, connected by ligaments and
small muscles, with nerves running from it to various
parts of the body, including ribs and ears. When voice
is required we breathe into the larynx muscles in
such a way that they act on the cartilages and are
balanced by them, resulting in an oscillatory and
vibratory movement, which is conveyed to the air
and to the bones of the head and makes it sonorous.
He compares it to a glass, set vibrating by rubbing
with a wet finger, or to the vibration of the tongue
in hri or in a labial trill. Pitch changes are attributed
to different length and thickness of the cartilages, and
also to the actions of muscles of the larynx in raising
it and narrowing the chink, or lowering it and widening the chink, to give higher or lower pitch respectively. Amman noted the close link between emotion and
voice quality, which he believed to come from an
original natural universal language, now virtually
lost.
His account of the articulators is also more anatomical than most, including reference to the tongue
muscles, and the functions of the hyoid bone and the
uvula. The various articulations of vowels and consonants are fully and accurately described, with comments on problems specific to the deaf. In all it is an
impressive work.
The Mechanism of Voicing: Dodart and Ferrein

Although there had been reasonably accurate descriptions of the anatomy of the larynx, notably by Casserius (15521616), no one had so far understood the
mechanism of phonation. Some, like Holder, had
come close to it, but it was not until the early 18th
century that experiments were made with excised
larynxes. Denys Dodart (16341707) and Antoine
Ferrein (16931769) were both medical doctors.
Dodart published a series of articles in the Me moires
de lAcade mie de Paris between 1700 and 1707,

478 Phonetics: Precursors to Modern Approaches

comparing the larynx to a stringed instrument. He


talks about the vibration of the two lips of the
glottis, the change of length or of tension resulting
in faster vibration and so higher pitch, and dismisses
the idea that the trachea plays a significant part in
pitch differences. However, it appears that when
Dodart and Ferrein talk of the glottis they are referring not to the true vocal folds but to the ventricular
folds. In his 1707 article, Dodart drew a false analogy
between the action of the vocal lips and the action of
the lips in whistling (apparently he was an expert
whistler), claiming that narrowing of the aperture
necessarily led to a higher pitch.
In 1741, Ferrein published an article which anticipates the myoelastic theory of larynx vibration.
He stated quite clearly that vibration is the essential
factor, and that for this to start there must be an
adduction of what he called the cordes vocales (the
first use of this term). The larynx is defined as an
instrument a` cordes et a` vent that is, not to be
classed purely either as a stringed instrument or a
wind instrument. The cordes vocales or rubans are
set in motion by the air (which acts like a bow on
violin strings), and the vibration produces tones according to the laws of a stringed instrument. Ferrein
dismissed Dodarts analogy with whistling, and maintained that changes of tension, not of aperture, cause
the variety of tones. He established this through
experiments using larynxes from cadavers, blowing
air through them and varying their tension.
Further experimentation by Magendie (1824),
Mayer (1826), and Johannes Mu ller (1837) confirmed much of Ferreins vibratory theory, and correctly identified the function of the true vocal folds.
Subsequent to the development of the laryngoscope
by Garcia (1856) and Czermak (1858), and the use of
photography (e.g., by Hermann, 1889) it became
possible to produce conclusive evidence as to their
mode of operation.
The Elocutionists

During the second half of the 18th century major


contributions to the knowledge of human speech
came from elocutionists. The demands for lessons in
elocution at this time sprang from a growing awareness of the deplorable standard of public speaking
and reading in churches, law courts, and in politics,
and also of the advantages to be gained by speaking
with a prestigious type of accent.
Two of the leading elocutionists were Thomas
Sheridan (171988; see Sheridan, Thomas (1719
1788)) and John Walker (17321807; see Walker,
John (17321807)). Sheridan had been interested in
oratory and in the promotion of the study of the
English language from an early age. Starting his career

as an actor, he later turned to lecturing and writing on


elocution. Both he and Walker produced pronouncing
dictionaries of English and published descriptions of
their systems of elocution, including, in particular,
extended accounts of the nonsegmental aspects of
speech, such as intonation, accent, emphasis, pause,
and rhythm. Walker gave very detailed rules as to the
intonation patterns of English, relating the various
inflections to pauses and to grammatical constructions.
Joshua Steele

The most precise and detailed description of these


nonsegmental aspects is to be found in Essay towards
Establishing the Melody and Measure of Speech, published in 1775 (2nd edn. 1779) by Joshua Steele
(170091; see Steele, Joshua (17001791)). Steele
was asked by the President of the Royal Society to
comment on certain remarks made by Lord
Monboddo (see Burnett, James, Monboddo, Lord
(17141799)) in a recently published book to the
effect that tone, or pitch, had no part to play in
English; accent, Monboddo had asserted, was conveyed simply by greater loudness. Having been interested for some years in the melody and measure of
speech, namely intonation and rhythm, Steele had
done experiments in which he imitated intonation
patterns using a bass viol. In his book he conclusively
refuted Monboddos statement, and set out his own
system, far in advance of anything previously published, together with comments on it from Monboddo, who admitted his mistake.
Steele distinguished (to use his terms) accent,
quantity, pause, emphasis (also called poise or
cadence), and force, that is, in modern terminology, pitch, duration, pause, salience (occurrence of
the strong beat), and loudness respectively. He also
provided a precise notation to indicate each of these
characteristics. The duration of each syllable, and of
each pause, is shown by using musical notes semiquaver, quaver, crotchet etc., and the pitch movement
by rising or falling oblique lines attached to the stalk
of each note. These show the slides of the voice,
which he contrasted with the separate steps of a
musical scale. Sometimes the notes are placed on a
stave, but more often they simply indicate the relative
direction and extent of the pitch movement. Steele
noted some of the functions of pitch change, for
example, to indicate completion by a final fall, noncompletion by a rise, and degrees of emotional involvement by the extent of the movement. In some
utterances it can serve simply as an embellishment, or
as an indicator of regional origin. Emphasis is more
complex. All speech, he says, prose as well as
poetry, falls naturally under emphatical divisions,

Phonetics: Precursors to Modern Approaches 479

which I will call cadences; let the thesis or pulsation,


which points out those divisions, be marked by bars,
as in ordinary music. In each bar the pulsation, or
thesis, is followed by an arsis, or remission. Sometimes the pulsation may fall on a pause rather than a
syllable, and be followed by one or more syllables in
the remission, as in the first and fifth cadences of the
following:
|

Of | mans | first diso | bedience, | and the | fruit


D [ | D [ |D . . [ | D . . [ | D [ | D

where the symbol D indicates the point of pulsation.


Each bar, he says, is composed of combinations of
syllables, or syllables and pauses, making up the same
total quantity. This springs from his belief that everyone has an instinctive sense of pulsation, which is
independent of the actual sounds. The terms heavy
and light are used to refer to this mental sensation of
what is emphatic or unemphatic. The rhythmus or
measure of speech is the number of cadences in a line
or sentence. Steele dismisses the popular link between force and emphasis. The term force refers
to loudness, but he points out that emphasis in his
sense frequently falls on a period of silence.
Although there are some inconsistencies in Steeles
notion of emphasis, his book represents a major advance in the description of nonsegmental aspects of
speech. It enabled him to make a detailed transcription of the way in which the actor David Garrick
delivered the well-known Hamlet soliloquy. John
Walkers theory of inflections was almost certainly
influenced by his knowledge of Steeles work, and
in the 19th century there were lively arguments between traditional prosodists and those who adhered
to Steeles temporal approach to rhythm.
Christoph Friedrich Hellwag

C. F. Hellwag (17541835) published his Dissertatio


physiologico-medica de Formatione Loquelae Physiologicalmedical Dissertation on the Formation of
Speech in Tu bingen in 1781. He based his account
of the anatomy and physiology of the vocal apparatus
on Albrecht von Hallers Elementa physiologiae corporis humani Elements of the Physiology of the
Human Body (175766) and on his own observations. The dissertation contains what was apparently
the first example of the vowel triangle as a way of
representing relationships between the vowel sounds:
u
o

u
o
a

i
e
a

The vowel hai is described as the principal vowel, the


basis of the rest when they are placed on what he calls

the ladder, while hui and hii form the upper extremities, and the other vowels are equally spaced between
these three. The transition from hui to hii goes
through hu i, and that from hoi to hei through ho i,
just as there is a theoretical point between ha i and ha i.
Between these designated points an infinite number of
others can be interpolated. Hellwag asks rhetorically
Will it not be possible to specify all the vowels and
diphthongs ever uttered by the human tongue by
reference to their position [on the ladder], as it were
mathematically? He says he has determined their
position not only auditorily, but also by observation
of their articulatory formation, and gives a very detailed and mostly accurate description of tongue and
lip positions. The way coarticulation occurs between
vowel and adjacent consonant is noticed and exemplified by the variation between [c ] and [x], according to whether the vowel is front or back. The
relative positions of the perceived pitches of whispered vowels (i.e., the formant resonances) are specified as (from lowest to highest): hu, o, a , a, a , e, ii,
and Hellwag remarks on the change induced in these
resonances by nasality. Diphthongs can be indicated
on his ladder by lines joining the starting and
ending points, with the implication that a series of
intermediate points must be traversed.
The description of consonants is accurate but
less notable than that of the vowels. It includes
descriptions of labial, alveolar, and uvular trills.

Early Developments in Experimental


Phonetics
Experiments were carried out, as has been outlined,
during the 18th century to try to illuminate the mechanism of the larynx. There was an increasing interest
in trying to imitate the action of the vocal apparatus
by the use of some sort of mechanical device. Hellwag
mentions a number of early attempts to produce
automata, none of which had any well-substantiated
success, but there were two important enterprises of
this kind in the late 18th century.
Christian Gottlieb Kratzenstein

In 1779, the Imperial Academy of St Petersburg specified the following questions as topics for its annual
prize: (a) What is the nature and character of the
sounds of the vowels A, E, I, O, U?; (b) Could one
construct a device of a kind like the organ stop called
vox humana which would produce exactly the sound
of these vowels? The prize was won by Christian
Gottlieb Kratzenstein (172395), Professor of Experimental Physics and of Medicine at the University of
Copenhagen, a distinguished scholar, whose father
was German and mother Danish (see Kratzenstein,

480 Phonetics: Precursors to Modern Approaches

Christian Gottlieb (17231795)). His prize essay,


written in Latin, was published in 1781. He starts
by giving an account of the mechanism of voice production. Like some earlier writers (e.g., John Wilkins), Kratzenstein believed that the epiglottis played a
major part in phonation. He gives an interesting description of the physiological formation of the
vowels, with precise measurements of the relative
positions of the tongue in relation to the palate, the
distances between upper and lower teeth, and the lip
apertures for the different vowels, together with the
positions of the larynx, and correctly notes both
tongue and lip positions for the rounded vowels hoi
and hui. Having described the normal vowel positions, he points out that it is possible to vary the
distance between the teeth without changing the
vowel quality.
He proceeded to construct special cavities, or
tubes, for each of the five vowels to be reproduced.
The equivalent of the breath came from a bellows,
and the glottal (or epiglottal) tone was provided by a
metal reed, except in the case of hii which operated
more like a transverse flute, so that the air passed over
a hole. Each vowel derived its quality from the air
traveling through the variously shaped tubes. It
appears that the sounds produced were recognizable,
but Robert Willis showed later that it was not necessary to have tubes of such complex shapes to produce
the desired result.
Wolfgang von Kempelen

Kratzensteins tubes were much less versatile, however, than the machine constructed by Wolfgang von
Kempelen (17341804), which he describes in his
book Mechanismus der menschlichen Sprache nebst
der Beschreibung seiner sprechenden Maschine
Mechanism of Human Speech together with the
Description of his Speaking Machine (1791). Von
Kempelen was a man of many parts lawyer, physicist, engineer, phonetician, and held a high position in
the government of the Habsburg monarchy (see
Kempelen, Wolfgang von (17341804)). He first
started on his project of constructing a speaking machine in 1769, so it took him over 20 years to bring it
to completion. One of the things which led him to
attempt it was his interest in teaching the deaf. His
book is in five parts, starting with more general questions about speech and its origins, and going on to
describe the speech mechanism in detail, the sounds
of European alphabets, and finally his machine. The
account of the speech mechanism is excellent, with
numerous diagrams, and amusing interludes, such as
his description of three kinds of kiss. The vowels are
characterized by reference to two factors: (a) the lip
aperture; (b) the aperture of the tongue-channel; he

specifically rules out any participation in vowel production by the nose (i.e., nasal vowels) or the teeth.
Each of the two apertures is divided into five degrees;
going from narrowest to widest, the lip aperture
gives U, O, I, E, A, and the tongue-channel aperture
I, E, A, O, U. These relationships were later seen to
correspond with variations of the first and second
formants respectively. It is clear from what von
Kempelen says that he was able to perceive pitch
differences (i.e., formants) in different vowels spoken
on the same note.
The machine which he eventually devised and built
was the culmination of several earlier unsuccessful
attempts. He describes it in great detail, with accompanying diagrams (see Experimental and Instrumental Phonetics: History) so precisely that attempts
were made later, more or less successfully, to build it
again on the basis of these directions.
The sounds produced (which included 19 consonants) undoubtedly left something to be desired in
their intelligibility; von Kempelen himself says that
if the hearer knows what word to expect it is more
likely to seem correct; otherwise one might treat it
like a young child who makes mistakes from time to
time but is intelligible. Nevertheless, his book represents a major contribution to phonetic description
and experimental research. The theory of speech
acoustics was not yet understood, but the attempt to
produce a satisfactory auditory output from a machine based on an articulatory model made several
steps in the right direction.
The Development of an Acoustic Theory of
Speech Production

Although there had been suggestions in earlier works


that the differences between vowel sounds derived
from changes in the shape of the vocal tract, brought
about by movements of the tongue and lips, there was
no theory to account for the differences prior to the
nineteenth century. Among those who played a part
in laying the foundations of such a theory were the
German physicist E. F. F. Chladni (17561827), and
the British physicists Robert Willis (180075) and
Charles Wheatstone (180275).
Chladni published his important book Die Akustik
Acoustics in 1802 (later translated into French as
ber die Hervorbringung
Traite dAcoustique), and U
der menschlichen Sprache On the Production of
Human Language in 1824. His vowel diagram is
very similar to Hellwags, except that it is reversed
vertically, and has one extra central vowel:
a
o` eu` e`
o e u e
ou u i

Phonetics: Precursors to Modern Approaches 481

He described the three columns in articulatory terms:


haoui ( [u]) are said to have progressively decreasing lip aperture; haii progressively decreasing mouth
cavity size; haui ([y]) show a decrease in both.
Willis was familiar with the work of both Kratzenstein and von Kempelen. After attempts to verify von
Kempelens methods of producing vowels using similar apparatus, he decided that the way ahead did not
lie in this direction. Instead, he commenced experiments using cylindrical tubes of differing lengths,
with reeds fitted into one end. He discovered that by
increasing the length he could produce the vowel
series hi, e, a, o, ui, without any change to the diameter or shape of the tube in any other way. Thus, he
showed that Kratzensteins complex tubes were unnecessary. He also concluded that the vowel quality
was independent of the note produced by the reed,
and was due to the damped vibrations set up by
reflections of the original wave at the extremity of
the tube. This theory came to be known as the inharmonic or transient theory. Wheatstone repeated
Williss experiments, and found that there were multiple resonances. Not only would a column of air
enter into vibration when it was capable of producing
the same sound as the vibrating body causing the
resonance, but also when the number of vibrations
which it was capable of making was an integral multiple of those produced by the original sounding body
i.e., a harmonic. Thus, a quality was added to the
original sound, depending on the relationship of the
frequency of vibration of that sound to the resonating
frequency of the column of air.
The work of Willis and Wheatstone formed the
basis for the vowel theory proposed by Hermann
Helmholtz (182194) in Die Lehre von den Tonempfindungen (1863). He did experiments using resonators of various sizes to simulate the effect of the
coupled resonances of the vocal cavity, and concluded
that vowel quality was determined solely by the resonance frequencies of the vocal cavity. This reinforced
the partial tones of the sound source which corresponded to its own resonances. His explanation
comes to be known as the harmonic theory of
vowel production. He divides the vowels into two
types: (a) hu, o, ai; and (b) hu , o , a , i, ei, type (a)
being characterized by one resonance, and type (b) by
two resonances. The positions of the resonances correspond very closely to modern definitions of the first
two formants. These would be very close together for
the vowels included in type (a), and Helmholtz presumably could not separate them from one another.
In the case of type (b) he assigns one resonance to the
pharynx and the other to the mouth cavity.
Helmholtzs harmonic theory was contested by
Hermann (Phonophotographische Untersuchungen,

1889), who preferred Williss inharmonic theory.


However, as Rayleigh (The Theory of Sound, 2nd
edn. 1896) pointed out, the difference between the
two theories is relatively unimportant. Both held that
for specifying vowel quality the resonance frequencies of the vocal cavity were essential; supporters of
the inharmonic theory required in addition that
there should be information about amplitude and
phase of all the spectral components, but in fact, for
vowels, this can be predicted from the resonance frequencies. Hermann was responsible for introducing
the term formant.
Relatively little work was done on the acoustics of
consonants before the last quarter of the nineteenth
century, when through the work of Rosapelly and
others, it became clear that there was a characteristic
pitch associated with the transition from consonant
to vowel.

Physiological Approaches to Speech in


the 19th Century
Karl Moritz Rapp (180383)

Rapps book (Versuch einer Physiologie der Sprache,


Essay on the Physiology of Speech 4 vols, 183641),
though taking physiology as its starting point, was
concerned particularly with tracing the historical development of languages. He was well acquainted with
the pioneering work of Rasmus Rask (see Rask, Rasmus Kristian (17871832)) and Jacob Grimm (see
Grimm, Jacob Ludwig Carl (17851863)) in this
field, and as well as his deep historical knowledge he
brought to his work an exceptionally keen phonetic
ear, and a wide knowledge of modern languages,
notably French, Italian, English, and Spanish. He
was a particularly harsh critic of the tendency of
scholars of the time, notably Grimm, to rely unduly
on the orthography in their study of speech, and this
did not endear him to his contemporaries in Germany. In common with other 19th-century German
scholars, including Grimm, Lepsius, and Thausing
(though their interpretations varied considerably),
Rapp drew an analogy between vowel sounds and
colors, referring to Goethes theory of color, which
was based on the mixture of pigments. He held that
just as the color gray represents an undifferentiated
mixture composed of all the other colors, and can be
placed in the middle of a triangle surrounded by
them, so the vowel in the center of the vowel triangle
is the Urvokal, or basic vowel, representing unentwickelte Indifferenz an undifferentiated neutral
form with respect to the distinctness of the vowels
around it. In spite of the impressive scholarship which
it shows, his book failed to win the recognition it
deserves.

482 Phonetics: Precursors to Modern Approaches


Ernst Wilhelm Bru cke (181992)

Bru cke was a distinguished physiologist and doctor


who practiced in Vienna (see Bru cke, Ernst (1819
1891); Experimental and Instrumental Phonetics:
History). He had relatively little background in modern spoken languages, so his approach to the description of speech is not a linguistic one. He was perhaps
inspired to do work in phonetics by his famous teacher Jan Purkinje (17871869). In his main work
(Grundzu ge der Physiologie und Systematik der Sprachlaute Foundations of Physiology and Taxonomy of
Speech Sounds, 1856) he formulated a classificatory
scheme which was intended to take in any possible
class of sounds, but, unlike his contemporary Merkel,
he kept the anatomical and acoustic material to a
minimum. His description of the consonants is wholly articulatory in its basis, as one would expect, and is
based on place and manner of articulation. He
emphasizes the importance for anyone investigating
language of being aware of the processes of production and not just the sounds which result from them.
In the case of the vowels, however, he abandoned this
approach, and attempted to class them by an acoustic
criterion, based on resonances. The vowels are set out
in a familiar triangular form, as follows:

with Bru cke the same background and approach to


speech description, he lacked the discrimination
which enabled Bru cke to present the essentials in
such a relatively precise and clear form. Merkels
book contains a somewhat forbidding mass of anatomical and physiological detail in its 1000 or so
pages, from which it is not easy to extract the more
valuable phonetic contributions. His knowledge of
foreign languages was extremely limited, and many
of his observations spring from his own Saxon dialect. One of his more serious failings was his misunderstanding of the mechanism of voicing, which led
him to say that no obstruent consonant has vibration
of the vocal cords; hence [b, d, g] and [p, t, k] are
distinguished as having respectively a closed and an
open glottis. On the other hand his treatment of the
nonsegmental aspects of speech, including phonotactics, rhythm, and pitch variation showed a keen
observers ear, and filled the gap which Bru cke had
left. He also, like Bru cke, devised a nonroman transcription system, which he claims is more complete,
in that Bru cke omitted from his notation vowel
sounds which he could not illustrate from languages,
and sounds whose physiological formation he did not
understand, such as the clicks of southern African
languages.
Moritz Thausing (183884)

The status of the two center vowels in each of the


two bottom lines is not at all clear. Even less clear is
the basis of the distinction made between what
Bru cke calls the vollkommen gebildete completely
formed vowels and the incompletely formed ones,
which form a further complete set of 14. Most of the
vowels of English are said to fall into this category. In
addition he talks of an unbestimmte indeterminate
vowel, but, like Lepsius (see Lepsius, Carl Richard
(18101884)), finds no place for it in his triangle.
Sievers (1881) (see Sievers, Eduard (18501932))
and others have criticized his failure to deal with
anything but sounds in isolation. However, in spite
of these shortcomings, Bru cke was thorough and
accurate in his general treatment of speech, and
his exposition has an admirable clarity. He put forward two systems of notation, one roman-based
andthe other nonroman (see Phonetic Transcription:
History).
Carl Ludwig Merkel (181276)

Merkel was Professor of Medicine in Leipzig. His


Physiologie der menschlichen Sprache Physiology of
Human Speech (1866) was an improved and enlarged version of an earlier work. While he shared

Thausing relied heavily on Bru ckes Grundzu ge as a


basis for his own system Das natu rliche Lautsystem
der menschlichen Sprache (1863). It is based on the
postulate that sounds make up a unified system,
which has its starting point in what he calls the Naturalaut, that is, the vowel hai. This, he believes, is the
sound with the least articulatory interference in its
mode of production, and the greatest intensity the
sound which is the earliest and easiest for humans to
produce. All other sounds are departures from this,
involving a greater degree of Verdumpfung dampening. He arranges the 22 basic sounds in a hierarchical
order, departing from hai in three different directions,
each branch containing seven grades of sounds,
decreasing in intensity (Tonsta rke, Schallsta rke) as
they get further from the Naturlaut. The nasal consonants are at the extreme, being most colorless.
The three series are as follows: (Thausing actually
arranges them on the model of a pyramid, with hai
at the top):
1. ha o u w
f
p b mi
2. ha e i j
ch([x]) k g g([N])i
3. ha l r s([z]) ([s]) t d ni

The different grades of sounds are subject to minor


variations in quality to account for national and individual varieties. At their boundaries they merge

Phonetics: Precursors to Modern Approaches 483

imperceptibly with their neighbors. The sounds can


only be defined in relation to their place in this system.
Thausing acknowledged his debt to Hellwag in
giving hai this position of prominence. The ancient
Indian phoneticians had also regarded it as basic to
the other vowels. However, perhaps the most significant of Thausings insights was not the system itself,
but the notion of a hierarchy of sounds of steadily
decreasing intensity or resonance. He emphasized
that just as only one syllable has prominence (den
Hauptton) in the word, so only one sound has
prominence in the syllable. Each sound has a certain
potentiality to take this prominent role in the syllable. Whether it does so or not depends not on its
inherent nature, but on its position in the hierarchy
relative to other adjacent sounds in the syllable. It
is not only vowels in the traditional sense that
can play this role, but frequently also liquids. He
therefore coined the term sonant to refer to this
prominent sound, and consonant for the nonprominent sounds in the syllable. This functional approach to the syllable in terms of the relative sonority
of its components influenced Sieverss syllable theory,
as he acknowledged. Thausing also put forward a
new notation system (see Phonetic Transcription:
History).
Fe lix Du Bois-Reymond (17821865)

Du Bois-Reymonds Kadmus only appeared in his


80th year, but it was the culmination of a lifetimes
interest in language. He had published articles on the
vowels (Neue Berlinische Monatschrift, 1811) and
consonants (Musen, 1812), intended to be part of
the larger work, but circumstances prevented its publication before 1862. Only encouragement from
Bru cke, in particular, stimulated him to complete
this work. It is a comprehensive account of the acoustics, articulatory basis, and graphic representation of
speech. For much of it he was dependent on earlier
writers, whose influence he acknowledges Hellwag,
Chladni, Merkel, Bru cke, Helmholtz, Lepsius, and
others. Perhaps the most interesting aspect of his
work is his vowel diagram. Instead of the triangular
shape, favored by Hellwag and Chladni, with hai at
the top or the bottom and regarded in some sense as
the most basic vowel, Du Bois-Reymond makes hui
and hii the two extremes in the vowel sequence, with
hai as a somewhat movable mean between them. His
diagram is semicircular, with a vertical diameter because he believed that this reflected the respective
high and low resonances of hii and hui, whereas the
more usual diagram put them both at the top or
bottom of a triangle (see Figure 1).

Figure 1 Vowel diagram by F. Du Bois-Reymond.

Phonetics in Britain in the 19th Century


The study and teaching of elocution continued to
stimulate advances in phonetics, but another powerful stimulus was provided by concern about systems
of transcription.
Alexander John Ellis (181490)

A. J. Ellis (see Ellis, Alexander John (ne Sharpe)


(18141890)) was responsible for laying the foundations of phonetic studies in Britain. A legacy
bestowed on him early in his life made it possible
for him to devote a large part of his time to study
and research. He obtained a first class degree in
mathematics at Cambridge, but he also had strong
interests in music and in philology, and during his
travels in Europe he became interested in the Italian
dialects. This led him to try to devise a system of
transcription to record the different sounds. In
1843, he heard about the work of Isaac Pitman (see
Pitman, Isaac, Sir (18131897)), who had by then
developed the system of shorthand which was to
become world famous, and in spite of their very different backgrounds (Pitman came from comparatively humble origins, and had left school at an early age)
they struck up a successful partnership, aimed at
reforming English spelling. Ellis determined to make
himself thoroughly familiar with the existing phonetic literature, and in his early work acknowledges a
particular debt to K. M. Rapp. He acquired a good
knowledge of languages, and following these early
studies published The Alphabet of Nature (1845),
and, what was effectively a substantial revision of it,
The Essentials of Phonetics, containing the theory of
a universal alphabet (1848) (printed in the Phonotypic transcription devised by Pitman and Ellis). Both of
these books were intended to provide a basis for a
new transcription system (see Phonetic Transcription:
History). However, his most substantial work was

484 Phonetics: Precursors to Modern Approaches

On Early English Pronunciation (EEP) (5 vols, 1869


89). Bells Visible Speech (1867) (see below) led him
to change some of his earlier ideas, though he did not
adopt Bells system as a whole. EEP was intended
primarily as a treatise on the history of the sounds
of English, but it contains in addition an amazingly
rich collection of information on the physiology of
speech and on contemporary English pronunciation,
summing up the results of earlier work, both of the
German physiologists, and of Melville Bell. Elliss
observations and analyses inevitably contain errors,
but one can have nothing but admiration for the
enormous amount of sheer hard work that went into
these five volumes, the extent of his scholarship, and
the fineness of the analysis of the dialectal sounds,
which he records in his Palaeotype transcription.
A sixth volume was planned, to provide a muchneeded summary and index, but regrettably the
authors death prevented its completion. In an
abridged version of volume five, entitled English dialects, their sounds and homes Ellis used Glossic, a
simpler transcription, based on English orthography,
instead of the more complex Palaeotype. A very
readable exposition of his ideas is contained in
Pronunciation for Singers (1877).
Alexander Melville Bell (18191905)

Bell was a Scot, the son of an elocution teacher, and in


due course became principal assistant to his father
(see Bell, Alexander Melville (18191905)). Between
1843 and 1870, he lectured in elocution at London
and Edinburgh universities, running courses for native speakers, foreigners, and for those suffering from
speech problems. This gave him a detailed knowledge
of a wide variety of different sounds. After visiting the
USA to lecture in 1868, he emigrated to Canada in
1870, and carried on his teaching there and in the
USA, becoming an American citizen in 1897.
His most influential work was Visible Speech, the
Science of Universal Alphabetics, or self-interpreting
physiological letters, for the writing of all languages
in one alphabet (1867). His idea was to provide a
framework which would accommodate all the various sounds that can be produced by the human vocal
organs, including even interjectional and inarticulate
sounds, and a notation designed to suggest to the user
the way in which the sounds concerned were formed
(that is an iconic notation see Phonetic Transcription: History). Bell first advertised his system in 1864,
with public demonstrations, and attempted to persuade the British government to give its support to it
as a way of simplifying the bewildering inconsistencies of the English orthography, and providing a
sound bridge from language to language, but he
was unsuccessful in this. However, his analysis of

speech sounds, and in particular of the vowels, had


a major impact on the subsequent development
of phonetics, even though it was not widely known
outside his own circle.
The new system was intended to identify clearly the
respective parts played in determining vowel quality
by the movement of the tongue, and the lips, and any
other factors. Too many earlier descriptions had
failed to separate out the contributions made by
these various components. Bells attention was, therefore, inevitably focused on the articulatory, and not
the acoustic aspect of vowel production. He divides
the vowel space into three horizontal (back, front,
mixed) and three vertical (high, mid, low) categories. The horizontal division is defined according to
the position of the raised part of the tongue in relation
to the palate mixed category involves raising both
the back and front of the tongue. The vertical division
relates to the height of the tongue. Each of these
tongue positions can be accompanied by rounded or
unrounded lips. This gives a total of 18 cardinal
vowel types, to which he added a further division
into primary and wide, making a total of 36
vowels. Bell explained primary and wide as follows:
Primary vowels are those which are most allied to consonants, the voice channel being expanded only so far as
to remove all fricative quality. The same organic adjustments form wide vowels when the resonance cavity is
enlarged behind the configurative aperture the physical
cause of wide quality being retraction of the soft palate,
and expansion of the pharynx.

The basis for this distinction was subsequently


changed by Sweet, who also changed the terms to
wide and narrow, but it did not win general acceptance. Sweet said of Bells classification Bells analysis of the vowels is so perfect that after ten years
incessant testing and application to a variety of languages. I seen no reason for modifying its general
framework. Bells own diagrams of the vowel articulations suggest that his distinctions may well be based
on auditory impressions rather than articulatory evidence, or perhaps on proprioceptive sensations of
muscular activity.
His analysis of the consonants and prosodic features broke less new ground, though there are many
interesting comments to be found in his works. The
consonants are divided into four places of articulation: (a) lip (labial); (b) point (formed by the point
of the tongue and, generally, the upper gums or teeth);
(c) front (formed by the front (middle) of the tongue
and the roof of the mouth); and (d) back (formed by
the root of the tongue and the soft palate). They are
further divided into four different forms: (a) primary (narrowing without contact fricatives); (b)

Phonetics: Precursors to Modern Approaches 485

divided (middle of the passage stopped, sides open


laterals); (c) shut (complete closure of the mouth
passage stops); and (d) nasal (complete closure of
the mouth, the nose passage being open nasals).
Further subdivisions of both place and form were
possible, partly by using the category mixed, which
allowed two places of articulation (e.g., point and
front), and partly by modifiers (front/back, close,
open). In addition the sounds could be voiced or
voiceless. In spite of the undoubted success which
Bell and his pupils clearly had in using the system,
his exposition of it in Visible Speech was far from easy
to follow for those nonspecialists who had to rely
purely on his some-what sparse descriptions of the
sounds.
Henry Sweet (18451912)

Sweet was perhaps the greatest 19th-century phonetician, and had an equal, if not greater, reputation as an
Anglicist (see Sweet, Henry (18451912)). He was
one of Bells pupils, and made himself fully familiar
with the works of other British and foreign writers on
phonetics such as Ellis, Merkel, Bru cke, Czermak,
Sievers, and Johan Storm. He was always guided by
his belief that it was vital to have a firsthand knowledge of the sounds of languages, and this is reflected
in his detailed studies of the pronunciation of Danish,
Russian, Portuguese, North Welsh, and Swedish in
the Transactions of the Philological Society from
1873 onward, and of German, French, Dutch, Icelandic, and English in his Handbook of Phonetics, published in 1877. This was his first major work on
general phonetics. He emphasizes in the preface that
phonetics provides an indispensable foundation
of all study of language, and deplores the current
poor state of language teaching and the lack of
practical study of English pronunciation in
schools. Phonetics, he says, can no more be acquired by mere reading than music can. It must be
learnt from a competent teacher, and must start with
the study of the students own speech. Bell, he believed, has done more for phonetics than all his
predecessors put together, and as a result of his
work and Elliss England may now boast a flourishing phonetic school of its own. He criticizes the
German approach to vowel description for focusing
entirely on the sounds, without regard to their formation, and for the resulting triangular arrangement of
the vowels, which has done so much to perpetuate
error and prevent progress. Several examples of the
triangular diagram being used to indicate the relationships between vowels have been given, beginning
with Hellwag. Du Bois-Reymond departed from this,
and Bells system led to the vowel quadrilateral as an
alternative. However, as Vie tor points out, Bell uses a

vowel triangle elsewhere, so his choice of the quadrilateral in Visible Speech relates more to the character
of his notation than to a basic difference in interpretation of the vowel area. For a discussion of various
19th-century vowel schemes see Vie tor (1898: 4165)
and Ungeheuer (1962: 1430).
The Handbook was a response to a request from
Johan Storm, the outstanding Norwegian phonetician
(see Storm, Johan (18361920)), for an exposition of
the main results of Bells investigations, with such
additions and alterations as would be required to
bring the book up to the present state of knowledge.
Sweets Primer of Phonetics (first published in 1890)
was intended in part as a new edition of the Handbook and in part as a concise introduction to phonetics, rigorously excluding all details that are not
directly useful to the beginner. The changes from
the Handbook are relatively few, but he made some
further changes in later editions of the Primer. The
following brief account refers to his final system.
Sweet has a short section on the organs of speech,
and then goes on to what he calls Analysis. This
starts with the throat sounds, and the categories
narrow and wide (modified from Bells primary
and wide). Sweet attributes this distinction not to
differences in the pharynx, but to the shape of the
tongue, and, unlike Bell, he applies it to consonants as
well as vowels. For narrow sounds he maintains that
the tongue is comparatively tensed and bunched,
giving it a more convex shape than for the wide
sounds. He adopts Bells classification for the vowels,
but introduces a further distinction, between shifted
and normal vowels, giving 72 categories in all. The
shifted vowels are said to involve a movement of
the tongue in or out, away from its normal position (front, mixed, or back), while retaining the
basic slope associated with this position, but few,
if any, phoneticians have found this a satisfactory
category.
The consonant classification is also based on Bell,
but with some modifications. Lip, point, front,
and back are retained, but a further category blade
is added and also fan (spread), to characterize the
Arabic emphatic consonants. Sweet preferred to use
the active articulator to define the place, rather than
the passive, as in modern terminology. Bell had somewhat curiously classed [f] [v] as lip-divided. and [y]
[] as point-divided (i.e., lateral) consonants: for
Sweet the first pair are lip-teeth, and the second
pair point-teeth. He also added an extra form category, namely trills. Each place of articulation could
be further modified forward (outer) or backward
(inner), and there are two special tongue positions:
inversion (in later terminology called retroflexion),
and protrusion (tongue tip extended to the lips).

486 Phonetics: Precursors to Modern Approaches

His next section is entitled Synthesis, and follows


the example of Sievers in looking at the characteristics of sounds in a stream of incessant change. These
characteristics include variations in the relative force
(stress), quantity, tone, and voice quality of parts of
the utterance, but notably the glides (a term which
he borrowed from Ellis) the transition from one
sound to another. In the case of vowels, this included
the aspirate on-glide hhi and glottal stop, as in [hat]
and [ at]. Consonant glides accounted for the variations of voicing and aspiration in stops, e.g., between [aka], [akha], [agha], and [aga]. The final
section of the Handbook is on sound notation
and introduces his Romic transcription. This was
subsequently revised in his article Sound Notation
(188081) (see Phonetic Transcription: History).
Sweets work had a widespread impact on the development of phonetics in Britain and abroad. The essentially pragmatic approach which he shared with Bell
and Ellis contrasted with the more theoretical approach of continental scholars. In due course the crucial importance of practical work with languages and
of careful observation and recording of sounds came
to be recognized by phoneticians throughout Europe.
Sweet combined this practical approach with an ability to extract from the multiplicity of his phonetic
observations a clear general phonetic system, and to
express this system clearly and concisely. Jespersen,
indeed, found him too concise, and looked for more
explanation or justification for some of his statements,
while admiring his clarity, sharp observation, and
ability to grasp the central and most crucial elements,
and to leave the rest on one side. He attributes
Sweets avoidance of lengthy discussions or explanations to his concern for the nonspecialist reader, who
would only be confused by discursive accounts of the
background and other competing views, and sees this
as a reaction to the German phoneticians, who had
often gone to the opposite extreme.

Phonetics in Germany in the 19th Century


Eduard Sievers (18501932)

Sievers (see Sievers, Eduard (18501932)) published


his Grundzu ge der Lautphysiologie in 1876. He saw
it as a continuation of the work of Merkel and Helmholtz, that is, as a scientific (physiological and physical) development of the study of speech. It was an
advance on the previous German work, giving more
attention to details of individual languages, but it was
still more theoretical than practical in its approach.
The second edition appeared in 1881, entitled Grundzu ge der Phonetik. In this the influence of what
was being called the EnglishScandinavian School
(notably Bell, Sweet, and Storm) was apparent.

Sievers abandoned his previous vowel system and


adopted Bells articulatory scheme, praising the practical approach that Bell had pioneered, though he still
believed that the starting point should be an analysis
of individual languages, rather than a scheme encompassing all possible sounds. He put more emphasis on
the phonetics of the sentence, including the glides
between sounds, and the identification of the syllable.
There is also an increasing emphasis on the importance of looking at the way sounds operate within a
system.
By the third edition (1885) Sievers had decided to
concentrate on the linguistic aspects of phonetics,
because he thought it was impossible to combine
this in one book with the physical and physiological aspects. The full title, Grundzu ge der Phonetik
zur Einfu hrung in das Studium der Lautlehre der
indogermanischen Sprachen, underlines his objective
to provide the basis for a detailed investigation of
the earlier stages of the IndoEuropean languages,
rather than to provide a treatise on the theory and
practice of phonetics. Nevertheless, he still promoted
the practical articulatory approach of Bell and Sweet,
and the need to study living languages (see Experimental and Instrumental Phonetics: History).
Moritz Trautmann (18421920)

Trautmann was not an admirer of the English


Scandinavian School. In his book, Die Sprachlaute
The Sounds of Speech (188486), he rejected Bells
vowel scheme because of its articulatory approach
and attempted to replace it with an acoustically
based analysis. However, although this is now a fully
acceptable method of analyzing vowels, at that time
the difficulties of identifying vowel resonances precisely were still considerable. In addition, Trautmann
frequently gives very little, if any, indication of the
precise vowels to which he is referring, and is content
to construct a theoretical scheme with little, if any,
exemplification from languages. This makes his system very difficult to follow. One of his objectives was
to found a German school of phonetics, and he
attempted to replace existing terminology with a new
German-based one, but only a few of his terms (such
as stimmhaft, voiced, and stimmlos, voiceless)
have become established.
Wilhelm Vie tor (18501918)

Vie tor was less of a theorist than a practical phonetician (see Vie tor, Wilhelm (18501918)). In this capacity he made a most important contribution to the
teaching of phonetics in Germany. His Elemente der
Phonetik Elements of Phonetics (1st edn. 1884) was
highly successful and was frequently republished (the
7th edn. in 1923, after his death). He was a pioneer in

Phonetics: Precursors to Modern Approaches 487

the movement to improve the standards of language


teaching in schools, and saw the crucial contribution
that phonetics could make to this. He was also, with
Passy and Sweet, one of the leading figures in the
Phonetic Teachers Association, which later became
the International Phonetic Association (IPA). Vie tor
for the most part accepted the practical tradition of
Bell and Sweet, and provides a valuable survey of the
work of his predecessors in the early part of his book.
Friedrich Techmer (184391)

Techmer started his academic life as a scientist, and


his aim was to link together the scientific and linguistic aspects of the study of speech. He had a much
wider knowledge of languages than the earlier physiologists like Bru cke, and with the increasing sophistication of experimental phonetic techniques at the
end of the nineteenth century the time was ripe for
an overall survey, linking auditory and articulatory
phonetics with the experimental approach. Techmer
did a great deal to stimulate interest and research in
phonetics by founding the Internationale Zeitschrift
fu r allgemeine Sprachwissenschaft (IZ), which began
in 1884 but ended with his death in 1891. In the first
issue he gave a valuable survey of the experimental
techniques being used to investigate speech, with
precise definitions of German technical phonetic
terms (see Experimental and Instrumental Phonetics:
History), which he had introduced (more successfully
than Trautmann). However, the contribution of his
own work to phonetics was regrettably not as great as
it promised to be. Contemporary phoneticians, such
as Sweet and Storm, were particularly critical of the
fact that he did not support his attempt to establish a
scientific basis for phonetics with actual data from
languages, and of his general lack of clarity and explanation. Techmer was highly critical of the mistakes
he claimed to have detected in the works of Bell and
Passy, but seems to have been unaware of the
improvements made to Bells system by Sweet. His
attacks were hard to respond to because they were
often phrased in general terms, without specific
examples from languages to support them, though
Techmer claimed that all his statements were based
on personal observation of the pronunciation of individual speakers, not on theoretically possible sounds.
His later works on specific languages and dialects,
are better supported (for his system of notation, see
Phonetic Transcription: History).

Paul Passy (18591940) and Otto


Jespersen (18601943)
The excuse for linking together Passy (see Passy, Paul
Edouard (18591940)) and Jespersen (see Jespersen,

Otto (18601943)) in one section is that while neither


of them was responsible for a major advance in phonetic theory, they both contributed very considerably
to the progress of phonetics through their conviction
of its importance and their practical teaching.
Both were closely involved in the founding of the
IPA, and in the movement to bring phonetics into language teaching. From 1890 to 1927 Passy was Secretary of the IPA, following his brother Jean, who had
been the first Secretary. Passy had a high reputation
for his lectures and for the clarity of his exposition.
Among his publications Les Sons du franc ais (1887)
is the most significant. In 1892, he won the Volney
Prize for an essay entitled Etudes de changements
phone tiques.
Jespersens contributions to linguistics range much
more widely, but phonetics was always one of
his strongest interests. He met Vie tor, Ellis, Sweet,
Sievers, and Passy soon after leaving university, and
acknowledges a great debt in particular to Sweet, but
also to the Scandinavians Thomsen (see Thomsen,
Vilhelm Ludvig Peter (18241927)), Verner, Mller
(see Mo ller, Hermann (18501923)), and Storm (see
Storm, Johan (18361920)). He was a good mathematician, and this may have led him to propose his
analphabetic system of notation in The Articulations
of Speech Sounds (1889), which also attempted to
give a more precise form and reference to phonetic
terminology (see Phonetic Transcription: History). It
was generally welcomed as providing objective reference points, which could be used to illuminate the
competing descriptive systems of Bell, Trautmann,
Sweet, and others, even though its very precision
made heavy demands on the skill of the observer.
His main phonetic work was published first in Danish, Fonetik, en systematisk fremstilling af lren om
sproglyd (189799), and later republished in German
in shorter form, Phonetische Grundfragen and Lehrbuch der Phonetik (both 1904). He was scrupulous in
familiarizing himself with all the previous literature,
and had searching observations and criticisms to
make of them. He also produced a dialect notation
for Danish, and was foremost in promoting the
Copenhagen Conference on phonetic transcription
in 1925, though he never adopted the IPA alphabet
for his own use (see Phonetic Transcription: History).

Phonetics at the End of the 19th Century


The discipline of phonetics was firmly established by
this time, in spite of inevitable differences in classificatory schemes, terminology, and systems of transcription. The major source of disagreement related
to the increasing use of experimental techniques of
analysis (see also Phonetics: Instrumental, Early

488 Phonetics: Precursors to Modern Approaches

Modern 3). Laryngoscopy, kymography, palatography, and sound recording were all being employed,
notably by Abbe Rousselot (see Rousselot, Pierre
Jean, Abbe (18461924)) in Paris (Principes de phone tique expe rimentale, 190108), but many phoneticians, among them Sievers, Sweet, and Jespersen,
were critical of the attention being given to these
techniques. They were not willing to accept results
produced by machines as convincing evidence in themselves. This is understandable, in view of the comparatively crude nature of some of the devices used. Sweets
comment in his article for Encyclopdia Britannica
(11th edn. 1911) sums it up:
It cannot be too often repeated that instrumental phonetics is, strictly speaking, not phonetics at all. It is only
a help: it supplies materials which are useless till they
have been tested and accepted from the linguistic phoneticians point of view. The final arbiter in all phonetic
questions is the trained ear of a practical phonetician.

The arguments went on for many years after that,


and even in the late 20th century have not been fully
resolved. As Kohler (1981) points out in his article on
19th-century German phonetics, there is still in many
cases a division between the linguistic and the more
technical or scientific aspects of phonetics, which
urgently needs to be bridged in order to confirm it
as a unified and independent discipline. Jespersens
hopeful words, written at the turn of the century
(190506: 80) are worth quoting, in conclusion.
After summarizing the various things that have led
people to the study of speech, he goes on:
While previously all these different men worked for
themselves, without knowing very much about the
others who, in a different way, were interested in the
same objectives, in recent times they seem more and
more to be converging, and to be making common
cause with one another, so that each one in his own
area is aware of the activity of the rest, and has the
strong feeling, that for the building which is to shelter
the science of human speech, stones must be hauled
together from many different directions.

See also: Amman, Johann Conrad (16691730); Bell, Alexander Melville (18191905); Bhartrhari; Brucke, Ernst
(18191891); Burnett, James, Monboddo, Lord (1714
1799); Dalgarno, George (ca. 16191687); Dionysius
Thrax and Hellenistic Language Scholarship; Ellis, Alexander John (ne Sharpe) (18141890); Experimental
and Instrumental Phonetics: History; First Grammatical
Treatise; Grimm, Jacob Ludwig Carl (17851863); Hart,
John (?1501-1574); Holder, William (16161698); Jespersen, Otto (18601943); Kempelen, Wolfgang von (1734
1804); Kratzenstein, Christian Gottlieb (17231795);
Lepsius, Carl Richard (18101884); Madsen Aarhus,

Jacob (15381586); Meigret, Louis (?1500-1558); Moller,


Hermann (18501923); Montanus, Petrus (ca. 1594
1638); Passy, Paul Edouard (18591940); Phonetic
Transcription: History; Pitman, Isaac, Sir (18131897);
Ramus, Petrus (15151572); Rask, Rasmus Kristian
(17871832); Rousselot, Pierre Jean, Abbe (1846
1924); Sandhi; Sheridan, Thomas (17191788); Sievers,
Eduard (18501932); Steele, Joshua (17001791);
Storm, Johan (18361920); Sweet, Henry (18451912);
Thomsen, Vilhelm Ludvig Peter (18241927); Vietor,
Wilhelm (18501918); Walker, John (17321807); Wallis,
John (16161703); Wilkins, John (16141672).

Bibliography
Abercrombie D (1965). Studies in phonetics and linguistics.
Oxford: Oxford University Press.
Allen W S (1953). Phonetics in ancient India. Oxford:
Oxford University Press.
Asher R E & Henderson E J A (eds.) (1981). Towards a
history of phonetics. Edinburgh: Edinburgh University
Press.
Austerlitz R (1975). Historiography of phonetics:
A bibliography. In Sebeok T A (ed.) Current trends in
linguistics, vol. 13. The Hague: Mouton.
Firth J R (1946). The English school of phonetics. TPhS
92132.
Fischer-Jrgensen E (1979). A sketch of the history of
phonetics in Denmark until the beginning of the 20th
century. ARIPUC 13, 135169.
Halle M (1959). A critical survey of the
acoustical investigation of speech sounds. In The sound
pattern of Russian. The Hague: Mouton.
Jespersen O (1897, 190506). Zur Geschichte der a lteren
Phonetik. In Jespersen O (ed.) (1933) Linguistica: Selected Papers in English, French, and German. Copenhagen:
Levin & Munksgaard.
Kemp J A (1972). John Wallis. Grammar of the English
language with an introductory grammatico-physical treatise on Speech. London: Longman.
Kohler K (1981). Three trends in phonetics: The development of phonetics as a discipline in Germany since the
nineteenth century. In Asher R E & Henderson E J A
(eds.) Towards a history of phonetics. Edinburgh: Edinburgh University Press.
Laver J (1978). The concept of articulatory settings:
A historical survey. HL 5, 114.
Malmberg B (1971). Reflections sur lhistoire de la phone tique. In Les domaines de la phone tique. Paris: Presses
Universitaires de France.
Mller C (1931). Jacob Madsen als Phonetiker. In Mller
C & Skautrup P (eds.) Jacobi M. Arhusiensis de literis
libri duo. Aarhus: Stiftsbogtrykkeriet.
Robins R H (1979). A Short history of linguistics. London:
Longman.
Sebeok T A (ed.) (1966). Portraits of linguists (2 vols).
Bloomington: Indiana University Press, IN.

Phonological Awareness and Literacy 489


Storm J (1892). Englische philologie die lebende Sprache 1
Abteilung: Phonetik und Aussprache (2nd edn.) (vol. 1).
Leipzig: Reisland.
Techmer F (1890). Beitrag zur Geschichte der franzo sischen und englishchen Phonetik und Phonographie.
IZ fu r allgemeine Sprachwissenschafr 5, 145295.

Ungeheuer G (1962). Elemente einer akustischen theorie


der Vokalartikulation. Berlin: Springer.
Vie tor W (1898). Elemente der phonetik (4th edn.). Leipzig:
O. R. Reisland.

Phonological Awareness and Literacy


U Goswami, University of Cambridge, Cambridge, UK
2006 Elsevier Ltd. All rights reserved.

There is growing empirical evidence from a variety of


languages for a causal connection between phonological awareness and literacy development. Phonological awareness is usually defined as the childs ability
to detect and manipulate component sounds in words.
Component sounds can be defined at a number of
different linguistic levels, for example syllables versus
rhymes. As children acquire language, they become
aware of the sound patterning characteristic of their
language, and use similarities and differences in
this sound patterning as one means of organizing the
mental lexicon (see Ziegler and Goswami, 2005). In
describing how phonological awareness is related to
literacy, I will discuss three kinds of empirical data: (a)
developmental studies measuring childrens phonological skills in different languages; (b) developmental
studies measuring longitudinal connections between
phonology and reading in different languages, and (c)
studies seeking to test whether the connection between phonological awareness and literacy is causal
(via training phonological skills). I will argue that the
development of reading is founded in phonological
processing across languages. However, as languages
vary in their phonological structure and also in the
consistency with which phonology is represented in
orthography, cross-language differences in the development of certain aspects of phonological awareness
and in the development of phonological recoding
strategies should be expected across orthographies.
After discussing the empirical evidence, I will conclude by showing that data from different languages
can be described theoretically by a Psycholinguistic
Grain Size theory of reading, phonology, and development (Goswami et al., 2001, 2003; Ziegler and
Goswami, 2005). According to this theory, while
the sequence of phonological development may
be language universal, the ways in which sounds are
mapped to letters (or other orthographic symbols)
may be language-specific. In particular, solutions to

the mapping problem of how sounds are related to


symbols appear to differ with orthographic consistency. When orthographies allow 1:1 mappings between
symbols and sounds (e.g., Spanish, a transparent or
consistent orthography), children learn to read relatively quickly. When orthographies have a many:1
mapping between sound and symbol (feedback inconsistency, which is very characteristic of French, as in
pain/fin/hein/) or between symbol and sound (feedforward inconsistency, very characteristic of English, e.g.
cough/rough/bough), children learn to read more
slowly. French and English are examples of nontransparent or inconsistent orthographies. My basic argument throughout this article will be that the linguistic
relativity of phonological and orthographic structures is central to understanding the development of
phonological awareness and reading.

Phonological Awareness in Different


Languages
According to hierarchical theories of syllable structure (see Treiman, 1988), there are at least three linguistic levels at which phonological awareness can be
measured. Children can become aware that (a) words
can be broken down into syllables (e.g., two syllables
in wigwam, three syllables in butterfly); (b) syllables
can be broken down into onset/rime units: to divide a
syllable into onset and rime, divide at the vowel, as in
t-eam, dr-eam, str-eam (The term rime is used because words with more than one syllable have more
than one rime, for example, in captain and chaplain,
the rimes are -ap and -ain, respectively. The rimes are
identical, but these words would not conventionally
be considered to rhyme, because they do not share
identical phonology after the first onset, as do rabbit
and habit, for example; this shared portion is sometimes called the superrime.); and (c) onsets and rimes
can be broken down into sequences of phonemes.
Phonemes are the smallest speech sounds making up
words, and in the reading literature are usually defined in terms of alphabetic letters. Linguistically,
phonemes are a relatively abstract concept defined

Phonological Awareness and Literacy 489


Storm J (1892). Englische philologie die lebende Sprache 1
Abteilung: Phonetik und Aussprache (2nd edn.) (vol. 1).
Leipzig: Reisland.
Techmer F (1890). Beitrag zur Geschichte der franzosischen und englishchen Phonetik und Phonographie.
IZ fur allgemeine Sprachwissenschafr 5, 145295.

Ungeheuer G (1962). Elemente einer akustischen theorie


der Vokalartikulation. Berlin: Springer.
Vietor W (1898). Elemente der phonetik (4th edn.). Leipzig:
O. R. Reisland.

Phonological Awareness and Literacy


U Goswami, University of Cambridge, Cambridge, UK
2006 Elsevier Ltd. All rights reserved.

There is growing empirical evidence from a variety of


languages for a causal connection between phonological awareness and literacy development. Phonological awareness is usually defined as the childs ability
to detect and manipulate component sounds in words.
Component sounds can be defined at a number of
different linguistic levels, for example syllables versus
rhymes. As children acquire language, they become
aware of the sound patterning characteristic of their
language, and use similarities and differences in
this sound patterning as one means of organizing the
mental lexicon (see Ziegler and Goswami, 2005). In
describing how phonological awareness is related to
literacy, I will discuss three kinds of empirical data: (a)
developmental studies measuring childrens phonological skills in different languages; (b) developmental
studies measuring longitudinal connections between
phonology and reading in different languages, and (c)
studies seeking to test whether the connection between phonological awareness and literacy is causal
(via training phonological skills). I will argue that the
development of reading is founded in phonological
processing across languages. However, as languages
vary in their phonological structure and also in the
consistency with which phonology is represented in
orthography, cross-language differences in the development of certain aspects of phonological awareness
and in the development of phonological recoding
strategies should be expected across orthographies.
After discussing the empirical evidence, I will conclude by showing that data from different languages
can be described theoretically by a Psycholinguistic
Grain Size theory of reading, phonology, and development (Goswami et al., 2001, 2003; Ziegler and
Goswami, 2005). According to this theory, while
the sequence of phonological development may
be language universal, the ways in which sounds are
mapped to letters (or other orthographic symbols)
may be language-specific. In particular, solutions to

the mapping problem of how sounds are related to


symbols appear to differ with orthographic consistency. When orthographies allow 1:1 mappings between
symbols and sounds (e.g., Spanish, a transparent or
consistent orthography), children learn to read relatively quickly. When orthographies have a many:1
mapping between sound and symbol (feedback inconsistency, which is very characteristic of French, as in
pain/fin/hein/) or between symbol and sound (feedforward inconsistency, very characteristic of English, e.g.
cough/rough/bough), children learn to read more
slowly. French and English are examples of nontransparent or inconsistent orthographies. My basic argument throughout this article will be that the linguistic
relativity of phonological and orthographic structures is central to understanding the development of
phonological awareness and reading.

Phonological Awareness in Different


Languages
According to hierarchical theories of syllable structure (see Treiman, 1988), there are at least three linguistic levels at which phonological awareness can be
measured. Children can become aware that (a) words
can be broken down into syllables (e.g., two syllables
in wigwam, three syllables in butterfly); (b) syllables
can be broken down into onset/rime units: to divide a
syllable into onset and rime, divide at the vowel, as in
t-eam, dr-eam, str-eam (The term rime is used because words with more than one syllable have more
than one rime, for example, in captain and chaplain,
the rimes are -ap and -ain, respectively. The rimes are
identical, but these words would not conventionally
be considered to rhyme, because they do not share
identical phonology after the first onset, as do rabbit
and habit, for example; this shared portion is sometimes called the superrime.); and (c) onsets and rimes
can be broken down into sequences of phonemes.
Phonemes are the smallest speech sounds making up
words, and in the reading literature are usually defined in terms of alphabetic letters. Linguistically,
phonemes are a relatively abstract concept defined

490 Phonological Awareness and Literacy

in terms of sound substitutions that change meaning.


For example, pill and pit differ in terms of their final
phoneme, and pill and pal differ in terms of their
medial phoneme. The mechanism for learning about
phonemes seems to be learning about letters. Letters
are used to symbolize phonemes, even though the physical sounds corresponding (for example) to the P
in pit, lap, and spoon are rather different. In all
languages studied to date, phonemic awareness
appears to emerge as a consequence of being taught
to read and write. In general, prereading children and
illiterate adults perform poorly in tasks requiring
them to manipulate or to detect single phonemes
(e.g., Goswami and Bryant, 1990; Morais et al.,
1979).
Studies of the development of phonological awareness across languages suggest, perhaps surprisingly,
that the early emergence of phonological awareness
at the level of syllables and onset-rimes is fairly universal. Children in all languages so far studied show
above-chance performance in phonological awareness tasks at the syllable and onset/rime level long
before they go to school. Cross-language studies
do not show uniform patterns of development for
phoneme awareness, however. Phoneme awareness
develops rapidly in some languages once schooling
commences, but not in others. Children learning
transparent orthographies such as Greek, Finnish,
German, and Italian acquire phonemic awareness relatively quickly. Children learning nontransparent
orthographies such as English, Danish, and French
are much slower to acquire phonemic awareness.
The development of phonological awareness in
children can be measured using a variety of tasks.
For example, children may be asked to monitor and
correct speech errors (e.g., sie to pie), to select the odd
word out in terms of sound (e.g., which word does
not rhyme: pin, win, sit), to make a judgment about
sound similarity (e.g., do these two words share a
syllable? hammer, hammock), to segment words by
tapping with a stick (e.g., tap out the component
sounds in soap three taps), and to blend sounds
into words (e.g., d-ish, or d-i-sh to make dish; see
for example, Bradley and Bryant, 1983; Chaney,
1992; Liberman et al., 1974; Metsala, 1999; Treiman
and Zukowski, 1991). However, as well as measuring
different levels of phonological awareness, these tasks
also make differing cognitive demands on young
children. In order to investigate the sequence of
the development of phonological awareness, ideally
the cognitive demands of a particular task should
be equated across linguistic level. This is even more
important to achieve when comparing the development of phonological awareness across different
languages.

Figure 1 Psycholinguistic units in words according to hierarchical theories of syllable structure.

Unfortunately, it is rare to find research papers in


which the same task has been used to study the emergence of phonological awareness at the different linguistic levels of syllable, rhyme, and phoneme. The
most comprehensive studies in English are those recently conducted by Anthony and his colleagues
(Anthony et al., 2002, 2003; Anthony and Lonigan,
2004). For example, Anthony et al. (2003) used
blending and deletion tasks at the word, syllable,
onset/rime, and phoneme level. They also studied a
large group of children (more than 1,000 children)
and included a much wider age range than many
studies (26 years). Using sophisticated statistical
techniques including hierarchical loglinear analyses,
and a factorial design that allowed them to investigate the order of acquisition of phonological skills,
they showed that childrens progressive awareness of
linguistic units followed the hierarchical model of
word structure shown in Figure 1. Children generally
mastered word-level skills before they mastered syllable-level skills, they mastered syllable-level skills before onset/rime skills, and they mastered onset/rimelevel skills before phoneme skills.
These findings with respect to sequence are mirrored by many other studies conducted in English
using a variety of tasks to measure awareness of
syllables, onsets, rimes, and phonemes. For example,
counting tasks (in which children tap with a stick or
put out counters to represent the number of syllables
or phonemes in a word) and oddity tasks (in which
children select the odd word out in terms of either
onset/rimes or phonemes) have yielded a similar developmental picture (e.g., Goswami and East, 2000;
Liberman et al., 1974; Perfetti et al., 1987; Tunmer
and Nesdale, 1985). Counting and oddity tasks are
useful for comparisons across languages, as both
tasks have been used by researchers in other languages to measure syllable, onset/rime, and phoneme
skills. Relevant data for syllable and onset/rime
awareness are shown in Table 1. It appears that,
where comparisons are possible, preschoolers in all
languages so far studied have good phonological
awareness at the large-unit level of syllables, onsets,
and rimes. For large units, phonological awareness
appears to emerge as a natural consequence of

Phonological Awareness and Literacy 491


Table 1 Data (% correct) from studies comparing syllable
(counting task) and rhyme awareness (oddity task) in different
languages
Language

Syllable

Rhyme

Greeka
Turkishb
Norwegianc
Germand
Frenche
Englishf
Chineseg

98
94
83
81
73
90

90

91h
73i

71j
68

Porpodas, 1999.
Durgunoglu and Oney, 1999.
c
Hoien et al., 1995.
d
Wimmer et al., 1991.
e
Demont and Gombert, 1996.
f
Liberman et al., 1974.
g
Ho and Bryant, 1997.
h
rhyme matching task.
i
Wimmer et al., 1994.
j
Bradley and Bryant, 1983.

Table 2 Data (% correct) from studies comparing phoneme


counting in different languages in kindergarten or early Grade 1
Language

% Phonemes counted
correctly

Greeka
Turkishb
Italianc
Norwegiand
Germane
Frenchf
Englishg
Englishh
Englishi

98
94
97
83
81
73
70
71
65

language acquisition. This is presumably because of


speech-perceptual factors that are common across all
languages using the syllable as the primary unit of
phonology (see Richardson et al., 2004; Ziegler and
Goswami, 2005, for relevant discussion).
As noted above, phoneme awareness is heavily dependent on letter learning. Awareness of phonemes
usually emerges fairly rapidly in languages with consistent orthographies, and in languages with simple
syllable structure (languages based on consonant
vowel (CV) syllables. In these languages, such as
Italian and Spanish, onset/rime segmentation (available prior to literacy) is equivalent to phonemic
segmentation (theoretically learned via literacy) for
many words (e.g., casa, mama). In Spanish and
Italian, one letter consistently maps to one phoneme.
Many of those phonemes are already represented in
the spoken lexicon of word forms, because they are
onsets and rimes (e.g., for a word like casa, the onset/
rimes are /c/ /A/ /s/ /A/ and so are the phonemes).
Children learning to read consistent alphabetic orthographies with a simple syllable structure are best
placed to solve the mapping problem of mapping
units of print (letters) to units of sound (phonemes).
Children learning languages with more complex
syllable structures, such as German, face a more difficult mapping problem. In such languages, onset/rime
segmentation is not usually equivalent to phonemic
segmentation for most words. This is because most
words either have codas (consonant phonemes) after
the vowel (e.g., Hand) or complex (consonant cluster)
onsets (e.g., Pflaum [plum]). However, for languages
like German, the orthography is consistent: one letter

Harris and Giannoulis, 1999.


Durgunoglu and Oney, 1999.
c
Cossu et al., 1988.
d
Hoien et al., 1995.
e
Wimmer et al., 1991.
f
Demont and Gombert, 1996.
g
Liberman et al., 1974.
h
Tunmer and Nesdale, 1985.
i
Perfetti et al., 1987 and Grade 2 children.
b

does map to one and only one phoneme. Hence letters


are a consistent clue to phonemes. The German child
is still at an advantage in terms of developing phoneme awareness. The child faced with the most difficult mapping problem in initial reading is the child
learning to read an orthographically inconsistent language that also has a complex syllable structure.
Examples include English, French, Danish, and
Portuguese. For English, onset/rime segmentation is
rarely equivalent to phonemic segmentation. English
has a relatively large number of monosyllables
(around 4000), and of these only about 4.5% have a
CV structure (see De Cara and Goswami, 2002). One
letter does not consistently map to one phoneme for
reading; instead one letter may map to as many as five
or more phonemes (e.g., the letter A). Accordingly,
phonemic awareness develops relatively slowly in English-speaking children. This is illustrated by the phoneme counting studies carried out in different
languages summarized in Table 2.
The development of phoneme awareness in different languages is mirrored by the ease of acquiring
graphemephoneme recoding skills, termed the sine
qua non of reading acquisition by Share (1995). As
pointed out by Share and many others, phonological
recoding (recoding letters into sounds) functions as a
self-teaching device, allowing the child successfully to
recode words that they have heard but never seen
before (see also Ehri, 1992). As may be expected
given the preceding analysis, graphemephoneme
recoding skills develop relatively rapidly in consistent
orthographies, and relatively slowly in inconsistent

492 Phonological Awareness and Literacy


Table 3 Data (% correct) from the COST A8 study of grapheme
phoneme recoding skills for monosyllables in 14 European
languages (adapted from Seymour et al., 2003)
Language

Familiar real words

Nonwords

Greek
Finnish
German
Austrian German
Italian
Spanish
Swedish
Dutch
Icelandic
Norwegian
French
Portuguese
Danish
Scottish English

98
98
98
97
95
95
95
95
94
92
79
73
71
34

97
98
98
97
92
93
91
90
91
93
88
76
63
41

orthographies. This is shown most clearly by a recent


study comparing graphemephoneme recoding skills
during the first year of schooling in the countries
making up the European Community (EC) at the
time that the data were gathered (Seymour et al.,
2003). The children in the study received simple
words and nonwords to recode, matched for familiarity as far as possible across orthography. Although
the age of school entry varies across the EC, the
success rates achieved by children in the different
countries appear very closely tied to the transparency
and phonological structure of the different languages.
Children learning to read consistent languages with
simple syllable structures (e.g., Finnish, Italian) were
close to ceiling in graphemephoneme recoding ability. Children learning to read inconsistent languages
with complex syllable structures (e.g., Danish,
French, English) were not. The data from this study
are reproduced in Table 3.
Developmental studies measuring childrens phonological awareness in different languages allow
some simple conclusions. The awareness of syllables,
onsets, and rimes appears to emerge as a natural
consequence of language acquisition in typically developing children across languages (note: this is not
so for dyslexic children, see Developmental Dyslexia
and Dysgraphia). Awareness of these large units of
phonology is present by the age of around 34 years,
long before children go to school and begin being
taught to read. The awareness of phonemes does not
appear to emerge as a natural consequence of language acquisition. Rather, it is an effortful consequence of reading acquisition. The rate of
development of phonemic awareness varies systematically with the phonological structure of the language
being learned and its orthographic consistency.

Longitudinal Associations between


Phonological Awareness and Reading
Across Languages
The existence of a longitudinal connection between
individual differences in childrens phonological
awareness measured prior to schooling and their
later progress in reading and spelling has been
known for at least 20 years. In a seminal study in
English, Bradley and Bryant (1983) demonstrated the
importance of onset/rime awareness for subsequent
reading development using the oddity task (e.g.,
which word does not rhyme: pin, win, sit). Bradley
and Bryant gave oddity tasks to 400 preschoolers
and found that onset/rime awareness was a significant predictor of their progress in reading and
spelling measured at 8 and 9 years. This longitudinal
correlation remained significant even when other factors such as IQ and memory were controlled in multiple regression equations. Subsequently, Maclean
et al. (1987) reported a significant connection between rhyming skills at age 3 measured via nursery
rhyme knowledge and single word reading at 4 years
and 6 months. Following up Maclean et al.s sample
2 years later, Bryant et al. (1990) reported a significant relationship between nursery rhyme knowledge
at age 3 and success in reading and spelling at ages
5 and 6, even after factors such as social background
and IQ were controlled.
These findings for English have been replicated by
a number of other research groups. For example,
Burgess and Lonigan (1998) found that phonological
sensitivity measured in a large sample of 115 4- and
5-year-old children (measured by the oddity task and
tasks requiring children to blend and segment compound words into words or syllables) predicted performance in both letter-name and letter-sound
knowledge tasks 1 year later (called rudimentary reading skills by the authors). Cronin and Carver (1998)
used an onset oddity task and a rhyme matching task
to measure phonological sensitivity in a group of 57
5-year-olds and found that phonological sensitivity
significantly discriminated the three different achievement levels used to group the children in terms of
reading ability at the end of first grade, even when
vocabulary levels were controlled. Baker et al. (1998)
showed that kindergarten nursery rhyme knowledge
was the strongest predictor of word attack and word
identification skills measured in grade 2, accounting
for 36% and 48% of the variance, respectively. The
second strongest predictor was letter knowledge,
which accounted for an additional 11% and 18% of
the variance, respectively. Note that these studies of
preschoolers do not typically use phonemic measures
of phonological awareness. This is because phonemic

Phonological Awareness and Literacy 493

awareness is so difficult to measure in prereaders,


unless it is awareness of onsets.
Studies in other languages support the research
findings typical of English, with some variation.
A German study using the oddity task tested children
(n 183) in their first month of schooling, when they
were aged on average 6 years 11 months (Wimmer
et al., 1994). Follow-up measures of reading and
spelling were taken both 1 year later and 3 years
later. Wimmer et al. (1994) found that performance
in the oddity task was only minimally related to
reading and spelling progress in German children
when they were 78 years old (the same age as the
children in Bradley and Bryants study). However,
significant predictive relationships were found in
the 3-year follow-up, when the children were aged
on average 9 years 9 months. At this point, rime
awareness (although not onset awareness) was significantly related to both reading and spelling development. A large-scale longitudinal study carried out by
Lundberg et al. (1980) with 143 Swedish kindergartners found an earlier connection. They gave a number
of phonological awareness tests, including rhyme
production, phoneme segmentation, and phoneme
reversal to the children in kindergarten, and examined the predictive relationships between these tests
and reading attainment in second grade. Both the
rhyme test and the phoneme tests were significant
predictors of reading almost 2 years later. A study in
Norwegian found a similar pattern of results. Hoien
et al. (1995) reported that syllable, rhyme, and phoneme awareness all made independent contributions
to variance in reading in a large group of 15,000
children. Finally, recent studies of Chinese children
also report longitudinal relationships between phonological awareness and reading, even though Chinese
children are learning a character-based rather than an
alphabetic orthography. For example, Ho and Bryant
(1997) gave a rime oddity task to 100 Chinese preschoolers aged on average 3 years 11 months. Performance was impressive, at 68% correct, and
significantly predicted progress in reading and spelling 2 years later, even after factors such as age, IQ,
and mothers educational level had been controlled.
This study demonstrates that the predictive relationships between large units and reading are found for
nonalphabetic orthographies, too.

Are Longitudinal Associations Evidence


for Causal Connections?
Although the predictive relations found between
early phonological awareness and later reading and
spelling development are impressive, they do not necessarily mean that the connection is a causal one.

Even though most of the studies described above


controlled for other variables such as IQ when computing longitudinal relationships, in order to demonstrate a causal connection it is necessary to intervene
directly. For example, if early phonological awareness
has a direct effect on how well a child learns to read
and spell, then guiding children to discover and attend to the phonological structure of language should
have a measurable impact on their reading progress.
A number of studies have used research designs
that included direct intervention. For example, as
part of the longitudinal study discussed earlier,
Bradley and Bryant (1983) took the 60 children in
their cohort of 400 who had performed most poorly
in the oddity task at 4 and 5 years of age, and gave
some of them 2 years of training in grouping words
on the basis of sounds using a picture-sorting task.
The children were taught to group words by onset,
rhyme, vowel, and coda phonemes (for example, placing pictures of a hat, a rat, a mat, and a bat together
for grouping by rhyme). A control group learned to
sort the same pictures by semantic category (e.g.,
farmyard animals). Half of the experimental group
also spent the 2nd year of the study matching plastic
letters to the shared phonological segments in words
like hat, rat, and mat. Following the intervention, the
children in the experimental group who had had plastic letter training were 8 months further on in reading
than the children in the semantic control group, and
12 months further on in spelling, even after adjusting
post-test scores for age and IQ. Compared to the
children who had spent the intervening period in an
unseen control group, they were an astonishing
24 months further on in spelling and 12 months in
reading.
Similar results were found in a large study of 235
Danish preschool children conducted by Lundberg
et al. (1988). Their training was much more intensive
than in the English study, and involved daily metalinguistic games and exercises, such as clapping out the
syllables in words and attending to the first sounds in
the childrens names. Training was for a period of
8 months and was aimed at guiding the children to
discover and attend to the phonological structure of
language (Lundberg et al., 1988: 268). The effectiveness of the program was measured by comparing the
childrens performance in various metalinguistic tasks
after training to that of 155 children in an unseen
control group. The trained children were found to
be significantly ahead of the control children in a
variety of metalinguistic skills including rhyming, syllable manipulation, and phoneme segmentation. The
long-term effect of the training on the childrens
reading and spelling progress in grades 1 and 2 was
also assessed. The impact of the training was found to

494 Phonological Awareness and Literacy

be significant at both grades, for both reading and


spelling, although effects were stronger for spelling.
Two training studies conducted in German found
a similar pattern of results to those reported by
Lundberg et al. (1988). Schneider et al. (1997) developed a 6-month metalinguistic training program covering syllables, rhymes, and phonemes and gave this
to a sample of 180 kindergarten children. Reading
and spelling progress was measured in grades 1 and 2.
Schneider et al. found significant effects of the metalinguistic training program on metalinguistic skills in
comparison to an unseen control group, as would be
expected from Lundberg et al.s (1988) results. They
also found significant long-term effects of metalinguistic training on reading and spelling progress,
with stronger effects for spelling.
In a second study, the same research group found
significant effects of the same training program on the
reading and spelling progress of 138 German children
assessed as being at risk for dyslexia in kindergarten
(Schneider et al., 2000). Prior to training, the at-risk
children were significantly poorer at rhyme production, rhyme matching, and syllable segmentation than
German control kindergartners, who were not
thought to be at risk. This study used a nice research
design in which all children designated at risk received
training (either metalinguistic training alone, lettersound training alone, or both together). Their progress was then compared to that of children from the
same kindergartens who had never been at risk. The
researchers found that the at-risk children who had
received the combined training program were not
significantly different in literacy attainment a year
into first grade when compared to those children
who had never been at risk and who had received no
kindergarten training. Interestingly, the at-risk group
who received letter-sound training alone, without
metalinguistic training, either performed at comparable levels in later reading and spelling progress to the
metalinguistic training alone group, or performed at
lower levels than this group. Both groups were
still significantly impaired in literacy attainment in
comparison to those children who had never been at
risk. This suggests that training either phonological
awareness alone or training letter-sound recoding
alone is insufficient. Both skills are important for
early literacy acquisition, at least for children thought
to be at risk of reading failure. Training one set of skills
without the other will not prevent literacy difficulties.

A Psycholinguistic Grain Size Model of


Phonological Awareness and Literacy
As shown by the studies reviewed above, the development of reading and spelling depends on phonological

awareness across all languages so far studied. Although apparently universal, specific characteristics
of this developmental relationship appear to vary
with language. Of course, languages vary in the nature of their phonological structure, and also in the
consistency with which phonology is represented by
orthography. This variation means that there are predictable developmental differences in the ease with
which phonological awareness at different grain
sizes emerges across orthography (and also in the
grain size of lexical representations and developmental reading strategies across orthographies; see
Ziegler and Goswami, 2005).
According to psycholinguistic grain size theory,
beginning readers are faced with three problems:
availability, consistency, and granularity of symbolto-sound mappings. The availability problem reflects
the fact that not all phonological units are accessible
prior to reading. Most of the research discussed in
this article has been related to the availability problem. Prior to reading, the main phonological units of
which children are aware are large units: syllables,
onsets, and rimes. In alphabetic orthographies, the
units of print available are single letters, which correspond to phonemes, units not yet represented in an
accessible way by the child. Thus, connecting orthographic units to phonological units that are not
yet readily available requires further cognitive development. The rapidity with which phonemic awareness is acquired seems to vary systematically with
orthographic consistency (see Table 2).
The consistency problem reflects the fact that some
orthographic units have multiple pronunciations and
that some phonological units have multiple spellings
(as discussed above, both feedforward and feedback
consistency are important; see Ziegler et al., 1997).
Psycholinguistic grain size theory assumes that both
types of inconsistency slow reading development.
Importantly, the degree of inconsistency varies both
between languages and for different types of orthographic units. For example, English has an unusually
high degree of feedforward inconsistency at the rime
level (from spelling to sound), whereas French has a
high degree of feedback inconsistency (from sound to
spelling; most languages have some degree of feedback inconsistency). This cross-language variation
makes it likely a priori that there will be differences
in reading development across languages, and indeed
there are (see Table 3, and Ziegler and Goswami,
2005, for a more comprehensive discussion). Finally,
the granularity problem reflects the fact that there are
many more orthographic units to learn when access
to the phonological system is based on bigger grain
sizes as opposed to smaller grain sizes. That is, there
are more words than there are syllables, more

Phonological Awareness and Literacy 495

syllables than there are rimes, more rimes than there


are graphemes, and more graphemes than there are
letters. Psycholinguistic grain size theory argues that
reading proficiency depends on the resolution of these
three problems, which will of necessity vary by orthography. For example, children learning to read
English must develop multiple strategies in parallel
in order to become successful readers. They need
to develop whole word recognition strategies and
rhyme analogy strategies (beak-peak, Goswami,
1986) in addition to graphemephoneme recoding
strategies in order to become efficient at decoding
print. The English orthography is characterized by
both feedforward and feedback inconsistency.

Conclusion
There is now overwhelming evidence for a causal link
between childrens phonological awareness skills and
their progress in reading and spelling across languages. Indeed, the demonstration of the importance
of phonological awareness for literacy has been
hailed as a success story of developmental psychology
(see Adams, 1990; Lundberg, 1991; Stanovich,
1992). Nevertheless, perhaps surprisingly, there are
still those who dispute that the link exists. For example, Castles and Coltheart (2004) recently argued that
no single study has provided unequivocal evidence
that there is a causal link from competence in phonological awareness to success in reading and spelling
acquisition (Castles and Coltheart, 2004: 77). In a
critical review, they considered and dismissed studies regarded by developmental psychologists as
very influential (for example, the large-scale studies
by Bradley and Bryant, 1983; Bryant et al., 1990;
Lundberg et al., 1988; and Schneider et al., 1997,
2000 described in this article). This is surprising,
because these studies used strong research designs
whereby (a) they were longitudinal in nature; (b) they
began studying the participants when they were
prereaders; and (c) they tested the longitudinal correlations found between phonological awareness and
literacy via intervention and training, thereby demonstrating a specific link that did not extend (for
example) to mathematics.
However, the apparent conundrum is easily solved.
Castles and Coltheart based their critique on two
a priori assumptions concerning phonological development and reading acquisition that are misguided. One was that the most basic speech units of a
language are phonemes (Castles and Coltheart,
2004: 78), and the second was that it is impossible
to derive a pure measure of phonological awareness if
a child knows any alphabetic letters (Castles and
Coltheart, 2004: 84). In fact, many psycholinguists

argue that the most basic speech units of a language


are syllables, not phonemes. Phonemes are not basic
speech units prior to literacy; indeed, letter learning is
required in order for awareness of phonemes to develop. Measures of phonological awareness in preschoolers are syllable, onset, and rime measures,
which can be administered as early as age 2 or 3.
Phonological awareness of these units does seem
to develop in the absence of letter knowledge (recall,
for example, the good onset/rime skills of 3-year-old
Chinese children). Taken to the extreme, however, the
second assumption (that phonological awareness
measures are impure once letter knowledge commences) is difficult to tackle. In alphabetic languages,
it is very difficult to find preschoolers who know no
letters at all. Even 2-year-olds in literate societies
tend to know the letters in their names, and thereby
probably know at least 45 letters.
Nevertheless, the balance of the evidence supports
a fundamental relationship between a childs phonological sensitivity and their acquisition of reading and
spelling skills. While the specific tasks and levels of
phonological awareness that will best predict literacy
are likely to depend on an individuals level of development, there does seem to be an apparently universal sequence of development from awareness of large
units (syllables, onsets, rimes) to awareness of small
units (phonemes). Within this apparently universal
sequence of development, variations in phonological
structure, and variations in the consistency with
which phonology is represented in orthography, generate cross-language differences. The nature of these
cross-language differences can be predicted a priori
by considering the availability and consistency of
phonological and orthographic units at different
grain sizes.
See also: Developmental Dyslexia and Dysgraphia;
Reading Processes in Adults; Reading Processes in Children.

Bibliography
Adams M J (1990). Beginning to read: Thinking and
learning about print. Cambridge, MA: MIT Press.
Anthony J L & Lonigan C J (2004). The nature of phonological awareness: converging evidence from four studies
of preschool and early grade school children. Journal of
Educational Psychology 96, 4355.
Anthony J L, Lonigan C J, Burgess S R, Driscoll K, Phillips
B M & Cantor B G (2002). Structure of preschool phonological sensitivity: overlapping sensitivity to rhyme,
words, syllables, and phonemes. Journal of Experimental Child Psychology 82, 6592.
Anthony J L, Lonigan C J, Driscoll K, Phillips B M &
Burgess S R (2003). Phonological sensitivity: a quasi-

496 Phonological Awareness and Literacy


parallel progression of word structure units and cognitive operations. Reading Research Quarterly 38(4),
470487.
Baker L, Fernandez-Fein S, Scher D & Williams H (1998).
Home experiences related to the development of
word recognition. In Metsala J L & Ehri L C (eds.)
Word recognition in beginning literacy. Hillsdale, NJ:
Lawrence Erlbaum Associates. 263287.
Bradley L & Bryant P E (1983). Categorising sounds
and learning to read: a causal connection. Nature 310,
419421.
Bryant P E, Maclean M, Bradley L & Crossland J (1990).
Rhyme, alliteration, phoneme detection, and learning to
read. Developmental Psychology 26, 429438.
Burgess S R & Lonigan C J (1998). Bidirectional relations
of phonological sensitivity and prereading abilities: evidence from a preschool sample. Journal of Experimental
Child Psychology 70, 117142.
Castles A & Coltheart M (2004). Is there a causal link from
phonological awareness to success in learning to read?
Cognition 91, 77111.
Chaney C (1992). Language development, metalinguistic
skills and print awareness in 3-year-old children. Applied
Psycholinguistics 13, 485514.
Cossu G, Shankweiler D, Liberman I Y, Katz L & Tola G
(1988). Awareness of phonological segments and reading
ability in Italian children. Applied Psycholinguistics 9,
116.
Cronin V & Carver P (1998). Phonological sensitivity,
rapid naming and beginning reading. Applied Psycholinguistics 19, 447461.
De Cara B & Goswami U (2002). Statistical analysis of
similarity relations among spoken words: evidence for the
special status of rimes in English. Behavioural Research
Methods and Instrumentation 34(3), 416423.
Demont E & Gombert J E (1996). Phonological awareness as
a predictor of recoding skills and syntactic awareness
as a predictor of comprehension skills. British Journal of
Educational Psychology 66, 315332.
Durgunoglu A Y & Oney B (1999). A cross-linguistic
comparison of phonological awareness and word recognition. Reading & Writing 11, 281299.
Ehri L C (1992). Reconceptualizing the development
of sight word reading and its relationship to recoding.
In Gough P B, Ehri L E & Treiman R (eds.) Reading
Acquisition. Hillsdale, NJ: Lawrence Erlbaum Associates. 105143.
Goswami U (1986). Childrens use of analogy in learning to
read: a developmental study. Journal of Experimental
Child Psychology 42, 7383.
Goswami U & Bryant P E (1990). Phonological Skills and
Learning to Read. Hillsdale, NJ: Lawrence Erlbaum.
Goswami U & East M (2000). Rhyme and analogy in
beginning reading: conceptual and methodological
issues. Applied Psycholinguistics 21, 6393.
Goswami U, Ziegler J, Dalton L & Schneider W (2001).
Pseudohomophone effects and phonological recoding
procedures in reading development in English and
German. Journal of Memory & Language 45, 648664.

Goswami U, Ziegler J, Dalton L & Schneider W (2003).


Nonword reading across orthographies: how flexible is
the choice of reading units? Applied Psycholinguistics
24, 235247.
Harris M & Giannouli V (1999). Learning to read and spell
in Greek: the importance of letter knowledge and morphological awareness. In Harris M & Hatano G (eds.)
Learning to read and write: a cross-linguistic perspective.
Cambridge: Cambridge University Press. 5170.
Ho C S-H & Bryant P (1997). Phonological skills are
important in learning to read Chinese. Developmental
Psychology 33, 946951.
Hoien T, Lundberg L, Stanovich K E & Bjaalid I K (1995).
Components of phonological awareness. Reading &
Writing 7, 171188.
Liberman I Y, Shankweiler D, Fischer F W & Carter B
(1974). Explicit syllable and phoneme segmentation
in the young child. Journal of Experimental Child
Psychology 18, 201212.
Lundberg I (1991). Phonemic awareness can be developed
without reading instruction. In Brady S A & Shankweiler
D P (eds.) Phonological processes in literacy: a tribute to
Isabelle Liberman. Hillsdale, NJ: Erlbaum.
Lundberg I, Olofsson A & Wall S (1980). Reading
and spelling skills in the first school years predicted
from phonemic awareness skills in kindergarten.
Scandanavian Journal of Psychology 21, 159173.
Lundberg I, Frost J & Petersen O (1988). Effects of an
extensive programme for stimulating phonological
awareness in pre-school children. Reading Research
Quarterly 23, 163284.
MacLean M, Bryant P E & Bradley L (1987). Rhymes,
nursery rhymes and reading in early childhood. MerrillPalmer Quarterly 33, 255282.
Metsala J L (1999). Young childrens phonological awareness and nonword repetition as a function of vocabulary development. Journal of Educational Psychology
91, 319.
Morais J, Cary L, Alegria J & Bertelson P (1979).
Does awareness of speech as a sequence of phones arise
spontaneously? Cognition 7, 323331.
Perfetti C A, Beck I, Bell L & Hughes C (1987). Phonemic
knowledge and learning to read are reciprocal: a longitudinal study of first grade children. Merrill-Palmer
Quarterly 33, 283319.
Porpodas C D (1999). Patterns of phonological and memory processing in beginning readers and spellers of
Greek. Journal of Learning Disabilities 32, 406416.
Richardson U, Thomson J, Scott S K & Goswami U (2004).
Auditory processing skills and phonological representation in dyslexic children. Dyslexia 10, 215233.
Schneider W, Kuespert P, Roth E, Vise M & Marx H (1997).
Short- and long-term effects of training phonological
awareness in kindergarten: evidence from two German
studies. Journal of Experimental Child Psychology 66,
311340.
Schneider W, Roth E & Ennemoser M (2000). Training
phonological skills and letter knowledge in children
at-risk for dyslexia: a comparison of three kindergarten

Phonological Change in Optimality Theory 497


intervention programs. Journal of Educational
Psychology 92, 284295.
Seymour P H K, Aro M & Erskine J M (2003). Foundation
literacy acquisition in European orthographies. British
Journal of Psychology 94(2), 143172.
Share D L (1995). Phonological recoding and self-teaching:
sine qua non of reading acquisition. Cognition 55,
151218.
Siok W T & Fletcher P (2001). The role of phonological
awareness and visual-orthographic skills in Chinese
reading acquisition. Developmental Psychology 37,
886899.
Stanovich K E (1992). Speculations on the causes and
consequences of individual differences in early reading
acquisition. In Gough P B, Ehri L C & Treiman R (eds.)
Reading Acquisition. Hillsdale, NJ: Lawrence Erlbaum
Associates. 307342.
Treiman R (1988). The internal structure of the syllable. In
Carlson G & Tanenhaus M (eds.) Linguistic structure
in language processing. Dordrecht, The Netherlands:
Kluger. 2752.
Treiman R & Zukowski A (1991). Levels of phonological awareness. In Brady S & Shankweiler D (eds.)

Phonological Processes in Literacy. Hillsdale, NJ: Erlbaum.


Tunmer W E & Nesdale A R (1985). Phonemic segmentation skill and beginning reading. Journal of Educational
Psychology 77, 417527.
Wimmer H, Landerl K, Linortner R & Hummer P (1991).
The relationship of phonemic awareness to reading acquisition: more consequence than precondition but still
important. Cognition 40, 219249.
Wimmer H, Landerl K & Schneider W (1994). The role of
rhyme awareness in learning to read a regular orthography. British Journal of Developmental Psychology 12,
469484.
Ziegler J C, Stone G O & Jacobs A M (1997). Whats the
pronunciation for-OUGH and the spelling for /u/?
A database for computing feedforward and feedback
inconsistency in English. Behavior Research Methods,
Instruments, & Computers 29, 600618.
Ziegler J C & Goswami U C (2005). Reading acquisition,
developmental
dyslexia
and
skilled
reading
across languages: a psycholinguistic grain size theory.
Psychological Bulletin. 131(1), 329.

Phonological Change in Optimality Theory


R Bermudez-Otero, University of Manchester,
Manchester, UK
2006 Elsevier Ltd. All rights reserved.

As has normally been the case for all major phonological frameworks, the relationship between Optimality Theory (OT) and historical phonology works
both ways: OT provides new angles on long-standing
diachronic questions, whereas historical data and
models of change bear directly on the assessment of
OT. For our purposes, it is convenient to classify
phonological changes under two headings, roughly
corresponding to the neogrammarian categories of
sound change and analogy:

2. What optimality-theoretic resources best explain


reanalysis: input optimization, innate biases in the
ranking of outputoutput correspondence constraints, both, or neither?
The answers to these questions may require OT to
depart significantly from the form in which it was
first proposed (Prince and Smolensky, 1993). OT
may need to acknowledge that markedness constraints are not innate but are rather constructed by
the child during acquisition, and it may need to adopt
a stratalcyclic approach to morphologyphonology
and syntaxphonology interactions.

The Role of Markedness in


Phonological Change

1. In phonologization, extragrammatical phonetic


effects give rise to new phonological patterns.
2. In reanalysis, a conservative grammar is replaced
by an innovative grammar that generates some of
the old phonological output in a new way.

OT asserts that speakers of natural languages know


implicitly that certain phonological structures are dispreferred or suboptimal. This knowledge is represented in their grammars by means of violable
markedness constraints, such as the following:

In this light, one can see that phonological change


raises two main questions for OT:

(1a) VOICEDOBSTRUENTPROHIBITION
Assign one violation mark for every segment
bearing the features [-sonorant, voice].
(1b) CODACOND-[voice]
Assign one violation mark for every token of the
feature [voice] that is exhaustively
dominated by rhymal segments.

1. Is markedness a mere epiphenomenon of recurrent


processes of phonologization, or does markedness
on the contrary constrain both phonologization
and reanalysis?

Phonological Change in Optimality Theory 497


intervention programs. Journal of Educational
Psychology 92, 284295.
Seymour P H K, Aro M & Erskine J M (2003). Foundation
literacy acquisition in European orthographies. British
Journal of Psychology 94(2), 143172.
Share D L (1995). Phonological recoding and self-teaching:
sine qua non of reading acquisition. Cognition 55,
151218.
Siok W T & Fletcher P (2001). The role of phonological
awareness and visual-orthographic skills in Chinese
reading acquisition. Developmental Psychology 37,
886899.
Stanovich K E (1992). Speculations on the causes and
consequences of individual differences in early reading
acquisition. In Gough P B, Ehri L C & Treiman R (eds.)
Reading Acquisition. Hillsdale, NJ: Lawrence Erlbaum
Associates. 307342.
Treiman R (1988). The internal structure of the syllable. In
Carlson G & Tanenhaus M (eds.) Linguistic structure
in language processing. Dordrecht, The Netherlands:
Kluger. 2752.
Treiman R & Zukowski A (1991). Levels of phonological awareness. In Brady S & Shankweiler D (eds.)

Phonological Processes in Literacy. Hillsdale, NJ: Erlbaum.


Tunmer W E & Nesdale A R (1985). Phonemic segmentation skill and beginning reading. Journal of Educational
Psychology 77, 417527.
Wimmer H, Landerl K, Linortner R & Hummer P (1991).
The relationship of phonemic awareness to reading acquisition: more consequence than precondition but still
important. Cognition 40, 219249.
Wimmer H, Landerl K & Schneider W (1994). The role of
rhyme awareness in learning to read a regular orthography. British Journal of Developmental Psychology 12,
469484.
Ziegler J C, Stone G O & Jacobs A M (1997). Whats the
pronunciation for-OUGH and the spelling for /u/?
A database for computing feedforward and feedback
inconsistency in English. Behavior Research Methods,
Instruments, & Computers 29, 600618.
Ziegler J C & Goswami U C (2005). Reading acquisition,
developmental
dyslexia
and
skilled
reading
across languages: a psycholinguistic grain size theory.
Psychological Bulletin. 131(1), 329.

Phonological Change in Optimality Theory


R Bermudez-Otero, University of Manchester,
Manchester, UK
2006 Elsevier Ltd. All rights reserved.

As has normally been the case for all major phonological frameworks, the relationship between Optimality Theory (OT) and historical phonology works
both ways: OT provides new angles on long-standing
diachronic questions, whereas historical data and
models of change bear directly on the assessment of
OT. For our purposes, it is convenient to classify
phonological changes under two headings, roughly
corresponding to the neogrammarian categories of
sound change and analogy:

2. What optimality-theoretic resources best explain


reanalysis: input optimization, innate biases in the
ranking of outputoutput correspondence constraints, both, or neither?
The answers to these questions may require OT to
depart significantly from the form in which it was
first proposed (Prince and Smolensky, 1993). OT
may need to acknowledge that markedness constraints are not innate but are rather constructed by
the child during acquisition, and it may need to adopt
a stratalcyclic approach to morphologyphonology
and syntaxphonology interactions.

The Role of Markedness in


Phonological Change

1. In phonologization, extragrammatical phonetic


effects give rise to new phonological patterns.
2. In reanalysis, a conservative grammar is replaced
by an innovative grammar that generates some of
the old phonological output in a new way.

OT asserts that speakers of natural languages know


implicitly that certain phonological structures are dispreferred or suboptimal. This knowledge is represented in their grammars by means of violable
markedness constraints, such as the following:

In this light, one can see that phonological change


raises two main questions for OT:

(1a) VOICEDOBSTRUENTPROHIBITION
Assign one violation mark for every segment
bearing the features [-sonorant, voice].
(1b) CODACOND-[voice]
Assign one violation mark for every token of the
feature [voice] that is exhaustively
dominated by rhymal segments.

1. Is markedness a mere epiphenomenon of recurrent


processes of phonologization, or does markedness
on the contrary constrain both phonologization
and reanalysis?

498 Phonological Change in Optimality Theory

Particular languages impose relationships of strict


dominance on a universal set of constraints CON. According to factorial typology, the class of possible
natural languages is defined by all the ranking permutations of CON.
In OT, the hypothesis that CON includes constraints
against voiced obstruents (1a) but not against voiceless obstruents, and against voice oppositions in the
syllable coda (1b) but not in the onset, explains the
statements in (2). (2a) is formulated as an absolute
negative universal and (2b) as an implicational universal. Both are representative of the class of typological generalizations known as markedness laws
(Hayes and Steriade, 2004: 3).
(2a) No language requires obstruents to be voiced in
the coda.
(2b) If a language licenses voice contrasts in the
coda, then it also licenses voice contrasts in
the onset.
Phonological Change and the
Problem of Grounding

A major question arises regarding the fact that most


of the markedness constraints posited by optimalitytheoretic phonologists have proved to be grounded
in phonetics. Consider, for example, the phonetic
motivation of (1b). A key phonetic cue to obstruent
voice specifications is voice onset time (VOT), the
duration (positive or negative) of the interval between
the offset of an obstruent and the first glottal pulse for
a following sonorant. By definition, VOT is available
only in presonorant contexts. In consequence, VOT
cues are frequently absent from the coda, and so voice
contrasts are less perceptible syllable-finally than syllable-initially. Thus, (1b) bans voice oppositions in
an environment in which they are difficult to realize
phonetically. The problem for OT is to account
for this relationship between markedness constraints
qua internal grammatical entities and the external
phonetic phenomena on which they are grounded.
There have been two main types of response to the
problem of grounding: diachronic reductionism and
nonreductionism. Diachronic reductionists argue that
markedness laws such as (2) are epiphenomena of
phonetically driven changes, and that postulating
markedness constraints such as (1) in the theory of
grammar is therefore unnecessary. The proponents of
diachronic reductionism typically adopt Ohalas
(1992) model of phonologization by misparsing, in
which phonological structures that pose articulatory,
acoustic, or auditory difficulties suffer from higher
rates of misperception and are therefore more likely
to be inadvertently altered or lost in historical change.

In this view, no language licenses voice contrasts


in the coda while neutralizing them in the onset simply because if listeners have historically managed
to parse the signal correctly in phonetically unfavorable environments (i.e., in the coda, where VOT cues
are often absent), they will a fortiori have succeeded
in phonetically favorable environments (i.e., in the
onset).
Two sharply different groups of phonologists have
espoused diachronic reductionism with respect to
markedness. One group comprises formalist (Hyman,
2001) and radically formalist (Hale and Reiss, 2000)
linguists who insist that phonology is autonomous and
phonetically arbitrary, and it must accordingly
be strictly separated from phonetics. The other group
consists of radically functionalist phonologists and
phoneticians who deny the existence of autonomous
principles of phonological organization and for whom
phonology emerges from phonetics in the process of
language use (Blevins, 2004; Bybee, 2001). Despite
their irreconcilable differences, both groups agree that
the problem of grounding is fatal to OT.
Nonreductionists, in contrast, maintain that markedness constraints, even if grounded on phonetic
phenomena, are nonetheless indispensable components of phonological grammar. A sizable subgroup
of nonreductionists account for grounding by suggesting that markedness constraints are neither innate
nor acquired by induction over the primary linguistic
data but are rather constructed by the child on the
basis of his or her experience of phonetic difficulty
in performance (Bermu dez-Otero and Bo rjars, 2006;
Boersma, 1998; Hayes, 1999; Hayes and Steriade,
2004). In the historical arena, some adherents of OT
have opposed diachronic reductionism by advancing
the argument that markedness constraints impose key
restrictions on phonological changes, whether driven
by phonologization or by reanalysis (Bermu dez-Otero
and Bo rjars, 2006; Kiparsky, 2004).
Diachronic Arguments against Markedness
Constraints

The reductionist critique of OT relies heavily on


Ockhams razor. Reductionists contend that phonetic
factors suffice to account for the dispreferred status of
phonological entities such as voiced obstruents or
voice features licensed by coda consonants; they see
no reason for positing a cognitive representation of
these factors in the shape of optimality-theoretic
markedness constraints, which are therefore deemed
superfluous. As noted by Bermu dez-Otero and Bo rjars (2006: x6.1), implicit in this argument is a crucial
claim: The fact that learners acquire phonological
grammars containing apparently markedness-driven

Phonological Change in Optimality Theory 499

processes and complying with markedness laws is


held not to raise Platos Problem (Chomsky, 1986).
Thus, diachronic reductionists follow Ohala in
emphasizing the role of the parser in phonologization
and downplaying the contribution of higher principles of grammatical organization. For Ohala, the
conditions for phonologization arise when the parser,
as it filters out noise from the phonetic signal, errs
either by excess (hypercorrection) or by defect (hypocorrection). Diachronic reductionists assume that, at
this point, the innovative patterns present in the distorted data delivered by the parser are incorporated
into the grammar of the learner (or, for functionalist
reductionists, of the adult listener) by mechanisms of
induction (or cognitive entrenchment, pattern association, Hebbian learning, etc.); see Hale (2000:
252) and Bybee (2001).
Diachronic reductionists reinforce their application
of Ockhams razor with the argument that OTs factorial typology technique is empirically inadequate in
that it is simultaneously too permissive (predicting
unattested language types) and too restrictive (failing
to allow for exceptions to markedness laws). The
overpermissiveness problem concerns gaps in factorial typology. Consider, for example, *NC, the mark of a nasal
edness constraint that penalizes sequences
followed by a voiceless obstruent. Permuting the
ranking of *NC relative to independently motivated

faithfulness constraints
predicts a wide range of repair strategies for NC clusters, of which some are
voicing, nasal deletion, and
attested (e.g., postnasal
denasalization) and some are not (e.g., metathesis and
vowel epenthesis). The gaps, it is argued, cannot be
eliminated by revising the theory of phonological
representations or the composition of CON; rather,
the unattested repairs are held to be impossible because they cannot arise from a phonetically driven
change or series of changes (Myers, 2002).
Factorial typology is also charged with excessive
restrictiveness because it fails to allow for so-called
crazy rules (Bach and Harms, 1972), which are
claimed to violate markedness laws. The following
examples have figured prominently in the debate:
1. In some dialects of English, an intrusive [r] or [l] is
inserted in certain hiatus contexts. This is alleged
to refute OTs prediction that epenthetic segments
should be unmarked (Blevins, 1997; Hale and
Reiss, 2000; McMahon, 2000).
2. Lezgi exhibits monosyllabic nouns in which a long
voiced plosive in the coda alternates with a short
plain voiceless plosive in the onset (3). This alternation is claimed to violate the markedness law in
(2a) (Yu, 2004).

(3) Lezgis crazy alternation


SING

PL/OBL

pab:
gad:
meg:

papa
gatu
meker

wife
summer
hair

Such phenomena are significant for two reasons. First,


OTs critics contend that the existence of crazy rules
shows markedness laws to be mere typological tendencies rather than strict universals. If so, explanations for markedness laws should be sought in
diachronic emergence rather than in universal phonological principles such as CON: notably, universals
derived by factorial typology should be exceptionless.
Second, crazy rules are created by processes of reanalysis, such as rule telescoping and rule inversion.
In English, for example, r- and l-intrusion are commonly assumed to have arisen through the inversion of
older natural rules that deleted /r/ and /l/ in nonprevocalic environments (Figure 1). Diachronic reductionism predicts this state of affairs: It is argued that
markedness laws are typological tendencies that
emerge from recurrent processes of phonetically
driven change; reanalysis is expected not to obey
markedness laws because it is not driven by phonetics
but by cognitive principles governing the relationship
between phonology, morphology, and the lexicon.

Figure 1 Linking r is replaced by intrusive r through rule inversion (based on Vennemann, 1972: x2). Phonological environments
are stated segmentally in order to avoid controversial analytical
commitments on syllabic affiliation. The distributional statements
are merely descriptive. In stratal versions of OT, intrusive r can be
analyzed as a two-step process, with FINALC driving insertion of [r]
in the environment Vi__o] at the (stem and) word level, followed by
deletion of nonprevocalic [r] at the phrase level.

500 Phonological Change in Optimality Theory


Diachronic Arguments for Markedness Constraints

OT supporters challenge many of the basic assumptions of diachronic reductionism. Hayes and Steriade
(2004: 2627), for example, rejected Ohalas model
of phonologization by misparsing, asserting instead
that phonological innovations typically originate in
child errors retained into adulthood and propagated
to the speech community. Child errors normally reflect endogenous strategies for adapting the adult
phonological repertory to the childs restricted production capabilities. It is assumed that these strategies
capitalize on knowledge acquired through the childs
experience of phonetic difficulty and represented by
means of markedness constraints (Bermu dez-Otero
and Bo rjars, 2006; Boersma, 1998; Hayes, 1999).
Compared with Ohalas, this approach to phonologization assigns a very different, although equally crucial, role to factors related to perception: highly
salient errors are less likely to be retained into adulthood or to be adopted by other speakers. In this light,
consider Amahls famous puzzlepuddle shift (Smith,
1973):
(4)
puzzle
puddle

Adult target
pVzel
pVdel

Amahls adaptation
pVdel
pVgel

This shift was endogenous; it did not originate in


misparsing because Smith showed that Amahls phonological discrimination was adult-like. Anecdotally,
a strategy such as Amahls may well have given rise to
the English word tickle, which is related to Old Norse
kitla and/or Latin titilla re but is not now believed to
be a childish mispronunciation.
On a different front, Kiparsky (2004) proposed a
set of criteria for distinguishing between mere typological tendencies and strictly universal markedness
laws, which admit no exceptions. Kiparsky regards
(2a) as a strict universal: He rejects Yus claim that the
Lezgi data in (3) provide a counterexample to (2a),
arguing that the alternating plosives in (3) derive from
underlying voiced germinates. Kiparsky nonetheless
accepted that explanations based on diachronic emergence are appropriate for typological tendencies. If,
accordingly, CON is held accountable only for nonemergent strict universals, then gaps in factorial typology need not be fatal to OT. Along these lines, the
refutation of diachronic reductionism involves two
tasks:
1. To show that maintaining compliance with strictly
universal markedness laws in the course of phonological change raises Platos Problem and therefore
requires the postulation of markedness constraints.
2. To show that there are no radically crazy phonological rules, understood as genuine phonological

processes that violate the strict universals implicit


in CON.
According to Kiparsky (2004) and Bermu dezOtero and Bo rjars (2006), the fact that languages
continue to comply with universal markedness laws
despite constant change raises Platos Problem. The
claim rests on what Bermu dez-Otero and Bo rjars call
the JakobsonKiparsky argument (Jakobson, 1929),
neatly summarized in Kiparskys (2004) dictum:
Whatever arises through language change can be
lost through language change. Note that, like the
neogrammarians, Ohalas theory of hypocorrection
predicts that phonologization is blind: it is driven by
local phonetic properties and operates without regard
for its global effects on the phonological system. However, a sequence of blind changes could easily lead to
the violation of a universal markedness law. To explain why this does not happen, one must postulate
grammatical principles (such as optimality-theoretic
markedness constraints) that block phonologization
or force a reanalysis in the relevant situations.
Consider, for example, a hypothetical language
with the following properties:
1. A two-way laryngeal contrast between plain and
voiced obstruents
2. No closed syllables
3. Syllabic trochees and a ban on degenerate feet.
Suppose now that, historically, this language becomes
subject to two phonological changes, applying in the
following order: (1) a lenition process voicing plain
obstruents in foot-internal position and (2) a process
of apocope creating closed syllables in word-final
position. The outcome of this scenario is given in
Table 1. Once lenition and apocope have taken
place, children are exposed to adult data in which
all coda obstruents are voiced. In this situation, diachronic reductionism predicts that learners will inductively acquire the following phonotactic
generalization:
(5) If an obstruent is in the rhyme, it must be voiced.

The knowledge that (5) is incorrect and, in fact, universally impossible is clearly beyond the reach of
inductive cognitive mechanisms.
In OT, in contrast, the emergence of (5) is blocked
because (5) is not a member of CON, nor is there a
Table 1 A diachronic scenario potentially conflicting with
markedness law (2a)
1. Initial state
2. Lenition
3. Apocope

a."ta.ta
a."ta.da
a."tad

a."ta.da
a."ta.da
a."tad

a."da.ta
a."da.da
a."dad

a."da.da
a."da.da
a."dad

Phonological Change in Optimality Theory 501

ranking of CON capable of replicating its effects. Even


if markedness constraints are constructed rather
than innate, the child will not add (5) to his or her
constraint set simply because plain obstruents are
easier to produce than voiced ones. In our scenario,
therefore, OT predicts that learners will interpret the
absence of voiceless obstruents in the coda either as
a lexical fact or as a result of morphological processes
that are not phonotactically driven. Notably, (5) will
fail to display the properties of productive phonological generalizations, such as application to neologisms
and nativized loans. In support of this conclusion,
note that, as expected, Lezgi has not extended the
crazy alternation in (3) to loans from Turkic, Arabic,
or other Lezgian languages (Yu, 2004: x5.7).
Bermu dez-Otero and Bo rjars (2006: x6.5) deploy
similar arguments in their discussion of l-intrusion in
American English dialects (Gick, 1999: x3). Historically, intrusive l arose through the inversion mechanism shown in Figure 1. This involved the reanalysis
of linking l patterns such as those in (6), where prevocalic [l] alternates with nonprevocalic [] after /O:/
and /e/.
(6a) [drO:]
drawl
(6b) [kru:we]
cruel

[drO:lIN]
drawling
[kru:welkt]
cruel act

Nonetheless, Gick reports that most dialects exhibit


l-intrusion only after /O:/:
(7a) l-intrusion:
(7b) no l-intrusion:

after /O:/
after /A:/
after /e/

the law[l] is . . .
the bra[] is . . .
the idea[] is . . .

Since the alternations in (6a) and (6b) are entirely


parallel, this restriction is unexpected: Why should
linking l have undergone inversion after /O:/ but not
after /A:/ or /e/? The key lies in the fact that, in these
dialects, the V-place features of /l/ are identical with
those of /O:/ but different from those of /A:/ and /e/.

Accordingly, [l] is inserted only when it can acquire its


V-place through spreading from the preceding vowel
(Figure 2). Bermu dez-Otero and Bo rjars argue that
this reanalysis transcends mere inductive generalization and therefore lies beyond the reach of the impoverished learner assumed by diachronic reductionists:
Using their knowledge of markedness, children rejected
highly marked [l] as an epenthetic hiatus breaker,
except where its V-place features were already available in the local context. This shows that the rule of
l-intrusion is not radically crazy and, more generally,
that phonological processes (as opposed to morphological or lexical patterns) created by reanalysis are
constrained by markedness.

The Role of Input Optimization in


Reanalysis
The diachronic scenario outlined in Table 1 and the
evidence of l-intrusion analyzed in Figure 2 indicate
that markedness laws play a crucial role in controlling reanalysis: CON forces learners to analyze
certain patterns in the primary linguistic data as
being partly or wholly lexical or morphological, rather than phonological. We now consider how other
components of OT contribute to our understanding
of phonological reanalysis. This section focuses on
OTs principles for the selection of input representations. These principles are referred to using the term
input optimization rather than Prince and Smolenskys (1993: x9.3) lexicon optimization because the
latter begs the question whether or not phonology is
stratified; in a stratal model, inputs to noninitial
levels need not be stored in the lexicon, although
they can be (Bermu dez-Otero, 2006).
According to a long tradition of research in diachronic generative phonology, one of the main mechanisms of analogical change is input restructuring. The
history of Yiddish provides a well-known example
(Bermu dez-Otero and Hogg, 2003: x1.3 and references

Figure 2 English /l/ is a complex segment with coronal C-place and dorsal V-place. In most American English dialects with l -intrusion,
the insertion of [l] as a hiatus breaker is licensed by V-place sharing with /O:/ but blocked by V-place disagreement with /e/. The feature
geometry assumed here, with the C-place node dominating the V-place node, is standard but is not crucial to the argument.

502 Phonological Change in Optimality Theory


Table 2 Yiddish: surface underapplication of final devoicing
day
NOM/ACC.SG
GEN.SG
DAT.SG
NOM/ACC.PL
GEN.PL
DAT.PL

gift

tac
tages
tage
tage
tage
tagen

tac
tages

before /-e/ loss

tagen

geben
geben

after /-e/ loss

before /-e/ loss

after /-e/ loss

tag
tag
tag

therein). In Middle High German, word-final obstruents were subject to devoicing; in Yiddish, however,
the alternations created by devoicing were leveled,
with underlying voiced obstruents freely surfacing in
word-final position.
(8)
Old High German
Middle High German
Yiddish

day
tag
tac
tog

days
tag-a
tag-e
teg

The conditions for reanalysis were created by a


phonological process of apocope that targeted final
/-e/. Apocope caused final devoicing to underapply
massively in surface representations (Table 2). In
(9), it is shown that a synchronic grammar could
recapitulate this historical development by means of
two phonological processes applying in counterfeeding order.
(9)
input
devoicing
apocope
output

SING

PL

/tag/
tak

[tak]

/tag-e/

tag
[tag]

geb
geb
geb
geb

gebe
gebe
gebe
gebe
geben
geben

Yiddish learners, however, failed to acquire such a


grammar: They were unable to posit inflectional
suffixes consisting of underlying /-e/ because this
vowel never surfaced.
Rule-based theories of phonology have never succeeded in delivering a satisfactory account of such
instances of input restructuring. Two fundamental
problems stand in their way. First, the learners choice
of input representations must be informed by the
system of inputoutput mappings that he or she has
acquired (and vice versa; Tesar and Smolensky, 2000:
x5.2). However, rule-based theory has never produced an explicit formal account of the acquisition
of rule systems. To do so is probably impossible
because the grammar space defined by rule-based
phonological models is too poorly structured to be
searched effectively by an informed learner.
In rule-based frameworks, moreover, the formal
demands of descriptive adequacy conflict with the empirical evidence of acquisition and change (Bermu dez-

Otero and Hogg, 2003: x2.2). Rule-based theories


typically rely on lexical underspecification to solve
the duplication problem, which arises over the fact
that the well-formedness conditions that lexical representations obey statically coincide to a very large
extent with the well-formedness conditions that
phonological processes enforce dynamically in the
derivation of grammatically complex expressions
(Clayton, 1976). Acquiring underspecified lexical
representations requires a powerful learner actively
pursuing a strategy of lexicon minimization. Psycholinguistic and diachronic evidence, however, suggests
that learners in fact follow a conservative what-yousee-is-what-you-get strategy and require positive evidence to abandon the identity map, in which inputs
are identical with the corresponding outputs.
OT, in contrast, holds promising prospects for
research into input restructuring because it incurs
neither of these difficulties. First, the assumption of
a finite CON has enabled learnability experts to devise
fully formalized constraint ranking algorithms, which
can be drawn upon in input selection (Tesar and
Smolensky, 2000). Second, OT is an output-oriented
theory in that no constraints directly evaluate input
representations. Rather, the grammar must work in
such a way that every possible input is associated with
a well-formed output (richness of the base). This
removes the requirement of lexicon minimization.
In fact, the fundamental insight behind Prince and
Smolenskys (1993: x9.3) original formulation of
their lexicon optimization principle is that, when
combined with minimal constraint violation, output
orientation defines the identity map as the default
option in input selection. Consider, for example, a
constraint hierarchy
such that there are two
potential input representations i1 and i2 for a given
output o. Since markedness constraints refer only
to outputs, the mappings i1!o and i2!o will tie
on markedness; they will differ solely in terms of
faithfulness violations. To achieve the most harmonic
mapping, then, the learner need only choose the input
representation that is closest to the output.
In the early days of OT, historical phonologists were quick to realize the advantages of input

Phonological Change in Optimality Theory 503

optimization (see references in Holt, 2003a). Since


then, however, progress has been halting and unsatisfactory: the theory of input selection remains underdeveloped and cannot in its current state serve the
needs of historical phonologists interested in reanalysis. This theoretical stagnation has been caused partly
by neglect: In strictly parallel versions of OT, once the
phonologist has satisfied himself or herself (1) that
the constraint hierarchy generates well-formed outputs for every possible input and (2) that there is a
viable input for every output, he or she has little
incentive to ask what input representation is actually
selected by the learner and how, crucial though these
questions are to the psycholinguist and to the historical linguist. In Stratal OT, however, the picture is
rather different, and it is hoped that research in this
framework will supply the want of an adequate theory of input selection (Bermu dez-Otero, 2006). In
what follows I outline a few promising avenues of
research.
First, it appears that the learning algorithm must
be set up in such a way that children search for a
single input representation for all the output alternants of each minimal grammatical unit at the current
level of analysis (although, of course, the search may
fail, as in cases of suppletion). Input optimization
mechanisms should be allowed to come into play
only when, for a given minimal unit, there is found
to be more than one possible input representation
meeting this requirement. Otherwise, in cases of alternation input optimization would cause the learner
to store every alternant in the lexicon as a means to
avoid unfaithful mappings (Prince and Smolensky,
1993: x9.3).
Several scholars have suggested that children require positive evidence from alternations in order to
depart from the identity map (Bermu dez-Otero,
2003: x4.4; Bermu dez-Otero and Hogg, 2003: x2.1;
Hayes, 2004). In this view, it is only when confronted
with alternations such as (10a) that children acquiring German contemplate inputoutput mappings that
violate the faithfulness constraint IDENT-[voice]
(10b):
(10a)

Ra:t

Ra:d-es
wheel wheel-GEN.SG
(10b) IDENT-[voice]
If a is segment in the output and b is a
correspondent of a in the input, then assign
one violation mark if a and b do not have the
same value for the feature [voice].

This assumption yields the right results in the Yiddish


case discussed previously. After apocope, the phonological realization of inflectional feature complexes
such as [DAT, SING] was nonalternatingly null: The

learner therefore had no motivation for deviating


from the identity map by positing suffixal /-e/.
An interesting line of enquiry, however, concerns
how learners may use evidence from alternations
in order to detect unfaithful mappings in nonalternating items (Bermu dez-Otero, 2003, 2006; McCarthy,
2005). Exploiting the resources of Stratal OT,
Bermu dez-Otero proposed a principle of archiphonemic prudence to deal with this problem. The basic
idea is the following: If the learner discovers an unfaithful mapping /a/![b] in alternating items at level l
(e.g., the phrase level), then he or she is required to
consider /a/ as a possible input representation for
nonalternating tokens of [b] as well; if, given current
constraint rankings, /a/ proves a viable input representation for some nonalternating token of [b], for
example, [bi], then the form that contains [bi] is set
aside; later in the acquisition process, the learner uses
the constraint hierarchy of the next higher level (e.g.,
the word level) to choose among the various possible
input representations for [bi].
The principle of archiphonemic prudence presupposes an account of how learners choose among competing input representations for an alternating item,
yet this is an area in which our understanding remains
particularly deficient. Inkelas (1995) and Tesar and
Smolensky (2000: x5.2) suggested that the faithfulness cost of each input representation is calculated by
adding faithfulness violations across the entire paradigm; Tesar and Smolensky called this paradigmatic
lexicon optimization. This is an appealingly simple
proposal, but it appears to make the wrong predictions with respect to analogical change. Consider, for
example, a hypothetical situation in which there are
two competing input representations i1 and i2 for a
given noun stem N in a language with a rich case
system. In addition, assume the following:
1. i1 allows the nominative form to be derived faithfully but causes a violation of the faithfulness constraint FAITH1 in the illative.
2. i2 allows the illative to be derived faithfully but
causes a violation of the faithfulness constraint
FAITH2 in the nominative.
3. FAITH1 dominates FAITH2.
In this situation, paradigmatic lexicon optimization
favors i2 since this input representation allows the
higher ranked faithfulness constraint FAITH1 to be
satisfied. Suppose, however, that the child is in a state
of transient underdetermination: i1 and i2 produce
different outputs for case forms of N that he or she
has not yet encountered in his or her trigger experience. In these circumstances, the child is vulnerable to
input restructuring, potentially leading to analogical
change. As we have seen, paradigmatic lexicon

504 Phonological Change in Optimality Theory

optimization favors i2, thereby creating pressure for


analogical leveling from the illative to the other cases
(Figure 3). We know, however, that for morphological
reasons, leveling is in fact far more likely to proceed
from the nominative.
To avoid this problem, Bermu dez-Otero (2003,
x4.4, 2006) proposed a weaker version of input optimization, which merely requires input representations to be Pareto-optimal.
(11) Input optimization: revised version
(11a) Input representations must be Pareto-optimal.
(11b) An input representation is Pareto-optimal if,
and only if, it has no competitor that (i)
generates all output alternants no less
efficiently and (ii) generates some output
alternant more efficiently.

Here, input efficiency is measured in terms of the


violation of ranked faithfulness constraints, as in previous formulations of input optimization. According
to (11), however, two input representations are both
Pareto-optimal if one input performs better than the
other input in one paradigm cell but worse than
the other input in a different paradigm cell. In such
situations, principle (11) predicts that the choice between the two inputs will depend on morphological
or lexical criteria.

Finally, a word should be said about the rise of


synchronic paradigm effects. Consider, for example,
the English rule of homorganic cluster simplification
that maps underlying /Ng/ onto [N] in the coda. As is
well-known, this process applies normally within
stem-level constructions but overapplies before
word-level suffixes and words beginning with a
vowel:
(12) long
longitude
longish
long effect

[lQn]
[lQngItju:d]
[lQnIS]
[lQnIfEkt]

normal application
normal nonapplication
overapplication
overapplication

Stratal OT generates this synchronic paradigm effect


without recourse to outputoutput correspondence
constraints. From a diachronic viewpoint, the effect
can be seen to arise through successive rounds of
input restructuring at different levels in the grammar
(Figure 4). There is therefore no need to stipulate
innate biases in the ranking of outputoutput correspondence constraints (Hayes, 2004). In particular,
Stratal OT correctly predicts that, synchronically,
misapplication in lexical domains implies misapplication in phrasal domains (Hayes, 2000: 102). Diachronically, the theory accounts for the typical life
cycle of a phonological pattern, in which its domain
gradually shrinks along the way to morphologization
and lexicalization.
See also: A Priori Knowledge: Linguistic Aspects; Choms-

Figure 3 Paradigmatic lexicon optimization (Inkelas, 1995;


Tesar and Smolensky, 2000) predicts analogical leveling in the
wrong direction. If the pattern of faithfulness violations across the
paradigm is the paramount criterion for input selection, input i2
will be selected, triggering leveling from the illative form. On
morphological grounds, however, i1 is far more likely to be selected, with leveling from the nominative.

ky, Noam (b. 1928); Constructivism; E-Language versus ILanguage; Elphinston, James (17211809); Experimental
Phonology; Feature Organization; Formal Models and
Language Acquisition; Formalism/Formalist Linguistics;
Generative Phonology; Infancy: Phonological Development; Inference: Abduction, Induction, Deduction;
Jakobson, Roman (18961982); Lexicalization; Linguistic Universals, Chomskyan; Linguistic Universals,
Greenbergian; Markedness; Morphological Change,
Paradigm Leveling and Analogy; Morphologization;
Morphophonemics; Neogrammarians;
Phonological

Figure 4 The life cycle of phonological patterns in Stratal OT. In the history of English, successive rounds of input restructuring at
progressively higher levels in the grammar cause the domain of homorganic cluster simplification to shrink. Stage I corresponds to the
formal speech of orthoepist James Elphinston (mid-18th century), stage II corresponds to Elphinstons colloquial speech, and stage III
corresponds to contemporary Received Pronunciation.

Phonological Change in Optimality Theory 505


Change: Lexical Phonology; Phonological Typology;
Phonological Universals; Phonology: Optimality Theory; PhonologyPhonetics Interface; Reductionism; Rule
Ordering and Derivation in Phonology; Sound Change:
Phonetics; Sound Change; Speech Perception; Suppletion; Syllable: Typology; 20th Century Linguistics: Overview of Trends; Underspecification.

Bibliography
Bach E & Harms R T (1972). How do languages get crazy
rules? In Stockwell R P & Macaulay R K S (eds.) Linguistic change and generative theory. Bloomington:
Indiana University Press. 121.
Bermu dez-Otero R (2003). The acquisition of phonologi (eds.)
cal opacity. In Spenader J, Ericksson A & Dahl O
Variation within Optimality Theory: Proceedings of the
Stockholm Workshop on Variation within Optimality
Theory. Stockholm: Stockholm University, Department
of Linguistics. 2536. (ROA 593, Rutgers University
Optimality Archive, http://roa.rutgers.edu.)
Bermu dez-Otero R (2006). Stratal Optimality Theory.
Oxford: Oxford University Press.
Bermu dez-Otero R & Bo rjars K (2006). Markedness in
phonology and in syntax: the problem of grounding. In
Honeybone P & Bermu dez-Otero R (eds.) Linguistic
knowledge: perspectives from phonology and from syntax. Lingua 116(2) (Special issue).
Bermu dez-Otero R & Hogg R M (2003). The actuation
problem in Optimality Theory: phonologization, rule
inversion, and rule loss. In Holt (ed.). 91119.
Blevins J (1997). Rules of Optimality Theory: two case
studies. In Roca I (ed.) Derivations and constraints in
phonology. Oxford: Clarendon Press. 227260.
Blevins J (2004). Evolutionary phonology: the emergence of
sound patterns. Cambridge, UK: Cambridge University
Press.
Boersma P (1998). Functional phonology: formalizing
the interaction between articulatory and perceptual
drives. The Hague, The Netherlands: Holland Academic
Graphics.
Bybee J (2001). Phonology and language use. Cambridge,
UK: Cambridge University Press.
Chomsky N (1986). Knowledge of language: its nature,
origin, and use. Westport, CT: Praeger.
Clayton M L (1976). The redundance of underlying morpheme-structure conditions. Language 52, 295313.
Gick B (1999). A gesture-based account of intrusive
consonants in English. Phonology 16, 2954.
Hale M (2000). Marshallese phonology, the phonetics
phonology interface and historical linguistics. The Linguistic Review 17, 241257.
Hale M & Reiss C (2000). Phonology as cognition. In
Burton-Roberts N, Carr P & Docherty G (eds.) Phonological knowledge: conceptual and empirical issues.
Oxford: Oxford University Press. 161184.
Hayes B (1999). Phonetically-driven phonology: the role of
Optimality Theory and inductive grounding. In Darrell
M, Moravcsik E, Newmeyer F J et al. (eds.) Functional-

ism and formalism in linguistics 1: general papers.


Amsterdam: Benjamins. 243285.
Hayes B (2000). Gradient well-formedness in Optimality
Theory. In Dekkers J, van der Leeuw F & van de Weijer J
(eds.) Optimality theory: phonology, syntax, and acquisition. Oxford: Oxford University Press. 88120.
Hayes B (2004). Phonological acquisition in Optimality
Theory: the early stages. In Kager R, Pater J &
Zonneveld W (eds.) Constraints in phonological acquisition. Cambridge, UK: Cambridge University Press.
158203.
Hayes B & Steriade D (2004). Introduction: the phonetic bases of phonological markedness. In Hayes B,
Kirchner R & Steriade D (eds.) Phonetically based phonology. Cambridge, UK: Cambridge University Press.
133.
Holt D E (2003a). Remarks on Optimality Theory and
language change. In Holt (ed.). 120.
Holt D E (ed.) (2003b). Optimality Theory and language
change. Dordrecht, The Netherlands: Kluwer.
Hyman L M (2001). The limits of phonetic determinism in
phonology: *NC revisited. In Hume E & Johnson K
(eds.) The role of speech perception in phonology. San
Diego: Academic Press. 141302.
Inkelas S (1995). The consequences of optimization
for underspecification. Proceedings of the North East
Linguistic Society 25, 287302.
Jakobson R (1929). Remarques sur le volution phonologique du russe compare e a` celle des autres langues slaves.
Travaux du Cercle Linguistique de Prague 2.
Kiparsky P (2004). Universals constrain change; change
results in typological generalizations. Stanford University
(http://www.stanford.edu/kiparsky/Papers/cornell.pdf).
McCarthy J J (2005). Taking a free ride in morphophonemic learning. Catalan Journal of Linguistics 4. (ROA
683, Rutgers University Optimality Archive, http://roa.
rutgers.edu.)
McMahon A M S (2000). Change, chance, and optimality.
Oxford: Oxford University Press.
Myers S (2002). Gaps in factorial typology: the case of
voicing in consonant clusters. University of Texas at
Austin. (ROA 509, Rutgers University Optimality Archive, http://roa.rutgers.edu.)
Ohala J J (1992). Whats cognitive, whats not, in sound
change. In Kellermann G & Morrissey M D (eds.)
Diachrony within synchrony: language history and cognition. Frankfurt am Main: Peter Lang. 309355.
Prince A & Smolensky P (1993). Optimality Theory:
constraint interaction in generative grammar. Rutgers
University Center for Cognitive Science Technical
Report No. 2. Published with revisions 2004. Oxford:
Blackwell.
Smith N V (1973). The acquisition of phonology: a case
study. Cambridge, UK: Cambridge University Press.
Tesar B & Smolensky P (2000). Learnability in Optimality
Theory. Cambridge: MIT Press.
Vennemann T (1972). Rule inversion. Lingua 29,
209242.
Yu A C L (2004). Explaining final obstruent voicing in
Lezgian: phonetics and history. Language 80, 7397.

506 Phonological Change: Lexical Phonology

Phonological Change: Lexical Phonology


A McMahon, University of Edinburgh, Edinburgh, UK

Lexical Phonology

2006 Elsevier Ltd. All rights reserved.

Lexical phonology was developed in an attempt to


reduce the abstractness of its standard generative
predecessor, and to account better for interactions
between phonology and morphology. Its architecture,
however, offers the additional benefit of making a
contribution to historical phonology.
Lexical phonology (Kiparsky, 1982; Mohanan,
1986; Halle and Mohanan, 1985) initially addressed
relationships between morphology and phonology.
For example, when adjective-forming -ic is added,
stress on the new, derived base shifts to the right, as
in underived atom as opposed to derived atomic.
However, this pattern is not observed with -ish, so
that yellow and yellowish both have initial stress. The
solution crucially involves reviewing the function and
shape of the lexicon, which had been seen as
a dictionary-like list, including notes of exceptional properties of particular items. Now the lexicon
becomes much more productive, incorporating morphological rules attaching affixes. These rules are
interspersed with phonological ones, and the whole
lexicon is divided into a series of levels, or strata.
Stress rules, which build phonological structure, are
taken to be located on level 1. Underived atom has
stress added initially; -ic is appended; and derived
atomic is again presented to the stress rules, which
now stress the longer form differently. However, yellow is not subject to level 1 morphological processes;
it passes to level 2, where -ish is added, but by then
yellowish is outside the scope of the level 1 stress
rules, and stress remains in the derived form where
it was assigned to the underived one. Likewise, if we
locate the rule of nasal deletion on level 1, we can
account for level 1 derived illegal from [in[legal]], but
level 2 derived unlawful, rather than *ullawful, from
[un[lawful]]. The morphological aspects of lexical
phonology are discussed in much greater detail by
Giegerich (1999).
This layering of lexical levels also rules out some of
the worst excesses of abstractness of standard generative phonology. Kiparsky (1982) uses the example of
English trisyllabic laxing, which shortens or laxes a
vowel followed by two or more syllables, as in declarative with []or serenity with [E], as opposed to declare with [eI] and serene with [i:]. The laxed
examples have level 1 suffixes, and we deduce that
trisyllabic laxing also applies on level 1. This is confirmed by the absence of its effects in bravery [eI] or
weariness [i:], which become trisyllabic by the attachment of level 2 suffixes, outside the scope of laxing.
Furthermore, ivory [aI] or Oberon [oo], which also

Although sound change was the main focus of phonological investigation in the 19th and early 20th
centuries, generative phonological theory initially
concentrated primarily on synchronic description
and explanation. Where sound changes were dealt
with at all, they were typically recast as rules in a
synchronic grammar. The development of lexical
phonology in the 1980s represented a partial refocusing on change, since the architecture of the model
lends itself particularly well to analyzing the lifecycle of phonological processes as they enter and
work through the grammar historically. Some of
these organizational insights are maintained in the
developing model of stratal optimality theory.

Phonological Theory and Sound Change


The essential concern of generative phonology, from
Chomsky and Halle (1968) onward, is with what
speakers of a language know. This prioritization
leads to an absolute post-Saussurean division between synchrony and diachrony, and to an absolute
focus on the former. Speech errors, dialect variation,
and change all become external evidence, while the
core object of enquiry is the synchronic system, determined by distribution and alternation and evidenced
by introspection.
From the perspective of a phonologist interested
in sound change, this narrowing of the field entails
the problematic assumptions that phonological theory should not be concerned with explaining change,
and that change cannot be used to evaluate theory.
Where standard generative phonology deals with
change at all, it is by reinterpreting changes as
synchronic phonological rules, as Chomsky and
Halle (1968) recast the historical English Great
Vowel Shift as the synchronic Vowel Shift Rule. But
paradoxically, envisaging a synchronic phonology
as layers of historical sound changes causes problems both for synchrony and diachrony: if we
are not concerned with explaining the source and
development of sound changes, we end up with
a rather static synchronic picture, which is not
sufficiently flexible to deal with dialect differences and which brings with it increasing abstractness. A concern only with what the speaker knows,
in other words, is likely to provide a model of the
grammar which, at worst, is unlearnable and unknowable.

Phonological Change: Lexical Phonology 507

retain tense vowels, cannot undergo laxing because


they are trisyllabic from the start, rather than becoming trisyllabic by level 1 suffixation: this reflects
the so-called derived environment condition, which
specifies that structure-changing phonological processes on level 1 can apply only when they are fed
by a morphological or prior phonological process.
Finally, items like camera [] or enemy [E], which
surface with short vowels in the right environment,
would have been good candidates in standard generative phonology for derivation by free ride through
trisyllabic laxing; but in lexical phonology, since
these forms are underived, they are prohibited from
undergoing laxing by the effects of the derived environment condition. They must therefore be stored
with short, lax vowels too, approximating the underlying and surface forms and reducing abstractness. Application of level 1 phonological rules at
least is therefore restricted to cases where speakers
encounter alternations, as in declare/declarative,
serene/serenity, to motivate a phonological process.
Even here, according at least to McMahon (2000b)
and Bermu dez-Otero (1999), we should assume that
the underlying form is equivalent to the surface form
of the underived alternant, increasing transparency
and learnability.

Although standard generative phonology has nothing to say about these different types of change, they
can be modeled in lexical phonology. The phonological processes mentioned above all operated in the
lexicon; but lexical phonology also recognizes a postlexical component. Postlexical processes interact not
with the morphology, but with the syntax, and can
apply across word boundaries, as does Tyneside
weakening (Carr, 1991), which changes /t/ to [r] in
word-final intervocalic positions, as in not a chance,
put it down, beat it. Postlexical rules are characteristically gradient, apply across the board without
lexical exceptions, are phonetically conditioned, and
can often be suspended stylistically. However, lexical processes tend to be categorical, may have lexical
exceptions, are morphologically conditioned, and
persist regardless of sociolinguistic factors such as
formality. Kiparsky (1988) equates Neogrammarian
sound changes with postlexical rules, and lexically
diffusing sound changes with lexical rules. More
accurately, we are dealing here with rule applications,
since palatalization in English, for instance, applies
both lexically and postlexically, though the different
applications have different properties. Thus, while
palatalization is obligatory word-internally in racial,
it is optional postlexically in Ill race you, where
we find variably either [sj] or [S].

Lexical Phonology and Sound Change

The Life-Cycle of Phonological Rules

The Neogrammarian Controversy

Connections between synchrony and diachrony are


strengthened by Labovs (1972, 1994, 2001) pioneering work, which demonstrated that variation and
change are inextricably connected: accepting that
there are simplifications of various kinds at play,
we can nonetheless claim that todays variation is
tomorrows change.
However, whereas the 19th-century Neogrammarians argued that sound change is phonetically
gradual but lexically abrupt, proceeding by tiny, incremental steps but in all eligible words at once, the
lexical diffusionists (Wang, 1969; Chen and Wang,
1975) saw sound change as phonetically abrupt but
lexically gradual, spreading from word to word.
Labov (1981) attempted to resolve this controversy
by testing the two views against changes in progress,
but found evidence for both, concluding that
there are simply two different types of sound change.
While Neogrammarian change is gradient, finely
phonetically conditioned, often sociolinguistically
marked, and learnable, the diffusing type is discrete,
may be grammatically conditioned, and has lexical
exceptions.

The equation of two types of sound change with


two types of phonological rules, however, is still a
static comparison. Harris (1989) makes matters
more dynamic by arguing that processes can begin
as postlexical, gradient, Neogrammarian changes,
but subsequently gain lexical characteristics, becoming lexical rules. Harris (1989) discusses -tensing,
which operates in Philadelphia, New York, and
Belfast English, for example, producing lax [] in
tap, bat, panel, ladder, but the tense equivalent
in pass, man, manning and man-hours. This process
was clearly originally phonetically conditioned,
operating gradiently in a hierarchy of environments
with the likelihood of tensing increasing from voiceless stops, through voiced obstruents, to nasals and
voiceless fricatives. However, it is now perceived
categorically, and has become sensitive to morphological information, as shown by tensing in manning
and man-hours. Such processes may apply on level 2,
and later on level 1, where more fossilized and exception-prone processes are situated (in English,
irregular inflection and Latinate affixes are typically
level 1, while regular inflection and Germanic affixes
are level 2). In other varieties, such as Standard
Southern British English, -tensing has shifted even

508 Phonological Change: Lexical Phonology

further from its postlexical, Neogrammarian roots: it


is no longer synchronically productive at all, but by
applying differentially and increasingly unpredictably, and acquiring lexical exceptions, it has caused
underlying restructuring, leading to an opposition of
short // in pat and long /A:/ in path.
Nor is -tensing the only example. McMahon
(1991) argues that the Scottish Vowel Length Rule
(Aitken, 1981), which lengthens tense vowels before
/r/, voiced fricatives and word or morpheme boundaries as in leave, agree, agreed, fire, lies, tie and tied, is
a partial phonologization of the English voicing effect
process, which lengthens vowels before all voiced
consonants. While the voicing effect rule is again
gradient, the Scottish Vowel Length Rule is categorical, sensitive to morphology, and is beginning to
diffuse lexically, creating exceptions like spider and
viper with long vowels in short contexts. Both short
/VI/ and long /a:i/ are perhaps becoming established as
contrasting members of the Scots vowel inventory.
McMahon (2000b) suggests that changes may
follow two alternative life-cycle paths. Some, like
-tensing, shift gradually from postlexical to lexical, gradually diffusing until they cause underlying
restructuring and cease to be productive. Others,
like the Scottish Vowel Length Rule, additionally
neutralize a distinction (here, vowel length) as they
become lexical rules. This second type may include
those processes analyzed in standard generative phonology as involving rule inversion (Vennemann,
1972).

Future Work
Lexical phonology provides an appealing way of
modeling the difference between Neogrammarian
and diffusing sound changes, and of capturing the
insight that a synchronic process developing from
gradient, phonetic roots can later become a lexical
process. It will then shift upward through the lexical levels, gaining exceptions and becoming increasingly bound up with the morphology and less
finely phonetically conditioned, until in the end the
process ceases to apply productively, leaving only an
indication of its existence in underlying contrasts
or redistributions. This reintegrates synchronic and
diachronic phonology, showing both that phonological theory can increase our understanding of
change, and of how changes become phonological
processes, and that historical processes can provide
useful tests of theories.
However, there remain some questions for the
future. First, lexical phonology is primarily a
model of phonological organization. Consequently,

it can model the percolation of change through the


grammar well and enlighteningly, but inevitably has
less to say about the initiation of change. If many
sound changes are phonetically motivated, then we
must look to phonetics, and to phonological representation, to explain the initial phases. Zsiga (1997)
argues that we might postulate traditional phonological features or feature geometries at the lexical
level, but the gestures of articulatory phonology
(Browman and Goldstein, 1986, 1991) postlexically;
McMahon et al. (1994) suggest that gestures should
be the primary unit throughout the phonology, but
with different constraints on rule application at different levels. In either case, phonetic explanation
needs to be integrated into the model to allow for
the beginning of the cycle.
Second, although work continues in lexical phonology, we are now in a period where a different model is
in the ascendancy, with the advent and development
of optimality theory (A. Prince and P. Smolensky,
2004; Kager, 1999). Optimality theory replaces rules
plus constraints (whether on representations or on
rule applications) with constraints alone; moreover,
these constraints are universal and innate, at least in
the classical version of the theory (A. Prince and
P. Smolensky, 2004; McCarthy, 2002). Here, all constraints evaluate all possible outputs simultaneously
and in parallel, to establish the single, winning and
surfacing candidate that satisfies most high-ranking
constraints. Optimality theory is still under construction, and there are many areas of inclarity relevant to
the analysis of sound change, not least the role of
phonetics (McMahon, 2000a; and see Holt, 2003
for papers discussing the application of OT to
change). Furthermore, since classical OT is strictly
parallel, there is no obvious way of incorporating
the valuable insights lexical phonology brings to
change; but it has been suggested that partial serialism and level-ordering can be maintained in stratal
optimality theory (Kiparsky, in press; Bermu dezOtero, 1999, in press). Bermu dez-Otero (in press),
for instance, argues that in this composite model
we are dealing with a life-cycle of constraint rankings,
which move from lower to higher levels in the grammar, driven by acquisition. Whichever theory we
adopt, the most likely prospect for a solution to
the many remaining problems must surely be one
that attempts to reconcile evidence and questions
from acquisition, change, phonetics, and synchronic
phonological patterns.
See also: Phonological Change in Optimality Theory; Pragmatics: Optimality Theory; Sound Change; Sound
Change: Phonetics.

Phonological Impairments, Sublexical 509

Bibliography
Aitken A J (1981). The Scottish vowel length rule. In
Benskin M & Samuels M L (eds.) So meny people,
longages and tonges. Edinburgh: Middle English Dialect
Project. 131157.
Bermu dez-Otero R (1999). Constraint interaction in
language change. Ph.D. diss., University of Manchester.
Bermu dez-Otero R (in press). The life-cycle of constraint
rankings: studies in early English morphophonology.
Oxford: Oxford University Press.
Browman C P & Goldstein L (1986). Towards an articulatory phonology. Phonology yearbook 3, 219252.
Browman C P & Goldstein L (1991). Gestural structures:
distinctiveness, phonological processes, and historical
change. In Mattingly I G & Studdert-Kennedy M (eds.)
Modularity and the motor theory of speech perception.
Hillsdale, NJ: Lawrence Erlbaum. 313338.
Carr P (1991). Lexical properties of postlexical rules:
postlexical derived environment and the Elsewhere
Condition. Lingua 85, 4154.
Chen M & Wang W S-Y (1975). Sound change: actuation
and implementation. Language 51, 255281.
Chomsky N & Morris Halle (1968). The sound pattern of
English. New York: Harper & Row.
Giegerich H J (1999). Lexical strata in English: morphological causes, phonological effects. Cambridge: Cambridge
University Press.
Halle M & Mohanan K P (1985). Segmental phonology of
modern English. Linguistic Inquiry 16, 57116.
Harris J (1989). Towards a lexical analysis of sound change
in progress. Journal of Linguistics 25, 3556.
Holt D E (ed.) (2003). Optimality Theory and language
change. Kluwer: Dordrecht.
Kager R (1999). Optimality Theory. Cambridge: Cambridge University Press.
Kiparsky P (1982). Lexical phonology and morphology. In
Yang I-S (ed.) Linguistics in the morning calm. Seoul:
Hanshin. 391.

Kiparsky P (1988). Phonological change. In Newmeyer F J


(ed.) Linguistics: the Cambridge survey. I: linguistic
theory foundations. Cambridge: Cambridge University
Press. 363415.
Kiparsky, P. (in press). Paradigm effects and opacity.
Stanford: CSLI Publications.
Labov W (1972). Sociolinguistic patterns. Philadelphia:
University of Pennsylvania Press.
Labov W (1981). Resolving the Neogrammarian controversy. Language 57, 267308.
Labov W (1994). Principles of linguistic change, vol. 1:
internal factors. Oxford: Blackwell.
Labov W (2001). Principles of linguistic change, vol. 2:
social factors. Oxford: Blackwell.
McCarthy J J (2002). A thematic guide to Optimality
Theory. Cambridge: Cambridge University Press.
McMahon A (1991). Lexical phonology and sound
change: the case of the Scottish Vowel Length Rule.
Journal of Linguistics 27, 2953.
McMahon A (2000a). Change, chance, and optimality.
Oxford: Oxford University Press.
McMahon A (2000b). Lexical phonology and the history of
English. Cambridge: Cambridge University Press.
McMahon A, Foulkes P & Tollfree L (1994). Gestural
representations and Lexical Phonology. Phonology 11,
277316.
Mohanan K P (1986). The theory of Lexical Phonology.
Dordrecht: Reidel.
Prince A & Smolensky (2004). Optimality theory. Oxford:
Blackwell.
Vennemann T (1972). Rule inversion. Lingua 29,
209242.
Wang W S-Y (1969). Competing changes as a cause of
residue. Language 45, 925.
Zsiga E C (1997). Features, gestures and Igbo vowels:
an approach to the phonology-phonetics interface.
Language 73, 227274.

Phonological Impairments, Sublexical


H W Buckingham, Louisiana State University,
Baton Rouge, LA, USA
S S Christman, University of Oklahoma,
Oklahoma City, OK, USA
2006 Elsevier Ltd. All rights reserved.

Introduction
In the present contribution to this volume, we will
briefly discuss some recent work in neurolinguistic
modeling that once again considers the human language cerebral system as a functional mosaic, more
diffuse in its overall functional operation and slightly

more parallel in its chronometry. Here, we will pay


specific attention to phoneme structure and prosody
namely the units commonly referred to as tonemes.
This allows us explore prosody in a very digital/
analytical way, where fundamental frequency control
looks to be a property of the left hemisphere. Prosody
is a multifunctional dynamic, not fully right-nor
left-hemisphere-specific. We will then present some
analysis of recent sublexical studies of paraphasia
and point to some inconsistencies and weaknesses of
each, concentrating on syllable structure complexity
and how that complexity and its linguistic control is
extremely important in drawing finer distinctions

Phonological Impairments, Sublexical 509

Bibliography
Aitken A J (1981). The Scottish vowel length rule. In
Benskin M & Samuels M L (eds.) So meny people,
longages and tonges. Edinburgh: Middle English Dialect
Project. 131157.
Bermudez-Otero R (1999). Constraint interaction in
language change. Ph.D. diss., University of Manchester.
Bermudez-Otero R (in press). The life-cycle of constraint
rankings: studies in early English morphophonology.
Oxford: Oxford University Press.
Browman C P & Goldstein L (1986). Towards an articulatory phonology. Phonology yearbook 3, 219252.
Browman C P & Goldstein L (1991). Gestural structures:
distinctiveness, phonological processes, and historical
change. In Mattingly I G & Studdert-Kennedy M (eds.)
Modularity and the motor theory of speech perception.
Hillsdale, NJ: Lawrence Erlbaum. 313338.
Carr P (1991). Lexical properties of postlexical rules:
postlexical derived environment and the Elsewhere
Condition. Lingua 85, 4154.
Chen M & Wang W S-Y (1975). Sound change: actuation
and implementation. Language 51, 255281.
Chomsky N & Morris Halle (1968). The sound pattern of
English. New York: Harper & Row.
Giegerich H J (1999). Lexical strata in English: morphological causes, phonological effects. Cambridge: Cambridge
University Press.
Halle M & Mohanan K P (1985). Segmental phonology of
modern English. Linguistic Inquiry 16, 57116.
Harris J (1989). Towards a lexical analysis of sound change
in progress. Journal of Linguistics 25, 3556.
Holt D E (ed.) (2003). Optimality Theory and language
change. Kluwer: Dordrecht.
Kager R (1999). Optimality Theory. Cambridge: Cambridge University Press.
Kiparsky P (1982). Lexical phonology and morphology. In
Yang I-S (ed.) Linguistics in the morning calm. Seoul:
Hanshin. 391.

Kiparsky P (1988). Phonological change. In Newmeyer F J


(ed.) Linguistics: the Cambridge survey. I: linguistic
theory foundations. Cambridge: Cambridge University
Press. 363415.
Kiparsky, P. (in press). Paradigm effects and opacity.
Stanford: CSLI Publications.
Labov W (1972). Sociolinguistic patterns. Philadelphia:
University of Pennsylvania Press.
Labov W (1981). Resolving the Neogrammarian controversy. Language 57, 267308.
Labov W (1994). Principles of linguistic change, vol. 1:
internal factors. Oxford: Blackwell.
Labov W (2001). Principles of linguistic change, vol. 2:
social factors. Oxford: Blackwell.
McCarthy J J (2002). A thematic guide to Optimality
Theory. Cambridge: Cambridge University Press.
McMahon A (1991). Lexical phonology and sound
change: the case of the Scottish Vowel Length Rule.
Journal of Linguistics 27, 2953.
McMahon A (2000a). Change, chance, and optimality.
Oxford: Oxford University Press.
McMahon A (2000b). Lexical phonology and the history of
English. Cambridge: Cambridge University Press.
McMahon A, Foulkes P & Tollfree L (1994). Gestural
representations and Lexical Phonology. Phonology 11,
277316.
Mohanan K P (1986). The theory of Lexical Phonology.
Dordrecht: Reidel.
Prince A & Smolensky (2004). Optimality theory. Oxford:
Blackwell.
Vennemann T (1972). Rule inversion. Lingua 29,
209242.
Wang W S-Y (1969). Competing changes as a cause of
residue. Language 45, 925.
Zsiga E C (1997). Features, gestures and Igbo vowels:
an approach to the phonology-phonetics interface.
Language 73, 227274.

Phonological Impairments, Sublexical


H W Buckingham, Louisiana State University,
Baton Rouge, LA, USA
S S Christman, University of Oklahoma,
Oklahoma City, OK, USA
2006 Elsevier Ltd. All rights reserved.

Introduction
In the present contribution to this volume, we will
briefly discuss some recent work in neurolinguistic
modeling that once again considers the human language cerebral system as a functional mosaic, more
diffuse in its overall functional operation and slightly

more parallel in its chronometry. Here, we will pay


specific attention to phoneme structure and prosody
namely the units commonly referred to as tonemes.
This allows us explore prosody in a very digital/
analytical way, where fundamental frequency control
looks to be a property of the left hemisphere. Prosody
is a multifunctional dynamic, not fully right-nor
left-hemisphere-specific. We will then present some
analysis of recent sublexical studies of paraphasia
and point to some inconsistencies and weaknesses of
each, concentrating on syllable structure complexity
and how that complexity and its linguistic control is
extremely important in drawing finer distinctions

510 Phonological Impairments, Sublexical

among aphasic deficits among error types as well as


among aphasic types. As we will argue, it is most often
the case that differently categorized aphasics (especially those who are clearly in the posterior/
sensory fluent groups of Wernickes as well as conductions and subgroupings of these) will likely produce
from time to time, and certainly in differing numbers,
most sublexical error types. The challenge, therefore,
is to find instances of significantly differing quantitative groupings among the error types for different
subject groupings. There is ample evidence that most
studies demonstrate precisely this kind of dissociation.
To this, we will simply reiterate the obvious: these
statistically differing groupings within error types are
qualitative in the sense of statistics, but quantitative,
still, in the sense of the error types themselves. We will
subsequently point out that this, in turn, is why Freuds
continuity thesis keeps rearing its ugly head in the
study of paraphasia in brain damage and why investigators continue to allude to the slips-of-the-tongue in
nonpathologically involved speakers. Lastly, we will
consider some recent work in the aphasic production
of neologisms, how and where they may be generated,
by whom, under what circumstances, whether they
involve anomia or not, and how they resolve to more
recognizable language. We will pay close attention to
whether or not the appearance of the neologism
adumbrates a lexical block, or whether it adumbrates
something that is actually quite different: the strictly
phonological transformation or deformation of a form
that by definition must have been retrieved at some
higher level of production. Lastly, we will follow the
logical predictions of neology recovery from these two
accounts.

Functional Metabolic Mosaics


To begin with, we would emphasize that many of the
recent findings from positron emission tomography as
well as from functional magnetic resonance imaging
and magnetoencephalography have provided support
for theories developed long before these advances
in the technology of cerebral observation: theories
such as motorsensory reverberatory circuitry and
function in ongoing speech production with interconnective physiology and a left hemisphere specificity.
The origins of these theories date as far back as the
monograph of Karl Wernicke (1874). There is much
evidence from imaging work that John HughlingsJackson (1866) was quite correct in his assumption
that the nondominant hemisphere was actively involved in many aspects of language comprehension
and that its role was increasingly observed in the
resolution of aphasia. Furthermore, there now exists
incontrovertible evidence from imaging for earlier

motor theories of speech perception as well as for


other theories that postulate the leading role that
acoustic memory of one sort or another has in speech
production. One can go even further into history
(Hartley, 1749) for the establishment of strong
motorsensory associations in terms of muscle sense.
Such strong associative connections and the theories
they stemmed from would certainly accord with parallel cerebral metabolic mosaics so often revealed
during the production of phonological segments and
during the perception of the same.
Two modern neurolinguistic investigators, David
Poeppel and Greg Hickok (e.g., Hickok, 2000), have
designed a model based on their own imaging work
and that of others that is bilateral for comprehension,
thereby bringing the nondominant hemisphere into
the total picture for language perception, including
more than simply prosody and pragmatics both of
which also play important roles in the overall picture
of online language processing, which for them begins
with the introduction of an auditory signal that very
importantly spreads to both hemispheres initially
and only then gets parceled out in terms of what
processing of that auditory signal can remain in the
right and what processing is forced to dominant hemisphere analysis, and why. In fact, it may very well be
that what is being seen in the metabolism of the right
hemisphere during comprehension are precisely those
computations of intonation and schematic knowledge
access from memory stores so crucial in our understanding of spoken language. Poeppel and Hickok are
careful to chart all relevant studies that demonstrate
a clear role for the nondominant hemisphere for
comprehension: the Wada test, split brain studies,
and single cell recordings.
Before going on, we want to emphasize that in
these imaging studies what we actually see is metabolism, and only metabolism. In order to infer what
may have been the work task, intricate statistical
subtractions must be performed on the signal to ferret
out the clutter, so to speak. The timing of events must
be closely controlled as well so that we can be sure
that the biochemically marked nutrients are introduced and arrive at the cerebral zones so their magnetic fields can be scanned radiographically, etc. (see
Uttal, 2001 for a sober evaluation of recent scanning
technology and about some of the limitations of the
paradigm).
Hickok (2000) focuses upon the functional neuroanatomy of conduction aphasia and uses the right
hemisphere for comprehension and the left temporal
zones for the strictly phonological (with emphasis on
sublexical processing) auditory element, which in turn
fits into the acousticmotor reverberatory system.
Holding to previous models scaffolded on techniques

Phonological Impairments, Sublexical 511

other than imaging, he suggests two types of physiological interconnectionistic routes in short, the
insula or the arcuate fasciculus, which as we have
known for over 100 years courses through the opercular regions of the temporal, parietal, and frontal
lobes. This, in turn, allows the Poeppel/Hickok model
to again reach into preimaging models and to suggest
that there are two kinds of conduction aphasias, one
perhaps involving lesions in and around the supramarginal gyrus (SMG) in such a way that the lesion
would extend to the operculum, thereby interfering
in one way or another with smooth acousticmotor
cooperation between Wernickes area and Brocas
area. An old notion, to be sure. Lesions to the left
posterior supratemporal plane (location of soundbased representations), including Heschls gyrus
(BA 41) and the planum temporale (PT), slightly inferior to the SMG, may produce a more anterior type
of conduction aphasia. In any event, Hickok and
Poeppel do point out that activation levels of the auditory signal, although bilateral, tend to be somewhat
stronger in the left.
Other recent significant studies of the perception
of phonemic elements, called tonemes, have been carried out by Jack Gandour and his colleagues. In one
(Gandour et al., 2000), the investigators looked at the
perception of tone in speakers of three languages:
Thai, Chinese, and English.
Tonemes are very short stretches of fast fluctuating
fundamental frequencies over the range of a vowel
production. As a simple example, the segmental
stretch of a CV, such as /ma/, may have a number
of differing tone patterns over the /a/, such that a
high-falling fundamental frequency (Fo) pattern shift
will give you one word, while the /a/ with a tone
pattern of low-rising will give you quite another.
These represent minimal phonemic pair distinctions
and like minimal pair distinctions call for close, digital, and highly analytical processing skills upon the
acoustic spectrum. In these kinds of studies, success
on the part of the subjects depends upon the ability to
link those analytical analyses of the perceptual system
with words in their language. Each tone language
has its own set of parameters that fit to the lexicon.
Linguists frequently define pitch as Fo, and therefore
what is involved physically is a set of fast-changing
pitches. Pitch, however, tends to avoid mention of
something so communicative that it would serve
as the sole element to distinguish one word from
another in some languages. The term toneme is
therefore applied to a rapid pitch change that calls
up in the minds of the listener different words. Thai
and Chinese are tone languages, English is not.
The method of study was PET, and there were three
to four subjects for each language. The subjects were

asked to simply listen to the prosodic, fast fluctuations of Fo and to press a same or different button
if they thought they heard the same pitch pattern or
not. For tonemes, only Thai patterns were chosen for
this study.
Several earlier imaging studies have shown activation in the left opercular regions of the frontal lobe,
very near Brocas area, during phonemic perception.
Recall that most consider both areas 44 and 45 en
toto to be Brocas area (see Foundas et al., 1998, for
an extremely close and detailed neuroanatomical
study of pars triangularis [45] and pars opercularis
[44], using volumetric MRI). The Gandour et al.
(2000) study now shows this for the perception of
the toneme, if you will, a phoneme of prosody. All
subjects showed similar metabolic mosaics for the
perception of rapidly changing pitch patterns that
were nonlinguistic. However, only the Thai speakers
revealed a significantly added component to the mosaic of metabolism when the pitch changes matched
the tonemes of Thai. They were not only perceiving
the rapid pitch changes, and therefore able to press
same or different buttons, they were hearing words.
That is to say, the perception was linguistic. And, crucially, that added metabolic zone was in the left frontal operculum. Both the Chinese speakers and the
English speakers revealed similar patterns of metabolism, but no added left Brocas area metabolism.
Another equally sophisticated and significant study
along somewhat similar lines is found in Hseih et al.
(2001). Here, however, there were 10 Chinese speakers and 10 English speakers, and all were analyzed as
they perceived consonants and vowels, as well as
pitches (nonspeech but physically similar to tones)
and tones. The general metabolic mosaic patterns
were different with each group of speakers, thus
providing evidence that the cerebral metabolic patterns were largely reflective of the fact that Chinese
and English involve different linguistic experiences.
Subjects either listened passively or were instructed
to do samedifferent responding by clicking left or
right, same vs. different. Subjects still had to click in
the passive condition, but they simply had to alternate from one to the other for each presentation a
mindless task involving similar digit movements.
The findings here show a task-dependent mosaic
of metabolic functioning that reflects how acoustic,
segmental, and suprasegmental signals may or may
not directly tap into linguistic significance, with nondominant hemisphere mechanisms activated for cues
that eventually work themselves into dominant hemisphere activation. Brocas area on the left was activated for the Chinese-speaking group for consonants,
vowels, tones, and pitches, while the right Brocas
area was activated for English speakers on the pitch

512 Phonological Impairments, Sublexical

task. Since pitch is nonlinguistic on all views for


English, this finding makes sense and again shows
the role of the nondominant hemisphere in processing
auditory stimuli at the beginning. Those pitches are
extremely rapid as well, but can be processed by the
right as long as they do not tap into anything linguistically meaningful. Chinese speakers, on the other
hand, appear to process temporal and spectral signals
in the left, not the right. Lateral effects are not predictable for very complex processing of rapid temporal and spectral change. Pitch patterns, then, along
with temporal/spectral signals for consonants and
vowels, are as likely to be in right or left hemisphere
for this or that language. Hsieh et al. (2001: 240)
write, Pitch processing is lateralized to the left hemisphere only when the pitch patterns are phonologically significant to the listener; otherwise to the right
hemisphere when the pitch patterns are linguistically
irrelevant. Recall that in the previous study, only the
Thai speaking group showed left activation (also in
and around Brocas area) for the pitches that fit to
the Thai tonemic system. Chinese speakers in that
study showed absolutely no left Brocas area effect.
Again, Chinese is a tone language, but the substrate of
pitches is not the same as in Thai. One further finding
(among the many others that we cannot go into here)
was that the Chinese group showed increased metabolism in the left premotor cortex as well as the
gyral zones of Brocas area (44 and 45) on the four
tasks, while only the vowel condition activated left
Brocas area significantly for English speakers. The
pitch condition for English speakers in this study gave
rise to increased metabolism in the right Brocas
homologue. Again, we see a picture of right and left
processing for auditory input, but where that
auditory processing directly connects with linguistic
significance for some language, that processing will
drift leftwards or otherwise be attracted to the dominant hemisphere by the strength and dominance of
the language processor. Now, therefore, we have
growing evidence that the left Brocas area is involved
in linguistic perception. In addition to this, there is
increasing evidence as well that left posterior regions
are involved in linguistic production.
We witness several lines of evidence in modern
neurolinguistics that strengthen the classical aphasia
notion that the posterior sensory auditory cortex is in
many ways directly involved in speaking. Buckingham
and Yule (1987) have related the architectonic findings of Galaburda of giant Betz cells under the planum
temporale in level III. Not only does this system
connect with the arcuate fibers in the operculum,
but these regions as well show large concentrations of acetylcholine in the left temporal lobe. Since
that neurotransmitter is found as well in large

concentrations in left basal ganglia in right-handed


people, it is assumed that it plays a motor role as well
for articulation, and thus would be well situated for
such a function in the left planum. In addition, Square
et al. (1997) speak increasingly of a posterior apraxia of speech (AOS), which would agree with older
theories of Liepmann (1905) and with more recent
models such as the one presented by Doreen Kimura
(1982). There has always been a certain amount of
tension in theories of AOS as to just how much of its
nature is phonetic and how much is phonological.
This question would only make sense, of course, if
there were a certain metric that would keep the
two apart, with clearly demarcated domains that
would allow for an interface of the two. We have
already seen a complex array of interactions between
the sensory and the motor for production and perception, both functionally and neuroanatomically.
The brain may very well turn out to be unable to
enforce a strict compartmentalization between the
two, and this would in turn lend support to the claims
of certain phonologists such as John Ohala (1990)
that in fact there is no such thing as a phonetics
phonology interface, since the two are fully intertwined. The whole issue may turn out to be moot
more than originally thought. Finally, stemming
from the recent evidence for old functional notions
of the arcuate fasciculus, that tracts motorsensory
mediation capacity has led to its establishment as part
and parcel of a subvocal rehearsal mechanism that is
crucially involved in short-term operational verbal
memory.

Recent Linguistic Aphasiological Studies


of Sublexical Units
Most studies of segmental paraphasias in modern
terms include reference to syllable structure as well
as syllable complexity. Phonotactic patterns are closely scrutinized, but the interaction between phonotactics and the sonority scale are often only loosely
defined and only marginally used as a comparative
analytic metric. For example, many investigators
(e.g., Nickels and Howard, 2004) measure syllable
complexity largely by canonical form and nothing
else. An aphasic who reduced complexity of syllable
structure would simply change a CCVCC to perhaps
a CVC, a CV, or a VC. A simple assumption would be
that a CV is less complex than a CCV. The sonority
hierarchy, however, provides the aphasiologist with a
more powerful way to measure syllable complexity
that goes beyond phonotactics. Universal (not absolute) sonority ranking, going from least to most
sonorous, is: Obstruent Nasal Liquid Glide Vowel.
The distance from O to V is 4, from O to N is 1, from

Phonological Impairments, Sublexical 513

N to G is 2, and so forth. Onset structures have a


crescendo architecture, while coda structures have a
decrescendo architecture, the vowel being the nucleus
of the syllable with maximum sonority. The most
powerful complexity metric is the dispersion principle embedded in this theory. The calculation of dispersion is done by summing the inverses of the
squared values of all the distances of all elements in
the initial demisyllable (the Cs and the V). The dispersion value for an initial demisyllable, such as the
/pli/ of the word /pliz/, would be the following. From
O to L has a distance value of 2; from O to V has
a distance of 4, while the distance from L to V is 2.
The dispersion value here would be: .56. Now, note
for instance, that if you measure the dispersion of the
CV /yu/, you get 1/1 1.00, a higher value than calculated for the CCV /pli/. There is a smoother and
more steady crescendo going from O, then to L, then
to V. This is not the case for the /yu/. The principle
prefers sequences of two segments that differ as much
as possible on sonority ranking. Lower dispersion
values are less marked in initial demisyllables. Sonority relates to amplitude, resonance, vocalicness, and
vocal tract opening. Sharp discontinuities in these
features is what is preferred: maximum contrast (see
Ohala, 1992, where he stresses maximum discontinuity, which to him renders the sonority principle totally
redundant, or, at best, derivative). There is very little
contrast between a glide and a vowel; they are contiguous. Even more nonpreferred would be a sequence
of two segments of the same sonority value, a flat
sequence, flat meaning that there is no crescendo nor
descrescendo. Other than /s/ plus another obstruent in
English, which are numerous (/s/ is often considered
extrasyllabic by some phonologists, and thus that
problem would vanish), often processes intervene to
shift the syllable structure to a more preferred situation. Often, for instance, when two vowels end up
together, a glottal stop intervenes to break up that
undesirable sequence.
In a recent study, Nickels and Howard (2004) used
a powerful statistic to discover the crucial factor correlating with word production errors (phonemic
paraphasia). They could dissociate the effects of number of segments in a word, number of syllables in
a word, and the syllable complexity of the word.
Admittedly, the three are often conflated in studies
of paraphasia that simply put the onus on word
length for degree of paraphasia. Number of segments
and number of syllables often intercorrelated, for
instance. A greater number of syllables would allow
for the possibility of more complex syllables. The
authors statistic was such that it leveled out syllable
complexity, showing ultimately that only number of
syllables correlated with degree of accuracy in lexical

realization. The problem, of course, is that without


considering sonority and its dispersion measurement,
the simple use of canonical forms to measure syllable
complexity is weak at best, and as we saw above,
actually wrong in many of its predictions of complexity, a CV in some cases being more complex that a
CCV.
Furthermore, note that not all CVs are if equal
complexity. A /pa/ (OV) .06; a /na/ (NV) .11; a /ra/
(LV) .25, and a /ya/ (GV) 1.
In two recent studies by Romani and Calabrese
(1998) and Romani et al. (2002), the principle of sonority and dispersion were closely charted in the analysis of a nonfluent patient (Romani and Calabrese,
1998), and then in the 2002 study that patient contrasted with a typical fluent paraphasic speaker, with
a CT scan demonstrating a two-year-old CVA in the
left temporoparietal area. The previous 1998 study
had clearly shown that the nonfluent errors simplified
structures clearly along the lines of predictions from
sonority: more segmental transformations creating
less complexity in terms of sonority and syllable simplifications that followed sonority predictions as well.
In that study, Romani and Calabrese importantly
emphasize that since this patient was nonfluent with
great articulatory difficulty, there is reason to believe
that sonority patterns are grounded in motor speech
execution. Christman (1992, 1994), on the other
hand, showed that neologisms in a fluent aphasic also
tend to follow the patterns of sonority. Neologistic
structures abide quite rigidly to the architecture of
sonority, and consequently they demonstrate that the
principle filters up into the phonology, or otherwise
becomes phonologized. In this way, sonority in the big
picture can be at work in both fluent and nonfluent
aphasic production. Romani et al.s (2002) fluent subject did not show such simplification tendencies and
to that effect did not reveal as much influence from
sonority. In general, the 2002 comparison study reported the following main differences between the
nonfluent DB and the fluent MM:
1. The majority of DBs errors, but not MMs errors,
gave rise to simpler syllables.
2. Most of DBs errors involved consonants; MMs
did not.
3. DBs substitutions were closer to the target and
were influenced by frequency.
4. DBs errors were largely paradigmatic substitutions, while MMs were more involved with linear
sequencing of segments.
5. MM showed no specific tendencies toward differential errors among vowels. DB, however, made
the fewest errors on /a/; /a/ is the most sonorous of
the vowels, since it has the most aperture.

514 Phonological Impairments, Sublexical

In addition to sonority patterns, there are metrical


patterns as well, and they, too, have distinct complexity levels. The trochee pattern of Strong-Weak is the
most frequent meter for two syllable words in English
and is likely a very frequent rhythmic template in all
human languages. Iambs are somewhat more complex, since the initial syllables are unstressed, initiation thereby being more difficult. Note that many
childrens deletions as well as aphasics are focused
on unstressed and most often reduced syllables.
Goodglass (1978) was one of the first aphasia researchers to point out that utterance-initial weak
stresses are abnormally difficult for many types of
patients, especially the nonfluent Brocas aphasics.
He extended this observation to sentence-initial unstressed function words, such as the in a phrase the
book is on the table. Here he noted the extreme
difficulties many Brocas aphasics had with producing
the initial the, and thus initiating the sentence. He
pointed out, however, that the patient could much
more easily produce noninitial, unstressed the, internal to the sentence. It turns out that there is a metrical
account for the differential deletability of the two
the function words, and that the same account works
as well for explaining why children who delete unstressed schwas most often delete the initial ones in
words such as banana. Metrical feet are assigned in
algorithmic fashion to a stretch of syllables or words
from right to left. Details aside, nana gets a trochaic
foot (SW), but the first syllable is not assigned a foot,
and is therefore referred to as an unfooted schwa.
The theory now says that the second schwa is protected by being within a foot unit. Of the two unstressed schwas, therefore, the unfooted one is more
vulnerable to apocope. At the sentential level, a similar situation arises. Going from right to left, table is a
trochee; on the is a trochee; book is is a trochee as
well. There is no foot that can be assigned to the
initial the, so that it is unfooted. An unfooted function word such as the first article the in the sentence,
is therefore more vulnerable than the internal the,
which is protected by the trochaic foot domain on
the. In the final tally, these new findings from the
phonology of the syllable, of metrics, and of the suprasegmental constraints on rhythm and cadence
have allowed us to better appreciate that AOS has
as much to do with loss of rhythm and cadence of
speech, and that they in turn cause many of the articulatory derailments seen in that syndrome. It also
allows us to appreciate even further that the phonemic paraphasias of Wernickes and conduction aphasics take place at levels much higher in the linguistic
production system. Metrical and syllabic phonology
have led to the postulation of frames or templates into
which contents may be inserted: phonemes or words.

The picture is one of empty slots within these templatic frames and the access of the contents to fill those
slots. Finally, each element, structure or unit may be
dissociated from any other. The syllable itself is typically considered as an encasement with slots labeled
as onset, rime, nucleus, and coda and groups of segments placed there as segments in production. It is
stressed that syllables themselves are not subject to
productive sequencing, but rather their contents are.
Levelt et al. (1999) constructs both phonological/syllabic frames and metrical frames. There has been a
long-standing article of faith held by many psycholinguists, which claims that when phonemes move in
linear ordering errors they move from and to the
same syllabic constituent slots: onsets move to onset
positions, nuclei move to nuclei positions and codas
move to coda positions. This has been variously
called the syllable constraint condition. For Levelt,
this constraint is overly lopsided in that according to
his numbers 80% of English language slips-of-thetongue involve syllable onsets, but these are crucially
word onsets as well. According to Levelts numbers,
slips not involving word onsets, . . . are too rare to be
analyzed for adherence to a positional constraint
(Levelt, 1999: 21). He rules out on other principles
the nucleus to nucleus observation, claiming that
phonotactic constraints are operating here, since a
vowel ! consonant will not likely result in a pronounceable string. In addition, Levelt feels that
vowels and coda consonants operate under more
tightly controlled conditions whereby . . . segments
interact with similar segments. Phonologists have
observed that there are often fewer coda consonants
than onset consonants, which is especially true for a
language like Spanish. The number of vowels in a
language is always smaller than the total number of
consonants. In any event, there is ample evidence
(also see Shattuck-Hufnagel, 1987) that onsets are
much less tied to the syllable than are codas, which
being sister nodes of the nucleus, both dominated by
the rime, are much more glued to the vowel of the
syllable. In terms of a qualitatively different status
for word onsets, there is evidence that the word
onset position is significantly more involved in the
phonemic carryover errors in recurrent perseveration
(e.g., Buckingham and Christman, 2004).
Two new notions of the nature of segmental targets
have been introduced recently. One is seen in Square
(1997) with her movement formulae. These are
stored in posterior left temporoparietal cortex and
seem very much like the centuries-old memories of
articulatory procedures of Jean-Baptiste Bouillaud
(1825). Targets are now understood by some as idealized gestures for sound production and that voluntary speech would involve the access of these stored

Phonological Impairments, Sublexical 515

gestural engrams for articulation. Both of these conjure up theories of embodiment and both are forms
of representative memory (see Glenberg, 1997 for a
cogent treatment of memory and embodiment).
Goldman et al. (2001) have analyzed the effect
of the phonetic surround in the production of phonemic paraphasias in the spontaneous speech of a
Wernickes aphasic. Through the use of a powerful
statistic, the authors were able to control for chance
occurrence of a copy of the error phoneme either
in the past context or in the future context. The idea
was that there could be a context effect that caused
the phonemic paraphasia to occur. Chance baselines
were established, and it was found that relative to this
baseline, the error-source distances were shorter
than expected for anticipatory transpositions but
not for perseverative transpositions. That could be
taken to mean that anticipatory errors are more indicative of an aphasia than perseveratory errors in
that this patient seemed more unable to inhibit a
copy of an element in line to be produced a few
milliseconds ahead. Thus, this could be taken to support the claim that anticipatory bias in phonemic
paraphasia correlates with severity. The authors also
observed that many but not all anticipatory errors
involved word onsets, mentioned earlier in this
review as vulnerable to movement or substitution.
The much larger distances between error and source
for perseverations could have been due to slower
decay rates of activated units whereby they return to
their resting states. The patients anterior/perseveration ratios measured intermediate between a nonaphasic error corpus and that of a more severe
aphasic speaker. One troubling aspect of this study
was that a source was counted in the context whether
or not it shared the same word or syllable position as
the error. This may represent a slight stumbling block
in interpreting the findings, since, Levelt notwithstanding, it would imply that syllable position had
no necessary effect. The authors, however, presented
their findings with caution, especially so because some
recent work (Martin and Dell, 2004) has demonstrated a strong correlation between anticipatory
errors and normality vs. perseveratory errors and
abnormality. It is a long-noted fact that slips-of-thetongue in normal subjects are more anticipatory than
perseverative. Perseveration is furthermore felt by
many to be indicative of brain damage. Martin and
Dell (2004) set up an anticipation ratio, which
is obviously higher in normality through slips. They
also find that more severe aphasics produce more
perseverative paraphasias than anticipations, but
that the ratio increases throughout recovery such
that in the later stages of recovery patients produce fewer and fewer perseverations as opposed to

anticipations: the anticipatory effect grows as the


patients approach normality. On the logic that the
improving aphasic should move in the direction of
the normal subject, the anticipatory ratio should increase. It may very well turn out that the anticipatory
error will ultimately serve as a metric to measure
recovery over time in aphasia.

Recovery from the Production of


Neologisms
The question may be, and has been, asked: neologisms: from whence? From the beginning, it was simply thought that they stemmed from a complex array
of literal or phonemic paraphasias. That is, it had
originally been taken as an article of faith that neologisms originated from words that had been phonemically transformed to the extent that any transparency
between error and target word was obliterated. There
has never been any question that this account is not a
logical one, especially given the prevalence of phonemic paraphasia in fluent aphasia. Since the days of
Wernicke (1874), M. Allen Starr (1889), the late 19thcentury linguists, through Arnold Pick (1931/1973)
and up to the present (see Buckingham, 1989), error
typologies have included anticipatory errors, perseverative errors, exchanges, and substitutions of phonemes in both normal subjects (slips-of-the-tongue)
and aphasics. The phonemic paraphasia theory of
neology was dubbed by Buckingham (1977) as the
conduction theory, since conduction aphasics are
marked by their phonemic paraphasias. This was the
theory implied in the Boston Aphasia Exam and specifically invoked in Kertesz and Benson (1970) for the
neologism. Note very importantly that this account
for the production of neologisms implies, if it does
not say so outright, that the problem is not with the
retrieval of the word but rather with the phonological
realization of that word. For this theory to hold true,
the target word would presumably have to have been
retrieved from the lexicon, because it must serve as
the input to the component that transforms it.
Another possible account of neologisms would be
to claim that straight away the patient had a word
block whereby no target word would be forthcoming,
and that nonetheless, the patient continued talking or
stopped responding. The question then becomes, how
in this circumstance could the patient produce what
would then be a surrogate for or a masker of the
unaccessed word. By what aspects of phonetic production could the speaker produce the surrogate. The
issue was introduced in modern neurolinguistic studies in Alajouinine (1956), Kertesz and Benson (1970),
Buckingham and Kertesz (1976), and Butterworth
(1979). Butterworth called this second account of

516 Phonological Impairments, Sublexical

neology the anomia theory, and suggested the metaphor of a random generator as a principle device
that could produce a phonetic form in light of retrieval failure.
Butterworth had studied with Freida GoldmannEisler (1968), who had analyzed large stretches of
spontaneous speech and had looked closely at the
on-going lexical selection processes online. She had
noted time delays before the production of nouns of
high information (i.e., low redundancy) in the speech
of normal subjects. Time delays for her indicated the
action of word search, and that search would obviously be a bit more automatic and fast, to the extent
that the word sought was highly redundant, therefore
carrying less information. Butterworth very cleverly
extended his mentors methods to the analysis of
neologistic jargon stretches of spontaneous speech
in Wernickes aphasia. What he found was extremely
interesting. Before neologisms that were totally opaque as to any possible target (subsequently termed
abstruse neologisms by A. R. Lecours (1982)), Butterworth noted clear delays of up to 400 ms before
their production. Crucially, he did not notice this
delay before phonemic paraphasias, where targets
could nevertheless be clearly discerned, nor before
semantic substitutions, related to the target. This
indicated failed retrieval for Butterworth and he
went on to suggest that perseverative processes and
nonce word production capabilities could play a role
in this random generator. It was random for Butterworth, since his analysis of the actual phonemic
makeup of neologisms did not follow the typical patterns of phoneme frequency in English. He never
implied that random meant helter skelter; he knew
enough about phonotactic constraints in aphasic
speech. Neither did he imply that the patient actually,
with premeditated intentionality, produced the surrogate. The whole issue was subsequently treated in
Buckingham (1981, 1985, 1990).
Each of these accounts of neology makes different
predictions concerning recovery. The conduction theory predicts that as the patient recovers, target words
will slowly but surely begin to reappear. Paraphasic
infiltration will lessen throughout the months and
ultimately the word forms will be less and less
opaque. In the endstage of recovery, there should be
no anomia. The theory also predicts that the error
distributions in the acute stage will produce some
errors with mild phonemic transformation, others
with more, others with a bit more, etc., up to the
completely opaque ones, i.e., a nonbimodal distribution. The anomia theory, on the other hand, predicts a
bimodal distribution with neologisms on one end and
more or less simple phonemic paraphasia on the
other, and few in the middle that were more severe

but not enough to render word recognizability


opaque. The anomia theory predicts that during recovery patients with that underlying problem will
generally show fewer and fewer neologisms, gaining
better monitoring capacity to note the neologisms,
perhaps ultimately holding back the surrogate productions as a mark of improvement in the aphasia. It
is also highly likely that as the patient improves, the
perseverations will lessen (e.g., Martin and Dell,
2004). What is clearly predicted at the endstage, however, is that the anomia may very well remain, but
now with more stammering, pausing, and halting.
This, then, would be more in the line of what a normal
speaker might do when faced with word-finding
difficulties. Unhampered with additional sequelae,
this is more or less what the pure anomic will do.
These predictions remained untested in the clinic
until Kohn and Smith (1994) and Kohn et al. (1996).
Essentially, and more specifically in 1996, Kohn and
colleagues observed and described patients who, in
the acute stages of their aphasia, produced neologisms. One group recovered to mild phonemic paraphasia with no noted anomia, while the other group
no longer produced neologisms in the later recovery
stages. This second group of patients, when producing the neology initially, appeared to Kohn and colleagues (1996: 132) to be invoking some kind of
. . . backup mechanism for reconstructing a phonological representation when either partial or no stored
phonological information about a word is made
available to the production system.
Most connectionists have an easier time with the
conduction theory of neologisms and in general only
provide this one account of their generation. They
generally weaken in varying degrees the connection
strengths between the lexical and the phonemic levels
in their models, while keeping decay rates at normal
levels. They can thus quite easily simulate a gamut of
phonemic errors (and also they can simulate segmental slips-of-the-tongue), the more severe bordering on
the opaque. Their paradox, however, would be that
the simple paraphasias would not render target words
opaque, and thus interlevel transparency would be
maintained between word and phoneme. On the
other hand, with a bit more connection weight reduction, opaque forms may begin to be produced, and to
the extent they are opaque, interlevel transparency
will disappear. Connectionists often make the claim
that interlevel transparency reveals qualitatively different kinds of errors or even different kinds of
patients. Some have called errors that do not show
between-level communication stupid, while those
that nevertheless reveal interlevel connectivity have
been called smart. Note that both kinds of patients
in Kohn and colleagues study would start out with

Phonological Impairments, Sublexical 517

stupid errors. Neologisms have no transparency


concerning some target. But, by the anomia account,
the errors would remain stupid even into recovery,
as long as the anomia was there, unless some semantically related errors began to appear; some probably
did. On the other hand, by the conduction account,
the errors would start off stupid but end up smart.
Connectionists will have to tell us whether this scenario is a puzzle for them or not. Again, most connectionist modelers opt for the conduction theory of
neologisms (e.g., Gagnon and Schwartz, 1996; Hillis
et al., 1999). As an example of the conduction theory
for neologisms in connectionist modeling, Hillis et al.
(1999: 1820) wrote, This proposal would account
for her phonemic paraphasias (when few nontarget
subword units are activated) and neologisms (when
many nontarget subword units are activated). That
is to say, few phonemic errors result in simple paraphasia, where the target is not rendered opaque,
while many phonemic errors result in a neologism,
where the target is, indeed, rendered opaque. The
problem of the neologism, however, is still with us,
and especially so if we do not admit at least two
error routes perhaps even a third, but time does
not permit further consideration.

Conclusions
We have considered a vast array of findings on sublexical linguistic elements, their brain locations and
tight sensorymotor links. We have claimed that many
new imaging studies have conjured up and vindicated
several earlier theories laid down long before the
modern technology before us in the neurosciences.
We have seen the motorsensory interface in the functional linguistic descriptions of phoneme level production and perception, and we have even seen that
much new work with modern technology has served
to sharpen our understanding at somewhat closer
levels, but nonetheless has vindicated much previous
thinking back to the Haskins Labs and even further
back to the classic aphasia models of the late 1800s.
The focus upon the perisylvian region in the left
hemisphere has not changed, and in fact there is
even more growing interest in charting the anatomy
and function of the arcuate fasciculus, the opercular
regions, and the insula. At a slight remove from the
physical system, we have discussed and compared
various new findings and principles in the phonetics
and phonology of segments, syllables, and meter and
how they impact on sublexical processing in aphasia.
Finally, we considered the enigma of the neologism
and provided evidence that there are at least two quite
reasonable accounts of how they may be produced
and under what conditions they may appear. This

leads to a consideration of how recovery from neology may take different paths as the patient improves
language control. Many questions remain.

Bibliography
Alajouanine T (1956). Verbal realization in aphasia. Brain
79, 128.
Bouillaud J B (1825). Recherches cliniques propres a`
de montrer que la perte de la parole correspond a` la le sion
des lobes ante rieures du cerveau, et a` confirmer lopinion
de M. Gall, sur le sie`ge de lorgane du langage articule .
Archives Ge ne rales Medicales 8, 2545.
Buckingham H W (1977). The conduction theory and
neologistic jargon. Language and Speech 20, 174184.
Buckingham H W (1981). Where do neologisms come
from? In Brown J W (ed.) Jargonaphasia. New York:
Academic Press. 3962.
Buckingham H W (1985). Perseveration in aphasia. In
Newman S & Epstein R (eds.) Current perspectives in
dysphasia. Edinburgh: Churchill Livingstone. 113154.
Buckingham H W (1989). Phonological paraphasia. In
Code C (ed.) The characteristics of aphasia. London:
Taylor & Francis. 89110.
Buckingham H W (1990). Abstruse neologisms, retrieval
deficits and the random generator. Journal of Neurolinguistics 5, 215235.
Buckingham H W & Kertesz A (1976). Neologistic jargonaphasia. Amsterdam: Swets & Zeitlinger.
Buckingham H W & Yule G (1987). Phonemic false evaluation: theoretical and clinical aspects. Clinical Linguistics and Phonetics 1, 113125.
Buckingham H W & Christman S S (2004). Phonemic
carryover perseveration: word blends. Seminars in
Speech and Language 25, 363373.
Butterworth B (1979). Hesitation and the production of
verbal paraphasias and neologisms in jargon aphasia.
Brain and Language 18, 133161.
Christman S S (1992). Abstruse neologism formation:
parallel processing revisited. Clinical Linguistics and
Phonetics 6, 6576.
Christman S S (1994). Target-related neologism formation in jargon aphasia. Brain and Language 46,
109128.
Foundas A L, Eure K F, Luevano L F & Weinberger D R
(1998). MRI asymmetries of Brocas area: the pars triangularis and pars opercularis. Brain and Language 64,
282296.
Gagnon D & Schwartz M F (1996). The origins of neologisms in picture naming by fluent aphasics. Brain and
Cognition 32, 118120.
Gandour J, Wong D, Hsieh L, Weinzapfel B, VanLancker D
& Hutchins G D (2000). A cross-linguistic PET study of
tone perception. Journal of Cognitive Neuroscience 12,
207222.
Glenberg A M (1997). What memory is for. Behavioral
and Brain Sciences 20, 155.
Goldmann R E, Schwartz M F & Wilshire C E (2001). The
influence of phonological context in the sound errors of a

518 Phonological Impairments, Sublexical


speaker with Wernickes aphasia. Brain and Language
78, 279307.
Goldman-Eisler F (1968). Psycholinguistics: experiments in
spontaneous speech. London: Academic Press.
Goodglass H (1978). Selected papers in neurolinguistics.
Munchen: Wilhelm Fink Verlag.
Hartley D (1749/1976). Observations on man: his frame,
his duty and his expectations. Delmar, NY: Scholars
Facsimiles & Reprints.
Hickok G (2000). Speech perception, conduction aphasia,
and the functional neuroanatomy of language. In
Grodzinsky Y, Shapiro L P & Swinney D (eds.) Language
and the brain: representation and processing. Multidisciplinary studies presented to Edgar Zurif on his 60th
birthday. San Diego: Academic Press. 87104.
Hillis A, Boatman D, Hart J & Gordon B (1999). Making
sense out of jargon: a neurolinguistic and computational
account of jargon aphasia. Neurology 53, 18131824.
Hsieh L, Gandour J, Wong D & Hutchins G (2001). Functional heterogeneity of inferior frontal gyrus is shaped
by linguistic experience. Brain and Language 76,
227252.
Hughlings-Jackson J (1866). Notes on the physiology and
pathology of language: remarks on those cases of disease
of the nervous system, in which defect of expression is the
most striking symptom. Medical Times and Gazette 1
Reprinted in Brain 38. Also in Taylor J (ed.) (1958).
Selected writings of John Hughlings-Jackson, Vol. II.
London: Hodder & Stoughton.
Kertesz A & Benson D F (1970). Neologistic jargon: a
clinicopathological study. Cortex 6, 362386.
Kimura D (1982). Left-hemisphere control of oral and
brachial movements and their relation to communication. Philosophical Transactions of the Royal Society of
London B 298, 135149.
Kohn S & Smith K (1994). Distinctions between two phonological output disorders. Applied Psycholinguistics
15, 7595.
Kohn S, Smith K & Alexander M (1996). Differential
recovery from impairment to the phonological lexicon.
Brain and Language 52, 129149.
Lecours A R (1982). On neologisms. In Mehler J, Walker
E & Garrett M (eds.) Perspectives on mental representation: experimental and theoretical studies of cognitive
processes and capacities. Hillsdale, NJ: Erlbaum.
217247.
Levelt W J M, Roelofs A & Meyer A S (1999). A theory of
lexical access in speech production. Behavioral and
Brain Sciences 22, 175.
Liepmann H (1905). Die linke Hemisphare und das Handeln. Munchener Medizinische Wochenschrift 49, 2322
2326, 23752378. Translated into English by Kimura,
D. (1980) In Translations from Liepmanns essays on

apraxia. Research Bulletin #506. London, Ontario:


Dept. of Psychology, University of Western Ontario.
Martin N & Dell G (2004). Perseverations and anticipations in aphasia: primed intrusions from the past and
future. Seminars in Speech and Language 25, 349362.
Nickels L & Howard D (2004). Dissociating effects of
number of phonemes, number of syllables, and syllabic
complexity on word production in aphasia: its the number of phonemes that counts. Cognitive Neuropsychology 21, 5778.
Ohala J (1990). There is no interface between phonology
and phonetics: a personal view. Journal of Phonetics 18,
153171.
Ohala J (1992). Alternatives to the sonority hierarchy for
explaining segmental sequential constraints. In
Ziolkowski M, Noske M & Deaton K (eds.) Papers
from the 26th Regional Meeting of the Chicago Linguistic Society, vol. 2. The parasession on the syllable in
phonetics and phonology. Chicago: Chicago Linguistic
Society. 319338.
Pick A (1931). Aphasia. In Handbuch der normalen und
pathologischen, 15. Heidelberg: Springer-Verlag. Translated into English and edited by Brown J W (1973).
Aphasia. Springfield, Ill: Charles C. Thomas.
Romani C & Calabrese A (1998). Syllable constraints on
the phonological errors of an aphasic patient. Brain and
Language 64, 83121.
Romani C, Olson A, Semenza C & Grana A (2002). Patterns of phonological errors as a function of a phonological versus an articulatory locus of impairment. Cortex
38, 541567.
Shattuck-Hufnagel S (1987). The role of word-onset consonants in speech production planning: new evidence
from speech error patterns. In Keller E & Gopnik M
(eds.) Motor and sensory processes of language.
Hillsdale, NJ: Erlbaum. 1751.
Square P A, Roy E A & Martin R E (1997). Apraxia of
speech: another form of praxis disruption. In Rothi L J G
& Heilman K M (eds.) Apraxia: the neuropsychology
of action. Hove: East Sussex, UK: Psychology Press.
173206.
Starr M A (1889). The pathology of sensory aphasia, with
an analysis of fifty cases in which Brocas center was not
diseased. Brain 12, 82102.
Uttal W R (2001). The new phrenology: the limits of localizing cognitive processes in the brain. Cambridge: The
MIT Press.
Wernicke K (1874). The aphasia symptom complex: a psychological study on an anatomic basis. Breslau: Cohn &
Weigert. Translated into English by Egger G H (1977).
Wernickes works on aphasia: A sourcebook and review.
The Hague: Mouton. 91145.

Phonological Phrase 519

Phonological Phrase
M Nespor, University of Ferrara, Ferrara, Italy
2006 Elsevier Ltd. All rights reserved.

Introduction
Connected speech is not just a sequence of segments
occurring one after the other. It also reflects an analysis into hierarchically organized constituents ranging
from the syllable and its constituent parts to the utterance (Selkirk, 1984; Nespor and Vogel, 1986).
Two elements that form a constituent of a certain
level have more cohesion between them than two
elements that straddle two constituents of the same
level. For example, in the Italian example in (1), the
syllables ra and men are closer together in (1a) than
in (1b), and in both cases they are closer than the
same syllables in (1c).
(1a) veramente
truly
(1b) era mente sublime
(it) was (an) exceptional mind
(1c) Vera mente sempre
Vera lies all the time

Phonological constituents reflect morphosyntactic


structure in that cohesion at the phonological level
often (though not necessarily) signals constituency at
the morphosyntactic level.
The cues that make us perceive two units as closer
together than two units of a higher order are various
and both segmental and prosodic in nature. That is, a
segment may undergo a change in its features within a
constituent of a certain type, but not outside of it, and
several types of prosodic phenomena can mark the
edges of constituents. The first type of phenomenon is
exemplified in (2): in some varieties of Standard Italian, an /s/ is voiced intervocalically only if it belongs
to a phonological word, as in (2a) but not if it straddles two phonological words as with the compounds
and the clitic groups exemplified in (2b) and (2c),
respectively.
(2a) ba[z]e, ca[z]a, me[z]e
basis, house, month
(2b) capo[s]ezione, para[s]ole
station-head, sunshade
(2c) lo [s]o, dice[s]i
(I) know it, one says

Another example of segmental change bound to a


constituent is vowel harmony, e.g., in Turkish, exemplified in (3): the last vowel of a words stem determines the quality of all vowels of the words suffixes,
but no vowel that is external to the word.

(3a) ev
eve
evde evden
evler evlerden
house to . . . in . . . from . . . PL. from . . . PL.
(3b) ada
adaya
adada
adadan
adalar
island to . . .
in . . . from . . . PL.
adalardan
from . . . PL.

In the second type of phenomena, those that mark


an edge of a prosodic constituent, can be exemplified
with word stress, which in many languages occurs in a
fixed position, as is seen in (4) and (5), on the basis of
Turkish and Hungarian, where stress systematically
falls on the last and on the first syllable, respectively.
(4) Turkish: arkada s, kelebek, benlemek, sivrisinek
friend, butterfly, to wait, mosquito
(5) Hungarian: kaposzta, katica, burgonya, fekete
cabbage, ladybird, potato, black

The phonological phrase is one of the phrasal constituents of the phonological, or prosodic, hierarchy,
and we will see that it is crucial in the signaling of
syntactic constituency and, additionally, in providing
cues to the value of a basic syntactic parameter. Like
the other constituents of the prosodic hierarchy, it is
signaled both by phenomena bound to it and by edge
phenomena. These cues are used both in the processing of speech by adults, for example to disambiguate
certain sentences with an identical sequence of words
but different syntactic structures, and in language
acquisition by infants.

The Domain of the Phonological Phrase


The domain of the phonological phrase (f) extends
from one edge of a syntactic phrase to and including
its head. There are thus two possibilities: in languages
in which the phonological phrase starts from the left
edge of a syntactic phrase (X00 ), it ends at the right
edge of the phrasal head (X). In languages in which
it starts from the right edge of a syntactic phrase, it
ends at the left edge of the phrasal head. These two
possibilities are illustrated in (6a) and (6b), where the
bold parts indicate the domain of the phonological
phrase in headcomplement and complementhead
languages, respectively.
(6a) X00 [...[ ]X. . .. . ..]
(6b) [. . .. . ...X[ ]...]X00

Is it possible to predict which of the two options is


chosen for a given language? It has been proposed
that it is indeed, and that it depends on the relative
location of heads and complements, i.e., on the direction of recursivity of a given language: in languages in

520 Phonological Phrase

which heads precede their complements, the first


option is chosen; in languages in which heads follow
their complements, the second option is chosen. The
general definition of the domain of f is given in (7),
where C stands for clitic group, the constituent that
dominates the phonological word and that includes
one lexical head and all clitics phonologically
dependent on it (Nespor and Vogel, 1986).
(7) The domain of f consists of a C which contains a
lexical head (X) and all Cs on its nonrecursive
side up to the C that contains another head
outside of the maximal projection of X.

Thus, in both types of languages, one edge of the


phrasal head and the opposite edge of the phrase are
marked prosodically. That is, prosodic cues signal
either the left edge of the phrase and the right edge
of the head, or the left edge of the head and the right
edge of the phrase. In both languages, phonological
phenomena signal a certain level of cohesion among
the elements that belong to the phonological phrase.
One such phenomenon in some central and southern
varieties of Italian is the lengthening of word-initial
consonants. Specifically, if a word starts with a consonant other than /s/ followed by a consonant, this
initial consonant is lengthened if preceded by a word
ending with a stressed vowel. The phenomenon takes
place if both words belong to the same phonological
phrase, and it does not apply if a phonological phrase
boundary intervenes between the two, as exemplified
in (8).
(8a) [sar[a] [f:]atto]f
(8b) [far[a ]]f [[t]ante
torte]f

(*[f]) (it) will be done


(*[t:]) (I) will make many
cakes

In the same prosodic context, i.e., only when two


words are included in one phonological phrase, in
many languages, clash avoidance applies, a phenomenon that in the case of two adjacent syllables bearing
word primary stress (a stress clash) destresses the first
of the two, with the possible addition of a stress on a
syllable to its left. This phenomenon is exemplified in
(9) and (10) for Italian and English, respectively.
(9) Italian
(9a) [sara fa tto]f ! [sa ra fa tto]f (it) will be done
(9b) [faro ]f [tante torte]f *[fa ro]f [ta nte torte]f
(10) English
(10a) [thirtee n boo ks]f ! [thrteen boo ks]f
(10b) [I introdu ce]f [Gree k books]f *[I ntroduce]f
[Gree k books]f

In some languages and in specific, syntactically


defined, cases, i.e., in headcomplement languages,
when the syntactic head at the right edge of af
is followed by a f that exhaustively includes the

heads complement or modifier, the two fs can be


restructured into a single one. Thus, (11) contrasts
with (10b) since in the first, Greek is a nonbranching
complement of introduce, while in (11) it is simply
part of the complement.
(11) [I introdu ce]f [Gree k]f ! [I ntroduce Gree k]f

The same holds for complement-head languages,


but in this case f restructuring takes place in
the other direction: the f containing a head at its
left edge optionally includes the f containing a
nonbranching complement or modifier to its left.
Phonological phrases have also been shown to
undergo final lengthening in many languages (e.g.,
Beckman and Edwards, 1990; Wightman et al.,
1992; Cambier-Langeveld, 2000) and to be crucial
for determining the anchoring of the tones of intonational melodies to the text (Hayes and Lahiri,
1991).
Why do we need a phonological constituent analysis into phonological phrases? Since phrasal phonological phenomena have the function of signaling
syntactic structure, the alternative would be to state
that certain syntactic constituents are the domain of
phonological phenomena. One of the reasons to have
a separate hierarchy to account for phonological phenomena is that the domains of phonological and
(morpho-)syntactic constituents are not always isomorphic (Selkirk, 1984; Nespor and Vogel, 1986;
Hayes, 1989; Truckenbrodt, 1999).
One reason for nonisomorphism lies in the fact that
phonological structure is much shallower than syntactic structure, the second, not the former, instantiating a recursive system. In addition, a constituent in
the syntactic tree can be attached at different levels.
For example, the two meanings of a sentence like
I have seen the president with the binoculars have
different syntactic structures according to the level
at which the constituent with the binoculars is attached, but only one phonological structure hence
its ambiguity.
A second reason for nonisomorphism between the
syntactic and the phonological structure at the f level
originates in the restructuring rule: in the syntactic
tree, the word Greek is attached in a similar way to
the word introduce in the two sentences in (10b) and
(11). Yet in one case, the two words are included in
the same phonological phrase and in the other case
they are not.
A third reason for nonisomorphism is the phonological constituency of closed class items that precede or follow the head in right- and left-recursive
languages, respectively. Thus, in many varieties of
American English, a word like because is not
part of the nominal phrase to its right and yet it is

Phonological Phrase 521

included in the same phonological phrase, as seen


from the fact that clash avoidance takes place in (12).
(12) I came [becau se Jo hn]f could not come !
. . . be cause Jo hn . . .

Besides the nonisomorphism argument, there is another reason to justify the prosodic hierarchy, and
thus the phonological phrase, as a separate level of
representation: the influence of syntax on phonology
is restricted to just those constituents. No other
domains for postlexical phonology are predicted to
exist.
We can thus draw the conclusion that the phonological phrase must be a constituent of grammar because it defines a domain that is phonologically
marked in many ways (of which more below), and it
is not necessarily isomorphic to any syntactic constituent. It is identifiable because it is the domain of
application of several phonological phenomena in
different languages and, in certain cases, it disambiguates two sentences with the same sequence of words,
such as those in (13).
(13a) Ho comprato delle fotografie di citta` [v]ecchie
(I) bought old maps of cities
(13b) Ho comprato delle fotografie di citta` [v:]ecchie
(I) bought pictures of old cities

The lengthening of the initial consonant of vecchie


in the second example, where vecchie refers to citta`
indicates that di citta` vecchie is a restructured f,
while its absence in the first example, where vecchie
refers to mappe, idicates that vecchie is a f on its
own: it cannot restructure with the preceding
word because, though nonbranching, it is not its
modifier.

Relative Prominence Within the


Phonological Phrase
Within each constituent of the phonological hierarchy, relative prominence is assigned to its daughter
constituents: one of the constituents is marked as
strong and the others as weak. That is, within each
constituent, which edge is strong (i.e., most prominent, e.g., more stressed) and which is weak is defined. Thus within a foot, either the leftmost or the
rightmost syllable is strong; within a phonological
word, either the leftmost or the rightmost foot is
strong (Liberman and Prince, 1977).
The location of the strong element dominated by
the phonological phrase has been proposed to depend
on the value of the headcomplement parameter,
i.e., on the order of words within a syntactic phrase
(Nespor and Vogel, 1986). In headcomplement
languages, the strongest element within a f is the

rightmost, in complementhead languages it is the


leftmost. Thus, the element dominated by f that
bears the main prominence is either the rightmost or
the leftmost depending on the recursive side of a
specific language: in right recursive languages, e.g.,
French, the strongest element is rightmost; in left
recursive languages, e.g., Turkish, it is leftmost, as
exemplified in (14) and (15), respectively (the strongest word is underlined). French and Turkish are used
in the examples because these languages are similar
both in word primary stress location and in syllabic
complexity.
(14a) [pour moi] for me
(14b) [la belle fille] the beautiful girl
(15a) [benim ic in] for me (me for)
(15b) [gu zel kadIn] (the) beautiful woman

Prominence at this level thus signals the value of an


important syntactic parameter, which defines the recursive side of a phrase: in a language like French, an
utterance will be a sequence of iambs, i.e., a weak
syllable followed by a strong one (or, depending on
the number of words it contains, of anapests, i.e., two
weak syllables followed by a strong one); in a language like Turkish an utterance will be a sequence of
trochees, i.e., one strong syllable followed by a weak
one (or dactyls, i.e., one strong syllable followed by
two weak ones).
It should be noted that while in an unrestructured
f, main stress always falls on the syntactic head, this
is not so in restructured fs. In this case, the most
prominent element is either the heads nonbranching
complement or its nonbranching modifier. The relative prominence within phonological phrases, thus,
does not directly signal the head of a phrase, but
more abstractly, whether heads are preceded or
followed by their complements.

The Function of the Phonological Phrase


in Speech Processing and Language
Acquisition
Given that the analysis of a string into phonological
phrases can, in many cases, disambiguate possibly
ambiguous sentences, it is feasible that its cues are
used online in the processing of speech. It has in fact
been shown that the location of phonological phrase
boundaries affects lexical access in experiments
with English-learning infants of 10 and 13 months
of age (Gout et al., 2004). Infants were familiarized
with two bisyllabic words and then presented sentences that either contained one of the words, or the
same two syllables but separated by a f boundary.
While 10-month-olds do not show any preference,

522 Phonological Phrase

13-month-olds prefer the utterance containing the


familiarization words, indicating that they hear the
difference between two syllables that either contain
or do not contain a f boundary.
Recent results with adults suggest that in French
phonological phrase boundaries are used online to
constrain lexical access (Christophe et al., 2003).
Participants reactions were slowed down by local
lexical ambiguities within phonological phrases, but
not across them. Thus, a target word chat (cat) was
responded to more slowly in son chat grincheux (his
grumpy cat), where chagrin is also a word, than in
son chat drogue (his drugged cat), where no
competitor word starts with chad. Instead, when the
word chat was followed by a phonological phrase
boundary, participants responded equally fast in
both conditions, the condition containing a potential
competitor and the one not containing any, e.g., in
chat in [son grand chat] [grimpait aux arbres] (his big
cat climbed up trees) and in [son grand chat] [dressait
loreille] (his big cat pricked up its ears). That is,
the competitor word chagrin is not recognized,
thus showing that phonological phrase boundaries
constrain lexical access online.
Given that the rhythmic pattern within phonological phrases reflects the value of the headcomplement
parameter, it has been proposed that this signal might
be used by infants in the bootstrapping of the syntax
of their language of exposure (Nespor et al., 1996):
infants exposed to an iambic rhythm may deduce that
in their language, heads precede their complements;
infants exposed to a trochaic rhythm may deduce that
in their language, heads follow their complements.
The first step to show the feasibility of this hypothesis
has been made in an experiment that shows that 6- to
12-week-old infants discriminate two languages solely on the basis of the location of prominence within
phonological phrases (Christophe et al., 2003).
French and Turkish were chosen because they have a
similar syllabic structure and both languages have
word final stress. In addition, the sentences were
delexicalized (all segments that share manner of
articulation are reduced to a single place of articulation: stops to [p], fricative to [s] and so on), so that no
segmental information could give a cue to the specific
language. Infants discriminate French from Turkish
sentences containing branching fs but fail to discriminate sentences containing only nonbranching
fs, where the different rhythmic pattern at the
phonological phrase level, iambic or trochaic is not
realized.

See also: Phonological Words; Phrasal Stress; Prosodic


Aspects of Speech and Language; Prosodic Cues of Discourse Units; Prosodic Morphology.

Bibliography
Beckman M E & Edwards J (1990). Lengthenings and
shortenings and the nature of prosodic constituency. In
Kingston J & Beckman M E (eds.) Papers in laboratory
phonology i: between the grammar and the physics
of speech. Cambridge: Cambridge University Press.
152178.
Cambier-Langeveld T (2000). Temporal marking of accents
and boundariest. HIL Dissertations.
Christophe A, Guasti M T, Nespor M, Dupoux E & van
Ooyen B (1997). Reflections on phonological bootstrapping: its role for lexical and syntactic acquisition.
Language and Cognitive Processes 12, 585612.
Christophe A, Nespor M, Dupoux E, Guasti M-T & van
Ooyen B (2003). Reflexions on prosodic bootstrapping:
its role for lexical and syntactic acquisition. Developmental Science 6(2), 213222.
Christophe A, Peperkamp S, Pallier C, Block E & Mehler J
(2004). Phonological phrase boundaries constrain lexical access: I. Adult data. Journal of Memory and Language 51, 523547.
Gout A, Christophe A & Morgan J L (2004). Phonological
phrase boundaries constrain lexical access. II Infant
data. Journal of memory and language 51, 548567.
Hayes B (1989). The prosodic hierarchy in meter. In
Kiparsky P & Youmans G (eds.) Phonetics and phonology. Rhythm and meter. New York: Academic Press.
201260.
Hayes B & Lahiri A (1991). Bengali intonational phonology. Natural Language and Linguistic Theory 9, 4796.
Liberman M & Prince A (1977). On stress and linguistic
rhythm. Liguistic Inquiry 8(2), 249336.
Nespor M, Guasti M T & Christophe A (1996). Selecting
word order: the Rhythmic Activation Principle. In
Kleinhenz U (ed.) Interfaces in phonology. Berlin:
Akademie Verlag. 126.
Nespor M & Vogel I (1986). Prosodic phonology.
Dordrecht: Foris.
Selkirk E O (1984). Phonology and syntax. The relation
between sound and structure. Cambridge: MIT Press.
Truckenbrodt H (1999). On the relation between syntactic
phrases and phonological phrases. Linguistic Inquiry 30,
219255.
Wightman C W, Shattuck-Hufnagel S, Ostendorf M &
Price P J (1992). Segmental duration in the vicinity
of prosodic phrase boundaries. Journal of the Acoustic
Society of America 91, 17071717.

Phonological Typology 523

Phonological Typology
M Hammond
2006 Elsevier Ltd. All rights reserved.
This article is reproduced from the previous edition, volume 6,
pp. 31243126, 1994, Elsevier Ltd.

Phonological typology is a classification of linguistic


systems based on phonological properties. There are
four basic kinds of typology: areal or genetic typologies; typologies based on surface phonological
properties; typologies based on some underlying
phonological property; and parametric typologies.
Examples of each of these are reviewed below.
In addition, phonological typology can refer to a
classification of the elements that make up a phonological system. For example, articulatory descriptors like
velar and labial form part of a typology of speech
sounds. Such a typology can be based on the surface
elements of a language or on deeper constructs. These
kinds of typologies are discussed as well.

Genetic and Areal Typology


A genetic typology is one based on the developmental
relationships within language groups, for example, Germanic versus Slavic, or Bantu versus IndoEuropean.
Such relationships are not in and of themselves phonological, but often coincide with phonological similarities. These similarities arise from phonological
properties that stem from the common period, but also
from shared innovations that occur after the languages
separate. These shared innovations are a consequence of
the structural pressures engendered by the phonological
system created in the shared period of evolution.
For example, the Slavic languages can be distinguished as a genetic type within the IndoEuropean
language family. The Slavic languages are characterized by a number of properties, including phonological properties. In the phonological domain, they
exhibit a set of three palatalizations that applied in
the common period. These palatalizations front consonants in the environment of an adjacent front
vowel. The chart in Figure 1 shows the results of the
three palatalizations for velar consonants adjacent

Figure 1

to front vowels. Interestingly, some of the daughter


languages have undergone subsequent palatalizations
as well. For example, Russian velars before front
vowels are palatalized as indicated in Figure 1.
The Germanic languages within IndoEuropean can
also be distinguished. These languages can be characterized by the consonantal shifts underlying Grimms
and Verners Laws. As with Slavic, many of the daughter languages have undergone subsequent similar shifts.
The chart in Figure 2 shows the results of the traditional
formulation of Grimms Law for velar consonants
(Proto-IndoEuropean > Germanic). It also shows the
results of the Standard German (High German) consonant shift (Germanic > High German). Notice how
the affrication of the voiceless velar partially recurs in
the two shifts (k ! x and k ! kx/k).
Phonological typologies based on genetic relationships are thus useful in that they have led to an
investigation of recurring perseverative developments
like those above.
An areal typology is based on geographic proximity. Such relationships also often coincide with phonological properties. For example, many languages of
the Indian subcontinent have murmured consonants,
irrespective of whether they belong to the same family
or not. African languages frequently have tonal contrasts. These phonological similarities are thus in
some cases a consequence of common origins (genetic
similarity), but in other cases they are borrowed from
neighboring languages.

Surface Typology
Surface typologies are built on surface properties of
phonological systems. For example, click languages
are those languages that have sounds produced with a
velaric ingressive airstream. Stress languages have variations in prominence marked on different syllables.
It is in the domain of phonological inventories (see )
that some of the most intensive work has been done
on surface typology. In the tradition inaugurated by
Joseph Greenberg (and continued by his students),
work on phonological typology has gone hand in
hand with the development of phonological universals (see Phonological Universals). Investigating the

Figure 2

524 Phonological Typology

Figure 3
Figure 4

basic types of phonological inventories has led to


universals regarding those inventories. Maddieson
(1984: 69) cited the following example of a universal:
The presence of a voiceless, laryngealized or breathy
voiced nasal implies the presence of a plain voiced
nasal with the same place of articulation. This universal results from investigating the typology underlying languages with nasal consonants, which leads
one to ask whether the distribution of those segments
is free or whether there are restrictions that would
result in an interesting typology. If Maddiesons universal is correct, then it makes sense to distinguish
languages containing voiceless, laryngealized, or
breathy voiced nasals from languages without those
sounds. It makes sense, because languages of the former sort must necessarily have voiced plain nasals
with the same place of articulation.
As a second example, consider whether languages
containing the sound /qw,/ (voiceless labialized ejective uvular stop) should be distinguished from languages without such a sound. If something follows
from this distinction, then it makes sense to make it.
Maddieson (1984) proposed the following universal:
if a language has /qw,/ it also has /q/ and /kw,/. Such a
universal motivates the basic typology.

Underlying Typology
Typologies can be built on underlying properties of
phonological systems as well. For example, three
basic kinds of suprasegmental systems can be distinguished underlyingly: accent, tone, and stress. Accent
languages, like Japanese, mark surface pitch contrasts
underlyingly with a diacritic mark associated with
particular syllables (perhaps even a linked tone as in
a tone language, or metrical structure as in a stress
language). The surface patterns are arrived at by
associating a specific melody with the diacritically
marked word. Archetypical tone languages are also
typified by surface pitch contrasts, but have a different underlying structure. In a tone language, the underlying representation of a word is associated with a
tonal melody. Stress languages, like English, are typified by metrical structure assignment at some point in
the derivation (see Metrical Phonology). However, at
the surface, stresses may be realized by pitch contrasts, as in Greek, resulting in a superficial conflation

of all three systems. That is, the three kinds of systems


may not be distinguishable phonetically. The underlying structural contrasts are indicated with the hypothetical words in Figure 3. In Figure 3, the accent
language marks a particular syllable; the tone language includes a specific tonal melody; the stress
language includes no underlying specification.

Parametric Typology
One of the most exciting developments in phonological typology has been parametric typology.
A parametric typology is built on parameters, which
correspond to the points where languages can exhibit
variation. In this respect, parametric typologies are
like other typologies. However, parameters are also
supposed to correspond to the particular choices that
a child must make in learning his or her language.
One parameter that has been proposed is directionality of syllabification. This parameter has several consequences with respect to the size of onset clusters and
the placement of epenthetic vowels. Languages exhibiting left-to-right syllabification can be distinguished
from those exhibiting right-to-left syllabification. The
diagram in Figure 4 shows how different syllabifications are achieved when CCVC syllables are aligned
from left to right and from right to left.
Another example of a parameter is headedness in
metrical constituents. In most versions of the metrical
theory of stress, the head of a constituent occurs on
either the left or the right edge of a constituent, as in
Figure 5. The xs at the lower level of the hierarchy
mark syllables; the xs at the higher level mark headship (see Metrical Phonology). Headedness has consequences with respect to the initial assignment of
stress and subsequent stress shifts that arise when a
vowel becomes unstressable for some reason. As with
the directionality of syllabification, headedness corresponds to one of the choices that language learners
supposedly make.

Typologies of Elements
Typologies can also be applied to the elements of
phonological systems. The simplest case of this is distinctive feature theory. A feature like [high] imposes a

Phonological Universals 525

Figure 5

classification on speech sounds as in Figure 6. The


feature [high] provides for a typology of speech sounds
in any language (see Distinctive Features).
Typologies have been applied to other machinery of
generative grammar as well. For example, the ordering of any two phonological rules can be categorized
in terms of whether the first rule creates environments
for the second (feeding) or eliminates potential cases
for the second (bleeding) (see Rule Ordering and
Derivation in Phonology). This gives rise to a typology of ordering relationships. The rules themselves
have been typologized as well: for example, lexical
versus postlexical, and structure-building versus
structure-changing in the theory of lexical phonology
(see Lexical Phonology and Morphology).

Evaluating Typologies

Figure 6

that it aids in the construction of a theory or follows from a theory. A good typology is one where
the proposed distinction between phonological systems or elements correlates with other phonological
properties. Specifically, a typology is highly valued to
the extent that it provides a classificatory scheme
based on a multiplicity of factors as opposed to just
a few.
See also: Distinctive Features; Lexical Phonology and
Morphology; Metrical Phonology; Phonological Universals; Rule Ordering and Derivation in Phonology.

Bibliography
Maddieson I (1984). Patterns of sounds. Cambridge:
Cambridge University Press.

In and of itself, classifying the objects of inquiry is of


little interest. A typology is of use only to the extent

Phonological Universals
M Hammond, University of Arizona, Tucson, AZ, USA

Greenbergian Universals

2006 Elsevier Ltd. All rights reserved.

Greenbergian universals are named for the work


of Joseph Greenberg and his collaborators. (See, for
example, Greenberg, 1978.) Here are a few examples:

Introduction
To understand what a phonological universal is, we
must first agree on what phonology is. In point of
fact, much of the divergence between different views
of phonological universals stems from this question.
A very simple characterization of phonology would
be the study of the sound systems of language. This
definition leaves a number of issues open and is therefore a reasonable albeit ambiguous starting point.
Given this definition, a phonological universal is a
statement about how sound systems may and may
not vary across the languages of the world.
There have been a number of approaches to phonological universals over the history of phonology
and this article surveys some of the more influential
ones of the last 50 years.

1.
2.
3.
4.

5.
6.
7.
8.

All languages have vowels.


All languages have consonants.
No language has an apicovelar stop.
The presence of a voiceless, laryngealized, or
breathy voiced nasal implies the presence of a
plain voiced nasal with the same place of articulation (Maddieson, 1984).
All languages have some syllables that begin with
consonants (Jakobson, 1962).
All languages have some syllables that end with
vowels (Jakobson, 1962).
Nearly all languages have /i, a, u/ (Maddieson,
1984).
Front vowels are usually unrounded, back vowels
are usually rounded (Maddieson, 1984).

Phonological Universals 525

Figure 5

classification on speech sounds as in Figure 6. The


feature [high] provides for a typology of speech sounds
in any language (see Distinctive Features).
Typologies have been applied to other machinery of
generative grammar as well. For example, the ordering of any two phonological rules can be categorized
in terms of whether the first rule creates environments
for the second (feeding) or eliminates potential cases
for the second (bleeding) (see Rule Ordering and
Derivation in Phonology). This gives rise to a typology of ordering relationships. The rules themselves
have been typologized as well: for example, lexical
versus postlexical, and structure-building versus
structure-changing in the theory of lexical phonology
(see Lexical Phonology and Morphology).

Evaluating Typologies

Figure 6

that it aids in the construction of a theory or follows from a theory. A good typology is one where
the proposed distinction between phonological systems or elements correlates with other phonological
properties. Specifically, a typology is highly valued to
the extent that it provides a classificatory scheme
based on a multiplicity of factors as opposed to just
a few.
See also: Distinctive Features; Lexical Phonology and
Morphology; Metrical Phonology; Phonological Universals; Rule Ordering and Derivation in Phonology.

Bibliography
Maddieson I (1984). Patterns of sounds. Cambridge:
Cambridge University Press.

In and of itself, classifying the objects of inquiry is of


little interest. A typology is of use only to the extent

Phonological Universals
M Hammond, University of Arizona, Tucson, AZ, USA

Greenbergian Universals

2006 Elsevier Ltd. All rights reserved.

Greenbergian universals are named for the work


of Joseph Greenberg and his collaborators. (See, for
example, Greenberg, 1978.) Here are a few examples:

Introduction
To understand what a phonological universal is, we
must first agree on what phonology is. In point of
fact, much of the divergence between different views
of phonological universals stems from this question.
A very simple characterization of phonology would
be the study of the sound systems of language. This
definition leaves a number of issues open and is therefore a reasonable albeit ambiguous starting point.
Given this definition, a phonological universal is a
statement about how sound systems may and may
not vary across the languages of the world.
There have been a number of approaches to phonological universals over the history of phonology
and this article surveys some of the more influential
ones of the last 50 years.

1.
2.
3.
4.

5.
6.
7.
8.

All languages have vowels.


All languages have consonants.
No language has an apicovelar stop.
The presence of a voiceless, laryngealized, or
breathy voiced nasal implies the presence of a
plain voiced nasal with the same place of articulation (Maddieson, 1984).
All languages have some syllables that begin with
consonants (Jakobson, 1962).
All languages have some syllables that end with
vowels (Jakobson, 1962).
Nearly all languages have /i, a, u/ (Maddieson,
1984).
Front vowels are usually unrounded, back vowels
are usually rounded (Maddieson, 1984).

526 Phonological Universals

There are a number of properties that are indicative of greenbergian universals. First, their logical
structure is transparent. For example, the first universal given above can be interpreted directly as a
logical restriction with universal scope, e.g., for all
x, if x is a language, then x has vowels: 8x
(L(x) ! V(x)).
A second general property of greenbergian universals is that they usually apply to relatively superficial elements of a phonological system. For
example, greenbergian universals typically though
not exclusively refer to elements of the phonetic, not
phonological, inventory. Likewise, greenbergian universals are typically not formulated over relatively
abstract phonological elements like features, rules,
constraints, or prosodic elements, but are typically
formulated over segments or classes of segments. (The
examples above that refer to syllables or phonemic
inventory are thus counterexamples.)
Finally, greenbergian universals are typically arrived at on the basis of empirical typological work,
rather than by deduction from some theory; they are
thus more empiricist, rather than rationalist on that
philosophical continuum.
All of these features are relatively uncontroversial
and can be used to argue the virtues or shortcomings
of this approach to phonological universals. For example, the fact that these types of universals are
typically formulated to refer to superficial entities
can be taken as a virtue, since the properties of relatively superficial entities are more amenable to direct
examination. Thus, universals over such entities are
more falsifiable. On the other hand, one might argue
that universals over superficial entities are less plausible, on the assumption that surely the cognitive
elements that underlie those entities and phonology
generally are of a significantly different character.
Notice that the list above also includes statistical
generalizations, e.g., generalizations that include qualifiers like nearly or usually. These obviously cannot
be interpreted in the same way as absolute restrictions
and are not as readily falsified. For example, a universal
that says that all languages have vowels can be falsified
by finding a language that has no vowels. On the
other hand, finding a language that does not have
/i, a, u/ does not falsify a universal that says that most
languages have those sounds. Rather, to falsify a statistical universal, one must present an appropriate
sampling of languages that violate the universal.
Since the statistical universal was presumably already
determined by such a sample, it follows that to be
falsified, a sample of similar size or greater must be
made. (Though see Bybee, 1985 for another view.)
Statistical greenbergian universals are therefore troubling because they are harder to falsify.

Rule-based Universals
The advent of generative phonology in the late 1960s
came with a different approach to phonological universals (Chomsky and Halle, 1968: henceforth SPE).
This approach to phonological universals can
be characterized as rule-based or formal(ist).
The theory is formalized as a notation for writing
language-specific phonological generalizations, or
rules. The basic idea behind the approach to universality in this theory is that the restrictions of the notation
are intended to mirror the innate capabilities of the
language learner. Thus, if a rule can be written with
the notation, then the theory is claiming that that is a
possible rule. If a rule cannot be written with the
notation, then it is an impossible rule.
In addition, this theory posited an evaluation metric whereby the set of possible rules could be compared with each other. It was thought at the time that
rules/analyses that were simpler in terms of the notation were more natural. They would show up more
frequently in languages, and they were preferred by
the language learner if more than one formalization
could describe the available data.
Feature theory provides a simple example of how
this theory made claims about universality. Vowel
height is treated in terms of two binary-valued features: [high] and [low]. The following chart shows
how these features can be used to group vowel heights
in different ways in a hypothetical five-vowel system.
(1) class
high
mid
low
high & mid
mid & low
high & low

vowels
i, u
e, o
a
i, u, e, o
e, o, a
i, u, a

features
[high]
[high, low]
[low]
[low]
[high]
?

Notice how this simple theory does not provide a way


to group high and low vowels together, excluding mid
vowels (unless we explicitly allow for disjunctive feature specification). This disparity would seem to
make the prediction that, if phonological generalizations must be couched in terms of the feature
theory, then we would expect to find no phonological generalizations that are restricted to high and
low vowels, but not mid vowels. Things are more
complex, as well see below.
Lets now consider the relative markedness of rules.
Consider a hypothetical example like the following.
(2) [vocalic] ! [nasal]/_[nasal]

This rule nasalizes a vowel before a nasal segment.




(3) [vocalic] ! [nasal]/_

nasal
coronal

Phonological Universals 527

This rule nasalizes a vowel before a nasal segment


that is coronal.
The evaluation metric maintains that if both analyses
are consistent with some set of data that the learner will
choose the first. The metric is based on a simple counting of elements in the rules. Since the first rule has fewer
feature references than the second, it is preferred.
In terms of acquisition, this theory would entail
that children overgeneralize, rather than overrestrict.
In terms of typology, we would expect all else being
equal that phenomena that can be described more
simply using this formalism to occur more frequently
than phenomena that require a more complex formal
description.
In retrospect, we can see that the rule formalism is
extremely powerful; it allows for an infinite number
of types of rules. It also allows a language to have
an unbounded number of actual rules. For example,
the generalization above about vowel height can be
evacuated by positing two separate rules referring to
vowel height. For example, if high and low vowels are
nasalized in some context, but mid vowels are not,
this feature can be accommodated by making use of
curly braces:
2

3
vocalic
(4) 4 high 5! [nasal]/_[nasal]
low

or by positing two separate nasalization rules:





vocalic
! [nasal]/_[nasal]
high


vocalic
! [nasal]/_[nasal]
(6)
low

(5)

These latter analyses are not as highly valued by the


evaluation metric as the first one, but they are possible. Hence, the putative universal is not absolute, but
a claim about acquisition and/or markedness, viz. all
else being equal, language learners will avoid generalizations that group high and low vowels together to
the exclusion of mid vowels, and such generalizations
will be relatively rare crosslinguistically.
There are several other aspects of generative era
universals that are worth pointing out. First, as we
have seen, these generalizations tend to hold of more
abstract entities, e.g., rules. While this application may
be more appealing in terms of our notions of how
phonological cognition might work, it does render
these generalizations less falsifiable. It would seem
far easier to look for a language with an apico-velar
stop, than it is to find a language that has a rule nasalizing low and high vowels before a nasal segment.
Another very important aspect of generative universals of this era is that they were typically not
proposed based on direct typological investigation.

Rather, it was supposed that detailed examination of


any single language, English in the case of SPE, would
lead to reasonable hypotheses about the shape of
universal grammar and phonological (and syntactic)
universals.

Nonlinear Phonology
In the early 1970s, phonological theory underwent
a shift, from a focus on the linear rules of SPE to
a focus on richer nonlinear representations for suprasegmental phenomena, i.e., stress, tone, syllable
structure, and prosodic morphology. The richer
representations required for these phenomena were,
in fact, later extended to account for traditionally
segmental phenomena as well.
This change in the focus of phonology entailed a
concomitant shift in the form of phonological universals. While previously the focus had been on the form
of rules, now attention shifted to the form of representations. Here are a few examples of universals that
stem from this era.
1. No-Crossing Constraint: autosegmental association lines cannot cross.
2. Sonority Hierarchy: syllables must form a sonority
peak.
3. Foot Theory: stress is assigned by building from a
restricted set of metrical feet.
The No-Crossing Constraint (Goldsmith, 1979)
holds that tonal autosegments, elements initially
representing pitches on a separate tier of representation, are linked to segmental elements via association
lines. That linking partially respects linear order in
that the lines connecting these two strings of elements
may not cross each other. Thus, the first representation below for the hypothetical word bada`ga is a legal
one, but the second is not. (Acute accent represents a
high tone and grave accent a low tone; H represents
a high autosegment and L a low one.)
(7)

This constraint is intended as an absolute restriction on


possible phonological representations. (Though see
Bagemihl, 1989 for a more restrictive view of what
this constraint holds over.)
The Sonority Hierarchy has been around for many
years, but is elaborated in nonlinear generative terms
most clearly by Steriade (1982). The basic idea is that
each syllable must constitute a sonority peak, where
sonority refers to the intrinsic loudness of the segments involved (Chomsky and Halle, 1968). Given
that segments like [a], [r], and [t] are ranked in that
order in terms of their sonority, it follows from the

528 Phonological Universals

Sonority Hierarchy that a string like [atra] can be


syllabified in either of the first two ways indicated
below, but not as the third.
(8)

The last syllabification given would have a syllable


composed of [atr] with two higher-sonority elements
separated by a lower-sonority element. This instance
is ruled out by the Sonority Hierarchy.
The Sonority Hierarchy is not as restrictive as one
would hope. Languages differ in the sonority that they
attribute to the same segments and in how elaborated
the hierarchy is. In addition, languages allow for a
variety of mechanisms whereby segments can be left
unsyllabified or partially syllabified in peripheral
positions.
Finally, the third example above is Foot Theory
(Halle and Vergnaud, 1977; McCarthy, 1982; Hayes,
1981, 1995). The basic idea is that stress is assigned
by building prosodic constituents across a span. For
example, alternating stress in a language like English
is achieved by building binary left-headed units across
the word. For example, a word like A`pala`chico la
would be footed as below.
(9)

The claim for universality comes from the fact that


any stress system must be described using such feet,
and the fact that only a restricted set of feet is available. Metrical stress theory is a complex domain, but
the basic logic is simple: there is a restrictive syntax
of footing available and all stress systems should be
describable using that syntax.
Nonlinear approaches to phonological universals
grow out of enriched phonological representations,
attention to suprasegmental phenomena, and increased typological focus. They exhibit the following
properties.
First, like linear generative universals, they are
rooted in a notational scheme. Typically, they are expressed as constraints that govern what a phonological representation may look like, rather than as a rule
notation. Like linear SPE-style universals, the basic
idea is that the notational restrictions mirror our
cognitive limits in learning a phonological system.
Those notational restrictions on representations thus
embody phonological universals.

Second, in contrast to linear generative universals,


they are arrived at through typological investigation.
What constitutes a reasonable sample is quite indeterminate, however. In some cases, simply adding a
new language to the set of languages discussed in the
literature has been sufficient to posit new universals.
In other cases, a more compendious approach is
taken. For example, Hayes (1981) was a comprehensive investigation of a large number of stress systems.
The latter is obviously the desideratum, though its
not clear if it is always possible (or always necessary).
Third, most nonlinear representations are sharp
constraints over what is possible, rather than preferences expressed as an evaluation metric. This paradigm is clearly a step forward, as sharp universals,
rather than tendencies, are of course more falsifiable.

Optimality-Theoretic Universals
Phonological universals have a rather different character in Optimality Theory (OT) (Prince and Smolensky,
1993; McCarthy and Prince, 1993).
Lets exemplify this universal with basic syllable
structure. The ONSET constraint requires that syllables have onsets. The NOCODA constraint requires
that syllables have no codas. These two constraints
militate for unmarked syllables and interact with
other constraints that militate for lexical distinctness:
PARSE and FILL. (It is convenient to describe the theory
before the development of Correspondence Theory
[McCarthy and Prince, 1995].)
In a language like English, PARSE and FILL outrank
ONSET and NOCODA, allowing onsetless syllables and
codas to arise when present lexically. In a language
like Hawaiian, where onsets are required and codas
ruled out entirely, the ranking is the other way around.
The following constraint tableau shows how
this hierarcchy works for English with a word like
[ks].
(10)

Since PARSE and FILL are ranked highest, any candidate that violates one or both of them is ruled
out. Consequently, ONSET and NOCODA are
ranked lower, and can be violated. The upshot is
that a form like [ks] surfaces as is, with a coda and
without an onset.
It works the other way around in Hawaiian.

Phonological Universals 529

(11)

Here, the ranking is the other way around. As a result,


modifications to the form are preferred to violating
ONSET and NOCODA, giving rise to [t].
Prince and Smolensky (1993) gave this analysis as
an optimality theoretic derivation of what they term
the Jakobsonian syllable typology, the claim that all
languages have some syllables that end in vowels and
some syllables that begin in consonants (Jakobson,
1962). This universal pattern follows from their account. If the only constraints that enforce markedness
patterns for onsets and codas are ONSET and NOCODA,
then this result will be the case.
Here, two traditional implicational universals follow from the existence of only certain kinds of constraints. Other universals follow from other aspects
of OT. Recall that the basic structure of the theory
is that there is a fixed set of universal constraints.
All phonological variation follows from languageparticular ranking of those constraints. The theory
also allows for universal ranking of some constraints.
Some universals follow from universal ranking of
relevant constraints.
The example Prince and Smolensky used to demonstrate this syllabification is in Imdlawn Tashlhiyt
Berber (Tachelhit) (Dell and Elmedlaoui, 1985). The
basic observation is that syllabification in this language is subject to a principle whereby less sonorous
segments prefer to be onsets and more sonorous segments prefer to be nuclei or peaks. This pattern is subordinated to other principles, e.g., medial syllables
must have onsets.
The fact that sonority plays the role indicated
is shown by forms like [tzMt], rather than [tZmt].
(Capital letters indicate syllable peaks.) Onsets can be
simple or complex in this context and the preferred
form differs in that its peak is more sonorous.
To capture these generalizations, Prince and
Smolensky propose top-ranked ONSET and a set of
constraints on sonority: *P/t  *P/m  *P/u  *P/a.
These constraints penalize different segments as syllable peaks. The basic idea is that these constraints
are ranked by sonority with constraints against lowsonority segments as peaks outranking constraints
against high-sonority segments as peaks. This captures the generalization that choice of what segments
can be peaks is governed by sonority.

This explanation captures the universal that the


Sonority Hierarchy governs syllabification if all constraints referring to sonority in the context of syllabification are universally ranked in this fashion. (This
idea that constraints and universal constraint rankings are grounded in phonetic properties like sonority
also showed up in Grounded phonology [Archangeli
and Pulleyblank, 1994].) That is, while language
variation is generally captured via constraint ranking,
not all constraint ranking is free. Since constraints
referring to sonority are fixed in their respective ranking, it follows that all phonological variation will
respect the Sonority Hierarchy.
Summarizing the approach to universals in OT, universals of two sorts are posited: specific constraints and
specific constraint rankings. These universals are based
on a more typological basis, though the data that have
been adduced have not been explicit language surveys.
Unlike previous generative approaches, OT-style universals are not representational. Instead, the focus is on
the form of constraints, rather than the form of representations. (This shift is an interesting swing of the
pendulum in that attention has swung from rules to
representations to constraints [Anderson, 1985].)

Evaluating Universals
The approaches to phonological universals reviewed
above are quite varied, but we can distinguish them
along the following parameters.
First, these approaches differ to the degree that they
are formalized or notational. Generative approaches
have typically been of this sort, though they have
varied in whether universals govern rules, representations, or constraints. Greenbergian universals have
typically not been formal in this sense.
Second, these approaches have differed in the extent to which universals have been based on explicit
typological sampling. Early greenbergian universals
were based on such samples, but early generative universals were not. Over the evolution of generative
phonology, there has been more and more typological basis to universals, though work like Hayes (1981)
or Hayes (1995) still stands as an exception, rather
than the rule. (It is interesting to compare the work of
Hayes to an early paper by Hyman, 1977. The latter is
a preliminary typological investigation of stress from
an essentially greenbergian perspective.)
Third, approaches to universals have differed in
terms of abstractness. Greenbergian universals have
typically been concrete, applying to elements that are
more or less observable directly. Early generative
universals were applicable to more abstract entities,
and thus less easily falsified on direct examination
of language phenomena.

530 Phonological Universals

There are a number of issues that we have not


addressed yet though. For example, how restrictive
are different approaches to phonological universals?
All else being equal, we would like our universals to
limit the set of possible languages as much as possible.
Greenbergian universals are quite amenable to this
sort of calculation. (See Hammond et al., 1989 for
more discussion of this sort of analysis.) For example,
a universal that says all languages have some property
x or no languages have that property are in effect
100% restrictive. All languages in any potential
sample must have or not have the relevant property.
An implicational greenbergian universal is far less
restrictive. Consider a universal of the following form:
if a language has property x, then it also has property y.
This statement is far less restrictive in that, if a language in the sample does not have property x, then it
is irrelevant to the generalization. If we were to make
the assumption that x is randomly distributed, we
could say that such a universal is 50% restrictive.
Weve already seen that, abstractness and other
issues aside, the universals of early generative phonology were a lot more difficult to assess in this way. How
would we assess in as theory-neutral a fashion as possible, the restrictiveness of the claim that low and high
vowels cannot be grouped together into a single rule?
On the other hand, optimality theoretic universals
can be assessed in these terms. For example, the fact
that onsets are subject to the ONSET constraint (and
no others) entails the phonological universal that
Prince and Smolensky discussed: all languages have
x (x have some syllables with onsets).
There are two issues that we have not fully
addressed yet, however. As noted above, Prince and
Smolensky (1993) argued that the Sonority Hierarchy
is grounded in the phonetic definition of sonority.
Other authors have elevated this definition to a
general claim about constraints in OT: all constraints must be grounded in phonetics (Archangeli
and Pulleyblank, 1994). If such a claim can be maintained, it would add to the falsifiability of universals, but in a novel direction. A putative universal,
expressed as a constraint or as a fixed constraint
ranking, can be tested by constructing an appropriate
languages sample or by examining the phonetic
underpinnings of the proposal.
On the other hand, the proposal leaves open the
following question. If some phonological fact can be
explained by some phonetic fact, should the latter
be formalized (redundantly) in the phonology? An
alternative view would have it that phonetic explanation should be strictly separated from phonological
explanation.
The other question that has been left open is what
to do about statistical generalizations. We have seen

examples of these in both the greenbergian and early


generative frameworks. In the former, we saw universals that included caveats like most or often. In the
latter, we saw that the effect of the evaluation metric
is a preference for certain structures or patterns.
In Optimality Theory, statistical universals can surface in two ways. First, some scholars have proposed
that certain constraint rankings should characterize
the child language learners initial state. These rankings would only be changed on direct experience. In
the absence of relevant experience, the default rankings would persevere to adulthood (Smolensky, 1996).
On such a view, all else being equal, we might expect
languages consistent with the default rankings to be
more frequent.
Another way that statistical universals might be
expressed in OT is with a version of OT that exhibits
statistical or stochastic ranking, rather than strict
ranking (Boersma, 1998; Hammond, 2003). The basic
idea would be that a statistical universal would be
expressable by universally ranking some constraint
A above some other constraint B. Since the framework exhibits statistical ranking, this relationship
wouldnt be absolute, but would be probabilistic or
stochastic. Such an approach makes the interesting
prediction that in OT, statistical generalizations can
only be captured with statistical ranking, rather than
by positing certain constraints rather than others,
e.g., ONSET, rather than NOONSET.
See also: Generative Phonology; Linguistic Universals,
Chomskyan; Linguistic Universals, Greenbergian; Phonological Typology; Phonology: Optimality Theory; Phonology: Overview.

Bibliography
Anderson S R (1985). Phonology in the twentieth century:
theories of rules and theories of representations. Chicago:
University of Chicago Press.
Archangeli D & Pulleyblank D (1994). Grounded phonology. Cambridge: MIT Press.
Bagemihl B (1989). The crossing constraint and backwards
languages. Natural Language and Linguistic Theory 7,
481549.
Boersma P (1998). Functional phonology. Ph.D. diss.,
University of Amsterdam.
Bybee J L (1985). Morphology. Philadelphia: John Benjamins
Press.
Chomsky N & Halle M (1968). The sound pattern of
English. New York: Harper & Row.
Dell F & Elmedlaoui M (1985). Syllabic consonants and
syllabification in Imdlawn Tashlhiyt berber. Journal
of African Languages and Linguistics 7, 105130.
Goldsmith J (1979). Autosegmental phonology. New York:
Garland.

Phonological Words 531


Greenberg J H (ed.) (1978). Universals of human language:
phonology 2. Stanford: Stanford University Press.
Halle M & Vergnaud J-R (1977). Metrical structures in
phonology: a fragment of a draft. MIT manuscript.
Hammond M (2003). Phonotactics and probabilistic ranking. In Carnie A, Harley H & Willie M (eds.) Formal
approaches to function in grammar: in honor of Eloise
Jelinek. Amsterdam: John Benjamins. 319332.
Hammond M, Moravcsik E & Wirth J (1989). The explanatory role of language universals. In Hammond M,
Moravcsik E & Wirth J (eds.) Studies in syntactic typology.
John Benjamins. 122.
Hayes B (1981). A metrical theory of stress rules. New
York: Garland.
Hayes B (1995). Metrical stress theory. Chicago: University
of Chicago Press.
Hyman L (1977). On the nature of linguistic stress. In
Hyman L (ed.) Studies in Stress and Accent 4. USC.
3782.

Jakobson R (1962). Selected writings 1: Phonological studies. The Hague: Mouton.


Maddieson I (1984). Patterns of sounds. Cambridge: Cambridge University Press.
McCarthy J (1982). Formal problems in semitic phonology
and morphology. Bloomington: IULC.
McCarthy J & Prince A (1993). Prosodic morphology. U.
Mass. manuscript.
McCarthy J & Prince A (1995). Faithfulness and reduplicative identity. In Beckman J, Dickey L & Urbanczyk S
(eds.) U. Mass. Occasional Papers in Linguistics 18:
Papers in Optimality Theory. 249384.
Prince A & Smolensky P (1993). Optimality Theory. U.
Mass and U. of Colorado. manuscript.
Smolensky P (1996). On the comprehension/production
dilemma in child language. Linguistic Inquiry 27,
720731.
Steriade D (1982). Greek prosodies and the nature of
syllabification. Ph.D. diss., MIT.

Phonological Words
I Vogel, University of Delaware, Newark, DE, USA

The Phonological Word and Compounds

2006 Elsevier Ltd. All rights reserved.

While the members of compounds constitute a single


MW, the phonological phenomena associated with
a single member tend not to apply across the
members.
Consider a phonotactic constraint of Dutch: schwa
may not be preceded and followed by the same consonant, except /s/ and /n/. We thus find eik[e]l acorn,
but not *eik[e]k, and the derived word sted-[e]ling
city dweller but not *kaal-[e]ling bald person.
The constraint is not observed, however, across the
members of a compound (e.g., vestibul[e] lamp hall
lamp) (Booij, 1999: 57).
Similarly, northern Italian s-Voicing (SV) (i.e., /s/ !
[z]/V __ [son]) operates within a morpheme (e.g.,
e[z]atto exact), and between a stem and a following derivational or inflectional suffix (e.g., genero[z]-ita` generosity; genero[z]-e generous, f. pl.),
but not across the members of a compound (e.g.,
*porta-[z]apone soap dish) (cf. Nespor and Vogel,
1986).
Furthermore, while primary stress is assumed to
be a property of a word, we often observe more
than one stress in compounds (i.e., one on each member). The fact that one of the stresses is typically
assigned relatively more prominence is due to a distinct stress rule operating on the entire compound, for
example, the well-known compound stress rule of
English, which stresses the first word (e.g., water
bu ffalo).
It is thus clear that each member of a compound
must form its own PW.

Introduction
The Phonological Word (or Prosodic Word) is located
within the phonological hierarchy between the constituents defined in purely phonological terms (i.e.,
mora, syllable, foot) and those that involve a mapping
from syntactic structure (i.e., clitic group, phonological phrase, intonational phrase, utterance). The phonological word (PW) itself involves a mapping from
morphological structure onto phonological structure.
(See, among others, Selkirk, 1980, 1984, 1986, 1996;
Nespor and Vogel, 1982, 1986.)
The PW is motivated on the grounds that it (a)
correctly delimits the domain of phonological phenomena (i.e., phonological rules, stress assignment, phonotactic constraints), (b) comprises a domain distinct
from those defined in morphosyntactic terms, and (c)
has consistent properties across languages. The Morphosyntactic Word (MW), by contrast, fails to meet
these criteria, typically providing strings that are
either too small (e.g., the, a) or too large (e.g.,
re-randomizations, dishwasher repairman). (See Hall
and Kleinhenz, 1999 for papers addressing these
issues in a variety of languages.)

Delimitation of the Phonological Word


The PW typically comprises at least a morphological
stem. In addition, we must consider the place of
affixes, clitics, and the members of compounds.

Phonological Words 531


Greenberg J H (ed.) (1978). Universals of human language:
phonology 2. Stanford: Stanford University Press.
Halle M & Vergnaud J-R (1977). Metrical structures in
phonology: a fragment of a draft. MIT manuscript.
Hammond M (2003). Phonotactics and probabilistic ranking. In Carnie A, Harley H & Willie M (eds.) Formal
approaches to function in grammar: in honor of Eloise
Jelinek. Amsterdam: John Benjamins. 319332.
Hammond M, Moravcsik E & Wirth J (1989). The explanatory role of language universals. In Hammond M,
Moravcsik E & Wirth J (eds.) Studies in syntactic typology.
John Benjamins. 122.
Hayes B (1981). A metrical theory of stress rules. New
York: Garland.
Hayes B (1995). Metrical stress theory. Chicago: University
of Chicago Press.
Hyman L (1977). On the nature of linguistic stress. In
Hyman L (ed.) Studies in Stress and Accent 4. USC.
3782.

Jakobson R (1962). Selected writings 1: Phonological studies. The Hague: Mouton.


Maddieson I (1984). Patterns of sounds. Cambridge: Cambridge University Press.
McCarthy J (1982). Formal problems in semitic phonology
and morphology. Bloomington: IULC.
McCarthy J & Prince A (1993). Prosodic morphology. U.
Mass. manuscript.
McCarthy J & Prince A (1995). Faithfulness and reduplicative identity. In Beckman J, Dickey L & Urbanczyk S
(eds.) U. Mass. Occasional Papers in Linguistics 18:
Papers in Optimality Theory. 249384.
Prince A & Smolensky P (1993). Optimality Theory. U.
Mass and U. of Colorado. manuscript.
Smolensky P (1996). On the comprehension/production
dilemma in child language. Linguistic Inquiry 27,
720731.
Steriade D (1982). Greek prosodies and the nature of
syllabification. Ph.D. diss., MIT.

Phonological Words
I Vogel, University of Delaware, Newark, DE, USA

The Phonological Word and Compounds

2006 Elsevier Ltd. All rights reserved.

While the members of compounds constitute a single


MW, the phonological phenomena associated with
a single member tend not to apply across the
members.
Consider a phonotactic constraint of Dutch: schwa
may not be preceded and followed by the same consonant, except /s/ and /n/. We thus find eik[e]l acorn,
but not *eik[e]k, and the derived word sted-[e]ling
city dweller but not *kaal-[e]ling bald person.
The constraint is not observed, however, across the
members of a compound (e.g., vestibul[e] lamp hall
lamp) (Booij, 1999: 57).
Similarly, northern Italian s-Voicing (SV) (i.e., /s/ !
[z]/V __ [son]) operates within a morpheme (e.g.,
e[z]atto exact), and between a stem and a following derivational or inflectional suffix (e.g., genero[z]-ita` generosity; genero[z]-e generous, f. pl.),
but not across the members of a compound (e.g.,
*porta-[z]apone soap dish) (cf. Nespor and Vogel,
1986).
Furthermore, while primary stress is assumed to
be a property of a word, we often observe more
than one stress in compounds (i.e., one on each member). The fact that one of the stresses is typically
assigned relatively more prominence is due to a distinct stress rule operating on the entire compound, for
example, the well-known compound stress rule of
English, which stresses the first word (e.g., water
buffalo).
It is thus clear that each member of a compound
must form its own PW.

Introduction
The Phonological Word (or Prosodic Word) is located
within the phonological hierarchy between the constituents defined in purely phonological terms (i.e.,
mora, syllable, foot) and those that involve a mapping
from syntactic structure (i.e., clitic group, phonological phrase, intonational phrase, utterance). The phonological word (PW) itself involves a mapping from
morphological structure onto phonological structure.
(See, among others, Selkirk, 1980, 1984, 1986, 1996;
Nespor and Vogel, 1982, 1986.)
The PW is motivated on the grounds that it (a)
correctly delimits the domain of phonological phenomena (i.e., phonological rules, stress assignment, phonotactic constraints), (b) comprises a domain distinct
from those defined in morphosyntactic terms, and (c)
has consistent properties across languages. The Morphosyntactic Word (MW), by contrast, fails to meet
these criteria, typically providing strings that are
either too small (e.g., the, a) or too large (e.g.,
re-randomizations, dishwasher repairman). (See Hall
and Kleinhenz, 1999 for papers addressing these
issues in a variety of languages.)

Delimitation of the Phonological Word


The PW typically comprises at least a morphological
stem. In addition, we must consider the place of
affixes, clitics, and the members of compounds.

532 Phonological Words


The Phonological Word and Affixes

Certain affixes participate with a stem in a unified


phonological way; however, others do not. The former, but not the latter, are considered part of the PW
with the stem.
Consider the well-known distinction in English between the stress shifting and stress neutral affixes.
While the former participate in word stress assignment, the latter do not. For example, stress falls on
the first syllable in co urage, but shifts when certain
affixes are present (e.g., coura ge-ous). In other cases,
however, additional affixes do not affect the stress of
a stem (e.g., sa vage and sa vage-ly).
Another example is northern Italian SV, which was
seen above to apply within a string consisting of a stem
and following suffixes. It does not, however, apply
across a prefix and a stem (e.g., *ri-[z]alutare regreet), indicating that, differently from the suffixes,
prefixes must not be included in the PW.
There is significant overlap between the affixes included within the PW and the boundary or level 1
affixes of earlier models of phonology; the excluded
affixes are typically # boundary or level 2 affixes. A crucial difference between the approaches, however, lies in
the ordering of affixes inherent in the earlier models
since all boundary/level 1 affixation and rules were
presumed to apply before those of subsequent levels,
often resulting in ordering paradoxes.
Since morphological ordering of affixes is not relevant to PW structure, such paradoxes do not arise: a
string is simply subject to a given (PW) phenomenon
if it is within the PW domain, and not otherwise.
Consider the morphological structure of undesirability: [[un[[desir]V abil]Adj ]Adj ity]N. The first affix
added to the stem desire is -able (-abil here). This
is considered a level 1 suffix since it is often adjacent
to a root and is followed by other level 1 suffixes (e.g.,
-ity), although it not stress-shifting (cf. Selkirk, 1982).
The next affix added is the level 2 prefix un-. The
problem arises with the subsequent addition of -ity, a
stress-shifting level 1 affix, which should not be permitted to follow the addition of a level 2 affix. In
terms of the PW, however, this is not problematic:
both -able and -ity form part of the PW with the
root, while un- does not, yielding the phonological
structure: un[desir-abil-ity]PW. The claim inherent in
this structure is that stress and any other phonological
phenomena that apply to the PW domain will include
the two suffixes, but not the prefix.
Similarly, paradoxes such as transformational
grammarian (i.e., [[transformational grammar]N
ian]N) are avoided. While morphologically the level
1 stress-shifting suffix -ian is added to the entire
compound, phonologically it is simply included with

the adjacent stem in a PW; the first member of the


compound forms its own PW: [transformational]PW
[grammar-ian]PW.
The Phonological Word and Clitics

While clitics, a category that includes various types of


function words, are MWs, they do not exhibit the
same phonological properties as PWs (cf. among
others Hall, 1999). Indeed, such words tend not
to constitute minimal (bimoraic) word structures, as
discussed below, and they do not bear stress.
They may also exhibit different phonotactic constraints, such as the interdental fricative in English.
We find the voiceless fricative at the beginning of
content words, clearly PWs (e.g., [y]ing, [y]ink),
while at the beginning of function words, at least
some of which would be considered clitics, we find
the voiced fricative (e.g., the [$], them [Em] or
[$m] ) (cf. Booij, 1999, among others).
Since clitics do not qualify as PWs, they have in
some cases been analyzed as part of the PW with the
stem (e.g., Inkelas, 1989; Selkirk, 1986, 1996). Problems arise, however, where the clitics do not, in fact,
participate in the phonological phenomena of the PW.
For example, in Italian SV fails to apply between
a stem and either a proclitic or an enclitic (e.g., *ti
[z]aluto (I) greet you; *guardando [z]i looking at oneself, where indicates the juncture
between a stem and a clitic).
Slightly more complex cases arise where it appears
that clitics do participate in phonological phenomena
with the stem. Consider the Lucanian dialect of Italian
(cf. Peperkamp, 1997). As in Standard Italian, primary
stress falls on one of the last three syllables of a word.
When certain enclitics are added, however, stress always appears on the penultimate syllable, regardless of
the original stress of the stem and the number of clitics:
(1a) /vn: e le/ ! [ven: l:e]
sell it
(1b) /ra me le/ ! [ra m: l:e] give me it
(1c) /man:ate ! [man:ate
send me it
me le/
m l:e]
(Peperkamp, 1997: 191)

Since the enclitics participate in stress assignment, it is


argued that they are part of the PW. The stress patterns are different, however, for PWs without clitics
and strings with clitics. By labeling the two structures
as the same type of constituent, we obscure the generalizations about Lucanian stress: (a) that word
(PW) stress may not fall farther left than the antepenultimate syllable and (b) that certain types of strings
with enclitics may only exhibit penultimate stress.
Thus, while clitics may participate in phonological
phenomena with a stem, the phenomena are different
from those observed within the PW.

Phonological Words 533


Phonological Word Domain

Recursivity

We can now provide a simple mapping of morphological structure onto the PW:

Recursivity permits items excluded from a PW, such


as those in (2), to be structured either as in (3a),
forming a separate PW within a higher PW, or (3b),
included serially into a higher PW.

(2) Phonological Word Domain


The PW consists of a single morphological stem
plus any relevant affixes.

The crucial implications of this statement are that


(a) the members of a compound form their own
PWs, (b) only affixes, but not clitics, may be part of
the PW, and (c) the inclusion of affixes in a PW is
language-specific, determined either by systematic
criteria (e.g., in Italian suffixes but not prefixes are
included) or idiosyncratic marking. We must now
consider the organization of these elements within
the phonological hierarchy.

Geometry of the Phonological Word


Minimality and the Strict Layer Hypothesis

It has been suggested that the PW is universally subject to a Minimality Condition: it must consist of at
least one foot, where a foot must consist of two moras
(e.g., McCarthy and Prince, 1986, 1990), although
languages such as French and Japanese also permit
monomoraic words (cf. Ito , 1990; Ito and Mester,
1992).
A central principle of early models of the phonological (prosodic) hierarchy, the Strict Layer Hypothesis (SLH), led to systematic violations of the
Minimality Condition. According to the SLH, a phonological constituent may only dominate constituents
of the immediately lower level. Thus, the constituent
C that dominates the PW may only dominate PWs.
This resulted in structures such as the Italian example
below (cf. Nespor and Vogel, 1986). The label C is
used to avoid the controversy about the constituent
that dominates the PW.
(2) [[ve]PW [le]PW [ri]PW [lavo]PW]C
CL
CL
Pre
Stem
(I) rewash them for you (lit. for you
them re (I) wash)

The clitics and prefixes that are excluded from the


stems PW are each labeled as PW merely to satisfy the
SLH. Since they do not exhibit the properties of PW,
this results in incorrect predictions about their phonological behavior. It was thus subsequently proposed that the SLH be somewhat relaxed either by
(a) permitting recursive structures or (b) allowing a
constituent to skip levels and dominate items more
than one level lower in the phonological hierarchy
(e.g., Ito and Mester, 1992; Vogel, 1999).

(3a) [[ve le ri]PW [lavo]PW]PW


(3b) [ve [le [ri [lavo]PW]PW]PW]PW

Both options yield enriched constituent structures,


raising the following questions: (a) are the additional
PWs independently motivated and (b) do they make
correct phonological predictions?
Linguistic constituents are motivated by evidence
that they share properties with other strings of the
same constituent type. Thus, (3a) predicts that the
PW composed of the clitics and prefix will exhibit
similar properties to other PWs. In Italian, the syllable bearing primary stress must be heavy; in the case
of a CV syllable, Vowel Lengthening (VL) applies
(e.g., /ve le/ ! [ve :le] sails). Excluding the prefix
from (3a), we have a similar string preceding lavo:
[[ve le]PW [lavo] PW ]PW (I) wash them for you. The
first PW should thus be homophonous with the word
for sails, with a long vowel in the first syllable. VL is
not observed in this string, however, so labeling it a
PW makes an incorrect prediction. An analogous
problem arises with enclitics where we again find
stress and VL on the first syllable of vele (4a), but
not in the similar string in (4b).
(4a) [lavando]PW
[vele]PW ! . . . [ve :le]PW
washing sails
[ve le]PW]PW ! . . . *[ve : le]PW
(4b) [[lavando]PW
Stem
CL CL
washing them for you (lit. washing for-you
them)

The alternative type of recursive structure introduces


a different problem, as illustrated below.
(5) [[[lavando]PW ve]PW le]PW

Main stress in Italian may fall no farther left than the


antepenultimate syllable of a word, with the exception of a small number of verb forms. In (5), the stem
PW lava ndo washing has penultimate stress; when
the first enclitic, ve is added, the stress becomes antepenultimate, still an acceptable word stress pattern.
Adding the second enclitic le, however, results in a
PW with stress on the pre-antepenultimate syllable,
and hence a violation of the stress constraint. We
could limit the stress constraint to PWs without internal PW brackets; however, this would merely be a
stipulation. Moreover, if the larger PWs do not exhibit the same properties as the innermost PW, it is
unclear in what way they are PWs (cf. Vogel, 1994,
1999).

534 Phonological Words

A recursive structure has also been suggested for


compounds: each member of the compound is a PW,
and these together form a larger PW (e.g., [[sewing]PW
[machine]PW]PW). Again, it is not clear in what way the
external PW is similar to the internal PWs since, for
example, stress is assigned differently to the two types
of strings (i.e., word vs. compound stress).
Phonological Word Adjoiners

Kabak and Vogel (2001) identify the elements excluded from the stems PW as Phonological Word Adjoiners (PWAs). All PWAs share a common property:
they introduce a PW boundary where they are in
contact with the stem PW (i.e., . . .stem]PW XPWA . . .
or . . . XPWA [PW stem . . .). PWAs combine with the
PW within the higher constituent C (i.e., Clitic Group
or Phonological Phrase) and crucially predict that
structures with PWAs and PWs themselves may exhibit different properties. Once a PWA is excluded
from a PW, subsequent items are also excluded, as
illustrated in Turkish:
(6) [giydir]PW
mePWA di
wear-CAUS NEG
-PAST
You didnt let (it) be worn.
(Kabak and Vogel, 2001: 343)

-niz
-2 PL

The PW is closed before the PWA me. This permits


the correct assignment of stress on dir, according to
the well-known (phonological) word-final stress rule
of Turkish. Complicated analyses involving exceptional stress patterns are thus avoided, and there
is no increase in the richness of the phonological
constituent structure.

Conclusions
It is clear that the Phonological Word, not the morphosyntactic word, is the appropriate domain for
numerous phonological phenomena across languages, and within different theoretical frameworks.
The examples presented here could also be analyzed
in OT in terms of alignment constraints, which yield
the crucial mismatch between the PW and the MW,
and different constraint rankings, which permit different roles for minimality. Furthermore, the minimal
(bimoraic) PW has been reported as the basis of
childrens early words in a variety of languages,
although in languages such as French that do not
require minimality, childrens early words are not
necessarily bimoraic (cf. Demuth and Johnson, 2004).
The cross-linguistic validity of the PW is thus
clear, though questions remain regarding details of
the PWs content and structure, including the roles
of minimality, clitics, and recursivity.

See also: Clitics; Lexical Phonology and Morphology;

Word Stress; Word.

Bibliography
Booij G (1999). The prosodic word in phonotactic generalizations. In Hall T A & Kleinhenz U (eds.) Studies on
the phonological word. Philadelphia: John Benjamins.
4772.
Demuth K & Johnson M (2004). Truncation to subminimal
words in early French. Ms. Brown University.
Hall T A (1999). German function words. In Hall T A &
Kleinhenz U (eds.) Studies on the phonological word.
Philadelphia: John Benjamins. 9131.
Hall T A & Kleinhenz U (eds.) (1999). Studies on the
phonological word. Philadelphia: John Benjamins.
Inkelas S (1989). Prosodic constituency in the lexicon.
Ph.D. diss. Stanford University.
Ito J (1990). Prosodic minimality in Japanese. In Deaton
K, Noske M & Ziolkowski M (eds.) Papers from the
parasession on the syllable in phonetics and phonology.
CLS. 26-II. 213239.
Ito J & Mester A (1992). Weak layering and word binarity.
Santa Cruz: Linguistics Research Center.
Kabak B & Vogel I (2001). Stress in Turkish. Phonology
18, 315360.
McCarthy J & Prince A (1986). Prosodic morphology.
Ms. U Mass Amherst and Brandeis University.
McCarthy J & Prince A (1990). Foot and word in prosodic
morphology: the Arabic broken plural. NLLT 8,
209283.
Nespor M & Vogel I (1982). Prosodic domains of external
sandhi rules. In Hulst H V D & Smith N (eds.) The
structure of phonological representations. I. Dordrecht:
Foris. 225255.
Nespor M & Vogel I (1986). Prosodic phonology.
Dordrecht: Foris.
Peperkamp S (1997). Prosodic words. The Hague: Holland
Academic Graphics.
Selkirk E (1980). Prosodic domains in phonology: Sanskrit
revisited. In Aronoff M & Kean M L (eds.) Juncture.
Saratoga, CA: Anma Libri. 107129.
Selkirk E (1982). The syntax of words. Cambridge: MIT
Press.
Selkirk E (1984). Phonology and syntax: the relation
between sound and structure. Cambridge: MIT Press.
Selkirk E (1986). On derived domain in sentence phonology. Phonology 3, 371405.
Selkirk E (1996). The prosodic structure of function words.
In Morgan J & Demuth K (eds.) Signal to syntax: bootstrapping from speech to grammar in early acquisition.
Mahwah, NJ: Lawrence Erlbaum Associates. 187213.
Vogel I (1994). Phonological interfaces in Italian. In
Mazzola M (ed.) Issues and theory in Romance linguistics: selected papers from the Linguistics Symposium on
Romance Languages XXIII. 109125.
Vogel I (1999). Subminimal constituents in prosodic
phonology. In Hannahs S J & Davenport M (eds.)
Phonological structure. Philadelphia: John Benjamins.
249267.

Phonological, Lexical, Syntactic, and Semantic Disorders in Children 535

Phonological, Lexical, Syntactic, and Semantic Disorders


in Children
D L Molfese, M J Maguire, V J Molfese, N Pratt,
E D Ratajczak, L E Fentress and P J Molfese,
University of Louisville, Louisville, KY, USA
2006 Elsevier Ltd. All rights reserved.

Brain Measures Common to the Study of


Language Disabilities
A variety of procedures is currently used to investigate
brain processes underlying language disabilities. These
include functional magnetic resonance imaging (fMRI),
MRI, positron emission tomography (PET), magnetoelectroencephalography (MEG), and event-related potentials (ERPs). Each procedure, of course, has its
strengths and weaknesses. MRI provides information
concerning the morphology of brain structures, whereas fMRI monitors hemodynamic processes, such as
changes in brain functions reflected during extended
periods of language processing. PEToperates in a somewhat similar fashion but tracks the flow of radioactive
elements injected into the blood to identify areas
actively engaged in a language task. MEG can detect
small fluctuations in the brains magnetic field in
response to task demands, and ERPs, a portion of
the ongoing EEG that is time locked to the onset of
a stimulus event, can reflect rapid changes in the
brains encoding and processing of a speech sound,
word, or even sentence. All procedures enable investigators to map linguistic and cognitive functions
onto brain structures (Fonaryova Key et al., in press).
Although studies of brain processing usually use
one of these procedures, it is clear that much is to be
gained from using a combination of procedures.
For example, the high temporal sensitivity of ERP
techniques can provide a means for determining the
sequential relationships that exist between the specific areas of brain activation identified through fMRI
(Georgiewa et al., 2002). Moreover, convergence in
source localization across fMRI, MEG and ERP procedures ensures that solutions are not biased by particular approaches but may reflect different aspects of
what occur in the brain in response to stimulus input
or task demands (see Hugdahl et al., 1998).
In general, differences are noted in brain responses
and structures for different disabilities (Harter et al.,
1988a, 1988b), but there are similarities across disabilities as well. Brain differences could relate to general cognitive processing differences (e.g., attention)
that may be impaired in some types of disabilities
(Holcomb et al., 1986), or brain differences could
reflect the involvement of different structures in response to task demands. Generally, brain structure

and functional differences have been thought to be


related to poor language function in general (Molfese
and Segalowitz, 1988) and to dyslexia in particular
(Eckert et al., 2001; Frank and Pavlakis, 2001).
Orton (1937) as well as Travis (1931) held the belief
that early signs of lateralization serve to identify children at risk for developmental language disorders.
More recent investigations continue to indicate that
differences in cerebral asymmetry associated with
atypical organization of the left hemisphere are a
marker for dyslexic children (Heim and Keil, 2004).
However, although reports often link hemisphere differences and language disorders, current thinking
indicates that the pathology as well as the neurophysiology of developmental language disabilities are a
great deal more complex than originally thought
and extend well beyond the classically defined language areas of the brain (Eden et al., 1996). For
example, some point to the neural circuitry to account for brain organizational differences between
impaired and nonimpaired children, as well as between children with different types of language disabilities (Eden et al., 1996; Leonard et al., 2002;
Sarkari et al., 2002). For example, dyslexic readers
fail to exhibit the usual network of anterior and posterior brain areas over left hemisphere regions,
whereas children with attention deficit hyperactivity disorder appear to have an abnormality in the
prefrontal and striatal regions.
For the purposes of this present chapter, the review
of brain structures and functions involved in language
disabilities is limited to autism, developmental dyslexia, Down syndrome, specific language impairment
(SLI), and Williams syndrome. Links between brain
and behavior in these developmental disabilities are
highlighted.

Autism
Autism is a neurodevelopmental disorder characterized by impairments in language, communication,
imagination, and social relations (American Psychiatric Association, 1994). Estimates of occurrence in the
general population range from approximately 1 in
200 to 1 in 1000 (Fombonne, 1999). Although nearly
25% of children with autism have essentially normal
vocabulary and grammatical abilities (Kjelgaard and
Tager-Flusberg, 2001), another 25% may remain
mute for their entire lives (Lord and Paul, 1997).
Many underlying language problems found in autistic
children are believed to be linked to social and emotional deficits. Although the leading causes of autism

536 Phonological, Lexical, Syntactic, and Semantic Disorders in Children

remain unknown, the interplay of multiple genes with


multiple environmental factors is considered a factor
(Akshoomoff et al., 2002).
General Brain Imaging Results for Autism

Most imaging studies of children with autism are


carried out with sedated children and thus focus on
brain structures rather than functional differences
(Rapin and Dunn, 2003). Even so, structural differences noted in autistic populations are often contradictory. The most consistent findings include
increased cerebellar hemisphere, parieto-temporal
lobe, and total brain volume. Current research findings also show that the size of the amygdala, hippocampus, and corpus callosum may differ from that of
normals (see Brambilla et al., 2003, for a review).
Social Brain Difference and Autism

Neurologically, many of the social aspects of language acquisition (e.g., social orienting, joint attention and responding to emotional states of others) are
tied to differences in the medial temporal lobe (amygdala and hippocampal), which is larger in autistic
children than in age-matched controls (Brambilla
et al., 2003; Sparks et al., 2002). This brain region
is thought to be related to performance on deferred
imitation tasks a skill that may be important in
language acquisition (Dawson et al., 1998). Further,
the increase in amygdala size may have consequences
for important skills such as discriminating facial
expressions (Adophs et al., 1995; Whalen et al.,
1998) and joint attention (Sparks et al., 2002). Both
skills appear to be important for language acquisition
and are often impaired or absent in autism.
Phonology and Autism

A shift in the latency of the first positive peak in the


ERP (P1) and the following first large negative peak
(N1) to speech sounds in typically developing children is believed to result from maturational changes
related to synaptogenesis, myelinogenesis, and dendritic pruning, possibly reflecting cortical auditory
system maturation (Bruneau et al., 1997; Eggermont,
1988; Houston and McClelland, 1985). Findings
with autistic children for these two ERP peaks are
mixed: some studies report longer N1 latencies in children with autism (Dunn et al., 1999; Seri et al., 1999),
whereas others report shorter N1 latencies with autistic children (Oades et al., 1988) or no differences
between autistic and control children (Kemner et al.,
1995; Lincoln et al., 1995).
Using MEG, the N1 correlate is the M100 or N1m.
Gage et al. (2003) found that the M100 shifted in
latencies in the left and right hemispheres with age

for typically developing children listening to tones but


occurred only in the left hemisphere for autistic children. This neural activation was localized to the supratemporal sites, reflecting activity of the auditory
cortex. Overall, children with autism also exhibited delayed M100 latencies compared to controls,
indicating a fundamental difference in the auditory
processing of autistic children.
Semantics and Autism

When given a semantic (meaning) categorization


task, autistic children exhibit no differences in the
N400 between deviant and target words, unlike agematched controls (Dunn et al., 1999). Surprisingly,
autistic childrens categorizing errors were not higher
than controls, indicating that although the autistic
children could categorize based on semantics, they
could not attend to the global context and could not
discern that one ending was more common than another. As a result, their brains appeared not to process
out-of-category words as deviant.
Many autistic children have limited word knowledge and limited comprehension of meaning in
connected speech (Dunn et al., 1999). Mental words,
such as think, believe, and know, are rarely part of
the autistic childs vocabulary (Happe, 1995), which is
speculated to be caused by differences in the limbic
system and as reflecting consistent with the problems
of these children in processing emotional information
(Dawson et al., 1998).

Dyslexia
Developmental dyslexia refers to the abnormal acquisition of reading skills during the normal course of
development despite adequate learning and instructional opportunities and normal intelligence. Estimates are that 510% of school-age children fail to
learn to read normally (Habib, 2000). Dyslexia can
exist in isolation, but more commonly it occurs with
other disabilities, such as dyscalculia (mathematic
skill impairment) and attention deficit disorder, both
with and without hyperactivity. Studies of dyslexia
usually indicate the involvement of left-hemisphere
perisylvian areas during the reading process. The specific areas identified vary somewhat depending on the
component of reading being engaged in but overall the
extrastriate visual cortex, inferior parietal regions,
superior temporal gyrus, and inferior frontal cortex
appear to be activated.
When one examines specific skills, visual word
form processing is associated with occipital and
occipitotemporal sites, whereas reading-relevant phonological processing has been associated with superior temporal, occipitotemporal, and inferior frontal

Phonological, Lexical, Syntactic, and Semantic Disorders in Children 537

sites of the left hemisphere. However, there is some


variation in the scientific reports. For example, although some studies report a hemisphere asymmetry
in the area of the planum temporale related to dyslexia (Frank and Pavlakis, 2001), others report no such
effect (Heiervang et al., 2000).
A number of studies have identified brain anatomical differences that distinguish dyslexic from normal
brains (see Hynd and Semrud-Clikeman [1989] for an
earlier review). For example, Eckert et al. (2003),
using MRI scans, reported that dyslexics exhibited
significantly smaller right anterior lobes of the cerebellum, pars triangularis bilaterally, and brain volume than controls. Correlation analyses showed
that these neuroanatomical measurements relate to
reading, spelling and language measures of dyslexia
(see also Grunling et al., 2004). Although earlier
studies report hemisphere differences in the region
of the planum temporale between dyslexics and controls, more recent studies investigating the morphology of the perisylvian cortical area in a clinical sample
of children failed to find morphological differences at
this locale that were associated with the diagnosis of
dyslexia (Hiemenz and Hynd, 2000). Scientists have
also reported differences between dyslexics and controls in the corpus callosum the band of fibers
connecting the two hemispheres (von Plessen et al.,
2002). These researchers reported differences in the
posterior midbody/isthmus region that contains interhemispheric fibers from primary and secondary
auditory cortices a finding that converges with
other reports of developmental differences during
the late childhood years, coinciding with reading
skill development.
General Brain Imaging Results for Dyslexia

There are general consistencies across phonological,


semantic, and syntactic processing in that enhanced
activation of the left extrastriate cortex is found when
visuospatial, orthographic, phonologic, and semantic
processing demands are placed on the dyslexic group
(Backes et al., 2002).
Researchers argue that variations in brain processing relate to language and cultural factors a finding
that parallels behavioral investigations of language
differences. For example, using fMRI, Siok et al.
(2004) reported that functional disruption of the left
middle frontal gyrus is associated with impaired
reading of the Chinese language (a logographic rather
than alphabetic writing system). No disruption was
found for the left temporoparietal brain regions. Siok
et al. argue that such differences reflect two deficits
during reading: the conversion of the orthography
(characters) to syllables, and the mapping of the orthography onto the semantics. Both processes, the

authors argue, are mediated by the left middle frontal


gyrus that coordinates and integrates various information about written Chinese characters in verbal
and spatial working memory (see also Eckert et al.,
2001; Grigorenko, 2001).
Phonology and Dyslexia

Dyslexic readers show less activation of both the


temporal and the prefrontal cortex during phonologic
processing (Backes et al., 2002). Intriguingly, similar
areas of lowered activation are seen in other populations with reading problems, reinforcing the notion
that inferior frontal and superior temporal brain
areas support reading skills (e.g., neurofibromatosis;
see Backes et al., 2002).
When magnetic source imaging (MSI) was employed during phonological tasks, Papanicolaou
et al. (2003) reported consistent brain maps across
children that differentiate between dyslexic and
nondyslexic children in the left and right posterior
temporal regions. Moreover, following reading interventions with the dyslexic children, brain sources
shifted from the right to the left hemisphere, indicating that intervention normalizes as the childs brain
moves from an ineffective to a more efficient use of
brain structures and pathways (Simos et al., 2002; for
replication, see Temple et al., 2003).
MEG investigations into the perception of speech
cues such as voice onset time (VOT) indicate that
children with dyslexia experienced a sharp peak of
relative activation in right temporoparietal areas between 300 and 700 milliseconds poststimulus onset, a
point markedly later in time (~500 milliseonds) relative to normal readers. This increased late activation
in right temporoparietal areas was correlated with
reduced performance on phonological processing
measures (Breier et al., 2003). Further, there are
data indicating an early relation between the perception of speech cues in early infancy and the emergence
of reading disorders as late as 8 years of age (Molfese
& Molfese, 1985; Molfese, 2000; Molfese et al.,
2005; Lyytinen et al., 2003). These studies indicate
that infants who go on to develop normal language
skills generate ERPs over left frontal and temporal
brain regions that discriminate between speech
sounds, whereas ERPs collected from infants at risk
for developing a reading disorder fail to discriminate
between these same sounds.
In phonological related tasks such as rhyming,
fMRI differences are found between dyslexic and control children (Corina et al., 2001). During phonological judgment, dyslexics generated more activity than
controls in right than left inferior temporal gyrus and
in left precentral gyrus (see Georgiewa et al. [1999]
for replication). During lexical judgment, dyslexics

538 Phonological, Lexical, Syntactic, and Semantic Disorders in Children

showed less activation than controls in the bilateral


middle frontal gyrus and more activation than controls in the left orbital frontal cortex. In an ERP study
paralleling this study, Lovrich et al. (1996) reported
that rhyme processing produced more pronounced
group differences than semantic processing at about
480 milliseconds, with a relatively more negative distribution for the impaired readers at centroparietal
sites. By 800 milliseconds, the impaired readers displayed a late positivity that was delayed in latency
and that was of larger amplitude at frontal sites than
that for the average readers.
When brain activation patterns were studied in
dyslexic and nonimpaired children during pseudoword and real-word reading tasks that required phonologic analysis, differences were noted in posterior
brain regions, including parietotemporal sites and
sites in the occipitotemporal area. Reading skill overall was positively correlated with the magnitude of
activation in the left occipitotemporal region an
area similarly found to discriminate between adult
groups of readers and nonreaders (Shaywitz et al.,
2002). A similar effect was demonstrated using
MEG (Simos et al., 2000).
Semantics and Dyslexia

During lexical judgment, less activation in bilateral


middle frontal gyrus and more activation in left
orbital frontal cortex occurred for dyslexic compared
to nondyslexic children (Corina et al., 2001). In a
related task, in which children read words and pronounceable nonwords, fMRI results detected a hyperactivation of the left inferior frontal gyrus in dyslexic
children. ERPs collected from the same children converged with the fMRI findings and showed topographic difference between groups at the left frontal
electrodes in a time window of 250600 milliseconds
after stimulus onset. A related study by Molfese et al.
(in press) reported similar findings, as well as a slower
rate of word processing over left hemisphere electrode sites in dyslexic children compared to normal
and advanced readers.
Reading and Dyslexia

Relatively few studies have investigated brain activation when the child is reading continuous text (Backes
et al., 2002). One exception is a report by Johnstone
et al. (1984), who monitored silent and oral reading,
noting that reading difficulty affected the central and
parietal ERPs of dyslexics but not the controls. In
addition, different patterns of asymmetry were found
for the two groups in silent compared to oral reading
at midtemporal placements.

Down Syndrome
Down syndrome (DS) is characterized by a number of
physical characteristics and learning impairments, as
well as IQ scores that may range from 50 to 60.
Individuals with DS typically are microcephalic and
have cognitive and speech impairments, as well as
neuromotor dysfunction. In addition, problems generally occur in language, short-term memory, and
task shifting. Typical language problems involve
delays in articulation, phonology, vocal imitation,
mean length utterance (MLU), verbal comprehension, and expressive syntax. Spontaneous language
is often telegraphic, with a drastic reduction in the
use of function words: articles, prepositions, and pronouns (Chapman et al., 2002). Language deficits may
arise from abnormalities noted within the temporal
lobe (Welsh, 2003). Individuals afflicted with DS
commonly suffer from a mild to moderate hearing
loss (78% of DS children have a hearing loss; StoelGammon, 1997), which may partially account for the
delay in phonological processing and poor articulation.
DS occurs in approximately 1 in 8001,000 live
births. Ninety to 95% of cases are caused by a full
trisomy of chromosome 21, and 5% result from
translocation or mosaicism. Considerable individual
variability exists in cognitive development among
those afflicted, with the greatest deficits in development observed with full trisomy21, where specific
genes have been associated with brain development,
specifically the cerebellum development, and produce
Alzheimer-type neuropathology, neuronal cell loss,
accelerated aging, and so on (Capone, 2001). Individuals with DS commonly exhibit neuropathology resembling that seen in Alzheimer disease, with some
patients showing symptoms beginning as early as age
35 years.
General Brain Imaging Results for DS

Brains of DS individuals appear to have a characteristic morphologic appearance that includes decreased
size and weight, a foreshortening of the anterior
posterior diameter, reduced frontal lobe volume, and
flattening of the occiput. The primary cortical gyri
may appear wide, whereas secondary gyri are often
poorly developed or absent, with shallow sulci and
reduced cerebellar and brain stem size (Capone,
2004). MRI studies indicate a volume reduction for
the whole brain, with the cerebral cortex, white
matter, and cerebellum totaling 18% (Pinter et al.,
2001a). Hippocampal dysfunction occurs in DS
(Pennington et al., 2003), perhaps because of the
reduced size of the hippocampus, as determined by
MRI (Pinter et al., 2001b), and the cerebral cortex

Phonological, Lexical, Syntactic, and Semantic Disorders in Children 539

has fewer neurons at all cortical layers. In addition,


dendritic spines appear longer and thinner than in
matched controls (Capone, 2004; Seidl et al., 1997).
Studies using MEG indicate atypical cerebral specialization, showing a greater activation of the right
hemisphere in DS when compared to normal controls
(Welsh, 2002). This greater activation is confirmed by
PET studies (Nadel, 2003), indicating that the brain
of the DS individual is working harder to process
information, although less effectively.
Brain morphology in DS does not differ dramatically from normals throughout the first 6 months of life.
Delayed myelination occurring within the cerebral
hemispheres, basal ganglia, cerebellum, brain stem,
and nerve tracts (fibers linking frontal and temporal
lobes) occurs after 6 months (Nadel, 2003; Capone,
2004). Other critical periods of brain development
affected by DS include neuronal differentiation, proliferation, and organization. A reduction in neuronal
number and density was noted for most brain areas
examined, specifically within interneurons and pyramidal neurons. However, this differs on a case-to-case
basis and has been hypothesized as a potential explanation for the spectrum of neurodevelopment
impairment observed (Capone, 2004).
Phonology and DS

Research of neural function indicates that in DS there


may be a delay in the development of the auditory
system (Nadel, 2003). Phonological delays exhibited in DS cases are often linked to differences in
anatomy and central nervous system development in
DS. In addition, limits on auditory working memory
and hearing may account for deficits observed in
phonological processing (Tager-Flusberg, 1999).
Semantics and DS

Dichotic listening tasks involving DS children generally result in a left-ear advantage, indicating that these
individuals use their right hemisphere to process for
speech (Welsh, 2002). On the basis of such findings,
Capone (2004) argued that difficulties in semantic
processing in DS occur from a reduction in cerebral
and cerebellar volume. In addition, the corpus callosum is thinner in the DS brain in the rostral fifth, the
area associated with semantic communication. Welsh
(2002) speculated that the thinner corpus callosum
isolates the two hemispheres from each other, making
it more difficult to integrate verbal information.
Vocabulary growth in DS children is delayed increasingly with age (Chapman, 2002). Studies using
dichotic listening tasks report a left-ear advantage for
DS, indicating that lexical operations are carried out
primarily in the right hemisphere, a finding opposite

to that found with normal developing children. In


fact, individuals with DS who exhibit the most severe
language deficits demonstrate the most atypical ear
advantage (Welsh, 2002).
Syntax and DS

Children with DS exhibit a delay in syntax production that generally becomes evident with the emergence of two-word utterances, and syntax is often
more severely impaired than lexical development
(Chapman, 1997). Verbal short-term memory may
be affected, limiting the ability to understand syntactic relations. Research on short-term memory points
to hippocampal dysfunction in DS children (Pinter
et al., 2001a). MRI studies of adults with DS highlight the possibility that reductions in volume size
observed in DS may contribute to the development
of language and memory deficits. It has been hypothesized that the cause of language deficits observed in
children with DS are primarily related to memory and
learning and are most associated with deficits observed in the hippocampal region (Nadel, 2003).

Specific Language Impairment


It is estimated that approximately 7% of the 5-yearold population is characterized with specific language
impairment (SLI), and that SLI is three times more
likely in males than females. The basic criteria underlying this disorder include normal intelligence (IQ of
85 or higher), language impairment (language test
score of 1.25 sd (standard deviation) or lower), no
neurological dysfunctions or structural anomalies,
successful completion of a hearing screening, and no
impairment in social interactions. Speculations as to
causes focus on the biological and environmental
issues, but with no resolution. Because of the heterogeneity of the phenotype, it is difficult to study this
population as a single unit (Leonard, 1998). As a
consequence, results and conclusions resulting from
any particular study are limited to the specific subset
of SLI under study.
Phonology and SLI

A phonological processing delay exists in children


with SLI, where the children have a problem distinguishing similar spoken sounds (i.e., /b/ vs. /p/) from
one another, as well as show lower accuracy in processing speech sounds at rapid ISI (interstimulus intervals). Improvement occurs with age in SLI children;
however, the plateau reached is still below normal
levels. ERP patterns of older SLI children in comparison with same-age and younger control children
show a correlation in brain wave patterns to that of
the younger population in response to auditory tone

540 Phonological, Lexical, Syntactic, and Semantic Disorders in Children

presentation. An auditory immaturity hypothesis is


indicated as a basis for the delay in phonological
processing in SLI. This hypothesis points to the auditory system as the basis for developmental delays
found in SLI children (Bishop et al., 2004). In fact,
an fMRI study showed that individuals with SLI
had less activation in brain regions specific to language processing as well as phonological awareness
(Hugdahl et al., 2004). Furthermore, MMN (mismatch negativity), a region of the ERP that is an
indicator of stimulus discrimination, indicates a deficit in discrimination of CV (consonantvowel) syllables that differ in the place of articulation in SLI
children (Uwer et al., 2002). Infants as early as
8 weeks of age who are at risk for SLI are already
showing MMN delays in their latency response when
presented with auditory speech sounds (Friedrich,
2004). These findings indicate that delays in discrimination skills are present from an early stage of
development.
Semantics and SLI

Semantic abilities are problematic in SLI. Investigations into the neural substrate of these issues have
made some headway in recent years. In particular,
the N400 (Kutas and Hillyard, 1980), a large negative
component of the ERP that correlates with semantic
ability and occurs approximately 400 milliseconds
after a stimulus begins, is altered in populations of
SLI children, as well as in their parents. This brain
component is enhanced in fathers of SLI children
compared to controls in response to the unexpected
ending of a sentence (Ors et al., 2001). For example,
the N400 response is normally larger in response to
the last word in the sentence, The train runs on the
banana than if the final word is track. Atypical
N400 amplitudes also are found in children with
other language deficits (Neville et al., 1993). MEG
studies have pinpointed the lateral temporal region
as the origin of the N400 response (Simos et al.,
1997). Intracortical depth recordings in response to
written words point to the medial temporal structures
near the hippocampus and amygdala (Smith et al.,
1986).

Williams Syndrome
Williams-Beuren syndrome (WS) results from a rare
genetic deficit (about 1 in 20 000 births) caused by a
microdeletion on chromosome 7 (Levitin et al., 2003).
This genetic etiology present in WS allows researchers
to identify developmental abnormalities associated
with WS from birth. Characteristics of WS include
dysmorphic facial features, mental retardation, and a
unique behavioral phenotype (Bellugi et al., 1999,

2000; Levitin et al., 2003). Recently, Mervis and colleagues (Mervis, in press; Mervis et al., 2003) have
formulated a cognitive profile for WS by analyzing the
relative weaknesses and strengths often associated
with the genetic syndrome. Markers for this profile
include a very low IQ and weakness in visuospatial
construction, as well as strengths in recognition of
faces, verbal memory, and language abilities. These
findings have been replicated by other researchers
(Galaburda et al., 2003).
Anatomical Aspects of WS

MRI studies note anatomical differences in brain


morphology in WS that include a bilateral decrease
in the dorsal posterior regions in both hemispheres
with an increase in the superior temporal gyrus, frontal lobe, and amygdala (Galaburda et al., 2003).
Schmitt et al. (2001a) recorded MRI images in 20
individuals with WS (age: 1944 years) compared to
20 age- and gender-matched participants. In WS
adults, the midsagittal corpus callosum was reduced
in total area, and within the corpus callosum, the
isthmus and splenium were disproportionately smaller. However, the frontal lobe and cerebellum were
similar in size to those of controls (Schmitt et al.,
2001b). The decrease in volume within the corpus
callosum and the parietal lobe has led many researchers to speculate that these findings could explain
visuospatial weaknesses in this population (Schmitt
et al., 2001a; Eckert et al., 2005). Other studies indicate abnormal clustering of neurons in the visual
cortex (Lenhoff et al., 1997). In contrast, the language strength predominately found in WS children
may be caused, based on the MRI data, by the relatively unimpaired frontal lobe and cerebellum and the
enlarged planum temporale (auditory region), particularly in the left hemisphere (Lenhoff et al., 1997;
Bellugi et al., 1999, 2000).
Semantics and WS

Studies indicate that WS children are capable of semantic organization, although the onset is often
delayed (Mervis and Bertrand, 1997; Mervis, in
press). WS children tend to list low-frequency words
when asked to complete the task (Mervis, in press). In
studies of lexical and semantic processing, unique
ERP patterns are recorded from WS children to auditory stimuli during a sentence completion task that
includes anomaly words at the end of the sentence
(Bellugi et al., 2000). In general, the expected component at N400 associated with anomaly words in WS
was more evenly distributed across the scalp, with no
hemispheric interaction (Bellugi et al., 2000). This
finding is unusual, given the left-hemisphere activation common in typically developing children (Bellugi

Phonological, Lexical, Syntactic, and Semantic Disorders in Children 541

et al., 2000). In addition, during the positive peak at


50 milliseconds, WS individuals produced an abnormally large spike. A smaller than normal negative
peak at 100 milliseconds and a large positive peak at
200 milliseconds can be seen in the WS population,
but not within normal controls (Bellugi et al., 1999).
Phonology and WS

Little research has examined the neural bases for


phonological processing in children with WS. A current study by Fornaryova Key et al. (in progress)
examined the brains response to speech syllables
(/ba/ and /ga/) in eight children with WS (age: 4.03
4.64 years). The results indicate that the left hemispheric of WS children is engaged in discriminating
between different speech sounds, rather than showing
the lack of hemisphere differences that Bellugi et al.
(2000) would predict. In addition, variations in the
second large positive ERP component (P2) to speech
sounds correlated highly with a range of language and
verbal abilities, such as those needed for performance
on the Matrices subtest of the K-BIT.
Syntax and WS

In normal, age-matched controls, ERP responses to


nouns, adjectives, and verbs (or open-class words)
tend to invoke a N400 peak in the right posterior
lobe. Words like articles, conjunctions, and prepositions (or close-class words) elicit an early negativity
peak in the anterior portion of the left hemisphere
(Bellugi et al., 1999). Using ERPs to open and closedclass word stimuli, WS subjects do not display the
typical evoked pattern at the N400 peak for openclass words in the right hemisphere, but instead, a
negativity in the left hemisphere was found (see
Bellugi et al., 1999). For closed-class words, the typical left-hemisphere pattern found in normal subjects
is not found in individuals with WS (Bellugi et al.,
1999). These findings indicate that the neural functional organization for syntactic processing is different for individuals with WS, even though results from
MRI studies report similar frontal lobe and cerebellum sizes to matched age and sex controls (Bellugi
et al., 1999, 2000).

Summary and Conclusion


Across the five developmental disability areas reviewed
here, much is already known about the underlying
neural bases for some impaired phonological processes,
but exceptionally little is known concerning the neural
underpinnings of other deficits involving syntactical
processing. At the same time, in areas where some
research is available, it is evident that language deficits
are not unique to a single syndrome and do not result

from the dysfunction in a single, discrete brain structure. Rather, language disorders are multidimensional
and involve neural processes that arise out of complex
interactions between multiple cortical brain regions,
and neural pathways, as well as from genetic factors
whose phenotypic expression is mitigated through dynamic environmental factors. There are, no doubt,
other as-yet-unknown factors. Clearly, we are still in
the earliest stages of our quest to understand the complex relationships that exist between developmental
language disabilities and the brain. Lest we get discouraged, it is important to keep in mind is that we at least
have begun that quest.

Acknowledgments
This work was supported in part by grants to
D. L. M. (NIH/NHLB HL01006, NIH/NIDCD
DC005994, NIH/NIDA DA017863) and V. J. M.
(DOE R215R000023, HHS 90XA0011).

Bibliography
Adolphs R, Tranel D, Damasio H & Damasio A R (1995).
Fear and the human amygdala. Journal of Neuroscience, 15, 58795891.
Akshoomoff N, Pierce K & Courchesne E (2002). The neurobiological basis of autism from a developmental perspective. Development and Psychopathology 14, 613634.
Alt M, Plante E & Creusere M (2004). Semantic features in
fast mapping: performance of preschoolers with specific
language impairment versus preschoolers with normal
language. Journal of Speech, Language and Hearing
Research 47, 407420.
American Psychiatric Association (1994). Diagnostic and
statistical manual of mental disorders (4th edn.).
Washington, DC: American Psychiatric Association.
Backes W, Vuurman E, Wennekes R, Spronk P, Wuisman M,
van Engelshoven J & Jolles J (2002). Atypical brain activation of reading processes in children with developmental
dyslexia. Journal of Child Neurology 17, 867871.
Bellugi U, Lichtenberger L, Mills D, Galaburda A &
Korenber J R (1999). Bridging cognition, the brain and
molecular genetics: evidence from Williams syndrome.
Trends in Neuroscience 22(5), 197207.
Bellugi U, Lichtenberger L, Jones W & Lai Z (2000). I. The
neurocognitive profile of Williams syndrome: a complex
pattern of strengths and weaknesses. Journal of Cognitive
Neuroscience 12(1), 729.
Billingsley R L, Jackson E F, Slopis J M, Swank P R,
Mahankali S & Moore B D 3rd (2003). Functional magnetic resonance imaging of phonologic processing in neurofibromatosis 1. Journal of Child Neurology, 18, 731740.
Bishop D V M & McArthur G M (2004). Immature cortical responses to auditory stimuli in specific language
impairment: evidence from ERPs to rapid tone
sequences. Developmental Science 7, 1118.

542 Phonological, Lexical, Syntactic, and Semantic Disorders in Children


Brambilla P, Hardan A, Ucelli di Nemi S, Perez J, Soares J C
& Barale F (2003). Brain anatomy and development in
autism: review of structural MRI studies. Brain Research
Bulletin 61, 557569.
Breier J I, Simos P G, Fletcher J M, Castillo E M, Zhang W &
Papanicolaou A C (2003). Abnormal activation of temporoparietal language areas during phonetic analysis in
children with dyslexia. Neuropsychology 17, 610621.
Bruneau N, Roux S, Guerin P & Barthelemy C (1997).
Temporal prominence of auditory evoked potentials
(N1 wave) in 48-year-old children. Psychophysiology
34, 3238.
Buckley S (2000). Teaching reading to develop speech and
language. Presented at the 3rd International Conference
on Language and Cognition in Down Syndrome, Portsmouth, UK.
Camarata S & Yoder P (2002). Language transactions
during development and intervention: theoretical implications for developmental neuroscience. International
Journal of Neuroscience 20, 459465.
Capone G T (2001). Down syndrome: advances in molecular biology and the neurosciences. Developmental and
Behavioral Pediatrics 22, 4059.
Capone G T (2004). Down syndrome: genetic insights and
thoughts of early intervention. Infants and Young Children 17, 4558.
Chapman R S (1997). Language development in children
and adolescents with Down syndrome. Mental Retardation and Developmental Disabilities Research Reviews 3,
307312.
Chapman R S, Hesketh L J & Kistler D J (2002). Predicting
longitudinal change in language production and comprehension in individuals with Down syndrome: hierarchical
linear modeling. Journal of Speech, Language, and
Hearing Research 45, 902915.
Corina D P, Richards T L, Serafini S, Richards A L, Steury
K, Abbott R D, Echelard D R, Maravilla K R &
Berninger V W (2001). fMRI auditory language differences between dyslexic and able reading children. Neuroreport 12, 11951201.
Dawson G, Meltzoff A N & Osterling J (1998). Children
with autism fail to orient to naturally occuring social stimuli. Journal of Autism and Developmental Disorders
28(6), 479485.
Dawson G, Webb S, Schellenberg G D, Dager S, Friedman
S, Aylward E & Richards T (2002). Defining the broader
phenotype of autism: genetic, brain, and behavioral perspectives. Development and Psychopathology 14, 581
611.
Dunn M, Vaughan H, Jr., Kreuzer J & Kurtzberg D (1999).
Electrophysiologic correlates of semantic classification
in autistic and normal children. Developmental Neuropsychology 16, 7999.
Eckert M A, Hu D, Eliez S, Bellugi U, Galaburda A,
Korenberg J, Mills D & Reiss A L (2005). Evidence for
superior parietal impairment in Williams syndrome. Neurology 64(1), 152153.
Eckert M A, Leonard C M, Richards T L, Aylward E H,
Thomson J & Berninger V W (2003). Anatomical

correlates of dyslexia: frontal and cerebellar findings.


Brain 126, 482494.
Eckert M A, Lombardino L J & Leonard C M (2001).
Planar asymmetry tips the phonological playground
and environment raises the bar. Child Development 72,
9881002.
Eden G F, VanMeter J W, Rumsey J M & Zeffiro T A
(1996). The visual deficit theory of developmental dyslexia. Neuroimage 4, S108S117.
Eggermont J J (1988). On the rate of maturation of sensory
evoked potentials. Electroencephalography and Clinical
Neurophysiology 70, 293305.
Fombonne E (1999). The epidemiology of autism: a review. Psychological Medicine 29, 768786.
Fonaryova Key A P, Dove G O & Maguire, M J (In press).
Linking brainwaves to the brain: an ERP primer. Developmental Neuropsychology.
Fonaryova Key A P, Mervis C & Molfese D L (in progress).
ERPs to speech sounds over the left hemisphere are
linked to language and cognitive abilities in 4-year-old
children with Williams syndrome.
Frank Y & Pavlakis S G (2001). Brain imaging in neurobehavioral disorders. Pediatric Neurology 25, 278287.
Freidrich M, Weber C & Freiderici D D (2004). Electrophysiological evidence for delayed mismatch response in
infants at risk for specific language impairment. Psychophysiology 41, 772782.
Gage N M, Siegel B & Roberts T P L (2003). Cortical
auditory system maturational abnormalities in children
with autism disorder: an MEG investigation. Developmental Brain Research 144, 201209.
Galaburda A M, Hollinger D, Mills D, Reiss A, Korenberg
J R & Bellugi U (2003). Williams syndrome: A summary
of cognitive, electrophysiological, anatomofunctional,
microanatomical and genetic findings. Revista de Neurologia 36(1), 132137.
Georgiewa P, Rzanny R, Gaser C, Gerhard U J, Vieweg U,
Freesmeyer D, Mentzel H J, Kaiser W A & Blanz B
(2002). Phonological processing in dyslexic children: a
study combining functional imaging and event related
potentials. Neuroscience Letters 318, 58.
Georgiewa P, Rzanny R, Hopf J M, Knab R, Glauche V,
Kaiser W A & Blanz B (1999). fMRI during word
processing in dyslexic and normal reading children.
Neuroreport 10, 34593465.
Grigorenko E L (2001). Developmental dyslexia: an update on genes, brains, and environments. Journal of
Child Psychology and Psychiatry 42, 91125.
Grunling C, Ligges M, Huonker R, Klingert M, Mentzel
H J, Rzanny R, Kaiser W A, Witte H & Blanz B
(2004). Dyslexia: the possible benefit of multimodal
integration of fMRI- and EEG-data. Journal of Neural
Transmission 111, 951969.
Habib M (2000). The neurological basis of developmental
dyslexia: an overview and working hypothesis. Brain
123, 23732399.
Happe F G E (1995). The role of age and verbal ability in
the theory of mind task performance of subjects with
autism. Child Development 66, 843855.

Phonological, Lexical, Syntactic, and Semantic Disorders in Children 543


Harter M R, Anllo-Vento L, Wood F B & Schroeder M M
(1988a). Separate brain potential characteristics in children with reading disability and attention deficit disorder: color and letter relevance effects. Brain and
Cognition 7, 115140.
Harter M R, Diering S & Wood F B (1988b). Separate brain
potential characteristics in children with reading disability and attention deficit disorder: relevance-independent
effects. Brain and Cognition 7, 5486.
Heiervang E, Hugdahl K, Steinmetz H, Inge Smievoll A,
Stevenson J, Lund A, Ersland L & Lundervold A (2000).
Planum temporale, planum parietale and dichotic listening in dyslexia. Neuropsychologia 38, 17041713.
Heim S & Keil A (2004). Large-scale neural correlates of
developmental dyslexia. European Child and Adolescent
Psychiatry 13, 125140.
Hiemenz J R & Hynd G W (2000). Sulcal/gyral pattern
morphology of the perisylvian language region in developmental dyslexia. Brain and Language 74, 113133.
Holcomb P J, Ackerman P T & Dykman R A (1986).
Auditory event-related potentials in attention and
reading disabled boys. International Journal of Psychophysiology 3, 263273.
Houston H G & McClelland R J (1985). Age and gender
contributions to intersubject variability of the auditory
brainstem potentials. Biological Psychiatry 20, 419430.
Hugdahl K, Gunderson H, Brekke C, Thomsen T &
Morten-Rimol L (2004). fMRI brain activation in a
Finnish family with specific language impairment compared with a normal control group. Journal of Speech,
Language, and Hearing Research 47, 162172.
Hugdahl K, Heiervang E, Nordby H, Smievoll A I,
Steinmetz H, Stevenson J & Lund A (1998). Central
auditory processing, MRI morphometry and brain laterality: applications to dyslexia. Scandanavian Audiology
Supplement 49, 2634.
Hynd G W & Semrud-Clikeman M (1989). Dyslexia and
brain morphology. Psychological Bulletin 106, 447482.
Johnstone J, Galin D, Fein G, Yingling C, Herron J & Marcus
M (1984). Regional brain activity in dyslexic and control
children during reading tasks: visual probe event-related
potentials. Brain and Language 21, 233254.
Joseph J, Noble K & Eden G (2001). The neurobiological
basis of reading. Journal of Learning Disabilities 34,
566579.
Kemner C, Verbaten M N, Cuperus J M & Camfferman G
(1995). Auditory event-related brain potentials in autistic children and three different control groups.
Biological Psychiatry 38, 150165.
Kjelgaard M M & Tager-Flusberg H (2001). An investigation of language impairment in autism: implications for
genetic subgroups. Language and Cognitive Processes
16, 287308.
Kutas M & Hillyard S A (1980). Reading senseless
sentences: brain potentials reflect semantic incongruity.
Science 207, 203205.
Lenhoff H M, Wang P P, Greenberg F & Bellugi U (1997).
Williams syndrome and the brain. Scientific American
277(6), 6873.

Leonard C M, Lombardino L J, Walsh K, Eckert M A,


Mockler J L, Rowe L A, Williams S & DeBose C B
(2002). Anatomical risk factors that distinguish dyslexia
from SLI predict reading skill in normal children. Journal of Communicative Disorders 35, 501531.
Leonard L (1998). Children with specific language impairment. Cambridge, MA: MIT Press.
Levitin D J, Menon V, Schmitt J F, Eliez S, White C D, Glover
G H, Kadis J, Korenberg J R, Bellugi U & Reiss A L (2003).
Neural correlates of auditory perception in Williams
syndrome: an fMRI study. Neuroimage 18(1), 7482.
Lincoln A J, Courchesne E, Harms L & Allen M (1995).
Sensory modulation of auditory stimuli in children with
autism and receptive developmental language disorder:
event related brain potential evidence. Journal of Autism
and Developmental Disorders 25, 521539.
Lincoln A J, Courchesne E, Harms L & Allen M (1995).
Sensory modulation of auditory stimuli in children with
autism and receptive developmental language disorder:
event related brain potential evidence. Journal of Autism
and Developmental Disorders 25, 521539.
Lord C & Paul R (1997). Language and communication in
autism. In Cohen D J & Volkmar F R (eds.) Handbook
of autism and pervasive development disorders, 2nd edn.
New York: Wiley.
Lovrich D, Cheng J C & Velting D M (1996). Late cognitive brain potentials, phonological and semantic classification of spoken words, and reading ability in children.
Journal of Clinical and Experimental Neuropsychology
18, 161177.
Lyytinen H, Leppa nen P H T, Richardson U & Guttorm T K
(2003). Brain functions and speech perception in infants
at risk for dyslexia. In Cse pe V (ed.) Dyslexia: different
brain, different behaviour. Dordrecht: Kluwer. 113152.
Marchman V A, Wulfeck B & Weismer S E (1999). Morphological productivity in children with normal language
and SLI: a study of the English past tense. Journal of
Speech, Language and Hearing Research 42, 206219.
Mervis C (in press). Recent research on language abilities
in WilliamsBeuren syndrome: a review. In Morris C A,
Wang P P & Lenhoff H (eds.) WilliamsBeuren Syndrome: Research and Clinical Perspectives. Baltimore,
MD: Johns Hopkins University Press.
Mervis C B & Bertrand J (1997). Developmental relations
between cognition and language: evidence from Williams
syndrome. In Adamson L B & Romski M A (eds.) Communication and language acquisition: discoveries from
atypical development. New York: Brookes. 75106.
Mervis C B, Robinson B F, Rowe M L, Becerra A M &
Klein-Tasman B P (2003). Language abilities of individuals who have Williams syndrome. In Abbeduto L (ed.)
International Review of Research in Mental Retardation,
vol. 27. Orlando, Fl: Academic Press. 3581.
Molfese D L (2000). Predicting dyslexia at 8 years using neonatal brain responses. Brain and Language 72, 238245.
Molfese D L & Betz J C (1988). Electrophysiological
indices of the early development of lateralization for
language and cognition and their implications for
predicting later development. In Molfese D L &

544 Phonological, Lexical, Syntactic, and Semantic Disorders in Children


Segalowitz S J (eds.) The developmental implications of
brain lateralization for language and cognitive development. New York: Guilford Press. 171190.
Molfese D L & Molfese V J (1985). Electrophysiological
indices of auditory discrimination in newborn infants: the
bases for predicting later language development. Infant
Behavior and Development 8, 197211.
Molfese D L, Fonaryova Key A, Kelly S, Cunningham N,
Terrell S, Fergusson M, Molfese V & Bonebright T (In
press). Dyslexic, average, and above average readers
engage different and similar brain regions while reading.
Journal of Learning Disabilities.
Molfese D L, Fonaryova Key A P, Maguire M J, Dove G O
& Molfese V J (2005). Event-related evoked potentials
(ERPs) in speech perception. In Pisoni D & Remez R
(eds.) Handbook of speech perception. London: Blackwell.
Nadel L (2003). Downs syndrome: a genetic disorder in
biobehavioral perspective. Genes, Brain and Behavior 2,
156166.
Neville H J, Coffey S A, Holcomb P J & Tallal P (1993).
The neurobiology of sensory and language processing
in language-impaired children. Journal of Cognitive
Neurosciences 5, 235253.
Oades R D, Walker M K & Geffen L B (1988). Eventrelated potentials in autistic and healthy children on an
auditory choice relation time task. International Journal
of Psychophysiology 6(1), 2537.
Ors M, Lindgren M, Berglund C, Hagglund K, Rosen I &
Blennow G (2001). The N400 component in parents of
children with specific language impairment. Brain and
Language 77, 6071.
Orton S (1937). Reading, writing and speech problems in
children. New York: Horton.
Papanicolaou A C, Simos P G, Breier J I, Fletcher J M,
Foorman B R, Francis D, Castillo E M & Davis R N
(2003). Brain mechanisms for reading in children with
and without dyslexia: a review of studies of normal development and plasticity. Developmental Neuropsychology
24, 593612.
Pennington B F, Moon J, Edgin J, Stedron J & Nadel L
(2003). The neuropsychology of Down syndrome: evidence for hippocampal dysfunction. Child Development
74, 7593.
Pinter J D, Brown W E, Eliez S, Schmitt J, Capone G T &
Reiss A L (2001a). Amygdala and hippocampal volumes
in children with Down syndrome: a high-resolution MRI
study. Neurology 56, 972974.
Pinter J D, Stephan E, Schmitt J, Capone G T & Reiss A L
(2001b). Neuroanatomy of Downs syndrome: a highresolution MRI study. The American Journal of Psychiatry 158, 16591671.
Rapin I & Dunn M (2003). Update on the language disorders of individuals on the autistic spectrum. Brain and
Development 25, 166172.
Sarkari S, Simos P G, Fletcher J M, Castillo E M, Breier J I
& Papanicolaou A C (2002). Contributions of magnetic
source imaging to the understanding of dyslexia. Seminars in Pediatric Neurology 9, 229238.

Schmitt J F, Eliez S, Warsofsky L S, Bellugi U & Reiss A L


(2001a). Corpus callosum morphology of Williams syndrome: relation to genetics and behaviour. Developmental Medicine and Child Neurology 43(3), 155159.
Schmitt J F, Eliez S, Warsofsky L S, Bellugi U & Reiss A L
(2001b). Enlarged cerebellar vermis in Williams syndrome. Journal of Psychiatric Research 35(4), 225229.
Seidl R, Hauser E, Bernert G, Marx, Freilinger M & Lubec
G (1997). Auditory evoked potentials in young patients
with Down syndrome. Event-related potentials (P3)
and histaminergic system. Cognitive Brain Research 5,
301309.
Seri S, Cerquiglini A, Pisani F & Curatolo P (1999). Autism
in tuberous sclerosis: Evoked potential evidence for a
deficit in auditory sensory processing. Clinical Neurophysiology 110, 18251830.
Shaywitz B A, Shaywitz S E, Pugh K R, Mencl W E,
Fulbright R K, Skudlarski P, Constable R T, Marchione
K E, Fletcher J M, Lyon G R & Gore J C (2002). Disruption of posterior brain systems for reading in children
with developmental dyslexia. Biological Psychiatry 52,
101110.
Simos P G, Breier J I, Fletcher J M, Foorman B R, Bergman
E, Fishbeck K & Papanicolaou A C (2000). Brain activation profiles in dyslexic children during non-word
reading: a magnetic source imaging study. Neuroscience
Letters 290, 6165.
Simos P G, Fletcher J M, Bergman E, Breier J I, Foorman B R,
Castillo E M, Davis R N, Fitzgerald M & Papanicolaou
A C (2002). Dyslexia-specific brain activation profile
becomes normal following successful remedial training.
Neurology 58, 12031213.
Simos P G, Basile L F H & Papanicolaou A C (1997).
Source localization of the N400 response in a sentencereading paradigm using evoked magnetic fields and magnetic resonance imaging. Brain Research 762, 2939.
Siok W T, Perfetti C A, Jin Z & Tan L H (2004). Biological
abnormality of impaired reading is constrained by culture. Nature 431, 7176.
Smith M E, Stapleton J M & Halgren E (1986). Human
medial temporal lobe potentials evoked in memory and
language tasks. Electroencephalography & Clinical
Neurophysiology 63, 145159.
Sparks G F, Friedman S D, Shaw D W, Aylward E H,
Echelard D, Artru A A, Maravilla K R, Giedd H N,
Munson J, Dawson G & Dager S R (2002). Brain
structural abnormalities in young children with autism
spectrum disorder. Neurology 59, 184192.
Stoel-Gammon C (1997). Phonological development in
Down syndrome. Mental Retardation and Developmental Disabilities Research Reviews 3, 300306.
Stoel-Gammon C (2001). Down syndrome: developmental
patterns and intervention strategies. Down Syndrome
Research and Practice 7, 93100.
Tager-Flusberg H (1999). Language development in atypical children. In Barrett M (ed.) The development of
language. London: UCL Press. 311348.
Temple E, Deutsch G K, Poldrack R A, Miller S L, Tallal P,
Merzenich M M & Gabrieli J D (2003). Neural deficits

Phonology in the Production of Words 545


in children with dyslexia ameliorated by behavioral
remediation: evidence from functional MRI. Proceedings of the National Academy of Science USA 100,
28602865.
Travis L (1931). Speech pathology. New York: AppletonCentury.
Uwer R, Albert R & von Suchodoletz W (2002). Automatic processing of tones and speech stimuli in children
with specific language impairment. Developmental
Medicine and Child Neurology 44, 527532.
von Plessen K, Lundervold A, Duta N, Heiervang E,
Klauschen F, Smievoll A I, Ersland L & Hugdahl K

(2002). Less developed corpus callosum in dyslexic subjects a structural MRI study. Neuropsychologia 40,
10351044.
Welsh T N, Digby E & Simon D (2002). Cerebral specialization and verbal-motor integration in adults with
and without Down syndrome. Brain and Language 84,
153169.
Whalen P J, Rauch S L, Etcoff N L, McInerney S C, Lee M B
& Jenike M A (1998). Masked presentation of emotional face expression modulates amygdala activity without explicit knowledge. Journal of Neuroscience 18,
411418.

Phonology in the Production of Words


N Schiller, Maastricht University, Maastricht,
The Netherlands
2006 Elsevier Ltd. All rights reserved.

Phonological Representations in the


Mental Lexicon
How are words represented in the brain? Words have
a meaning and a form, and presumably these two
aspects of words are represented and processed separately in different areas of the brain (for a recent
overview see Indefrey and Levelt, 2004). For instance,
each act of speech production is planned in advance
and starts with the intention to talk about a specific
meaning which is to be conveyed to the interlocutor(s). Therefore, the first step in speech production is
called conceptualization (Levelt, 1989). In this phase,
the content of an utterance is represented as prelinguistic units or concepts. During the next step, called
formalization, concepts become lexicalized, i.e., lexical entries corresponding to the concepts are retrieved.
Formalization can be divided into two processes,
namely, grammatical encoding and phonological
encoding (Levelt et al., 1999). This division is based
on empirical data, such as speech errors. Garrett
(1975) already observed that there are at least two
categories of exchange errors, i.e., word exchanges
and segment (phoneme) exchanges. An example of a
word exchange is laboratory in your own computer
(Fromkin, 1971); laboratory and computer belong to
different syntactic phrases, but they are of the same
syntactic word class, i.e., nouns. Segment or phoneme
exchanges, in contrast, typically result from the same
syntactic phrase, but from words of different syntactic word classes, e.g., our queer dean (instead of our
dear queen; an original spoonerism). This pattern
of word and segment exchanges can be explained
by ssuming that word exchanges occur during

grammatical encoding, whereas segment exchanges


occur during subsequent phonological encoding. During grammatical encoding the syntactic structure of
an utterance is specified including the syntactic word
class of an individual word, but not its phonological
form. That is why words of the same word class
are exchanged, no matter what their phonological
make up is. In contrast, during phonological encoding the words of an utterance have already been selected, i.e., their syntactic word class information can
no longer influence the planning process, but their
phonological form is still to be specified. During this
specification segments or phonemes from adjacent
words can accidentally become active at the same
time, and then they can be exchanged and result in a
sound error.
In the meantime, on-line experimental evidence for
the division between grammatical and phonological
encoding has been obtained. Schriefers et al. (1990)
asked Dutch participants in the laboratory to name
pictures while presenting them with auditory distracter words. When the distracter words were semantically, i.e., categorically, related to the target picture
name (e.g., gieter watering can), participants were
slower to name the picture of a rake (hark) compared
to an unrelated distracter word (e.g., bel bell) (see
Figures 13). However, this happened only when the
distracter words were presented slightly before picture onset or simultaneously with the picture onset
(see Figure 4). When the distracter words were phonologically related to the picture name (e.g., harp
harp), however, the naming of hark was faster than
in the unrelated control condition (see Figures 57).
However, this effect disappeared when the phonologically related distracter words were presented before
picture onset (see Figure 8).
The received account for the semantic interference
effect (harkgieter) is that the lexical entry gieter does

Phonology in the Production of Words 545


in children with dyslexia ameliorated by behavioral
remediation: evidence from functional MRI. Proceedings of the National Academy of Science USA 100,
28602865.
Travis L (1931). Speech pathology. New York: AppletonCentury.
Uwer R, Albert R & von Suchodoletz W (2002). Automatic processing of tones and speech stimuli in children
with specific language impairment. Developmental
Medicine and Child Neurology 44, 527532.
von Plessen K, Lundervold A, Duta N, Heiervang E,
Klauschen F, Smievoll A I, Ersland L & Hugdahl K

(2002). Less developed corpus callosum in dyslexic subjects a structural MRI study. Neuropsychologia 40,
10351044.
Welsh T N, Digby E & Simon D (2002). Cerebral specialization and verbal-motor integration in adults with
and without Down syndrome. Brain and Language 84,
153169.
Whalen P J, Rauch S L, Etcoff N L, McInerney S C, Lee M B
& Jenike M A (1998). Masked presentation of emotional face expression modulates amygdala activity without explicit knowledge. Journal of Neuroscience 18,
411418.

Phonology in the Production of Words


N Schiller, Maastricht University, Maastricht,
The Netherlands
2006 Elsevier Ltd. All rights reserved.

Phonological Representations in the


Mental Lexicon
How are words represented in the brain? Words have
a meaning and a form, and presumably these two
aspects of words are represented and processed separately in different areas of the brain (for a recent
overview see Indefrey and Levelt, 2004). For instance,
each act of speech production is planned in advance
and starts with the intention to talk about a specific
meaning which is to be conveyed to the interlocutor(s). Therefore, the first step in speech production is
called conceptualization (Levelt, 1989). In this phase,
the content of an utterance is represented as prelinguistic units or concepts. During the next step, called
formalization, concepts become lexicalized, i.e., lexical entries corresponding to the concepts are retrieved.
Formalization can be divided into two processes,
namely, grammatical encoding and phonological
encoding (Levelt et al., 1999). This division is based
on empirical data, such as speech errors. Garrett
(1975) already observed that there are at least two
categories of exchange errors, i.e., word exchanges
and segment (phoneme) exchanges. An example of a
word exchange is laboratory in your own computer
(Fromkin, 1971); laboratory and computer belong to
different syntactic phrases, but they are of the same
syntactic word class, i.e., nouns. Segment or phoneme
exchanges, in contrast, typically result from the same
syntactic phrase, but from words of different syntactic word classes, e.g., our queer dean (instead of our
dear queen; an original spoonerism). This pattern
of word and segment exchanges can be explained
by ssuming that word exchanges occur during

grammatical encoding, whereas segment exchanges


occur during subsequent phonological encoding. During grammatical encoding the syntactic structure of
an utterance is specified including the syntactic word
class of an individual word, but not its phonological
form. That is why words of the same word class
are exchanged, no matter what their phonological
make up is. In contrast, during phonological encoding the words of an utterance have already been selected, i.e., their syntactic word class information can
no longer influence the planning process, but their
phonological form is still to be specified. During this
specification segments or phonemes from adjacent
words can accidentally become active at the same
time, and then they can be exchanged and result in a
sound error.
In the meantime, on-line experimental evidence for
the division between grammatical and phonological
encoding has been obtained. Schriefers et al. (1990)
asked Dutch participants in the laboratory to name
pictures while presenting them with auditory distracter words. When the distracter words were semantically, i.e., categorically, related to the target picture
name (e.g., gieter watering can), participants were
slower to name the picture of a rake (hark) compared
to an unrelated distracter word (e.g., bel bell) (see
Figures 13). However, this happened only when the
distracter words were presented slightly before picture onset or simultaneously with the picture onset
(see Figure 4). When the distracter words were phonologically related to the picture name (e.g., harp
harp), however, the naming of hark was faster than
in the unrelated control condition (see Figures 57).
However, this effect disappeared when the phonologically related distracter words were presented before
picture onset (see Figure 8).
The received account for the semantic interference
effect (harkgieter) is that the lexical entry gieter does

546 Phonology in the Production of Words

Figure 1 Picture naming with a semantically related distracter


word. Participants task is to name the picture and ignore the
word. It is known that such a situation yields Strooplike interference, i.e., participants are influenced by the distracter word when
naming the picture.

Figure 3 The semantic interference effect in speech production. Naming latencies are slower when the distracter word is
semantically related to the picture than when it is unrelated. The
results are taken from a study by Schriefers et al. (1990).

Figure 4 The time course of semantic interference in speech


production. The effect occurs only when the distracter word is
presented slightly (e.g., 150 ms) before picture onset or simultaneously with the picture.
Figure 2 Picture naming with an unrelated distracter word.

not only receive activation from the auditory presentation of the distracter word, but also via the conceptual network garden utilities from the picture
of the hark (rake) due to the fact that there are
connections between conceptually similar entries.
Therefore, gieter (watering can) is a stronger lexical
competitor than the unrelated distracter bel (bell),
which does not receive activation from the picture of
the rake (see also Levelt et al., 1999: 1011). The
phonological facilitation is accounted for by assuming that the phonological distracter harp preactivates
segments (phonemes) in the production network.
The segments that are shared between distracter and
target (/h/, /a/, /r/) can be selected faster when the
target picture name hark is phonologically encoded.
One can infer from this pattern that semanticcategorically related distracters have an influence on

the speech production process at an earlier point in


time, namely during lexical selection, than phonologically related distracter words, which only show
an influence during phonological encoding (see
Figure 9).
This article is about phonology in the production of
words. A model of phonological encoding is provided
by Levelt and Wheeldon (1994) and has been further
developed since then (see Figure 10). This model
describes word form encoding processes that follow
the selection of a word from the mental lexicon. Once
a word has been selected from the mental lexicon, it
has to be encoded in a form that can finally be used
to control the neuromuscular commands necessary
for the execution of articulatory movements (see
Guenther, 2003 for a recent overview). When accessing a words form for phonological encoding, speakers retrieve segmental and metrical information.

Phonology in the Production of Words 547

Figure 5 Picture naming with a phonologically related distracter


word.

Figure 7 The phonological facilitation effect in speech production. Naming latencies are faster when the distracter word is
phonologically related to the picture than when it is unrelated.
The results are taken from a study by Schriefers et al. (1990).

Figure 6 Picture naming with an unrelated distracter word.

Figure 8 The time course of phonological facilitation in speech


production. The effect occurs only when the distracter word is
presented simultaneously with the picture or slightly (e.g.,
150 ms) later than picture onset.

During segmental encoding, the segments (phonemes)


of a word and their order have to be retrieved. For the
word lepel spoon this would be the segments /l/1,
/e/2, /p/3, /e/4, /l/5. During metrical retrieval, a metrical frame has to be retrieved, i.e., the number of
syllables and the location of the lexical stress. For
the example lepel, the metrical frame would include
two syllable slots, the first of which bears lexical
stress (e.g., _ _). Furthermore, the syllable or consonantvowel (CV) structure of the individual syllables
of the word may be retrieved (Dell, 1988; but see
Roelofs and Meyer, 1998). Once the segmental and
the metrical information has been retrieved, it is combined during a process called segment-to-frame association. During this process, the previously retrieved
segments are combined from word beginning to end
with their corresponding metrical frame. The resulting phonological string is syllabified according to
universal and language-specific syllabification rules.
A fully prosodified phonological word is generated,

which forms the basis for the activation of syllables in


a mental syllabary (Levelt and Wheeldon, 1994). Presumably, the units in the syllabary can be conceived of
as precompiled articulatory motor programs of syllabic size. These motor programs may be represented
in terms of gestural scores, i.e., a phonetic plan that
specifies the relevant articulatory gestures and their
relative timing (see Goldstein and Fowler, 2003 for a
review). The final step includes the execution of these
gestures by the articulatory apparatus. This results in
overtly produced speech (see Figure 11).
One puzzling feature of this mechanism is why
segments and metrical frame are retrieved independently from memory when both types of information
are reunified slightly later. However, while this may
seem puzzling when considering single, isolated word
production, it is not when the production of words in
context is taken into account. For instance, syllabification does not respect lexical boundaries since the
domain of syllabification is the phonological word

548 Phonology in the Production of Words

Figure 9 Schematic illustration of the time course of semantic


and phonological effects in speech production. SOA stimulus
onset asynchrony.

Figure 11 Example of the phonological encoding of a picture


name (animated with sound).

Figure 10 A model of phonological encoding (after Levelt and


Wheeldon, 1994).

(not the lexical word). Let us take the example of the


verb to type. Type is a monosyllabic CVC word. Now
consider the words ty.pist (someone who types; dots
indicate syllable boundaries), ty.ping (the gerund), or
the phrase ty.pe it. In all of these examples, the coda /
p/ of type /taIp/ becomes the onset of a second syllable. In the example ty.pe it, it even straddles the
lexical boundary between type and it. Therefore,
it is important to bear in mind that segments (phonemes) are not inserted into a lexical word frame, but
into a phonological word frame. The phonological
word, however, is a context-dependent unit. It can
solely consist of the lexical word type as in type
faster, or unstressed function words such as it can
cliticize to it as in type it faster, yielding ty.pe it
/taI.pIt/. A corollary of context-dependent syllabification in speech production is that it would not make

much sense to store syllable boundaries with the word


forms in the mental lexicon because syllable boundaries change as a function of the phonological
context. The so-called syllable position constraint
observed in sound errors (i.e., onsets exchange with
onsets, nuclei with nuclei, etc.) can probably not hold
as an argument for stored syllable frames because it
may just be a reflection of the general tendency of
segments to interact with phonemically similar segments. Therefore, it makes more sense to postulate
that syllables are not stored with their lexical entries
(Levelt et al., 1999). Rather, syllable boundaries
will be generated on-line during the construction of
phonological words to yield maximally pronounceable syllables. This architecture lends maximal flexibility to the speech production system in all possible
contexts.

Segmental Encoding
Speech error research has been an important source
of information for the understanding of segmental
encoding (Shattuck-Hufnagel, 1979 for an overview).
The vast majority of sound errors are single-segment
errors, but sometimes also consonant clusters get substituted, deleted, shifted, or exchanged. Most often,
the word onset is involved in a sound error, although
sometimes also nuclei (beef needle instead of beef
noodle) or codas (god to seen instead of gone
to seed) form part of an error. This points to the
general importance of the word onset in phonological
encoding (see Schiller, 2004 for more details). Some
errors suggest the involvement of phonological
features in planning phonological words, e.g., glear
plue sky (Fromkin, 1971). In this latter example, it

Phonology in the Production of Words 549

seems as if only the feature [VOICE] changed position


although two independent segmental sound errors
(i.e., /k/ ! /g/ and /b/ ! /p/) cannot be excluded,
either. In fact, often the target and the error only
differ in one single phonological feature, and there is
a tendency for more specified segments to substitute
for less specified features, e.g., documentation !
documendation; /t/ [VOICE] ! /d/ [VOICE]
(Stemberger, 1991). The reason for this addition
bias is not entirely clear. One important question
which featural errors raise concerns the representation of segments during on-line processing: are segments represented as phonemic units or as bundles of
phonological features?
In speech production, metalinguistic evidence
(backward talking, language games, etc.) as well as
speech errors (the vast majority of the phonological
slips concern a single phoneme) suggest that the segment is the smallest unit of speech planning. However, as mentioned above, there are some speech
errors which might imply a representation in terms
of phonological features. In Levelt et al.s (1999)
model, the features of the segments in a syllable
were accessed in parallel. Moreover, Roelofs (1999)
showed that a difference in a single phonological
feature (e.g., been leg [VOICE], bos forest
[VOICE], pet cap [VOICE]) is enough to spoil
the so-called preparation effect (see below). This suggests that segments are planning units independent of
their phonological features. However, this finding
does not exclude that features may play a role in
planning a word form. In fact, there are instances
when subphonemic specification is required in speech
production (e.g., I scream and ice cream are segmentally identical, i.e., /aI.skrim/), and it is as yet not clear
how exactly subphonemic details can form part of the
theory (see also McQueen et al., 2003).
Time Course of Segmental Processing

One important question in word processing is the


time course of the processes involved. For instance,
does semantic processing precede phonological processing in speech production or do these two processes occur in parallel? Similarly, are the segments of a
word encoded one after the other or are they encoded
in parallel? It was argued above on the basis of empirical evidence (e.g., sound errors) as well as on
theoretical grounds that word forms are planned in
terms of abstract units called segments or phonemes.
Meyer (1990, 1991) had participants produce sets of
words that either overlapped in the onset phoneme
(hut tent, heks witch, hiel heel), or in the first two
phonemes (hamer hammer, haring herring, hagel
hail), or in the first three phonemes (haver oats,

haven haven, havik hawk), or in the final phonemes


(haard stove, paard horse, kaard map). These
were the so-called homogeneous conditions, which
were compared to so-called heterogeneous conditions
in which the words did not overlap at all (hut tent,
dans ballet, klip cliff). Reaction times were found to
be faster when the beginning of the target words could
be planned in advance but not when the final part
could be prepared. The magnitude of this preparation
effect depended on the size of the string that could be
prepared, i.e., the more phonemes overlapped among
the words within a set, the larger the preparation
effect. Importantly, this was only true for beginningoverlap, but not for end-overlap, suggesting that the
phonological planning of words is strictly sequential,
i.e., proceeding in a left-to-right fashion from the
beginning of words to their end. When the onset phoneme is not known, nothing can be prepared.
Wheeldon and Levelt (1995) provided additional
evidence for the incremental nature of segmental phonological encoding. They required bilingual Dutch
English participants to internally generate Dutch
translations to English prompt words, which were
displayed via headphones. However, participants
did not overtly produce the Dutch words but selfmonitored them internally for previously specified
segments. For example, participants would hear the
English prompt word hitchhiker and were asked to
press a button on a button box in front of them if the
Dutch translation (lifter) contained the phoneme /t/.
Thus, for hitchhiker participants would press the button as fast as possible, whereas for cream cheese
(roomkaas) they would not. The button press latencies varied as a function of the target phoneme in the
translation word. That is, participants were faster
when the prespecified phoneme (e.g., /t/) was in
onset position (e.g., garden walltuinmuur) than
when it occurred in the middle (e.g., hitchhikerlifter)
or at the end of the translation word (e.g., napkin
servet). The earlier the target phoneme occurred in
the Dutch word, the shorter the decision latencies (see
Figure 12). These data have been interpreted as support for the claim of rightward incremental encoding.
Furthermore, these effects have been localized at the
phonological word level, i.e., when segments and
metrical frames are combined because metrical stress
location influences the effect. Moreover, Wheeldon
and Levelt (1995) observed a significant increase in
monitoring times when two segments were separated
by a syllable boundary. One possibility is that the
monitoring difference between the target segments
at the syllable boundary (e.g., fiet.ser vs. lif.ter)
might be due to the existence of a marked syllable
boundary or a syllabification process that slows down
the encoding of the second syllable.

550 Phonology in the Production of Words

Figure 12 Mean reaction times of phoneme monitoring as


a function of the position of the target phoneme in the word
form. The results are taken from a study by Wheeldon and Levelt
(1995).

Metrical Encoding
Roelofs and Meyer (1998) investigated how much
information about the metrical structure of words is
stored in memory. Possible candidates are lexical
stress, number of syllables, and syllable structure. In
one experiment, for instance, they compared the production latencies for sets of homogeneous disyllabic
words such as ma.NIER (manner; capital letters
indicate stressed syllables), ma.TRAS (mattress),
and ma.KREEL (mackerel) with sets including
words with a variable number of syllables such
as ma.JOOR (major), ma.TE.rie (matter), and
ma.LA.ri.a (malaria). Lexical stress was kept constant (always on the second syllable). Relative to a
heterogeneous control condition, there was strong
and reliable facilitation for the disyllabic sets but
not for the sets with variable numbers of syllables.
This showed that the number of syllables of a word
must be known to the phonological encoding system.
Hence, this information must be part of the metrical
representation of words.
Similarly, the production of sets of homogeneous
trisyllabic words with constant stress (e.g., ma.RI.ne
navy, ma.TE.rie matter, ma.LAI.se depression,
ma.DON.na madonna) and variable stress (e.g.,
ma.RI.ne navy, ma.nus.CRIPT manuscript,
ma.TE.rie matter, ma.de.LIEF daisy) was
measured and compared to the corresponding heterogeneous sets. Again, facilitation was obtained for the
constant sets but not for the variable ones. Therefore,
one can conclude that the availability of stress information is indispensable for planning of polysyllabic
words at least when stress is in nondefault position.
However, CV structure did not yield an effect. When
the production latencies for words with a constant

Figure 13 Summary of the stress-monitoring latencies. Depicted are the reaction times of three experiments, two with
disyllabic words (example words LEpelliBEL and TOrentoMAAT)
and one with trisyllabic words (example words asPERgeartiSJOK) as a function of the position of the lexical stress in the
picture name. The results are taken from a study by Schiller et al.
(2005).

CV structure (e.g., bres breach, bril glasses, brok


piece, brug bridge; all CCVC) were compared to
words with a variable CV structure (e.g., brij porridge, CCVV; brief letter, CCVVC; bron source,
CCVC; brand fire, CCVCC), relative to the
corresponding heterogeneous conditions, no difference was found, suggesting that the metrical structure
speakers retrieve does not contain information about
the CV or syllable structure of a word.
Time Course of Metrical Processing

To investigate the time course of metrical processing,


Schiller and colleagues employed a tacit naming task
and asked participants to decide whether the disyllabic name of a visually presented picture had initial or
final stress. Their hypothesis was that if metrical
encoding is a parallel process, then there should not
be any differences between the decision latencies for
initial and final stress. If, however, metrical encoding
is also a rightward incremental process just like
segmental encoding then decision latencies for picture names with initial stress should be faster than for
picture names with final stress. The latter turned out
to be the case (Schiller et al., 2005). However, Dutch
like other Germanic languages has a strong preference for initial stress. More than 90% of the words
occurring in Dutch have stress on the first syllable.
Therefore, this effect might have been due to a default
strategy. However, when pictures with trisyllabic
names were tested, participants were still faster to
decide that a picture name had penultimate stress
(e.g., asPERge asparagus) than that it had final
stress (e.g., artiSJOK artichoke). This result suggests
that metrical encoding proceeds from the beginning
to the end of words, just like segmental encoding (see
Figure 13).

Phonology in the Production of Words 551

Syllables
The contribution of the syllable in the speech production process is quite controversial. Studies by Ferrand
et al. (1996) reported a syllable-priming effect in
French speech production. The visually masked
prime ca primed the naming of ca.rotte better than
the naming of car.table. Similarly, the prime car
primed the naming of car.table better than the naming
of ca.rotte (see Figure 14). This effect is a production
equivalent of the syllabic effect reported by Mehler
et al. (1981). Ferrand et al. (1996) concluded that
the output phonology must be syllabically structured
since the effect disappears in a task that does not
make a phonological representation necessary, such
as a lexical decision task. Furthermore, Ferrand et al.
(1996) argued that their data are compatible with
Levelts idea of a mental syllabary, i.e., a library of
syllable-sized motor programs. Interestingly, Ferrand
et al. (1997) also report a syllable-priming effect for
English. This is surprising considering the fact that
Cutler et al. (1983) could not get a syllabic effect for
English speech perception.
However, when Schiller (1998) tried to replicate
the syllabic effects in Dutch speech production, he
failed to find a syllabic effect. Instead, what he
obtained was a clear segmental overlap effect, i.e.,
the more overlap between prime and target picture
name, the faster the naming latencies. That is, the
prime kan yielded not only faster responses than ka
for the picture of a pulpit (kan.sel) but also for the
picture of a canoe (ka.no) (see Figure 15).
Similar results were obtained in the auditory modality, i.e., presenting either /ro/ or /rok/ when Dutch
participants were requested to produce either ro.ken
(to smoke) or rook.te (smoked). In fact, in the
auditory modality also a segmental overlap effect
was obtained, i.e., /rok/ was a better prime than /ro/
independent of the target. The failure to find a syllable-priming effect in Dutch is in agreement with the
statement that syllables are never retrieved during
phonological encoding (Levelt et al., 1999). The
syllable-priming effect found by Ferrand et al.

Figure 14 Mean reaction times (picture-naming latencies) per


prime and target category in the Ferrand et al. (1996) study.

(1996) in French can be accounted for by assuming


that the segments in the prime are coded with their
corresponding syllable structure information. For instance, the prime pal preactivates segments specified
for syllable position in the perceptual network, e.g.,
ponset, anucleus, and lcoda. Active phonological segments in the perceptual network can directly affect
the corresponding segment nodes in the production
lexicon. Therefore, the prime matches with the target
pal.mier, but not with pa.lace because the /l/ in pal is
specified for coda and not for onset.
The segmental overlap effect is not restricted to
Dutch. When Schiller (2000) tried to replicate the
Ferrand et al. (1997) results for English with bettercontrolled material, no syllabic effect was obtained
but a segmental overlap effect was. These English
data are interesting because in English there is phonological equivalence between corresponding syllable
structures. For example, pi /paI/ matches phonologically the first syllable in pilot but not in pillow, and pil
/pIl/ matches phonologically the first syllable in
pillow but not in pilot. Nevertheless, the prime pil
yielded faster responses than pi for both pilot and
pillow (see Figure 16).
Either the contribution of vowels is less important
in segmental priming or consonants and vowels
have different time courses of activation (Berent and
Perfetti, 1995), consonants being faster than vowels

Figure 15 Mean reaction times (picture-naming latencies) per


prime and target category in the Schiller (1998) study.

Figure 16 Mean reaction times (picture-naming latencies) per


prime and target category in the Schiller (2000) study.

552 Phonology in the Production of Words

and therefore more effective. Further testing revealed


that there is no syllable effect in Spanish, but a small
segmental overlap effect (Schiller et al., 2002), and no
syllabic effect in French when a larger set of materials
is tested (Schiller et al., 2002). Taken together, these
results support the idea that syllables are not retrieved, but created on-line during phonological
encoding.
Mental Syllabary

The existence of a mental syllabary is a hotly debated


topic. The original idea for a library of articulatory routines comes from work on speech errors
(Crompton, 1981; Levelt, 1989). The idea was that
precompiled motor programs of syllable size could
help reduce the computational load during speech
production if they form the basic units of articulatory
programming. This idea is attractive from a lexicostatistical point of view since the majority of the
speech in Dutch (about 85%) can be produced with
a minority of the Dutch syllables (only 5% of all
Dutch syllables). Therefore, Levelt and Wheeldon
(1994) tested this idea in an experiment comparing
the production latencies of words differing in syllable
frequencies. For instance, there were words in the
experiment that consisted of high-frequency syllables
(e.g., bo.ter butter) and words that were made up
from low-frequency syllables (e.g., gi.raf giraffe)
while word frequency was controlled. Results showed
that words with high-frequency syllables were named
significantly faster than words with low-frequency
syllables, independent of word frequency. Levelt and
Wheeldon (1994) took this finding as evidence for
a separate store from which syllabic units can be
recruited during speech production. However, syllable frequency correlates highly with segment or
phoneme frequency. Therefore, the effect reported
by Levelt and Wheeldon (1994) could as well be
attributed to segment frequency. When segment frequency was controlled, a small set of awkward word
stimuli remained and the syllable frequency effect
disappeared.
Although syllables cannot be primed in Dutch,
Cholin et al. (2004) found that syllable structure can
be prepared in the planning of speech production.
Syllables probably emerge at the interface between
phonological and phonetic encoding. In a follow-up
study, the same authors found significant syllable
frequency effects in pseudoword production when
segment frequency was controlled for (Cholin et al.,
in press). This latest result strongly supports the notion of a mental syllabary that mediates between
abstract phonological syllables and phonetic syllables, which are conceived of as precompiled gestural

scores to control the execution of an articulatory


motor program.

Summary and Conclusion


In this article, I described the role of phonology in the
production of words. A model of phonological encoding was described. Certain aspects of this model, such
as the role of segments and metrical frames, were
discussed in more detail. It was argued on the basis
of speech error and reaction time data that segments
rather than phonological features play a role in production planning, while more subphonemic detail is
necessary to account for the speech comprehension
data. Furthermore, the nature of metrical frames was
described and it was argued that segments as well as
lexical stress are encoded rightward incrementally.
Finally, the role of syllables in speech production
was sketched and the role of a mental syllabary was
discussed. It is concluded that more research on phonological processing is necessary to specify aspects of
the model that are currently underspecified.

See also: Dutch; English in the Present Day (since ca.


1900); PhonologyPhonetics Interface; Speech Errors as
Evidence in Phonology; Speech Errors: Psycholinguistic
Approach; Speech Production; Spoken Language Production: Psycholinguistic Approach; Syllable: Phonology;
Word Stress.

Bibliography
Berent I & Perfetti C A (1995). A rose is a REEZ: the twocycles model of phonology assembly in reading English.
Psychological Review 102, 146184.
Cholin J, Levelt W J M & Schiller N O (in press). Effects of
syllable frequency in speech production. Cognition.
Cholin J, Schiller N O & Levelt W J M (2004). The
preparation of syllables in speech production. Journal
of Memory and Language 50, 4761.
Crompton A (1981). Syllables and segments in speech
production. Linguistics 19, 663716.
Cutler A, Mehler J, Norris D and Segui J (1983).
A language-specific comprehension strategy. Nature
304, 159160.
Dell G S (1988). The retrieval of phonological forms
in production: tests of predictions from a connectionist model. Journal of Memory and Language 27,
124142.
Ferrand L, Segui J & Grainger J (1996). Masked priming of
word and picture naming: the role of syllabic units.
Journal of Memory and Language 35, 708723.
Ferrand L, Segui J & Humphreys G W (1997). The syllables role in word naming. Memory and Cognition 35,
458470.

Phonology: Optimality Theory 553


Fromkin VA (1971). The non-anomalous nature of anomalous utterances. Language 47, 2752.
Garrett M F (1975). The analysis of sentence production.
In Bower G H (ed.) The psychology of learning and
motivation 9. San Diego: Academic Press. 133177.
Goldstein L & Fowler C A (2003). Articulatory phonology: a phonology for public language use. In Schiller &
Meyer (eds.). 159207.
Guenther F (2003). Neural control of speech movements.
In Schiller & Meyer (eds.). 209239.
Indefrey P & Levelt W J M (2004). The spatial and temporal signatures of word production components. Cognition 92, 101144.
Levelt W J M (1989). Speaking: from intention to articulation. Cambridge, MA: MIT Press.
Levelt W J M & Wheeldon L (1994). Do speakers have
access to a mental syllabary? Cognition 50, 239269.
Levelt W J M, Roelofs A & Meyer A S (1999). A theory of
lexical access in speech production. Behavioral and
Brain Sciences 22, 175.
McQueen J M, Dahan D & Cutler A (2003). Continuity
and gradedness in speech processing. In Schiller &
Meyer (eds.). 3978.
Mehler J, Dommergues J Y, Frauenfelder U & Segui J
(1981). The syllables role in speech segmentation. Journal of Verbal Learning and Verbal Behavior 20, 298305.
Meyer A S (1990). The time course of phonological encoding in language production: the encoding of successive
syllables of a word. Journal of Memory and Language
29, 524545.
Meyer A S (1991). The time course of phonological encoding in language production: phonological encoding inside
a syllable. Journal of Memory and Language 30, 6989.
Roelofs A (1999). Phonological segments and features as
planning units in speech production. Language and
Cognitive Processes 14, 173200.
Roelofs A & Meyer A S (1998). Metrical structure in
planning the production of spoken words. Journal of

Experimental Psychology: Learning, Memory, and


Cognition 24, 922939.
Schiller N O (1998). The effect of visually masked syllable
primes on the naming latencies of words and pictures.
Journal of Memory and Language 39, 484507.
Schiller N O (2000). Single word production in English: the
role of subsyllabic units during phonological encoding.
Journal of Experimental Psychology: Learning, Memory,
and Cognition 26, 512528.
Schiller N O (2004). The onset effect in word naming.
Journal of Memory and Language 50, 477490.
Schiller N O & Meyer A S (eds.) (2003). Phonology and
phonetics in language comprehension and production:
differences and similarities. Berlin: Mouton de Gruyter.
Schiller N O, Costa A & Colome A (2002). Phonological
encoding of single words: in search of the lost syllable.
In Gussenhoven C & Warner N (eds.) Laboratory
phonology. 7. Berlin: Mouton de Gruyter. 3559.
Schiller N O, Jansma B M, Peters J & Levelt W J M (2005).
Monitoring metrical stress in polysyllabic words. Language and Cognitive Processes.
Schriefers H, Meyer A S & Levelt W J M (1990). Exploring
the time course of lexical access in language production:
pictureword interference studies. Journal of Memory
and Language 29, 86102.
Shattuck-Hufnagel S (1979). Speech errors as evidence for
a serial ordering mechanism in sentence production. In
Cooper W E & Walker E C T (eds.) Sentence processing.
New York: Halsted Press. 295342.
Stemberger J P (1991). Apparent anti-frequency effects in
language production: the addition bias and phonological
underspecification. Journal of Memory and Language
30, 161185.
Wheeldon L & Levelt W J M (1995). Monitoring the time
course of phonological encoding. Journal of Memory
and Language 34, 311334.

Phonology: Optimality Theory


D B Archangeli, University of Arizona, Tucson,
AZ, USA
2006 Elsevier Ltd. All rights reserved.

Introduction
Optimality theory, introduced in the early 1990s
(Prince and Smolensky, 1993; McCarthy and Prince,
1993a,b), offers an extremely simple formal model of
language, with far-reaching implications for how language works. The formal component of each grammar consists of a ranked ordering of a universal set
of constraints; this ordering is used to identify the
best pairing between a given input and all potential

outputs. Languages differ not by the constraints


used, but by the ranking used the constraints are
universal. Depending on where a particular constraint
falls in the constraint hierarchy of the language determines how roundly violated that constraint will
be, because all constraints are violable. The ranking
determines which violations matter for which input
output mappings (for overviews, see Archangeli and
Langendoen, 1997; Kager, 1999; McCarthy, 2002,
2004).
During the 1970s and into the 1980s, phonological
research centered on phonological representations.
As representations were better understood, the rules
relating those representations to each other seemed to

Phonology: Optimality Theory 553


Fromkin VA (1971). The non-anomalous nature of anomalous utterances. Language 47, 2752.
Garrett M F (1975). The analysis of sentence production.
In Bower G H (ed.) The psychology of learning and
motivation 9. San Diego: Academic Press. 133177.
Goldstein L & Fowler C A (2003). Articulatory phonology: a phonology for public language use. In Schiller &
Meyer (eds.). 159207.
Guenther F (2003). Neural control of speech movements.
In Schiller & Meyer (eds.). 209239.
Indefrey P & Levelt W J M (2004). The spatial and temporal signatures of word production components. Cognition 92, 101144.
Levelt W J M (1989). Speaking: from intention to articulation. Cambridge, MA: MIT Press.
Levelt W J M & Wheeldon L (1994). Do speakers have
access to a mental syllabary? Cognition 50, 239269.
Levelt W J M, Roelofs A & Meyer A S (1999). A theory of
lexical access in speech production. Behavioral and
Brain Sciences 22, 175.
McQueen J M, Dahan D & Cutler A (2003). Continuity
and gradedness in speech processing. In Schiller &
Meyer (eds.). 3978.
Mehler J, Dommergues J Y, Frauenfelder U & Segui J
(1981). The syllables role in speech segmentation. Journal of Verbal Learning and Verbal Behavior 20, 298305.
Meyer A S (1990). The time course of phonological encoding in language production: the encoding of successive
syllables of a word. Journal of Memory and Language
29, 524545.
Meyer A S (1991). The time course of phonological encoding in language production: phonological encoding inside
a syllable. Journal of Memory and Language 30, 6989.
Roelofs A (1999). Phonological segments and features as
planning units in speech production. Language and
Cognitive Processes 14, 173200.
Roelofs A & Meyer A S (1998). Metrical structure in
planning the production of spoken words. Journal of

Experimental Psychology: Learning, Memory, and


Cognition 24, 922939.
Schiller N O (1998). The effect of visually masked syllable
primes on the naming latencies of words and pictures.
Journal of Memory and Language 39, 484507.
Schiller N O (2000). Single word production in English: the
role of subsyllabic units during phonological encoding.
Journal of Experimental Psychology: Learning, Memory,
and Cognition 26, 512528.
Schiller N O (2004). The onset effect in word naming.
Journal of Memory and Language 50, 477490.
Schiller N O & Meyer A S (eds.) (2003). Phonology and
phonetics in language comprehension and production:
differences and similarities. Berlin: Mouton de Gruyter.
Schiller N O, Costa A & Colome A (2002). Phonological
encoding of single words: in search of the lost syllable.
In Gussenhoven C & Warner N (eds.) Laboratory
phonology. 7. Berlin: Mouton de Gruyter. 3559.
Schiller N O, Jansma B M, Peters J & Levelt W J M (2005).
Monitoring metrical stress in polysyllabic words. Language and Cognitive Processes.
Schriefers H, Meyer A S & Levelt W J M (1990). Exploring
the time course of lexical access in language production:
pictureword interference studies. Journal of Memory
and Language 29, 86102.
Shattuck-Hufnagel S (1979). Speech errors as evidence for
a serial ordering mechanism in sentence production. In
Cooper W E & Walker E C T (eds.) Sentence processing.
New York: Halsted Press. 295342.
Stemberger J P (1991). Apparent anti-frequency effects in
language production: the addition bias and phonological
underspecification. Journal of Memory and Language
30, 161185.
Wheeldon L & Levelt W J M (1995). Monitoring the time
course of phonological encoding. Journal of Memory
and Language 34, 311334.

Phonology: Optimality Theory


D B Archangeli, University of Arizona, Tucson,
AZ, USA
2006 Elsevier Ltd. All rights reserved.

Introduction
Optimality theory, introduced in the early 1990s
(Prince and Smolensky, 1993; McCarthy and Prince,
1993a,b), offers an extremely simple formal model of
language, with far-reaching implications for how language works. The formal component of each grammar consists of a ranked ordering of a universal set
of constraints; this ordering is used to identify the
best pairing between a given input and all potential

outputs. Languages differ not by the constraints


used, but by the ranking used the constraints are
universal. Depending on where a particular constraint
falls in the constraint hierarchy of the language determines how roundly violated that constraint will
be, because all constraints are violable. The ranking
determines which violations matter for which input
output mappings (for overviews, see Archangeli and
Langendoen, 1997; Kager, 1999; McCarthy, 2002,
2004).
During the 1970s and into the 1980s, phonological
research centered on phonological representations.
As representations were better understood, the rules
relating those representations to each other seemed to

554 Phonology: Optimality Theory

simplify and become more illuminating. However, by


the late 1980s, this simplification of rules did not
lead to a more viable theory of rules. Rather, in order
to account for language phenomena, both rules and
constraints were used. Much of phonological exploration in the late 1980s and early 1990s focused on
the role of constraints in grammars (e.g., Archangeli
and Pulleyblank, 1994). A separate class of research
programs explored derivationless phonology (e.g.,
Goldsmith, 1993). Optimality theory blends these
two lines of exploration, using constraints (instead
of a rule-based derivation) to mediate between input
and output.
Understanding the architecture of optimality theory
is a prerequisite to exploring some of the many ramifications of that architecture. Throughout this article,
the examples are phonological because the theory is
most developed through phonological studies. However, as discussed in the closing section, optimality
theoretic analyses of syntactic and morphological
structure are among the extensions of the model.

Understanding the Model


The architecture of optimality theory is deceptively
simple. It is shown schematically in Figure 1. Humans
are equipped with a universal set of constraints (CON)
stating what is possible or impossible in language.
Learning a language consists, in part, of learning
a ranked ordering for that set of constraints, i.e.,
the constraint hierarchy for that language. The correct
constraint hierarchy selects the best pairing between a
given input and an output candidate (CAND). An operation known as generator (GEN) creates the set of
output candidates from which the optimal one is
selected. Candidate selection results from evaluation
(EVAL), an operation that compares the various candidates to the constraint hierarchy in order to select the
best match.
GEN and EVAL

GEN and EVAL are the only operations available in this


theory. The function of GEN is to map the relations
between any given input and the candidate set. This
mapping shows correspondences between the elements of the input and the elements in each candidate.
For example, with an input /abc/ and the candidate
[dab], we might imagine a one-to-one mapping such

Figure 1 Schematic of the optimality theory model.

that input /a/ maps to [d], input /b/ to [a], and input /c/
to [b] (these relations are shown graphically in Figure
2A). We could also imagine that the input /a,b/ maps
directly to candidate [a,b], but that input /c/ has no
counterpart in the candidate, and candidate [d] has
no counterpart in the input (Figure 2B).
EVAL is responsible for determining the best pairing
between the input and candidate set in each language.
For EVAL to function in a language-particular fashion,
it takes into account the input, the candidate set (i.e.,
the results of GEN), and the constraint hierarchy as
defined for the language in question. In classic optimality theory, EVAL compares each candidate to the
requirements of the highest ranking constraint (Ci)
and eliminates all candidates but the candidates
with the least number of violations of Ci. EVAL then
examines the remaining viable candidates against the
second constraint (Cj) in the language and again eliminates all but those with the least number of violations
for Cj. EVAL continues through the constraint hierarchy in this fashion, eliminating candidates until only
one remains, which is the optimal candidate, given
the input and the constraint hierarchy. (Alternative
versions of GEN and EVAL have been proposed in order
to make the model finite and so to introduce the
potential of psychological reality; the versions presented here are the classic functions, as introduced
in the original works.)
Tableaux

Practitioners of optimality theory use two conventions for visually representing the function of EVAL
for a given input. First, the candidate set is limited
to those most likely to be contenders for the best
match. Second, the evaluation of those candidates is
depicted in a chart, called a tableau. The tableau
helps to understand even relatively simple cases.
To understand how a tableau works, it is useful to
start with a concrete, but limited, example: a hypothetical CON with only two conflicting constraints in
it, Ci and Cj. This allows for two languages. In one
language, the constraint hierarchy ranks Ci above Cj,
denoted Ci  Cj, whereas in the other language,
the opposite ranking holds, Cj  Ci. Assume for the
moment that for a given input, GEN produces only
two candidates, CAND1 and CAND2, such that CAND1

Figure 2 Possible relations between a given input and a


potential candidate.

Phonology: Optimality Theory 555

violates only Ci and CAND2 violates only Cj. A bit of


thought reveals that in the first language, L1, CAND2
will be preferred, despite violating Cj. This is because
Ci outranks Cj and the only competing candidate
violates Ci. The opposite case holds in the second
language, L2, in which CAND1 wins. The bit of
thought involved in working through this is aided
considerably by a tableau a visual presentation of
the logic. Figure 3 shows tableaux for the two cases
just described; in each case, the optimal candidate is
identified with a pointing finger symbol ( ). Each
tableau is arranged with the input in the top left corner and the constraints arrayed across the top from
left to right (corresponding to highest ranked to lowest ranked). Candidates go in the leftmost column.
Each cell then shows a pairing between a candidate
and a constraint. An asterisk is put in cells corresponding to the pairing of a candidate with a constraint that the candidate violates. For example, in
Language 1 (Figure 3A), the winning CAND2 violates
Cj, as shown by the asterisk in the cell pairing CAND2
and Cj. The notation *! is used to identify the constraint violation that eliminates a particular candidate, i.e., a fatal violation. This occurs when there is
some other candidate with fewer violations for the
constraint in question. Again, in Language 1, CAND1
fatally violates the top-ranked Ci, shown by the *! in
the cell pairing CAND1 and Ci.
In Language 2, CAND2 (which violates Cj) is eliminated by Cj because Cj is the highest ranking constraint under consideration; this fatal violation is
indicated by *! notation. Since CAND1 does not violate Cj, it remains in the competition. Also, because
there are no other candidates to consider, CAND1 is
selected as the optimal candidate, despite its violation
of the lower ranked Ci. The gray-shaded cells in each
tableau indicate constraintcandidate pairings that
are no longer relevant because a decision has already
been made.

Markedness Constraints Markedness constraints


are evaluated based on output candidate form alone.
Each language has its own distinct sound patterns,
including the syllable shapes permitted, the inventory
of sounds from which words are made, and the sound
sequences allowed. Markedness constraints formally
define these preferred configurations. Intriguingly,
taken as a group, they also reflect cross-linguistic
patterns of markedness, in the sense defined in
Chomsky and Halle (1968). Including Markedness
constraints in CON, the universal set of constraints,
directly encodes universal properties of language in
the grammar of each specific language.

CON, the Constraint Set

Summary

The remaining component to examine is the constraint set. There are two basic types of constraints,
Faithfulness constraints and Markedness constraints.

Universal grammar provides a set of constraints,


CON, and two operations, GEN and EVAL. The language learner acquires inputs and a ranking of

Faithfulness Constraints Faithfulness constraints


evaluate the goodness of the match between the
input and the candidate. A complete match is fully
faithful it violates no Faithfulness constraints: Inputs
/abc/ maps to candidate [abc] in a fully faithful manner. By contrast, candidates such as [ab], [abd], and
[dabc] violate some Faithfulness constraint(s). The
insight into Faithfulness constraints is that languages
need to be able to distinguish a variety of words. In
order to do so, it is important that the constraint
system have the ability to maintain differences between inputs. Without Faithfulness, all inputs would
converge on the same output, because the only constraints would be Markedness constraints, which
govern preferred outputs.
Faithfulness constraints formally encode the variety of ways in which there can be a match between
input and candidate. Violations are incurred by mismatches. For example, there can be a mismatch because the output contains something that is not in the
input; there can also be a mismatch because the output does not contain something that is present in the
input (recall Figure 2). Each distinct type of mismatch
is characterized by a distinct type of Faithfulness
constraint (see McCarthy and Prince, 1995).

Figure 3 Tableaux for two simple languages, varying by the ranking of the two constraints: in each case, the optimal candidate is
identified with a pointing finger symbol ( ).

556 Phonology: Optimality Theory

constraints, comprising the languages constraint hierarchy. The constraint hierarchy for a given language
provides the ranking of Markedness and Faithfulness
constraints for that language. This ranking results in
interaction among the different types of constraints,
which determines the shape of output forms, and so
defines possible words for the language. GEN creates
input/candidate pairings and EVAL uses the constraint
hierarchy to evaluate those pairings to find the best
match for a given input.

Consequences
Testing optimality theory against a variety of different linguistic effects shows that numerous effects
follow directly from the architecture of the model.
There is no need to add particular constraints, types
of representations, or additional operations to the
straightforward model sketched in the preceding
section. This section presents a number of phonological phenomena and shows how optimality theory explains these effects through the interaction of
Markedness and Faithfulness constraints (for a lucid
and thorough discussion of these and other results of
optimality theory, see McCarthy (2002)).
Inventories

The term inventory refers to the set of sounds of a


language: velar stops k, g are in the inventory of
English but the velar fricatives x, are not. Quite
often in phonological analysis a distinction is made
between the underlying inventory the set of sounds
from which lexical entries are made and the surface inventory the set of sounds from which surface
representations are made (in analyses of English, the
underlying inventory typically does not include aspirated voiceless stops, yet they are part of the surface
inventory).
Within optimality theory, a distinction between
input and output inventories is impossible because
there are no constraints on the input and so no way
of encoding an input inventory. The only constraints
available are those in the constraint hierarchy of the
language, governing the relation between input and
output (Faithfulness) or the well-formedness of the
output (Markedness). The constraints on the wellformedness of output forms determine the sound set
that is realized in the language. The absence of restrictions on the input gives rise to a concept called richness of the base. Richness refers to a lack of
constraints and the base is the set of possible inputs.
Formally, the free combination of linguistic primitives
produces the set of possible inputs. The wide variety
of sounds available in the input is filtered through
the constraint hierarchy. In particular, Markedness

constraints limit the types of sounds possible in the


output, whereas Faithfulness allows for a variety of
sounds to surface. It is the interaction of Faithfulness and Markedness constraints that produces the
inventory effect.
Consider, for example, a language without front
round vowels (see Figure 4). Under optimality theory,
this would be expressed by a Markedness constraint,
possibly *[round, front] No [round] and [front] combination. Since no front round vowels occur in the
language, this would be a high-ranked constraint.
Crucially, this constraint must outrank the Faithfulness constraint for one of [front] and [round], as
the tableaux in Figure 4 reveal. Should a candidate
contain a front round vowel, it will incur a fatal
violation even if that vowel is completely identical
to the input vowel. The relative ranking of Faithfulness to [front] and to [round] determines whether an
input /y/ maps to [i] (Figure 4A) because FaithRound
is high ranking, or to [u] (Figure 4B) because FaithFront is high ranking.
Formally, there are no limits on the input. The
consequence for the practitioner of optimality theory
is the need to consider very carefully all sorts of
peculiar inputs, to be sure that the constraint hierarchy proposed for a language does indeed result in the
identification of appropriate inputoutput pairings.
There have been efforts to curtail this property,
most notably the concept of lexicon optimization.

Figure 4 Tableaux for a language without front round vowels;


the optimal candidate is identified with a pointing finger symbol
( ).

Phonology: Optimality Theory 557

However, as McCarthy (2002: 78) pointed out, lexicon optimization exists mainly to provide reassurance
that the familiar underlying forms are still identifiable
in the rich base. It is irrelevant as a principle of
grammar.
Distributional Restrictions

Closely related to the concept of inventories, and


handled by optimality theory in much the same way,
is the notion of distributional restrictions. Distribution restrictions refer to cases in which certain sounds
in a language are permitted only in certain environments. A familiar form of distributional restriction is
complementary distribution. An example of this is
aspirated and plain voiceless stops in English: aspirated stops are found initially and before stressed
vowels, and plain stops are found elsewhere. Another
type of distributional restriction is contextual neutralization: a particular distinction is eliminated (neutralized) in a particular context. For example, English
allows a wide variety of consonant clusters at the end
of a syllable, but disallows a nasal-stop sequence
when the two consonants have different places of
articulation: nt, nd, and mp are acceptable, whereas
np, md, and mg are not acceptable.
The interaction between particular types of Faithfulness and Markedness constraints readily captures
these distributional restrictions. The Faithfulness
constraints in question require identity between the
input and the output (e.g., IDENT(voice): corresponding input and output segments have the same voicing
value). The Markedness constraints involve a prohibition against a particular type of segment (e.g.,
OBSVOI: no voiced obstruents) and a requirement for
that prohibited type of segment in a particular context (e.g., *NC(-voice): after nasals, obstruents are
voiced). As shown in Figure 5, there are three interactions of interest among constraints such as these
(see Pater, 1999). First, if IDENT(voice) is the highest
ranked of the three, then input obstruents such as /t/
and /d/ will surface as [t] and [d], respectively, because
the input value for [voice] matches the output value,
regardless of context. This is simply the expected case
with high-ranking Faithfulness: the two classes of
obstruents occur in the same environments, as is the
case with alveolars in English (want, wand).
Suppose, however, that the top-ranked constraint is
*NC(-voice), prohibiting nasal-voiceless obstruent
sequences. In this case, there are two outcomes,
depending on the ranking of the other two constraints. When OBSVOI (obstruents are voiceless) is
ranked above IDENT(voice), voiced and voiceless
obstruents occur in complementary distribution,
with voiced obstruents following nasals and voiceless
ones elsewhere. Figure 6A shows the input /nt/,

which does not faithfully survive because of the


high-ranked constraint prohibiting such clusters. The
other two constraints are irrelevant: the cluster [nd]
surfaces. The other inputs of interest are shown in
Figure 6B,C, in which either input, /t/ or /d/, is
mapped to output [t] because the prohibition against
voiced obstruents outranks faithfulness. Schematically, complementary distribution is characterized
by Contextual Markedness  Generic Markedness
 Faithfulness. Proponents of optimality theory
note that complementary distribution is predicted
simply by the existence of Markedness and Faithfulness constraints. Nothing additional needs to be
added to the formal model.
The ranking Contextual Markedness  Faithfulness  Generic Markedness gives rise to contextual
neutralization. Returning to the preceding constraint
set, the relevant constraint hierarchy is depicted in the
tableaux in Figure 7. The tableau in Figure 7A shows
the positional case wherein, after a nasal, the voiced/
voiceless distinction is neutralized to voiced obstruents. By contrast, the other two tableaux (Figure 7B,C)
show that input voiced and voiceless obstruents
surface without change when they do not follow a
nasal. This ranking, then, produces contextual neutralization, and again, this type of sound pattern is
expected, given the nature of constraint interaction
(see also Wilson (2000) on contextual neutralization).
Processes and Conspiracies

One class of phenomena that might be presented as a


serious challenge to optimality theory is the phonological process, depicted schematically by the A ! B /

Figure 5 Tableaux showing that top-ranked Faithfulness obviates lower ranked Markedness constraints; the optimal candidate
is identified with a pointing finger symbol ( ).

558 Phonology: Optimality Theory

C__D formula (Chomsky and Halle, 1968). This type


of characterization of a phonological process includes
an element identifying the target (A), a statement of
the change (B), and a characterization of the context
in which the change takes place (C__D). Optimality
theory does not have the luxury of encoding processes
in packages of this sort: optimality theory has only
constraints with which to achieve these results. In
fact, however, this is viewed as a strength of the
model: optimality theory requires that the elements
of the process be separated from each other.
A Markedness constraint *CAD prohibits the target
string CAD, while separate Markedness constraints
identify the preferred configuration. The implication
of separating the target from the change is that languages are predicted to resolve CAD in different
ways. In the classic CAD case, the illicit sequence
maps to CBD. However, two other alternatives are
possible: BAD and CAB. Under optimality theory, the
interaction of constraints determines exactly what
surfaces in place of the prohibited CAD. For example, many languages do not allow consonant clusters
(CCs) within syllables. Interestingly, as a class, these
languages use different methods to handle CC

Figure 6 Tableaux showing the same constraints as in Figure 5,


with different inputs to show complementary distribution (voiced
obstruents after nasals; voiceless obstruents elsewhere); the optimal candidate is identified with a pointing finger symbol ( ).

sequences that could end up in a single syllable: a


consonant might be deleted, a vowel might be
added, or a consonant might be realized as a vowel.
In each case, the Markedness constraint *CC prohibits consonant clusters within a single syllable, but
the resolution depends on other constraints, such as
the relative rankings of Faithfulness constraints preventing the addition or loss of a segment, or the
addition/loss of certain features (for a consonant to
be realized as a vowel). Optimality theory predicts a
variety of resolutions to configurations that are ruled
out by a high-ranked Markedness constraint, and this
variety is found in languages.
Another result of the separation of processes forced
by optimality theory is an account of the class of
cases called conspiracies in the generative literature
(Kisseberth, 1970). A conspiracy is said to exist in
a language when several different processes all converge on a single output configuration. This is common with syllabification, whereby epenthesis,
deletion, and vowel shortening (three independent
processes) all converge on maximally bimoraic syllables with no complex margins (CV, CVC, and CVV).
Under optimality theory, conspiracies are characterized by high-ranking Markedness constraints that

Figure 7 A change in the ranking shown in Figure 6 produces


contextual neutralization; the optimal candidate is identified with
a pointing finger symbol ( ).

Phonology: Optimality Theory 559

define the general properties of the language. The


ranking of other Markedness and Faithfulness constraints below those top-ranked constraints serves to
define exactly how the conspiracy plays out with
respect to each input.
Emergence of the Unmarked

As has already been demonstrated with the preceding


tableaux, even constraints that are dominated by
other constraints can play a deciding role. This happens whenever the higher ranked constraint does not
make a decision between competing candidates. The
emergence of the unmarked (McCarthy and Prince,
1994) refers precisely to this case: some Markedness
constraint is sufficiently outranked, to be roundly
violated in the surface forms of the language. However, when the higher ranked constraints fail to determine a single optimal candidate, the dominated
Markedness constraint can cast the deciding vote,
and the unmarked configuration emerges. For example, in Kinande, there are 10 vowels, 5 with advanced
tongue root and 5 with retracted tongue root. There is
also a harmony pattern whereby all vowels in a single
word are typically either advanced or retracted.
However, this harmony pattern is mediated in various
ways. In one context, when all other constraints fail
to select the optimal pattern, the critical factor is
that the affected vowel is front: the unmarked combination [advanced, front] emerges (Archangeli and
Pulleyblank, 2002). This type of sound pattern is
particularly intriguing because it is an obvious and
expected result of optimality theory. The interaction
of Markedness and Faithfulness constraints predicts just this type of case. By contrast, it cannot
be characterized as a coherent class in a rule-based
derivational model.
Universals and Typology

Each Markedness constraint prefers or bans some


configuration. Under optimality theory, constraints
are universal. Putting these two properties of the
model together means that the Markedness constraints in optimality theory express universal preferences. The model also claims that universals need not
be surface true in all languages. This follows from the
principles that constraints are violable and that languages vary in how the constraints are ranked in
constraint hierarchies. In this fashion, optimality theory answers one question that has dogged studies of
language universals: why it is so difficult to find universals that are indeed universal. Optimality theorys
answer is that the constraints are universal, but they
are also violable.
There is a second side to universals that often
receives less attention: the universal absence of some

configuration. As pointed out in McCarthy (2002:


108), [i]f no permutation of the constraints in CON
produces a language with property P, then languages
with P are predicted not to exist. This does, of
course, lead to a research challenge: when P is found
to exist after all, does that disconfirm the theory, or
does it mean that the constraints in CON are not yet
the correct set of constraints?
A further result of positing a rankable CON is the
claim that each different possible ranking corresponds to a possible language. In this way, optimality
theory is inherently typological. This fact about the
model introduces a neat way of checking a particular
analysis. Suppose an analysis posits three constraints,
crucially ranked Ci  Cj  Ck. Because constraints
can be ranked differently in different languages, if
these three constraints are members of CON, there
should then be languages that correspond to the
other possible rankings of these three constraints:

Ci  Cj  Ck
Ci  Ck  Cj
Cj  Ci  Ck
Cj  Ck  Ci
Ck  Ci  Cj
Ck  Cj  Ci
This observation is known as the factorial typology:
for n constraints, freely ranked, there are n! possible
languages. The research challenge is to check each of
those possible languages to ascertain whether they
are attested or, if not attested, at least appear to be
feasible as human languages.
There is one caveat about the free ranking of constraints: certain phenomena, such as the sonority hierarchy, suggest that there might be some sets of
constraints with fixed rankings with respect to other
members of the set. Accepting fixed rankings within
CON restricts the number of possible grammars somewhat but does not change the essence of the factorial
typology.
Summary

This section has provided several illustrations of


consequences of optimality theory, based on the essential architecture of the model. Some of these
consequences show how optimality theory handles
familiar phonological phenomena such as inventories, segment distribution, and phonological processes. Other consequences, such as the typological
and universal effects, are perhaps more profound in
that they show how optimality theory as a model

560 Phonology: Optimality Theory

broadly encompasses aspects of language structure


that are independent of a particular language. At the
same time, through the rankings of the universal constraint set CON, these effects actually conspire to form
each individual language.

Extensions
Optimality theory was introduced as a model of synchronic adult phonological grammar. Since the inception of the model, research has explored extending the
model to better understand domains as diverse as learnability, syntax, semantics, language change, and language variation. This following sections introduce the
optimality theory perspective on a few of these topics.
Learnability

The clear formal properties of optimality theory lend


themselves to explorations of learnability, i.e., whether the constraint hierarchy for a particular language is
learnable. Other formal aspects of the model, such as
the infinite candidate set created by GEN, suggest that
the grammar would be totally unlearnable: The learner would get stuck generating the infinite set of candidates for the first input attempted, let alone the
morass of trying to sort out the correct ranking of n!
possible rankings of CON. The learnability question
has been explored since the inception of optimality
theory (Tesar and Smolensky, 1998; Riggle, 2004).
The most successful learnability model, recursive
constraint demotion, starts with the learner knowing
a few important inputoutput pairings. An alternative that does not require such knowledge is the
genetic algorithm (Pulleyblank and Turkel, 2000);
this model starts with random triples, i.e., an input
outputconstraint hierarchy. With each iteration,
the better triples increase their chances of survival
while the poorly matched triples (i.e., ones for
which the inputoutput pairing is not selected by the
constraint hierarchy in the triple) decrease their
chances of survival.
Optimality theory is a model of the structure of
language, but it is not necessarily a model of what
people actually do when they use a language. The
relation between optimality theory and psychological
reality is yet to be clarified, thus discussions of learnability, and even more so of acquisition, may be
somewhat premature.
Language Change/Variation

Language variation refers to the different forms of a


language at a particular point in time, such as the
many dialects and idiolects of English spoken now.
Language change refers to the different forms of
a language over time; an example would be the

difference between the English of Chaucer and the


English of the 21st century. Under optimality theory,
both variation and change are being explored in a
number of ways. A very intriguing line of thought
about variation and change is the view that these
effects result from the interaction between phonetic
changes and Faithfulness, which impact the constraint hierarchy (Kennedy, 2003). For example, in a
constraint hierarchy configured for trochaic feet, in
which word-final vowels are phonetically lost, some
kind of restructuring is necessary. One option would
be for stress to shift; another would be that the final
syllable is stressed, despite not being in a trochaic
foot. The cases Kennedy discussed retain stress on
the final syllable: This requires significant restructuring of the constraint hierarchy, which in turn impacts
the way that reduplication works.
Challenges

Opacity is the greatest empirical challenge to optimality theory. It simply does not follow from the
basic architecture of the model and, to date, there
is no broadly accepted solution to the problem.
Opacity, a term from derivational phonology, refers
to cases where (a) it appears that a rule has applied,
but should not have, or (b) it appears that a rule has
not applied but should have. Examples of both
types are found in the Yokuts dialects. The vowel
/i/ harmonizes to [u] following [u], but not following
[o]:
. Type (a): Surface strings of [. . .o. . .u. . .] occur
where [. . .o. . .i. . .] would be expected (the [o] corresponds to an input long u, which lowers to a midvowel). The constraints inducing harmony in
Yokuts refer to vowel height so that in other contexts, the [o. . .i] sequence can occur. The harmony
constraints outrank IDENT(round) so that harmony
can take place. HARMONY, a Markedness constraint,
is evaluated with respect to the surface form, and
[o. . .u] does not conform to the height restrictions
and so should be eliminated as it is in other contexts.
. Type (b): In one dialect, [. . .u. . .i. . .] strings also
surface where [. . .u. . .u. . .] is expected (the [u] corresponds to an input short /o/, which raises before
[i]). The challenge to optimality theory is again that
the HARMONY constraint is a Markedness constraint and so is evaluated on the output alone,
and so is unable to distinguish the [u. . .i] sequence
that derives from an input /o/ and therefore should
be exempt from HARMONY.
Efforts to resolve the opacity challenge all share the
property of introducing an additional level of
representation. One type of response is to identify a

Phonology: Optimality Theory 561

particular candidate that the winner must match in


some critical way: this approach is called sympathy,
in that the winner is sympathetic to some aspect of a
nonoptimal candidate (McCarthy, 1998). Another
type of response introduces additional representation
overtly, by allowing multiple constraint hierarchies,
thus the output of one hierarchy serves as the input
to the next hierarchy. These stratal systems are
strongly reminiscent of the strata or levels of lexical
phonology (see Bermu dez-Otero, 2005).
The universality of CON is a significant practical
challenge to optimality theory. In the practice of optimality theory, it is not uncommon to find rather
specific constraints proposed that do not have clear
motivation as universals. Establishing the viability
of the languages predicted by the factorial typology
involving these overly specific constraints is perhaps
the best means of establishing them as universals.
See also: Morphology: Optimality Theory; Phonological
Change in Optimality Theory; Phonological Change: Lexical Phonology; Phonology: Overview; Syntax: Optimality
Theory.

Bibliography
Archangeli D & Langendoen D T (eds.) (1997). Optimality
theory: an overview. Oxford: Blackwell Publishers.
Archangeli D & Pulleyblank D (1994). Grounded phonology. Cambridge, MA: MIT Press.
Archangeli D & Pulleyblank D (2002). Kinande vowel
harmony: domains, grounded conditions, and one-sided
alignment. Phonology 19(2), 139188.
Bermu dez-Otero R (2005). Oxford studies in theoretical
linguistics: stratal optimality theory: synchronic and diachronic applications. Oxford: Oxford University Press.
(In press.)
Chomsky N & Halle M (1968). The sound pattern of
English. New York: Harper and Row.
Goldsmith J (1993). The last phonological rule: reflections
on constraints and derivations. Chicago: University of
Chicago Press.
Kager R (1999). Optimality theory. Cambridge: Cambridge
University Press.

Kennedy R (2003). Confluence in phonology: evidence


from Micronesian reduplication Doctoral diss., University of Arizona.
Kisseberth C (1970). On the functional unity of phonological rules. Linguistic Inquiry 1, 291306.
McCarthy J (1998). Sympathy and phonological opacity.
Rutgers Optimality Archive ROA 252-0398.
McCarthy J (2002). A thematic guide to optimality theory.
Cambridge: Cambridge University Press.
McCarthy J (ed.) (2004). Optimality theory in phonology: a
reader. Malden, MA and Oxford: Blackwell.
McCarthy J & Prince A (1993a). Prosodic morphology I:
constraint interaction and satisfaction. Rutgers Optimality Archive ROA 482-1201.
McCarthy J & Prince A (1993b). Generalized alignment.
Rutgers Optimality Archive ROA-7.
McCarthy J & Prince A (1994). The emergence of the
unmarked: optimality in prosodic morphology. In
Gonzalez M (ed.) Proceeding of the North East Linguistic Society 24. 333379.
McCarthy J & Prince A (1995). Faithfulness and reduplicative identity. In Beckman J, Dickey L & Urbanczyk S
(eds.) Papers in optimality theory: University of
Massachusetts occasional papers 18. [Rutgers Optimality
Archives ROA-60.] 249384.
Pater J (1999). Austronesian nasal substitution and other
NC effects. In Rene Kager R, van der Hulst H &
Zonneveld W (eds.) The prosodymorphology interface. Cambridge: Cambridge University Press.
310343.
Prince A & Smolensky P (1993). Optimality theory:
constraint interaction in generative grammar. Rutgers
University Center for Cognitive Science Technical
Report 2 .
Pulleyblank D & Turkel W J (2000). Learning phonology:
genetic algorithms and Yoruba tongue-root harmony. In
Dekkers J, van der Leeuw F & van de Weijer J (eds.)
Optimality theory: phonology, syntax, and acquisition.
Oxford: Oxford University Press. 554591.
Riggle J (2004). Generation, recognition, and learning in
finite state optimality theory. Doctoral diss., University
of California at Los Angeles.
Tesar B & Smolensky P (1998). Learnability in optimality
theory. Linguistic Inquiry 29, 229268.
Wilson C (2000). Targeted constraints: an approach to
contextual neutralization in optimality theory. Doctoral
diss., Johns Hopkins University, Baltimore, MD.

562 Phonology: Overview

Phonology: Overview
R Wiese, Philipps University, Marburg, Germany
2006 Elsevier Ltd. All rights reserved.

Phonology What Is It About?


Phonology is that part of language which comprises the
systematic and functional properties of sound in language. The term phonology is also used, with the
ambiguity also found with other terms used for the
description of languages, for the study of those systematic features of sound in language. In this sense, it refers
to a subdiscipline of linguistics. It was the first such
subdiscipline in which the view of language as an object
with particular structural properties was developed
successfully. Phonology seeks to discover those systematic properties in the domain of sound structure, and
find the regularities and principles behind it both
for individual languages and for language in general.
More recently, phonology has become considerably
diversified and has found a number of applications.
The emphasis on systematicity in the definition
above derives from the observation that behind the
infinitely varying properties of each token of speech
there is an identifiable set of invariant, recurring,
more abstract properties. The hypothesis that such a
phonological system exists is largely due to Saussure
(see Saussure, 1916) and to the phonologists of the
early structuralist school, both in Europe (the Prague
school and the British school) and in the United States
(American structuralism); see the survey by Anderson
(1985).
Phonology, from its beginnings, has stood in a close,
but sometimes strained, relation to the other science of
linguistic sounds, phonetics. Phonetics studies the concrete, physical features of sound in language, often
called speech. As the function of phonology is to
make linguistic items, which are represented by rather
abstract symbols, pronounceable and understandable,
it is intimately related to phonetics. But while phonetics is interested in the concrete, continuously varying
features of articulation, sound transmission (acoustics), and auditory perception, the subject of phonology is thought to be a set of discrete, symbolic categories
which belong to the cognitive, and not the physical,
domain. This distinction can be interpreted either as a
rather strict and principled one, or as one which is
gradual and of less importance.

The Categories of Phonology


The Phoneme

The first invariant of the categories identified in


phonology was the phoneme. The discovery of the

phoneme, the smallest unit of sound which causes a


form to differ in meaning from other forms, can justly
be identified as the origin of phonology. While the
groundwork for phonology was laid in the analysis
of many ancient and modern languages, around the
beginning of the 20th century the work by Ferdinand
de Saussure (Saussure, 1916), Jan Baudouin de
Courtenay (Baudouin de Courtenay, 1972), and Franz
Boas (Boas, 1889) was crucial for the formulation
of the phonemic analysis in a stricter sense. This was
followed in the first half of the 20th century by several important formulations of phonological theory,
notably by Edward Sapir (Sapir, 1921), Leonard
Bloomfield (Bloomfield, 1933), Roman Jakobson
(Jakobson, 1939) and Nikolay Trubetzkoy (Trubetzkoy,
1967). A unit of sound is a phoneme if it functions to distinguish lexical items from each other
in terms of meaning, and if it cannot be broken up
any further in a way that other lexical units emerge.
The German (Standard German) word mein ([maI:n]
my), for example, clearly is a different word from
nein ([naI:n] no). At the same time, neither [m] nor
[n] can be split up in such a way that other words of
German appear. Together with similar comparisons
relating [m] and [n] to other segments of German,
the so-called minimal pair just presented constitutes
evidence that /m/ and /n/ are phonemes of this
language.
A phoneme therefore is a contrastive structural unit
of language, and is related to, but not identical with, a
concrete sound. Phonemes within a language form
phonemic systems, by means of the contrasts between
phonemes. The nasals /m, n, N/, for example, form
part of the phoneme system of German, and contrast
systematically with an otherwise similar subsystem,
that of the voiceless consonants /p, t, k/. Seen abstractly, a phoneme is nothing but the set of such
contrasts; seen more concretely, a phoneme is a class
of related sounds. Phonology also aims at the study
of the properties of phonemic systems, and has established many patterns and principles which hold for
such systems; see, in particular, Maddieson (1984).
The conception of phonemes as objects in human cognition is largely due to Sapir (1933), who argued that
speakers must have mental representations of sounds,
and that these mental objects cannot be identical to the
concrete realizations of such phonemes.
The Phonological Features

The phoneme turns out to be an important, but not


the only, category needed for the phonological description of languages. Most importantly, phonemes
can be shown to function in groups which share

Phonology: Overview 563

identifiable properties. This observation, largely due


to Trubetzkoy and Jakobson, led to the formulation
of a theory of distinctive features. Such features describe the classes of phonemes by assigning the same
feature to all members of a class. At the same time,
features define sounds (phonemes or not) and express
their composite nature. Thus, a sound [p] may be
assigned the feature [labial], on the grounds that closure of the lips is a crucial ingredient of this sound.
The description of this sound then includes the feature [labial], and the class of labial consonants (and
perhaps vowels such as [u]) is defined as the set of
sounds bearing this feature.
Finally, the features can also be seen as the set of
atomic units from which all larger units of sound are
constructed, in all languages. In this function, the set
of phonological features is part of the cognitive equipment which allows human beings to use language. In
phonological theory, since the work by Jakobson et al.
(1952), the hypothesis has been proposed that there is
indeed a small set of features from which all sound
segments can be built up. This feature set thus constitutes something like the set of atoms from which
all larger units derive. Several sets of features have
been proposed since then, most influentially that in
Chomsky and Halle (1968).
Phonological Processes

Sounds, whether phonemes or not, often relate


to each other in systematic ways. A first type of
relation is that between a phoneme and its (possibly
several) realizations. This relation can be described
by means of rules which specify precisely in which
way a phoneme is modified under specific circumstances, such as influence from its right-hand or lefthand neighbors. For example, a vowel will often
be nasalized if standing before a nasal consonant.
This approach was most thoroughly proposed by
Chomsky and Halle (1968) in a rule-based phonological theory. In this theory, processes are not only those
which describe the relation between a phoneme
and its realizations, but also those which relate phonemes of several connected word forms. In this view,
a rule is also called upon to describe the relation
between the sounds in, for example, relate with final
/t/ and relation with final /s/ in the same morpheme,
but before the suffix -ion. In other theories, processes
of this type (often riddled by exceptions in one direction or other) are placed in a separate component of
grammar called morphophonology. Thirdly, phonological processes can also be observed if language is
looked at diachronically. Thus, the changes leading
from the Middle English vowel system including /i:/,
as in wife or mice, to the Modern English vowel
system with the diphthong /VI/ in the corresponding

items can be described as the result of a set of phonological processes, in this case the so-called Great
Vowel Shift.
Generative phonology in the version proposed by
Chomsky and Halle (1968) and in other works was
strongly process oriented in giving rules a central
place in the model. Phonological rules are based on
features in the sense introduced above and apply
to underlying forms (abstract phonemes) to derive
surface forms. More recent theories, in particular
declarative phonology and optimality theory, diverge
from this view. Processes are now modeled not by
means of feature-changing rules, but as the surface
result of static constraints which put different
demands on sounds under different conditions.
Prosody and Its Categories

Phonology in many recent conceptions also is a theory of phonological grouping, that is, of units of a size
larger than the phoneme. The size of these groupings
ranges from the small subphonemic unit (such as
the atomic features) to the most comprehensive one,
often called the utterance. These units are usually
assumed to be hierarchically related to each other.
The most widely used of the prosodic units is the
syllable. It usually consists of a vowel and some flanking consonants, which may or may not be present.
The principles of syllable structure largely determine
what sequences of sounds are permissible in a language. Furthermore, the syllable is an important
domain for phonological processes. Syllables often
come in two types, stressed and unstressed, as in the
English word Piccadilly [%pIk@"dIlI], which consists of
two stressed and two unstressed syllables, resulting in
an alternating sequence of stressed and unstressed
syllables. Groupings consisting of a stressed syllable
plus any unstressed syllables are identified as
members of a further prosodic category, called the
foot. (Other languages display other types of feet;
see Hayes, 1995 for a typology of feet.)
The example just given also illustrates that stress
(also called accent) is a phonological phenomenon. It
can even be phonemic in the sense of being contrastive, as in English mport (n.) with initial stress versus
import (v.) with final stress. Such phenomena of stress
can be observed in units of several sizes. The domain
of word stress is the unit called the prosodic word
or phonological word, and the domain of stress in
larger units is the phonological or prosodic phrase.
Thus, prosodic words and prosodic phrases are
other units of phonology. This hierarchy extends upwards to the intonational phrase, the domain of intonational patterns. These units also serve as the
domains of phonological restrictions or processes.
The phonological word, for example, is the

564 Phonology: Overview

domain for vowel harmony in many languages, as in


Hungarian or Turkish.
A further prosodic phenomenon is that of tone.
Tone is the lexically contrastive use of pitch movement, as in Standard Chinese (Mandarin Chinese),
where one and the same syllable can be combined
with several (four) such pitch levels and movements,
with the result that most of the combinations between
segmental syllables and tones are different lexical
items. The phenomena of stress and tone and their
analysis in phonology were to a large extent responsible for developments in phonological theory which
led beyond the view that a phonological representation is nothing but a string of segments (with segments
built up from elementary features). Tone and stress
turn out to be inherently nonsegmental in the sense
that they relate to units both larger and smaller than
an individual sound segment. There may be more
than one tone per segment, and there may be one
tone relating to more than one segment, the tone
sometimes spanning a domain as large as a whole
word. These observations led to a new theory of
representations in phonology, called autosegmental
phonology.

Phonology in Its Relation to Other Parts of


Language
The relation of phonology to phonetics has been discussed in the section Phonology What Is It About?
above. As the phonology of a language is the systematic use of phonetically given material, the interface
between phonology and phonetics is crucial in any
phonological theory. A phonological system being
part of the structural system of language, which
includes at least morphology, syntax, and semantics,
its relation to these components of language needs to
be discussed as well. In most structuralist conceptions
of language and grammar, several levels of linguistic
analysis must be distinguished. While the phonological level of analysis contains at least a phonemic and
an allophonic level, phonology is also closely related
to morphology, simply because morphemes are signs
relating a meaning with a phonological form. Furthermore, the concatenation of morphemes often causes
the changes identified as morphophonological above.
Finally, there are often prosodic requirements on both
simplex and complex words, requiring a further close
interaction between phonology and morphology.
Phonology relates to syntax mostly through the formation of phonological phrases and intonational
phrases. By determining (at least in part) how such

prosodic units are formed, syntax indirectly influences the placement of phrasal stress and intonation
contours. As for semantics, the main connection to
phonology is again through intonation. Intonation is
constrained by phonology (through the principles
governing the assignment of tonal features within
intonational phrases), syntax, and semantics. The
result of such constraints deriving from syntax, semantics, and phonology is often called information
structure, a particular packaging and ordering of
information within a sentence.
See also: Autosegmental Phonology; Chinese (Mandarin):

Phonology; Declarative Approaches to Phonology; Foot;


Hungarian: Phonology; Intonation; Kimatuumbi: Phonology; Phonemic Analysis; Phonological Phrase; Phonological Words; Phonology: Optimality Theory; Phonology
Phonetics Interface; Prosodic Morphology; Structuralist
Phonology: Prague School; Syllabic Constituents; Syllable: Phonology; Tone: Phonology.

Bibliography
Anderson S R (1985). Phonology in the twentieth century:
theories of rules and theories of representations. Chicago/
London: University of Chicago Press.
Baudouin de Courtenay J ([1895] 1972). An attempt at a
theory of phonetic alternations. In Stankiewicz E (ed.)
Selected writings of Baudouin de Courtenay. Bloomington,
Indiana: Indiana University Press. 144212.
Bloomfield L (1933). Language. New York: H. Holt.
Boas F (1889). On alternating sounds. American Anthropologist 2, 4753.
Chomsky N A & Halle M (1968). The sound pattern of
English. New York: Harper and Row.
Hayes B P (1995). Metrical stress theory: principles and
case studies. Chicago/London: University of Chicago
Press.
Jakobson R (1939). Observations sur le classement phonologique des consonnes. In Proceedings of the 3rd
International Congress of Phonetic Sciences. 3441.
Jakobson R, Fant G & Halle M (1952). Preliminaries to
speech analysis. Cambridge, MA: MIT Press.
Maddieson I (1984). Patterns of sound. Cambridge:
Cambridge University Press.
Sapir E (1921). Language. New York: Harcourt, Brace &
World.
Sapir E (1933). La re alite psychologique du phone`me.
Journal de Psychologie Normale et Pathologique 30,
247265.
Saussure F de (1916). Cours de linguistique generale. Paris:
Payot.
Trubetzkoy N S ([1939] 1967). Grundzu ge der Phonologie.
Go ttingen: Vandenhoeck & Ruprecht.

PhonologyPhonetics Interface 565

PhonologyPhonetics Interface

Much of the early work on the phonologyphonetics


interface advanced perceptual and articulatory explanations for recurring patterns in segment inventories
(see Ohala, 1997 for a brief history of research on the
phonologyphonetics interface). In their seminal
work on vowels, Liljencrants and Lindblom (1972)
explored the hypothesis that vowel inventories are
shaped in response to a requirement that vowels be
maximally distinct from each other in the perceptual
domain. They found a relatively close fit between
attested vowel inventories and a perceptual model of
vowel quality.
Lindblom and Maddieson (1988) built on the earlier work on vowels by incorporating an articulatory
component in their account of consonant inventory
construction. They hypothesized that consonants are
divided into three classes according to articulatory
difficulty. Within each class and starting from the
most simple articulations, languages adopt sounds
that are maximally distinct from each other in the
perceptual domain. After a minimal threshold of perceptual distinctness is reached within each level of
articulatory difficulty, an increase in inventory size
requires expansion into the subspace of immediately
greater articulatory complexity. Perceptual fractionation of this subspace continues until it is perceptually crowded, thereby necessitating the incorporation
of sounds belonging to the next tier of articulatory
complexity.
Stevenss (1989) quantal theory also appealed to a
combination of perceptual and articulatory factors,
arguing that /a/, /i/, and /u/ are the most common
vowels cross-linguistically because they occupy
regions of articulatory and perceptual stability such
that variation in their articulation have only small
acoustic and perceptual consequences.
The program of research on phonetic explanations
for phonological patterns extends to a variety of other
phenomena (see Ohala, 1997 for an overview), such
as consonant assimilation (Ohala, 1990), nasalization (Wright, 1986; Ohala and Ohala, 1993), voicing
patterns in obstruents (Westbury and Keating, 1985;
Ohala, 1997), and vowel harmony (Ohala, 1994).

and, in many cases, has challenged the traditional


generative conception of the relationship between
phonetics and phonology. The generative tradition
has always assumed that phonetic factors ultimately
constrain certain aspects of phonology. For example,
it is noncontroversial that phonological features
have a phonetic basis, grounded either in acoustic
(Jakobson et al.,1952) or articulatory (Chomsky and
Halle, 1968) properties or a combination of both
(Stevens and Keyser, 1989). Furthermore, sonority
scales are assumed to be projected from phonetic
prominence (see Parker, 2002 on phonetic correlates
of the sonority hierarchy). Nevertheless, the phonetic
factors claimed to underlie certain phonological
properties did not fall within the purview of study
of generative phonologists, as phonetics was considered to fall outside of the core grammar module. It
was not until the 1980s that phonologists began to
examine the phonetic properties of individual languages in order to gain insight into their phonologies. Pioneering work by Pierrehumbert (1980) and
Pierrehumbert and Beckman (1988) on intonation
and by Keating (1988, 1990), Cohn (1993), and
Huffman (1993) on segments explored the link between feature specification and phonetic properties,
adopting the hypothesis that each feature has a phonetic target value or window of permissible values
and that intervening elements that are phonologically
unspecified with respect to a feature owe their phonetic realization to interpolation between specified
elements. To take an example from Cohns work, a
vowel that is phonologically unspecified with respect
to nasality and occurs between an oral and a nasal
consonant is predicted to gradually increase in nasality throughout its duration from the oral target
associated with the preceding consonant to the nasal
target characterizing the following consonant.
This contrasts with a [nasal] vowel, which shows a
steady state peak in nasality throughout most of its
duration.
Crucially, the line of research on phonetic interpolation and phonological specification adopts the
traditional generative modular conception of the
grammar, in which the phonetics component is fed
by the phonology. As in the phonology module, a
series of language-specific phonetic implementation rules were postulated within the phonetics component to account for cross-linguistic variation in
phonetic realization.

Phonetics in Generative Phonology

Phonetics in Optimality Theory

The recent phonology literature has witnessed


renewed interest in the phonologyphonetics interface

The 1990s saw an increased fusion of phonetic and


phonological research exploring the hypothesis

M Gordon, University of California, Santa Barbara,


CA, USA
2006 Elsevier Ltd. All rights reserved.

Sound Inventories

566 PhonologyPhonetics Interface

that phonetic factors shape phonological systems.


Building on some of the insights of the natural phonology program (Stampe, 1972) and earlier phonetic
research, many phonologists began to explicitly
model phonetic motivations in the formal theory.
This integrated version of phonetics and phonology
took a complementary position to that of traditional
generative grammar by building phonetic explanations directly into the formalism. The advent of Optimality Theory (Prince and Smolensky, 1993), with its
hierarchically ranked well-formedness conditions,
provided a suitable phonological framework for
developing a phonetically informed model of phonology, termed phonetically driven OT (see Hayes et al.,
2004 for a cross-section of works within this framework). Constraints could be grounded in principles
of phonetic naturalness previously postulated by
phoneticians.
The literature in phonetically driven OT has become extensive in recent years and includes both
perceptual and articulatory-based accounts of various
phenomena. A key feature of this body of research is
that it demonstrates the nonarbitrary nature of many
phonological patterns, showing that these patterns
are predictable from independently known facts
about articulation and perception. I discuss some of
this research now.
Archangeli and Pulleyblank (1994) introduce the
notion of phonetic grounding in phonological cooccurrence restrictions, arguing that commonly observed interactions between the feature [ATR] and
vowel height and backness features find substantive
support in phonetic facts. For example, the fact that
low vowels are cross-linguistically [-ATR] is attributed to the retracted position of the tongue root during
the production of low vowels. Hayes (1999) explored
the articulatory basis for obstruent voicing patterns,
finding that a number of voicing asymmetries are
natural from an aerodynamic standpoint: Phonological voicing is most likely for places of articulation,
e.g., bilabial, and contexts, e.g., postnasal, that are
phonetically best suited to sustaining the transglottal
airflow necessary for voicing.
Much of the work in phonetically driven OT
appeals to perceptual factors. Flemming (1995)
invokes a perceptual account of a number of assimilatory phenomena. Building on the theory of acoustic
features advanced in Jakobson et al. (1952),
Flemming proposes a feature set based on vowel
formants to account for the assimilation data. For
example, the triggering of rounding in the high front
vowel /i/ preceding retroflexes in Wembawemba, an
extinct Pama Nyungan language of Victoria, Australia (Hercus, 1986), does not find a straightforward
explanation in acoustic terms but can be interpreted

as assimilation of the third formant, which is lowered


by both retroflexion and lip rounding.
Jun (1996) and Steriade (2001) examine the typology of consonant assimilation, finding asymmetries in
the direction of assimilation dependent on both manner and place of articulation. Jun finds a close correspondence between assimilation patterns and the
results of earlier experiments designed to assess the
relative salience of different perceptual cues to
consonants. Perceptual cues to consonants may be
classified as internal to the consonant (e.g., closure
duration, voicing, energy during the closure) or external (e.g., formant transitions for adjacent vowels,
fundamental frequency values in adjacent sounds,
voice-onset time, burst amplitude). For example, in
consonant clusters, the consonant on the left typically
assimilates to the consonant on the right rather than
vice versa, owing to the greater perceptual salience of
formant transitions coming out of a consonant into a
following vowel relative to transitions from vowel
into a following consonant (see also Ohala, 1990).
Furthemore, sounds with more robust internal cues
(e.g., fricatives) are less likely to assimilate than those
with weaker cues (e.g., nasals and stops).
Steriade (1999) explores the perceptual basis for
laryngeal neutralization, finding that the likelihood
of neutralization increases as the robustness of perceptual cues to laryngeal features decreases. She
showed that her cue-based approach to neutralization
offers a better fit to the observed patterns than a
syllable-based account in which neutralization is
claimed to occur in coda position (Lombardi, 1995).
For example, voicing neutralization in some languages only affects a subset of codas, those occurring
in contexts that are particularly ill-suited to recovering external consonant cues, such as word-finally and
before an obstruent, as in Lithuanian, or only before
another consonant, as in Hungarian. Conversely, neutralization can also affect syllable onsets if the
neutralized consonant occurs before an obstruent, as
in Russian.

Interlanguage Phonetic Variation and


Phonology
Another area of research dealing with the phonologyphonetics interface searches for correlations between
language-specific phonetic properties and crosslinguistic parameterization in phonological patterns.
The phenomenon of weight-sensitive stress has been
a fruitful area of language-specific links between
phonetic and phonological properties. The primary
source of cross-linguistic variation in weight is the
treatment of closed syllables. In some stress systems,
such as the one found in Latin, CVC is treated as

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy