Phonetic Detail in Connected Speech
Phonetic Detail in Connected Speech
Phonetic Detail in Connected Speech
Aims
To consider how the phonetic detail of speech reflects linguistic structure and function To explore the hypothesis that phonetic detail is central to speech understanding To raise questions about whether the identification of particular phonological units is necessary for speech perception
SPEECH SIGNAL
Long-domain coarticulation
Coarticulation is not restricted to adjacent segments Some coarticulatory effects cross syllable and word boundaries
lip rounding nasalisation resonance effects of //
can be perceptually useful in a forced-choice test (e.g. belly vs berry), listeners could correctly identify a missing // when surrounding vowels and consonants were replaced by noise (West, 2000)
0
0
-0.7564
-0.7509
0 Time (s)
0.894014
0 Time (s)
0.943447
[ktnaI] /ktnaIt/
[knaI]
A more parsimonious model of speech perception would make use of the systematic phonetic information in the signal to assign probable word boundaries
Grammatical status
Phonetic detail differs between the same words when they serve a different grammatical function e.g.
Function vs content words: (Lavoie, 2002) for much more reduced than four Different grammatical roles: infinitival to tends to be shorter and more reduced than prepositional to complementizer that tends to be shorter and more reduced than pronominal that
(Jurafsky, 2002)
Listeners are sensitive to these differences between function words and content words
(Baker, 2008)
No content words begin with //and it occurs within only a few content words:
mother, other, bother, father, smithereens, soothe, smoothie
Connected speech processes differ for // in content words and function words
content
at] ban thatch [ban n [ba at] at] *[ban
function
ban that [banat]
ban zips
(Local, 2003)
Listeners report hearing // in tokens like these They can generally distinguish win noes from win those, even when the fricative is completely nasalized Manuel suggests that low F2 at the release of the nasal is a cue to dental rather than alveolar place of articulation
Frequency (kHz)
lime vs. Im
content pro + be in ordinary lexical items, word-final m does not usually assimilate to the place of articulation of the following consonant in English, while n does
but not
lime vs. Im
content pro + be However, in the grammatical chunk Im (= I am) assimilation regularly happens in everyday talk
aIm aIn aI
aI m
aI n
aI
aI w
Im contrasts with:
youre, shes, hes, were, theyre, its
mistimes
Unproductive
mistakes
Productive True prefix (Pr) mistimes Unproductive Pseudo prefix (Ps) mistakes
Ps
Same phonemes: /mst/ but PD systematically reflects morphological status: properties of periodic part
duration abruptness of [m] boundary F2 frequency (etc)
relative durations e.g.: True prefix (Pr) Pseudo prefix (Ps) mistimes mistakes
periodicity : aperiodicity (fric) fricative : silence (closure) VOT
Baker, Smith, Hawkins ICPhS 2007; Baker 2008 PhD dissertation
intelligibility in noise worse when mismatched details of sound patterns can signal morphemic status
Rachel Baker, PhD 2008; Baker, Hawkins and Smith, 2007, ASA; Wurm, 1997
Summary so far...
Phonetic detail maps onto many different abstract units of linguistic structure
one sound chunk can signal many abstract units (with different probability levels)
Summary so far...
For any given unit U
(e.g. segment, syllable, foot, word...)
U is functionally inseparable from its context If something in the context changes, there will probably be consequences for the phonetic realisation and perceptionof U Only when whole context is taken into account is the systematicity of phonetic detail evident Taking account of context can help to make sense of variation that would otherwise appear random
Context
Context does not just refer to linguistic structure also probabilistic factors
word frequency neighbourhood density predictability
Context
Context does not just refer to linguistic structure also probabilistic factors
word frequency neighbourhood density predictability
Is the identification of sublexical units always a necessary stage in speech perception? Is the identification of words always a necessary stage in speech perception?
Summary
Phonetic detail signals all sorts of linguistic categories, and linguistic and communicative functions Taking a broader context into account can help make sense of seemingly random variation linguistic structure, prosodic structure, probabilistic factors, register, attitude, speaker, situation...
Summary
Much phonetic detail is learnable and perceptually salient, and at least some of it makes speech easier to understand in at least some circumstances If we take understanding meaning to be the aim of listening to speech (rather than phoneme or word identification), it may ultimately be easier to make sense of the variation in the speech signal