SLIPS OF THE EAR
Z.S. Bond
Ohio University
The study of speech production and analysis…asks how the person’s mental representations enter into articulation and perception. (Chomsky, 1996, p. 23).
1. Introduction
In everyday conversation, speakers employ various reductions and simplifications of their utterances, so that what they say departs from the clarity norms found in formal speech or laboratory recordings. Both listeners and speakers are sometimes engaged in other tasks while carrying on a conversation, distracted, or preoccupied with their own ideas, so listeners vary in the amount of attention they pay to speech. Not surprisingly, sometimes listeners fail to understand what a speaker has said. Instead, a listener perceives, clearly and distinctly, something that does not correspond to the speaker's utterance. The following is a typical example. At a doctoral dissertation defense, a member of the audience heard the candidate say 'chicken dance', a phrase that had absolutely no connection with the dissertation topic of early literacy. Then she saw a proper name on a graphic: Schikedanz. The listener suspected that something was wrong from the inappropriateness of what she had heard and recovered the speaker's intended utterance from subsequent information.
Over the past years, I have collected approximately 1000 examples of slips of the ear taking place in everyday casual conversation. For a few of the misperceptions, I was a participant in a conversation as either a speaker or a listener. Interested friends, students and colleagues have contributed the majority. I have described this data set in Slips of the Ear: Errors in the Perception of Casual Conversation (1999). All my examples are from English as spoken in the United States and Canada. Many of the examples reported here appeared in the original publication.
Investigations of speech errors have a long history. Though slips of the tongue have received the most attention, slips of the ear have been described from a linguistic or psycholinguistic point of view since the seminal work of Meringer (1908) and Meringer and Mayer (1895). Meringer and Mayer included forty-seven German misperceptions in their corpus. They observed that stressed vowels tend to be perceived correctly whereas consonants are misperceived more readily. Celce-Murcia (1980) analyzed these misperceptions as well as her own collection and noted that many misperceptions showed grammatical coherence coupled with a lack of appropriateness to the conversation or situation. Celce-Murcia suggested that dialect differences might be one cause of misperceptions. Labov (1994) has found that more than one fourth of the misperceptions in his collection are traceable to dialect differences and observed that nasal and liquid consonants promote misperceptions by obscuring vowel quality.
Slips of the ear can also be found in popular collections, where they are sometimes known as ‘mondegreens,’ a relatively obscure term coined to honor the misperception of a line from a ballad:
They hae slain the Earl of Murray
And laid him on the green è And Lady Mondegreen.
Because children's misperceptions or misunderstandings tend to be humorous, they are particularly common in these popular collections. Often the misunderstandings concern material learned by rote. Some examples have become apocryphal, for example 'One Nation and a Vegetable’ and 'I lead the pigeons to the flat' as misunderstandings of the Pledge of Allegiance, and 'Jose, can you see?' from the Star Spangled Banner. There are popular collections of mysterious ailments such as 'very close veins' and 'fire balls of the uterus’ and of song lyrics such as 'There's a can of fish all over the world,' 'Don't cry for me, Marge and Tina,' and 'Row, row, row your boat…Life's a butter dream' (see Edwards, 1995).
Misperceptions and misunderstandings have also been consciously created in humorous writing, from Gilbert and Sullivan operettas to cartoons. In the Pirates of Penzance, the hero is mistakenly apprenticed as a pirate, instead of a pilot, because his nursemaid misunderstood her instructions. In the cartoon strip The Family Circus, the children talk about the windshield whappers and the Umpire State building. Whether some reported misperceptions are created or spontaneous is unclear. Circulated by e-mail: Husband's note to his wife: Someone from the Guyna College called. They said Pabst beer is normal.
Both spontaneous and artfully created misperceptions provide language-based humor, in that many have a wild appropriateness. Spontaneous misperceptions do something more. Slips of the ear or misperceptions and misunderstandings provide a unique window into the ways listeners use linguistic knowledge in understanding speech. They show that listeners use the phonetics, phonology, the lexicon, and syntax of their language in understanding speech.
2. Phonetic knowledge
Although the majority of slips of the ear show a relatively complex relationship between the speaker's utterance and the listener's misperception, in a portion of errors a listener misperceives a single segment. That is, the speaker's utterance and the listener's perception differ in only one segment. It is logical to assume that phonetic information which is rarely misperceived provides reliable information, whereas phonetic information which is frequently misperceived is less reliable. Misperceptions of single segments involve consonants much more frequently than vowels.
2.1 Vowel misperceptions
All collectors of slips of the ear have observed that simple stressed vowel misperceptions are exceedingly rare. In my collection, only 5% of the misperceptions involved stressed vowels as the only error. Although errors in which a stressed vowel is replaced by a very different vowel, such as
It's like a math problem è mouth problem
do occur, they are highly unusual, first, in that the misperceived vowel is not in a phonetic environment which affects vowel quality and, second, in that the phonetic distance between the target and the misperception is considerable. More commonly, vowel misperceptions occur in consonantal environments which affect vowel quality, such as the liquids /r, l/ and the nasals, as Labov (1994) has observed. Misperceptions primarily involve vowel height; other perceptual dimensions are misperceived much less frequently. The following are examples of typical stressed vowel errors:
Alan è Ellen
Wendy will come è windy
Cherri and me è cheery and me.
Experimental data support the resilience of stressed vowels to misperception. In examining the role of consonants vs. vowels in the perception of fluent speech, Cole, Massaro, Yan, Mak and Fanty (2001) report that listeners identify about twice as many words accurately when they have vowel information available as when they have consonant information available. Similarly, Neel, Bradlow and Pisoni (1996/97) found more consonant than vowel misidentifications in spoken sentences. When listeners are given incorrect vowel information, they find it much more disruptive than incorrect consonant information. Bond and Small (1983) asked listeners to shadow passages which contained mispronounced consonants and mispronounced stressed vowels. Listeners had little difficulty recovering intended words when they contained mispronounced consonants but recovered only 15% of the words containing mispronounced vowels.
Pisoni (1981) has argued that stressed syllables provide 'an island of reliability,' that is reliable phonetic information which listeners use to interpret the stream of speech. Altman and Carter (1989) also argue for the informational value of stressed syllables. Grosjean and Gee (1987) and Cutler and her colleagues (Cutler and Norris, 1988; Butterfield and Cutler, 1988; Cutler and Butterfield, 1992) have proposed that stressed syllables provide information for segmenting the continuous speech stream. Only when a task requires strategic responses do listeners prefer consonant information to vowel information (see van Ooijen, 1996).
Even though stressed vowels seem to provide reliable information, occasionally the stress pattern of target words is misperceived, and always accompanied by phonetic restructuring of some sort. For example:
giving an award è giving an oral
roll up the back window è patrol the back window.
I'm in the political science department è pickle science department.
Misperceptions of unstressed vowels are more common than misperceptions of stressed vowels, suggesting that the status of unstressed vowels in speech is relatively fluid. The quality of vowels in unstressed syllables may be misperceived or unstressed syllables may be perceptually lost or added.
Misinterpretations of the intended quality of unstressed vowels may occur in content words, as in:
Grammar Workshop è grandma workshop
More commonly, vowel quality misperceptions occur in function words, as in:
Attacks in the ear è a tax on the ear
They took footprints when you're born è in the dorm
Errors of this type may be less perceptual than grammatical, in that listeners tend to report hearing function words which are appropriate to the form of an utterance.
When unstressed vowels are added or lost, the shape of the target word changes, because adding or omitting unstressed vowels necessarily changes the number of syllables. For example:
evolution of tense systems è intense systems
Dec writer è decorator
You can spend a mint eating è a minute
In the first slip, the listener heard a spurious initial syllable; in the second, the listener failed to detect a word boundary and altered the phonological shape of the word by reporting a spurious medial syllable. In the third misperception, the listener added a syllable at the end of the target word.
The reverse type of misperception, loss of unstressed vowels as part of the loss of unstressed syllables, is approximately equally common. For example:
My coffee cup refilled è my coffee cup fell
Accidents è actions
I teach speech science è speech signs.
The first example shows some phonological restructuring as well as loss of an initial syllable. In spite of the phonological restructuring, the verb maintains past tense, though the verb form becomes irregular. In the second example, a medial syllable is lost. In the third example, a syllabic nasal loses its syllabicity.
Slips of the ear typically lose or add only one syllable. Although errors affecting more than one syllable occur, they are relatively rare without considerable concomitant restructuring of the target utterance.
2.2 Consonant misperceptions
Slips of the ear affecting consonants are much more plentiful than vowel slips, whether as misperceptions of single segments or as parts of errors involving a more extensive missmatch between a target utterance and its misperception. Consonants may be lost or added, or one consonant may be substituted for another.
Consonants were lost at any position within a word, the two examples below showing consonant loss in initial and final position.
When their condition è air condition
The only poor meet è the only poor me.
Final consonants are lost much more frequently than initial consonants, undoubtedly because they tend to receive weak and indistinct articulation.
Spurious consonant perceptions can be seen in the two examples:
else è elfs
Tapas bars è topless bars
Though these particular errors do not have any obvious phonetic motivation, a number of consonant additions were associated with word boundary misassignments. For example, the spurious consonant in:
Slip of the ear è slip of the year
may have resulted when the listener interpreted the final segment of the article as a word-initial glide, distributing a single segment over two words.
Consonants may be substituted for each other relatively freely. Table 1 shows the variety of possible misperceptions of one target consonant, the voiceless alveolar stop /t/.
==========================================
Table 1. Misperceptions of /t/
----------------------------------------------------------------------
Misperceived as another stop:
great è grape
at least this part of it è park
training for great books è grade books
Misperceived as a fricative:
Tagalog è Thagalog
She had on a trench suit è a French suit
Misperceived as an affricate:
I'll bet that'll be a teary program è cheery program
Misperceived as a resonant:
booty è boolie
Fifth Street è fifth string
==========================================
Examples of misperceptions of a resonant, the bilabial nasal /m/ are given in Table 2.
==========================================
Table 2. Misperceptions of /m/
----------------------------------------------------------------------
Misperceived as a stop:
I'm getting married this Friday è buried
Ma'am è Pam
Misperceived as another resonant:
I'm trying to find some matches è latches
Key lime pie è key line pie
==========================================
In consonant substitutions, basic manner of articulation categories tend to be maintained in that resonants are most commonly misperceived as resonants and obstruents as obstruents. Consonant misperceptions involving substitutions tend to be more common in word-initial position than elsewhere, in a ratio of 2 to 1.
Even though many of the contributors to the collection of slips of the ear speak other languages or are familiar with them, English misperceptions are almost invariably built out of the inventory of English segments. In the data set, there is only one exception:
Patwin è pa/win
The alveolar stop of the target utterance may very well have been produced as an allophonic glottal stop. The listener, an anthropological linguist, did not compensate for the phonological reduction and reported perceiving a glottal stop with which she was familiar from her work with other languages.
2.3 Segment order
Sometimes slips of the ear resulted in a change in the order of segments or of syllables. These errors suggest that listeners take advantage of global information distributed in the target utterance.
There was only one example of misordering of adjacent segments, the type of misordering traditionally termed metathesis:
They're all Appalachian whites è Appalachian waste.
Misorderings of nonadjacent segments within a syllable were more common, as in:
Falstaff è Flagstaff
I can ink it in è I can nick it in
Do lions have manes? è have names
Misorderings also crossed syllable boundaries, as in
Acton Road è Atkin Road
and even word boundaries:
I'm making boats è taking notes
Spun toffee è fun stocking.
Friar Tuck pizza è Kentucky Fried pizza
In investigations of slips of the tongue, a generalization with almost no exceptions is that segments retain their position within syllable onsets and rhymes. That generalization does not hold for slips of the ear.
Misorderings seem to involve considerable restructuring of the phonology of the target utterance, so the details provided by any one example are certainly not definitive. Nevertheless, the overall impression is that listeners take advantage of information which is not sequential. Whether listeners operate within a fixed time window or are simply opportunistic is not clear. They treat the phonetic information which specifies words as if it were a braid, in which cues for individual segments overlap (see Mattys, 1997).
3. Phonological knowledge
Listeners appear to use knowledge of the phonology of their language in understanding casual conversation, as shown by misperceptions which seem to result from listener attempts to deal with phonological reductions and language varieties. Listeners also show sensitivity to phonotactics, the permissible shape of words.
3.1 Phonological reduction
In casual speech, listeners hear various kinds of pronunciations which differ from the shape words have in their canonical form. Most of the time, these reductions provide no difficulty for listeners in that they report reduced forms as intended by speakers. Sometimes listeners make an error by treating the phonetic stream literally, rather than recovering the intended utterance. At other times, they treat an utterance as if it had undergone phonological reduction, even when it has not. Both of the misperceptions
find me è fine me
in harmony with the text è test
probably represent accurate responses to words in which consonant clusters have been reduced, a literal interpretation of the phonetic material. In the same way, in the misperception
traitor è trader
the listener was presented with a flap, the typical realization of intervocalic alveolar stop, /t/ (see Patterson and Connine, 2001). The listener recovered a homophonous word with the flap as a realization of intervocalic /d/. In the misperception
I tripped on a tent pole è tadpole,
the listener probably failed to detect the nasalized vowel which would provide the only cue for the nasal consonant.
The reverse of these errors, treating an utterance as if it has undergone reduction, indicates that listeners use phonological knowledge in recovering the intended utterances. In the misperceptions:
Mrs. Winner è Winter
Fine sunny weather è fine Sunday weather
The Old Creek Inn was deserted è creek end
the listeners recover spurious consonants, consonants which could have been omitted in a reduced pronunciation. Similarly, in the misperception
You can weld with it--braze è braids
the listener recovered a consonant which could have been omitted in producing a consonant cluster.
Although there is debate about the nature of the mechanisms responsible, experimental evidence also supports the idea that listeners compensate for specific kinds of phonological reductions. Marslen-Wilson and colleagues (Marslen-Wilson, Nix and Gaskell, 1995; Gaskell and Marslen-Wilson, 1997) used words which had assimilated point of articulation, such as 'leam' for 'lean' in 'lean meat'. Listener perceptual interpretations were sensitive to possible assimilations from context. Based on statistical differences in the occurrence of flaps, Patterson and Connine (2001) suggest that some words may be represented in the mental lexicon as reduced forms, requiring no perceptual compensation.
3.2. Phonological well-formedness
Almost invariably, listener misperceptions were phonologically well-formed. There is only one counter-example in the data set, a misperception by a child:
The men are out lumbering in the forests è are out tlumbering.
The child misperception served as a target in another misperception. The speaker was describing slips of the ear to an adult colleague and mentioned that one example had violated English phonotactic constraints. In spite of the introduction which might have been expected to prepare the listener for what was to come, he 'corrected' the sequence to something more acceptable in English:
tlumbering è klumbering.
A similar example involved a proper name. The listener adjusted the non-English syllable onset to fit English phonotactics:
Sruti è Trudy.
In the misperception
a fancy seductive letter è a fancy structive letter
the listener probably failed to detect a short unstressed vowel in the first syllable; he reported the voiced /d/ as voiceless, as is appropriate in English syllable onsets even when he perceived a non-word.
3.3 Language varieties
When listeners hear speech produced in a different dialect or with a foreign accent, their misperceptions can take two forms, just as in the case of phonological reductions. Listeners can perceive the phonetic detail veridically and recover something other than the intended utterance or they can compensate inappropriately for the dialect or accent characteristics of the speaker.
In the misperception
Kings è kangs
the listener reported hearing a non-word when presented with the nasalized vowel produced by the speaker from the South. Similarly, in the misperception
That's a special è spatial
the listener failed to compensate for the tensing of lax vowels characteristic of speech in southeast Ohio.
Veridical perception of phonetic detail leading to an error in recovering the intended utterance can also result from foreign accent. For example, a speaker with a noticeable Eastern European accent produced a flap for the English rhotic, as would be appropriate in her native language. The listener treated the flap as a reduction of an English alveolar stop:
barrel è bottle.
Listeners may also misperceive in attempting to compensate for speaker characteristics, using expectations about the phonology of various dialects. One vowel misperception seems to have resulted from attempting to compensate for dialect differences.
Wattsville è Whitesville
The speaker from South Carolina gave the name of a town, and the listener from Ohio corrected for the monophthongal vowel he believes the Southern speaker was employing. Similarly, in the misperception
It's Lawson à Larson
the Ohio listener corrected for the supposed r-less pronunciation of a speaker from the East Coast of the United States.
4. Lexical knowledge
Listeners report their misperceptions in words and claim that they hear words, in that words are the consciously available result of the perceptual process. Their errors suggest strategies which they employ in partitioning the stream of speech and finding discrete items in the mental lexicon.
4.1 Nonwords
Because slips of the ear occasionally result in the perception of nonwords, phonological sequences which do not map onto any existing lexical item, it seems that one way in which listeners access the mental lexicon is through a phonological code.
Undoubtedly, there are multiple reasons for misperceptions leading to non-words. In the case of proper names or specialized vocabulary, listeners may simply not have sufficient knowledge to recover the intended utterance. Two misperceptions of this kind might be
The anechoic chamber è the ambionic chamber
The mining of Haiphong harbor è Haithong harbor
Some perceptions of nonwords resulted from a failure to compensate for the dialect of the speaker. The misperception mentioned above
Kings è kangs
is an example. Another example of the same type is:
Call Star Fire è /sta fa/.
The listener was told to call Star Fire, the name of a gas station, by a speaker using an r-less dialect. The listener perceived the phonetic material accurately, but could not find an appropriate proper name in her lexicon, and did not have sufficient knowledge about the dialect of the speaker to make an appropriate compensation.
Sometimes common words were misperceived as well, without any obvious motivation in the linguistic or non-linguistic environment. For example:
The article è the yarticle
Sitter problems è sinter
Paula played with Tom è polyp laden /Tam/.
4.2 Word boundaries
Because casual speech is a continuous stream without clearly marked word boundaries, listeners have to segment the stream in some way in order to find phonological sequences to compare with words in the mental lexicon. Slips of the ear involving word boundaries suggest that listeners employ stressed syllables as aids in segmentation.
In the simplest case of word boundary errors, all properties of the target utterance correspond with the perceived utterance except for the presence of word boundaries. A classic error of this type is:
acute back pain è a cute back pain.
The listener perceived the phonological material accurately but misanalyzed the speaker's utterance, interpreting the initial unstressed syllable as an article. Listeners may fail to detect word boundaries, insert spurious word boundaries, or shift the location of a word boundary.
In all cases of word boundary loss which do not involve radical phonological restructuring of some sort, the environment for the loss is a stressed syllable followed by an unstressed syllable, as in the following:
We're going to pour him into the car è purim into the car
Chris De Pino è Christofino
He works in an herb and spice shop è an urban spice shop
Many perceptions of spurious word boundaries were also associated with stressed vowels in that phonetic information was analyzed as if word boundaries preceded stressed syllables. For example:
attacks in the ear è a tax on the ear
Americana è a Mary Canna
At the parasession è at the Paris session.
Because other phonetic cues to word boundary location are available, there are exceptions to the tendency to add word boundaries before stressed vowels. One student misanalyzed a Japanese surname as including a given name, placing the spurious word boundary after the first syllable:
Yoshimura è Yo Shimura.
Shifted word boundaries typically involved misassigning a consonant from word-initial to word-final position, or the reverse, as in
I need a loose crew è loose screw
We could give them an ice bucket è a nice bucket
Dix [Dixon] Ward è Dick Sward
The typical errors in word boundary placement indicate that listeners employ expectations about the structure of their language. As Cutler and Carter (1987) have documented, the great majority of English nouns begin with a stressed syllable. Listeners use this expectation about the structure of English and partition the continuous speech stream employing stressed syllables.
4.3 Content words and function words
Sometimes listeners seem to be extremely inattentive to phonetic information and report a content word only vaguely related to the speaker's utterance. These substitutions are curious in that the misperceptions seem to come from a semantic domain appropriate to the target.
Athens è Akron
pathology è psychology
Stockholm è Scotland
sounds interesting è sounds intriguing.
It is possible that when errors such as these occur, a listener is functioning in a discourse mode, aware of little more than the gist of the conversation, paying almost no attention to the details of what the speaker is saying. Voss (1984) has reported some similar examples.
Because function words tend to be unstressed in ordinary conversation, they are misperceived or adjusted relatively frequently. Listeners misperceive function words sometimes in the context of other misperceptions, sometimes not:
Did you put the food out for him è for them
When were you here è Why were you here?
I think I see a place è his face
In the first two examples, the listeners misidentified a function word. In the third example, the listener misperceived a content word as well as a function word.
Function words may result from misperceptions of word boundaries. The listeners heard spurious word boundaries and interpreted unassigned phonological material as appropriate function words.
Jefferson Starship è Jeffers and starship
You swallowed a watermelon è You smiled at a watermelon
I've been doing research è a search.
Sometimes, listeners also adjust function words to be appropriate to a misperceived utterance, either reporting a spurious word or not reporting any trace of one. In the slip of the ear
hypnotic age regression è hypnotic aid to regression,
the listener probably misperceived 'age' as 'aid' and then added a preposition as required. In the reverse type of error, the listener loses a function word:
change for a dollar è exchange a dollar
the word 'exchange' does not require a preposition, so the preposition is simply absent in the reported perception.
4.4 Morphology
Perceptual errors related to morphology primarily involved inflectional rather than derivational affixes, most commonly the plural suffix. In these misperceptions, either a morphologically simple word was interpreted as a plural, or the reverse, a plural word was interpreted as mono-morphemic. In these misperceptions, the target utterance contains phonetic material which can be interpreted as a plural, either in the word itself or in the initial consonant of the following word.
Her niece was in the hospital è her knees
on an island with a moat surrounding it è with moats surrounding it
In the reverse error, a fricative representing a plural was interpreted as part of the stem:
matches è mattress.
Some plural forms did not have phonological support in the target utterance but rather appeared by conforming the perception to grammatical requirements. For example, in the misperception:
It'll be a confusing weekend è You're confusing weekends
the listener may have misinterpreted the beginning of the utterance and supplied the appropriate plural form. Similarly, in
It will be done next year è in six years
the listener supplied a plural form appropriate to the numeral.
In other errors involving the perception of inflectional morphology, listeners displayed a strong tendency to interpret morphologically complex forms as mono-morphemic. In these errors, phonological material was sometimes reinterpreted; at other times, phonological material was simply lost. Some examples are given in Table 3.
=============================================
Table 3. Errors in the perception of morphology
----------------------------------------------------------------------------
A mono-morphemic word is interpreted as a possessive:
A loose end in this problem è a leaf's end
A possessive is interpreted as part of the stem:
Olga's son è the sun
Skipper's treat è trick or treat
A monomorphemic word is interpreted as a verb:
I was through on a bus è I was thrown off a bus
A verbal suffix is interpreted as part of the stem:
Citrus craving è citrus gravy
This friend of ours who visited è is an idiot
=============================================
Some evidence that morphemes have an independent status in the lexicon is provided by errors in which the stem is misperceived but retains its morphological affixes. For example, in the misperception
Bloomfield's personality was warped here è Whorfed here
the listener retained the same verb tense suffix in the misperception even when the target was misinterpreted as a proper noun, forming a nonce form.
In comparison with misperceptions of inflectional morphology, errors involving derivational morphology were rare. The few examples appear to be phonological rather than involving any specific morphological structure. For example:
Felicity conditions à ballistic conditions
He hasn't heard of any viable reasons è buyable reasons.
Although there is some evidence that morphological affixes have an independent status as elements of the lexicon, most of the errors affecting morphology in some way appear to be primarily phonological, that is, based on misperceptions of phonological information. All things being equal, morphologically complex words are analyzed as mono-morphemic rather than the reverse, and morphological suffixes are adjusted to fit grammatical requirements (see Bond, 1999, for further discussion).
5. Syntax
Most slips of the ear are local, typically affecting words or short phrases. When slips of the ear involve relatively lengthy stretches of speech, the misperceptions can show considerable divergence from the target utterances. Short slips do not provide much information about syntax, while long but radically restructured slips make it difficult to determine exactly what was misperceived.
5.1. Well-formed and ill-formed utterances
Most slips of the ear produced syntactically well-formed utterances in that the erroneously perceived portions did not create syntactic deviance. On occasion, misperceptions created utterances which listeners were unable parse. In the misperception,
has knocked real dents è has not real dents.
the listener misinterpreted the verb 'knocked' as the negative 'not' and indicated incomprehension before the speaker could complete his utterance. This misperception suggests that listeners almost immediately map what they hear into syntactic structures.
Even though listeners seem to expect well-formed utterances, a portion of misperceptions were ungrammatical to various degrees. Some showed minor deviations from well-formedness, such as missing articles:
I just got back from Denison è from dentist
Wouldn't she look good with a ring in her nose è oregano nose.
Other slips departed further from well-formedness. Probably the most syntactically deviant misperception was:
We offered six è we Alfred six
in which a verb was misinterpreted as a proper name without any adjustments of the remainder of the phrase. Apparently, although listeners expect syntactic well-formedness, syntactic structure does not constrain interpretation of utterances to the same degree that phonological structure does.
5.2 Constituents
As a minimum, sentence understanding requires that listeners locate constituents and assign structural relationships. Consequently, we would expect that misperceived utterances preserve the integrity of constituents. The misperception data support the idea that constituents function as perceptual units.
First, misperceptions which involve misordering of segments were almost always located within constituents. In the collection of misperceptions, there were only four apparent counterexamples, errors in the order of segments which seem to cross constituent boundaries in some way:
I have to eat too è I have eighty-two
She wants to be a teacher è She wants me to teach her
without your mother along è without your mother-in-law
my three-ninety class è my three-D night class.
All four of these misperceptions also involve considerable phonological restructuring of the target utterance.
Second, word boundary misperceptions seem to be restricted to constituents. There was only one word boundary misperception which clearly crossed a major constituent boundary. The slip took place in the context of a riddle:
What goes 'zzub, zzub, zzub'?
A bee flying backwards è A beef lying backwards.
Perhaps the listener was prepared to suspend normal expectations when faced with a very odd question.
Major syntactic constituents are typically produced with a unified intonation contour and with the constituent boundaries phonetically characterized in some way (see, e.g. Gussenhoven and Jacobs, 1998, for phonological descriptions). Very few word boundary misperceptions crossed intonational contours. The few examples involve direct address, such as:
I know what happened to our ice, Andy è to high Sandy
or an explanation:
Sonic, the hedgehog è son of the hedgehog.
Even misperceptions which restructured the phonology and lexicon quite radically seem to maintain the over-all phrasing of the target utterance. For example,
I wasn't getting anywhere with all those vowel adjustments è with all those bottles of aspirin.
How've you been? è Got a minute?
It seems likely that the phonological structure of constituents provides a scaffolding which guides listener perception. Phrases defined by intonation seem to serve as units of segmentation with which listeners begin syntactic analysis.
5.3 Argument structure and function
Even though constituents seem to be resistant to misperception, their function and internal structure can be misanalyzed in many different ways. There seem to be two primary causes of syntactic misanalyses, often operating jointly. Listeners recover a word which is phonetically similar to the target but has a different part of speech or they mislocate word boundaries.
A misperception which leads to an incorrect part of speech assignment to a word can have consequences at any level of syntactic analysis. When the listener recovered a verb instead of an interrogative, she interpreted the speaker's question as a command:
Where are your jeans? è Wear your jeans,
changing the function of the utterance. A student wished her classmate good luck, using a phrase from the movie Star Wars. The listener reinterpreted the auxiliary and following noun phrase as a noun:
May the Force be with you è metaphors be with you
A verb was interpreted as a homophonous noun with an adjustment of the function word:
I'm going to try to get it towed è to get a toad.
In the reverse part of speech assignment, a noun was interpreted as a verb:
structure, style and usage è instruct your style and usage.
. In the misperception
I'm going to go downstairs and do some laminating è lemon eating
a noun was interpreted as a verb with a direct object.
Quite often, listeners adjust or 'edit' portions of utterances so that they have the appropriate lexical items for well-formedness. In the misperception:
Missed the news è must ‘a [have] snoozed
the listener interpreted the word 'missed' as the near homophone 'must' and the remainder of the utterance as the continuation of the verb phrase. In
John's nose is on crooked è John knows his own cooking
the noun 'nose' was interpreted as the homophonous verb 'knows' and the remainder of the utterance was interpreted as the required argument. Swinney (1979) and Tannenhaus, Leiman and Seidenberg (1979) have reported that the same phenomenon, recovering homophones regardless of their part of speech, can be observed in experimental situations.
Misperceptions of word boundaries affected the perceived syntactic structure of phrases. For example, conjoined nouns or noun phrases were interpreted as a single noun phrase:
cinema and photography è cinnamon photography
a purse and a billfold è a personal billfold
The reverse error, interpreting a noun phrase as a conjoint, also occurred but more rarely:
Jefferson Starship è Jeffers and starship.
Apparently, almost any kind of phrase or clause may be misanalyzed. For example, a prepositional phrase was interpreted as a modifier in:
a plate of lasagna è potato lasagna.
A relative clause was interpreted as a predicate in:
This friend of ours who visited è This friend of ours is an idiot.
No one syntactic property or characteristic of target utterances was invariably perceived correctly, thus serving to provide reliable syntactic information. Neither the purpose of the utterance as a whole nor the structure of any of its parts survived misperceptions reliably. Overall phrasing and rhythm, however, tended to be preserved.
6. Semantics and pragmatics
Listeners do not appear to be constrained by semantic plausibility or contextual appropriateness. There are numerous misperceptions which involve radical changes in phonology and syntax, completely lacking in semantic appropriateness. Some examples:
After the rubber boat had been wrecked in the squall è After the rubber boot had been erected in the squirrel
I'm going to go back to be until the news à I’m going to go back to bed and crush the noodles
I seem to be thirsty à I sing through my green Thursday
A linguini is a noodle è a lean Wheatie
My interactive Pooh è Mayan rack of Pooh
Languages allow speakers to say novel and unexpected things. Listeners, in turn, are willing to entertain novel and unexpected utterances.
7. Summary and conclusions
Phonetic errors show that stressed vowels resist misperception in comparison with consonants. The status of unstressed vowels is much more fluid in that they are readily lost or added and, particularly in function words, changed to make the word fit grammatical requirements. The fundamental manner of articulation feature, obstruent vs. resonant, is somewhat resistant to misperception, in that resonants tend to be perceived as resonants and obstruents tend to be perceived as obstruents. Misperceptions are not equally likely in all positions in a word: consonant substitutions tend to occur word-initially; consonant loss tends to affect final consonants. Finally, listeners take advantage of phonetic information wherever it is available, sometimes making errors about the order of segments.
Listener misperceptions suggest that they act in accordance with knowledge of the phonology of their language. They expect utterances to be phonologically well formed, and they 'correct' consonant sequences which do not correspond to English phonotactic constraints. Listener misperceptions which result from compensating for phonological reductions suggest that listeners act as if phonological knowledge guided their interpretations of reduced speech. In the same way, listeners appear to have expectations about the phonological characteristics of dialects and sometimes use these in understanding speakers from other dialect areas.
The lexical representations which listeners employ involve a phonological code, as indicated by the perceptual occurrence of non-words. Listeners seem to employ stressed syllables to locate word boundaries, showing knowledge of the statistical structure of their language. They are not particularly attentive to function words, adding or modifying them as needed by the structure of phrases or sentences.
Listener misanalyses of syntactic structure appear to be related to misplaced word boundaries and misassigned part of speech roles. That is, listeners recover a homophonous or nearly homophonous word from a different lexical category than the word in the target utterance. No one syntactic property resists misperception. However, phrases defined by intonation contours seem to be resistant to error and perhaps provide a scaffolding for syntactic interpretation. Listeners are open to extremely implausible utterances, not at all constrained by semantic or pragmatic appropriateness. It may be that the perception of an odd or unusual utterance leads listeners to question what they have heard and to detect a perceptual error, a slip of the ear.
I should end with a note of caution. Slips of the ear are not directly observable. Rather, slips of the ear become available through listener reports. Because slips have been collected from spontaneous, casual conversations, the speakers’ target utterances are also not available. Rather, the data set for slips of the ear consists of speakers’ intentions and listener reports of their perceptions. These difficulties are not unique to perceptual errors but rather characterize most investigations based on observations of fleeting actions. Though any one slip may be misreported by a listener or depend on an undetected error by a speaker, when many slips share in characteristics, we may be reasonably sure that they represent real perceptual processes.
Listeners are faced with a phonetic stream, what Sapir (1921) calls the 'rumble of speech'; it is rapid and inconsistent, full of assimilations, deletions and many other kinds of reductions. Most of the time, listeners untangle the rumble speech and recover listener intentions. They do this by applying strategies based on their knowledge of the structure of their language.
ACKNOWLEDGEMENTS
My thanks to Scott Jarvis, Emilia Marks and Verna Stockmal for providing helpful comments on an earlier version of this paper.
REFERENCES
Altman, G. and Carter, D. (1989) 'Lexical stress and lexical discriminability: Stressed syllables are more informative, but why?' Computer Speech & Language, 3, 265-275.
Bond, Z. S. Slips of the Ear: Errors in the Perception of Casual Conversation. (San Diego, CA: Academic Press. 1999).
Bond, Z.S. 'Morphological errors in casual conversation'. Brain and Language, 68 (1999), 144-150.
Bond, Z. S. and Small, L. H. (1983). 'Voicing, vowel, and stress mispronunciations in continuous speech.' Perception and Psychophysics, 34, 470-474.
Butterfield, S. and Cutler, A. (1988). 'Segmentation errors by human listeners: Evidence for a prosodic segmentation strategy.' Proceedings of Speech '88 (pp. 827-833). Edinburgh.
Celce-Murcia, M. 'On Meringer's corpus of ‘slips of the ear.' Errors of Linguistic Performance: Slips of the Tongue, Ear, Pen and Hand (pp. 199-211). V. A. Fromkin (ed.) (New York: Academic Press, 1980).
Chomsky, N. ‘Language and thought: Some reflections on venerable themes.’ Powers and Prospects: Reflections on Human Nature and the Social Order (pp. 1-30) (Boston: South End Press, 1996).
Cole, R. A., Massaro, D. W., Yan, Yonghong, Mak, B., and Fany, M. (2001). 'The role of vowels versus consonants to word recognition in fluent speech.' URL: http://mambo.ucsc.edu/psl/dwm.
Cutler, A. and Butterfield, S. (1992) 'Rhythmic cues to speech segmentation: Evidence from juncture misperception,' Journal of Memory & Language, 31, 218-236.
Cutler, A. and Carter, D. M. (1987) 'The predominance of strong initial syllables in the English vocabulary' Computer Speech and Language, 2, 133-142.
Cutler, A. and Norris, D. (1988) 'The role of strong syllables in segmentation for lexical access,' Journal of Experimental Psychology: Human Perception & Performance, 14, 113-121.
Edwards, G. 'Scuse Me while I Kiss this Guy and other Misheard Lyrics (New York: Simon & Schuster, 1995).
Gaskell, M. G. and Marslen-Wilson, W. (1997). ' Integrating form and meaning: A distributed model of speech perception.' Language and Cognitive Processes, 12, 613-656.
Grosjean, F. and Gee, J. P. 'Prosodic structure and spoken word recognition.' Cognition, 25, 135-155.
Gussenhoven, C. and Jakobs, H. Understanding Phonology. (London: Arnold, 1998).
Merringer, R. Aus dem Leben der Sprache: Versprechen, Kindersprache Nachahmungstrieb (Berlin: B. Behr, 1908).
Merringer, R. and Mayer, K. Versprechen und Verlesen (Stuttgart: G. J. Goschensche Verlagshandlung, 1895).
Marslen-Wilson, W., Nix, A. and Gaskell, G. (1995) 'Phonological variation in lexical access: Abstractness, Inference and English place assimilation, Language and Cognitive Processes, 10, 285-308.
Mattys, S. L. (1997) 'The use of time during lexical processing and segmentation: A review,' Psychonomic Bulletin & Review, 4, 310-329.
Neel, A. T. Bradlow, A. R. and Pisoni, D. B. (1996/97). 'Intelligibility of normal speech: II. Analysis of transcription errors.' Research on Speech Perception, Progress Report No. 21. (Bloomington: Indiana University).
Patterson, D. and Connine, C. M. 'A corpus analysis of variant frequency in American English flap production.' Paper presented at Acoustical Society of America, June, 2001.
Sapir, Edward. Language. (New York: Harcourt, Brace, 1921).
Swinney, D. A. (1979) 'Lexical access during sentence comprehension: (Re)consideration of context effects.' Journal of Verbal Learning and Verbal Behavior, 18, 645-659.
Van Ooijen, B. (1996). 'Vowel mutability and lexical selection in English: Evidence from a word reconstruction task.' Memory and Cognition, 25, 573-583.
Voss, B. Slips of the Ear: Investigations into the Speech Perception Behaviour of German Speakers of English (Tubingen, Germany: Gunter Narr Verlag, 1984).