Naturalistic Speech Misperception

Project Summary:

For my PhD thesis, I was investigating speech misperception in its most naturalistic form, namely slips of the ear. The erroneous patterns in speech perception shed lights on psycholinguistics, speech segmentation, models of diachrony and more. Furthermore, I cross-examined the patterns in naturalistic and laboratory data in English.

Thesis Abstract:

This thesis presents a new corpus containing ≈ 5,000 instances of naturally occurring misperception of conversational English, which is the result of a standardised format for the orthographic and phonetic transcriptions and meta-data of existing naturalistic corpora. Available at the SEAR Project (www.searproject.org)

I examined top-down phonetic/phonological factors and bottom-up lexical factors for their contributions in naturalistic settings. On the feature level, voicing/place/manner confusions were best explained using sonority, featural underspecification (Lahiri and Reetz, 2002) and markedness (Lombardi, 2002), and vowel height/backness confusions using perceived similarity (Steriade, 2001) and chain shifts (Labov, 1994a).

On the segment level, I found that confusions can be explained with acoustic/featural distances, and extreme signal-to-noise ratio and narrow bandwidth were less ecologically valid. Furthermore, three well-known sound changes (TH-fronting, velar nasal fronting and back vowel fronting) were consistently found in naturalistic and experimental data.

On the syllable level, codas are more likely to be misperceived than nuclei/onsets for monosyllables, but onsets are more likely to be misperceived for polysyllables. Fewer errors occur in the stressed syllables than in unstressed syllables in polysyllabic words, but not monosyllables. Initial syllables are more likely to be misperceived than medial syllables, which in turn are more prone to misperception than final syllables.

On the word level, listeners were found to perceive a word of similar frequency as the intended word in a misperception -- but crucially not a more frequent word. This supports the graceful degradation account of a malfunctioning system (Vitevitch, 2002). On the utterance level, listeners were sensitive to the predictability of a word, suggesting that less predictable words are more likely to be misperceived.

Together, these analyses establish the naturalistic corpus as an ecologically valid resource and a benchmark of misperception, bridge the gap between experimental and naturalistic studies, and highlight the need of examining misperception with units larger than nonsense syllables.