In an era of increasing globalization and immigration, the speech heard on a daily basis is becoming less homogenous, and various accents and dialects permeate everyday activities as interactions between people from different language backgrounds increase. Speakers of different languages use different acoustic information to decide what sounds they are hearing. For example, while both English and Spanish have a sound contrast between "p" and "b," listeners pay attention to different aspects of the sound signal (or different "acoustic cues") to determine the difference between these sounds in the two languages. This means that a person listening to another language, or to foreign-accented speech, may not be paying attention to the same parts of the sound that are relevant for the speaker, which could lead to difficulties in intelligibility. For effective speech comprehension, listeners must rapidly accommodate to different accents by shifting their attention to the aspects of the sounds that are relevant for the speaker. The proposed dissertation research explores how one's native language can shape the way sound categories are produced and perceived, as well as to what extent listeners adapt when confronted with changes in these categories.

The first part of the research will explore which acoustic cues are most relevant to distinguishing contrasts like "p" vs. "b" in English and Spanish. The second set of experiments will explore how the "boundaries" of these sound categories can be shifted. Perhaps the most striking example of accommodation is provided by bilinguals, who switch from using the sounds of one language to another effortlessly and instantaneously. However, monolingual listeners also shift their sound category boundaries to accommodate to new accents. The proposed research will explore the plasticity of sound categories, or the rate and magnitude at which listeners shift their category boundaries when confronted with speech that is pronounced differently than usual, as happens when hearing an unfamiliar foreign accent. The research will also investigate whether bilinguals show more plasticity than monolinguals. Together, the results of the proposed experiments will increase understanding of how sound categories are structured in different languages and which factors contribute to the plasticity of these categories, with the broader aim of providing insight into the challenges involved in understanding foreign or foreign-accented speech.

Project Report

When learning sounds of a foreign language, the most obvious challenge is mastering "exotic" sounds that aren’t part of our native language. However, even sounds that we think of as "same" are pronounced differently in different languages. For example, although both English and Spanish have ‘B’ and ‘P’ sounds, a ‘P’ in English does not sound the same as a ‘P’ in Spanish, and using the English versions of the sounds when speaking Spanish (or vice versa) contributes to a foreign accent. Furthermore, because of these differences, Spanish and English listeners also hear these sounds differently: a sound that is heard clearly as ‘B’ by an English listener is heard just as clearly as a ‘P’ by a Spanish listener. Although these differences seem small (and in part because they are so small), they pose particular difficulty to language learners. They are also relevant for anyone communicating with people with different language backgrounds; in other words, given the current increase in globalization and immigration, the majority of the population. Furthermore, these sorts of fine-grained differences between languages, and even between individual speakers, underlie one of the biggest challenges for automatic speech recognition systems (such as the voice recognition feature found on mobile phones): adapting to new speakers and accents. The first goal of this project was to get an accurate picture of the differences between similar sounds in Spanish and English by studying monolingual speakers of each language (in Arizona and Mexico) and Spanish-English bilinguals (in Arizona). In these experiments, which involved the training of research assistants from the bilingual community in Arizona and from Mexico, we measured several properties of speakers’ pronunciation of the sounds to determine how speakers produce sounds differently in the two languages. To study differences in how listeners hear sounds in the different languages, we manipulated properties that differ in pronunciation to create a large, systematically varying acoustic space encompassing the sounds we were interested in, then asked speakers of each language to listen to each sound in this space and tell us what they heard. Based on these responses, we were able to create a "map" of each listener’s sound categories, and compare how sounds are heard differently in different languages. The results of these studies, which show the specific ways in which similar sounds differ across languages, allow us to make predictions about the difficulties are most likely to be encountered by language learners, which in turn can form the basis for more targeted and efficient learning materials. A second part of the project focused on how listeners adapt their sound categories when confronted with a new speaker or accent. As listeners, every time we talk to someone different, and particularly when the speaker and listener do not share the same first language, we must adapt to the idiosyncracies of that speaker’s accent. Although humans do this quickly and apparently effortlessly, the details of how this adaptation happens are not well understood, and despite recent large technological advances, adaptation to new speakers remains a major source of difficulty for speech recognition systems. We created artificial accents by manipulating specific properties of the sounds discussed above, then played the different accents to listeners. By analyzing the ways that listeners adapted to different sorts of artificial accents, we were able to identify a specific mechanism by which listeners adapt to new accents. Furthermore, the same sort of adaptation process was used by all of the language groups we tested: monolinguals, bilinguals, and foreign language learners. The generality and relative simplicity of this adaptation mechanism suggests that it could be integrated into automatic speech recognition systems in order to improve their flexibility and accuracy.

Project Start
Project End
Budget Start
2013-09-01
Budget End
2015-02-28
Support Year
Fiscal Year
2013
Total Cost
$12,903
Indirect Cost
Name
University of Arizona
Department
Type
DUNS #
City
Tucson
State
AZ
Country
United States
Zip Code
85719