When we listen, we rapidly and reliably decode speakers' intentions and we mostly do so independently of whom were are talking to. Yet, anyone who has interacted with an automated speech recognition system (e.g., while booking a flight) is painfully aware that speech recognition is a computationally hard problem: although we hardly ever become aware of it, the physical signal corresponding to, for example, one speaker's "b" can be identical to another speaker's "p", making it hard for computers to distinguish between them. How then does the human brain accomplish this task with such apparent ease?
This NSF funded workshop brings together researchers from computer sciences, linguistics, and the cognitive sciences to discuss and investigate how the brain achieves robust language understanding despite variability. The invited speakers are internationally-known experts. Representatives from both industry and academia will present on the state of the art in automated speech recognition, implicit learning during language understanding, and the neural systems underlying speech perception. The workshop will take place in conjunction with the 2013 Linguistic Society of America's Summer Institute--the largest international linguistics summer school--and will thereby provide training to a large number of young language researchers.
In order to understand each other, listeners must decode the linguistic signal, in order to infer our intended messages. This process is necessarily noisy: no two instances of a sound category (e.g. a /p/) are exactly alike in terms of the physical signal that corresponds to them. To further complicate communication, speakers differ in the way they realize the same sounds, concepts, and propositions. The challenges posed to comprehension by variability are perhaps most obvious at the level of speech perception. For example, the acoustic distributions corresponding to one speaker’s /p/, can be physically more similar to the acoustic distributions of another speaker’s /b/ than the acoustic distributions of that speaker’s /p/. That is, not only is the linguistic signal intrinsically noisy, resulting in distributions of acoustic features corresponding to linguistic categories, but these distributions also vary depending on the speaker. This lack of invariance makes language understanding a challenging computational problem. Understanding how the human brain overcomes this problem can help to develop better speech technology (such as automatic speech recognition, background noise-suppression while using cell phone, etc.). It can also help to develop a better understanding of certain types of problems that cause trouble with understanding others. We organized and conducted a workshop that brought together researchers working on the on this problem from linguistic, psycholinguistics, and computational perspectives. Over 120 students, post-docs, and faculty, including academics and representative of related industries, attended the workshop at the 2013 Linguistic Society of America Summer Institute (the largest gathering of language researchers in the world). The workshop facilitated new interdisciplinary collaboration, including potential collaborations (still in the planning stages) between industry and academia.