The major aim of our research is to understand how dynamic visual and auditory components of vocal expressions (e.g., speech) are combined behaviorally and physiologically to enhance communication. For example, holding a conversation among a group of individuals at a party means that all around you are the sounds of voices, laughter and music. In this mixture of sounds, the problem your brain is confronted with is to deftly detect when a person is saying something and discriminate what she is saying. To make its task easier, our brains do not rely entirely on the person's voice, but also take advantage of the movement of the person's face while she speaks. The motions of the mouth provide spatial and temporal cues. These multidimensional cues enhance detection and discrimination of voices. The focus of our work will be on what role the auditory cortex plays in integrating faces and voices, and how its role may different from that of the more traditional association areas such as the superior temporal sulcus. We have four main hypotheses. First, we hypothesize that the magnitude of the behavioral advantage, in terms of multisensory benefits on reaction times, will relate to the response magnitude and response latency of auditory cortical neurons. To address this, we will record from the lateral belt auditory cortex during the performance of audiovisual vocal detection task in noise. Second, we predict that the auditory cortex will show a rhythm preference for normal speech relative to slowed or sped up speech and that this preference will also result in greater spiking output, greater spike-speech phase locking or both. Third, we hypothesize that the role of this rhythm is to """"""""chunk"""""""" the auditory signal into manageable units to allow for further, more efficient processing of the fine structure of vocalizations. We will then test the possibility that a rhythmic visual signal could compensate for disruptions in the rhythmicity of the auditory component of vocalizations;we will test this both behaviorally and at the level of auditory cortical signals. Fourth, we hypothesize that processes occurring in the superior temporal sulcus during the same detection and discrimination tasks will be different from those occurring in the auditory cortex. This difference will be primarily because the superior temporal sulcus receives supra-threshold inputs from both the auditory and visual modalities, whereas the auditory cortex only receives a modulatory, sub-threshold influence from the visual modality. Our work has the potential to illuminate the neurophysiological mechanisms that go awry in a number of communication disorders. First, relative to typically-developing children, children with autism spectrum disorders exhibit impaired neural processing and impaired detection of audiovisual speech in noisy backgrounds. Second, a recent theory of dyslexia suggests that dyslexics are impaired at linking phonological sounds with vision. Third, relative to normal individuals, schizophrenic patients are particularly impaire at discriminating audiovisual versus auditory-only speech in noisy backgrounds. One likely substrate for these impairments is the temporal lobe, where faces and voices are first combined neurophysiologically.

Public Health Relevance

In a number of communication disorders, there are deficits in audiovisual integration. Patients with schizophrenia and children with autism spectrum disorders both exhibit impaired detection of audiovisual speech sounds in noisy backgrounds, and audiovisual speech processing in dyslexics show abnormal patterns of brain activity. By investigating the relationship between 1) audiovisual detection and temporal lobe neurophysiology, and 2) audiovisual discrimination, the rhythmic nature of speech and temporal lobe neurophysiology, we are trying to understand what goes awry in these and other communication disorders.

National Institute of Health (NIH)
National Institute of Neurological Disorders and Stroke (NINDS)
Research Project (R01)
Project #
Application #
Study Section
Program Officer
Babcock, Debra J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Princeton University
Schools of Arts and Sciences
United States
Zip Code
Ghazanfar, Asif A; Liao, Diana A (2018) Constraints and flexibility during vocal development: Insights from marmoset monkeys. Curr Opin Behav Sci 21:27-32
Liao, Diana A; Zhang, Yisi S; Cai, Lili X et al. (2018) Internal states and extrinsic factors both determine monkey vocal production. Proc Natl Acad Sci U S A 115:3978-3983
Teramoto, Yayoi; Takahashi, Daniel Y; Holmes, Philip et al. (2017) Vocal development in a Waddington landscape. Elife 6:
Fitch, W Tecumseh; de Boer, Bart; Mathur, Neil et al. (2016) Monkey vocal tracts are speech-ready. Sci Adv 2:e1600723
Borjon, Jeremy I; Takahashi, Daniel Y; Cervantes, Diego C et al. (2016) Arousal dynamics drive vocal production in marmoset monkeys. J Neurophysiol 116:753-64
Ghazanfar, Asif A; Zhang, Yisi S (2016) The autonomic nervous system is the engine for vocal development through social feedback. Curr Opin Neurobiol 40:155-160
Choi, Jung Yoon; Takahashi, Daniel Y; Ghazanfar, Asif A (2015) Cooperative vocal control in marmoset monkeys via vocal feedback. J Neurophysiol 114:274-83
Ghazanfar, Asif A; Takahashi, Daniel Y (2014) The evolution of speech: vision, rhythm, cooperation. Trends Cogn Sci 18:543-53
Borjon, Jeremy I; Ghazanfar, Asif A (2014) Convergent evolution of vocal cooperation without convergent evolution of brain size. Brain Behav Evol 84:93-102
Ghazanfar, Asif A; Eliades, Steven J (2014) The neurobiology of primate vocal communication. Curr Opin Neurobiol 28:128-35

Showing the most recent 10 out of 26 publications