The major aim of our research is to understand how dynamic visual and auditory components of vocal expressions (e.g., speech) are combined behaviorally and physiologically to enhance communication. For example, holding a conversation among a group of individuals at a party means that all around you are the sounds of voices, laughter and music. In this mixture of sounds, the problem your brain is confronted with is to deftly detect when a person is saying something and discriminate what she is saying. To make its task easier, our brains do not rely entirely on the person's voice, but also take advantage of the movement of the person's face while she speaks. The motions of the mouth provide spatial and temporal cues. These multidimensional cues enhance detection and discrimination of voices. The focus of our work will be on what role the auditory cortex plays in integrating faces and voices, and how its role may different from that of the more traditional association areas such as the superior temporal sulcus. We have four main hypotheses. First, we hypothesize that the magnitude of the behavioral advantage, in terms of multisensory benefits on reaction times, will relate to the response magnitude and response latency of auditory cortical neurons. To address this, we will record from the lateral belt auditory cortex during the performance of audiovisual vocal detection task in noise. Second, we predict that the auditory cortex will show a rhythm preference for normal speech relative to slowed or sped up speech and that this preference will also result in greater spiking output, greater spike-speech phase locking or both. Third, we hypothesize that the role of this rhythm is to "chunk" the auditory signal into manageable units to allow for further, more efficient processing of the fine structure of vocalizations. We will then test the possibility that a rhythmic visual signal could compensate for disruptions in the rhythmicity of the auditory component of vocalizations;we will test this both behaviorally and at the level of auditory cortical signals. Fourth, we hypothesize that processes occurring in the superior temporal sulcus during the same detection and discrimination tasks will be different from those occurring in the auditory cortex. This difference will be primarily because the superior temporal sulcus receives supra-threshold inputs from both the auditory and visual modalities, whereas the auditory cortex only receives a modulatory, sub-threshold influence from the visual modality. Our work has the potential to illuminate the neurophysiological mechanisms that go awry in a number of communication disorders. First, relative to typically-developing children, children with autism spectrum disorders exhibit impaired neural processing and impaired detection of audiovisual speech in noisy backgrounds. Second, a recent theory of dyslexia suggests that dyslexics are impaired at linking phonological sounds with vision. Third, relative to normal individuals, schizophrenic patients are particularly impaire at discriminating audiovisual versus auditory-only speech in noisy backgrounds. One likely substrate for these impairments is the temporal lobe, where faces and voices are first combined neurophysiologically.

Public Health Relevance

In a number of communication disorders, there are deficits in audiovisual integration. Patients with schizophrenia and children with autism spectrum disorders both exhibit impaired detection of audiovisual speech sounds in noisy backgrounds, and audiovisual speech processing in dyslexics show abnormal patterns of brain activity. By investigating the relationship between 1) audiovisual detection and temporal lobe neurophysiology, and 2) audiovisual discrimination, the rhythmic nature of speech and temporal lobe neurophysiology, we are trying to understand what goes awry in these and other communication disorders.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Program Officer
Babcock, Debra J
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Princeton University
Schools of Arts and Sciences
United States
Zip Code
Borjon, Jeremy I; Ghazanfar, Asif A (2014) Convergent evolution of vocal cooperation without convergent evolution of brain size. Brain Behav Evol 84:93-102
Ghazanfar, Asif A; Takahashi, Daniel Y (2014) The evolution of speech: vision, rhythm, cooperation. Trends Cogn Sci 18:543-53
Ghazanfar, Asif A; Eliades, Steven J (2014) The neurobiology of primate vocal communication. Curr Opin Neurobiol 28:128-35
Ghazanfar, Asif A; Takahashi, Daniel Y (2014) Facial expressions and the evolution of the speech rhythm. J Cogn Neurosci 26:1196-207
Ghazanfar, Asif A (2013) Multisensory vocal communication in primates and the evolution of rhythmic speech. Behav Ecol Sociobiol 67:
Ghazanfar, Asif A; Morrill, Ryan J; Kayser, Christoph (2013) Monkeys are perceptually tuned to facial expressions that exhibit a theta-like speech rhythm. Proc Natl Acad Sci U S A 110:1959-63
Hasson, Uri; Ghazanfar, Asif A; Galantucci, Bruno et al. (2012) Brain-to-brain coupling: a mechanism for creating and sharing a social world. Trends Cogn Sci 16:114-21
Shepherd, Stephen V; Lanzilotto, Marco; Ghazanfar, Asif A (2012) Facial muscle coordination in monkeys during rhythmic facial expressions and ingestive movements. J Neurosci 32:6105-16
Chandrasekaran, Chandramouli; Lemus, Luis; Trubanova, Andrea et al. (2011) Monkeys and humans share a common computation for face/voice integration. PLoS Comput Biol 7:e1002165
Ghazanfar, Asif A; Chandrasekaran, Chandramouli; Morrill, Ryan J (2010) Dynamic, rhythmic facial expressions and the superior temporal sulcus of macaque monkeys: implications for the evolution of audiovisual speech. Eur J Neurosci 31:1807-17

Showing the most recent 10 out of 16 publications