To speak so a listener understands, the speaker has to accurately produce the sounds from his or her language. While this may seem effortless for most people, the actual speech production involves complex mental and physical processes involving the activation of speech muscles and the precisely timed movements in the vocal tract (e.g., the combination of movements in the mouth, jaw, and so on). The unique properties of the individual's speech organs (e.g., size of mouth), combined with the developmental changes of these properties over a lifetime, will directly influence the way speech sounds are produced by each individual. How does the human brain accomplish this feat of continually tuning the control of vocal tract so that it always produces the sounds desired? With support from the National Science Foundation, the investigators will study how the speaking process involves the brain predicting the sensory feedback and correcting the control of the vocal tract when the feedback does not match the prediction. While previous research suggests that this prediction and correction process does occur during speaking, there is little information about how the circuitry in the brain would accomplish such a process. In the proposed research the investigators will examine the timecourse of neural responses to audio feedback perturbations (brief changes in pitch, amplitude, or formant frequencies) during speaking. They will use magnetoencephalography (MEG) and electrocorticography (ECOG) methods to record normal individuals and epilepsy patients who have electrodes implanted in their brain to localize seizures. Both methods allow neural activity in the brain to be recorded at a millisecond time resolution.

The results of these experiments will allow for the testing of different models that have been proposed to explain the neural substrate of speech motor control. The outcome of the research will facilitate relating the control of speaking to what is known in other domains of motor control research, and lead to a more complete understanding of the control of movements in humans. The use of advanced functional neuroimaging to study the neural basis of speaking will provide a special opportunity to train and educate a wide range of graduate students, post-doctoral trainees, and medical students who will get involved in the research. The proposed research will also further the development of multi-user research facilities, especially at the UCSF Biomagnetic Imaging Laboratory that has one of a limited number of MEG scanner facilities in the US.

Project Report

How do we learn to speak? A big part of the answer is that we listen to ourselves. We need to hear we need to hear the sounds that we produce, known as auditory feedback, when we first learn to speak. We know this because children born deaf don’t learn to speak unless their hearing is restored with cochlear implants. But even after we learn to speak, we need auditory feedback to maintain our ability to speak. If speakers become deaf, the pitch and loudness of their speech degrades almost immediately, and the intelligibility of their speech begins to decline. Auditory feedback, therefore, is critically important for speaking, and so if we wish to understand how the brain controls speaking, we must understand how it processes auditory feedback. Our lab has developed a model of how the brain processes auditory feedback during speaking. This model, called the state feedback control (SFC) model, posits that as we speak, our brain generates predictions of the sounds we expect to hear given what we’re intending to say, and that auditory feedback is compared with this prediction. If auditory feedback deviates from these expectations, the resulting error drives changes in our speaking that attempt to cancel the error between the auditory feedback and its predictions. In this grant, we tested our SFC model of how the brain controls speaking by determining where auditory feedback is processed in the brain. In several experiments, we had subjects produce a long, drawn out "ah" while wear headphones and a microphone. The experimental setup allowed us to intercept the speech at the microphone, pass it through a computer that altered how the speech sounded, and return the altered auditory feedback to the subject via the headphones in real-time. In the main experiments of this grant, the computer altered the auditory feedback by briefly perturbing it: it suddenly raised or lowered the pitch of the speech. This external perturbation caused the subject to change his/her production in a way that opposed the feedback perturbation – i.e. the subject compensated for the feedback perturbation. Using this pitch perturbation procedure, we were able to clearly show auditory feedback playing a role in speech production. When auditory feedback was perturbed, this generated a rapid sequence of neural events in the brain leading to compensatory change in speaking. We examined these neural events using two very different methods. First, we used a technique called electrocorticography (ECoG), which is the recording electric potentials directly on the brain’s surface. These experiments were conducted in epilepsy patients who had electrode grids implanted in their brains for clinical purposes. These patients willingly participated in our pitch perturbation studies, allowing us to examine the brain activity during speech feedback compensation. Second, we used a technique called magnetoencephalography (MEG), which is the non-invasive recording of the brain’s magnetic fields. We also conducted pitch perturbation experiments during MEG in healthy adult subjects, which allowed us to examine activity arising from the whole brain during speech compensation. Taken together, the ECoG and MEG recordings showed that auditory cortex (1) detected the feedback perturbations and then (2) signaled areas of motor cortex to compensate, and finally (3) received an updated prediction of what the auditory feedback would now sound like after compensation. These results supported the correctness of our SFC model of how the brain produces speech. But the results also yielded some additional unexpected and exciting findings. The MEG data showed that the right hemisphere appears to play an even larger role than the left in detecting feedback perturbations and generating compensations. We also found that the left and right cortical areas communicated quite frequently with each other in the time leading up to compensation. Finally and importantly, we also found the neural responses to the auditory feedback perturbations we created artificially in the laboratory were also active in natural unperturbed speech. Our experimental findings will allow us to extend and refine our SFC model for speech motor control, and generate new predictions that we intend to test in subsequent studies. The project findings not only impact research on the neuroscience of speech, but also provide interest data for the broader field of motor control and cognitive neuroscience. The project findings may also have specific clinical impact on the improved understanding and treatment of voice disorders like spasmodic dysphonia. More generally, as we learn more about other communication disorders like stuttering and the hypophonic dysarthria (weak voice) that can accompanies Parkinson’s disease (PD), or some subtypes of dementia, there is emerging evidence for abnormalities in auditory feedback processing in these disorders. Thus, the findings of this project may ultimately lead to targeted diagnosis and treatments for a wide range of communication disorders, which would positively affect the quality of life of many people in society.

National Science Foundation (NSF)
Division of Behavioral and Cognitive Sciences (BCS)
Application #
Program Officer
Akaysha Tang
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
San Francisco
United States
Zip Code