Midbrain dopamine neurons are thought to drive associative learning by signaling reward prediction error (RPE), or actual minus expected reward. Based on dopamine RPE signaling, computational and empirical studies have produced detailed models of how reinforcement learning could be implemented in the brain. In particular, the temporal difference (TD) learning model has been a cornerstone in understanding how dopamine RPEs could drive associative learning. Classically, TD learning imparts value to features that serially track the passage of elapsed time relative to observable stimuli. In the real world, however, sensory stimuli provide ambiguous information about the hidden state of the environment, leading to the proposal that TD learning might instead operate over an inferred distribution of hidden states (a ?belief state?). Although this hypothesis has gained traction in theories of reinforcement learning, the empirical evidence is lacking. To test this hypothesis in Aim 1, dopamine neurons will be recorded while mice perform either of two novel classical conditioning tasks. In both tasks, the timing of reward delivery relative to conditioned stimulus is varied across trials. In the first task, reward is always given. In the second task, reward is occasionally omitted. Preliminary data displays a striking difference in dopamine signaling between these two tasks, which is well-explained by a model that incorporates the animal?s intra-trial inference that reward may be omitted in the second task. These preliminary results provide evidence in favor of an associative learning rule that combines cached values with hidden state inference.
Aim 2 then seeks to understand which cortical regions shape hidden state inference in the dopamine system.
This Aim will consist of cortical electrophysiology (Aim 2a) and chemogenetic cortical inactivation (Aim 2b) as mice perform the classical conditioning tasks described above. The results of this proposal will provide critical experimental data towards understanding how reinforcement learning is actually implemented in the brain. This has broad relevance to both basic and translational science. In the healthy brain, robust reinforcement learning ensures that animals can maximize rewards within their environments. In the diseased brain, reinforcement learning may also play an important role. For instance, addiction has been cast as an example of maladaptive and destructive reinforcement learning. Aberrant dopamine signaling in schizophrenia is thought to underlie the reinforcement of ?positive? symptoms such as auditory hallucination. Therefore, examining the regulation of dopamine signaling and constructing a more accurate model of reinforcement learning is of great importance in understanding both the healthy and diseased brain.

Public Health Relevance

Dysfunction of the midbrain dopamine system has been implicated in neuropsychiatric pathologies, such as depression, anxiety, addiction, and schizophrenia, yet therapeutic strategies for these conditions lack efficiency and specificity. In this proposal, we aim to more fully understand dopamine signaling in the healthy brain. By doing this, we can gain insight into the mechanistic underpinnings of conditions involving aberrant dopamine signaling, which is needed to develop more effective treatment strategies for neuropsychiatric disease.

Agency
National Institute of Health (NIH)
Institute
National Institute of Mental Health (NIMH)
Type
Individual Predoctoral NRSA for M.D./Ph.D. Fellowships (ADAMHA) (F30)
Project #
5F30MH112242-02
Application #
9526911
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Van'T Veer, Ashlee V
Project Start
2017-07-01
Project End
2019-04-30
Budget Start
2018-07-01
Budget End
2019-04-30
Support Year
2
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Harvard Medical School
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
047006379
City
Boston
State
MA
Country
United States
Zip Code
Starkweather, Clara Kwon; Gershman, Samuel J; Uchida, Naoshige (2018) The Medial Prefrontal Cortex Shapes Dopamine Reward Prediction Errors under State Uncertainty. Neuron 98:616-629.e6