The overall goal of this project is to develop a reinforcement learning (RL) theory of motivation, understood here as motivational salience, and to test the conclusions of this theory using experimental observations obtained in the ventral pallidum (VP). Animals' actions depend on the shifting values of internal demands determined by physiological or behavioral conditions, such as thirst, hunger, addiction, specific nutrient deficiency, etc. These need-based modulations of the perceived values of reinforcements (reward or punishment} are described by a mathematical variable called motivational salience or, simply, motivation. Including motivation adds a new level of complexity to RL theory, and allows it to generate flexible ongoing behaviors. Here, we will investigate how motivation can be learned by neuronal networks to generate complex adaptive behaviors and compare the conclusions of our theory with the VP circuits. Previous studies indicate that the VP plays an important role in a variety of behaviors, potentially, by influencing motivational salience. In vivo recordings suggest that VP neuron firing correlates with motivational states. Lesions, pharmacological and optogenetic manipulations in VP cause profound changes in behaviors motivated by natural rewards or drugs of addiction. Dysfunction of this structure is linked to depression and drug addiction in humans. Our theoretical results suggest that distinct classes of neurons in the VP should play essential roles in representing either positive or negative motivational states. We further hypothesize that the functional interactions locally within the VP are critical for generating such signals that guide motivated behaviors. Consistent with predictions of RL theory, in our preliminary studies, we found that individual VP neurons could be classified as either positive or negative 'motivation neurons', as the activities of these neurons represented both expected values of outcomes and motivational states. When population activity is considered, representations of outcome expectation can be distinguished from representations of motivation fluctuating according to the animals' physiological states. Based on the preliminary data, we devised an integrated approach, combining studies in computational analysis and theory (Koulakov lab) with advanced molecular genetic tools, optogenetics, chemogenetics, electrophysiology, and imaging in behaving mice (Li lab), to test our hypotheses through the following Aims:
Aim 1. To develop methods for identifying motivation in the population activity of VP neurons. Here we will use novel behavioral and computational methods to disambiguate representations of motivation and outcome expectation in neuronal responses.
Aim 2. To develop reinforcement learning theory of motivation and to test its predictions using responses of VP neurons. Here we will develop the Q-learning theory of motivation and compare networks trained using this theory to responses of VP neurons.
Aim 3. To identify the circuit basis of representations of motivation in VP neuronal populations. We will identify the network structure in Q-learning networks with motivation, and test predictions using opto- and chemogenetic manipulations in VP.
The neural mechanisms of motivated behaviors remain unclear. In the proposed research program, we will determine the precise circuit mechanisms and computations by which neurons in the ventral pallidum participate in modulating motivated behaviors. Findings from this project will have important clinical implications, as impairments in motivational processes are core features of depression and drug addiction.