Various neuropsychiatric conditions lead to failures in generating accurate models of the reward environment or inabilities in using those models to guide flexible behavior, very often manifesting as impaired reversal learning. The anterior cingulate cortex (ACC) and the orbitofrontal cortex (OFC) are frontocortical regions important for flexible reinforcement learning, and have been theorized to work in a hierarchy of parallel processes for reward- based choice. In OFC, there is priority encoding of lower-level attributes like reward-predictive value of sensory cues, the palatability of specific rewards, and the current stimulus-reward mappings relevant to behavior. In ACC, these variables are thought to be multiplexed for higher-level computations of reward prediction error (RPE) and confidence/uncertainty of predictions, which are used to monitor performance and update behavioral strategies when necessary (particularly overall trial strategy following positive feedback, i.e., WinStay). These computations may depend upon propagation of spikes from OFC to ACC. However, it remains poorly understood how flexible reward learning is mediated by interactions between OFC and ACC. Here we will investigate this question using a robust animal model of adaptive learning under uncertainty: stimulus-based probabilistic reversal learning (PRL). In freely behaving rats, we will use a combination of in vivo 1-photon calcium imaging and electrophysiology, chemogenetics, and closed-loop neural control of reward delivery to examine how OFC and ACC regulate PRL. Using new technology that we have recently developed for online decoding of calcium activity we will use a novel strategy of regulating reward delivery based upon neural activity in ACC and OFC to test whether flexible reward learning depends upon accurate neural representations in these frontocortical areas. To date, we have: demonstrated effective DREADDs manipulation in vivo and in transduced cortical slices; designed and tested custom electrode arrays to perform chronic in vivo electrophysiological recordings in these areas simultaneously; and imaged ensemble activity time-locked to behavior, which has proven stable over multiple sessions, ideal to study learning. Leveraging these technical advances and using this capacity as a platform, we propose to identify the precise cortico-cortical mechanisms of encoding variables in flexible reinforcement learning across two Aims. Collectively, these experiments will: 1) shed new light on the signaling signatures of cortical regions and their respective roles in flexible reinforcement learning, 2) accelerate groundbreaking experiments as they would be performed in closed-loop: control of reversal learning in real-time using decoded neural expectation, and 3) these signals would eventually be compared in animal models of psychopathology because of their known failures in reversal learning. These novel and unconventional approaches make the R21 mechanism ideal for the proposed work.
Many neuropsychiatric disorders are typified by failures in using models of the reward environment to flexibly change behavior. We outline a plan to discover frontocortical signatures and interactions underlying this learning.