Neural signatures of model-based and model-free reinforcement learning across prefrontal cortex and striatum.

Miranda B., Butler JL., Malalasekera WMN., Behrens TEJ., Dayan P., Kennerley SW.

Animals integrate knowledge about how the state of the environment evolves to choose actions that maximise reward. Such goal-directed behaviour - or model-based (MB) reinforcement learning (RL) - can flexibly adapt choice to changes, being thus distinct from simpler habitual - or model-free (MF) RL - strategies. Previous inactivation and neuroimaging work implicates prefrontal cortex (PFC) and the caudate striatal region in MB-RL; however, details are scarce about its implementation at the single-neuron level. Here, we recorded from two PFC regions - the dorsal anterior cingulate cortex (ACC) and dorsolateral PFC (DLPFC), and two striatal regions, caudate and putamen - while two rhesus macaques performed a sequential decision-making (two-step) task in which MB-RL involves knowledge about the statistics of reward and state transitions. All four regions, but particularly the ACC, encoded the rewards received and tracked the probabilistic state transitions that occurred. However, ACC (and to a lesser extent caudate) encoded the key variables of the task - namely the interaction between reward, transition, and choice - which underlies MB decision-making. ACC and caudate neurons also encoded MB-derived estimates of choice values. Moreover, caudate value estimates of the choice options flipped when a rare transition occurred, demonstrating value update based on structural knowledge of the task. The striatal regions were unique (relative to PFC) in encoding the current and previous rewards with opposing polarities, reminiscent of dopaminergic neurons, and indicative of an MF prediction error. Our findings provide a deeper understanding of selective and temporally dissociable neural mechanisms underlying goal-directed behaviour.

More information Original publication

DOI

10.7554/eLife.106032

Type

Journal article

Publication Date

2026-06-22T00:00:00+00:00

Volume

Keywords

anterior cingulate cortex, caudate nucleus, decision-making, neuroscience, reinforcement learning, rhesus macaque, value encoding, Animals, Macaca mulatta, Prefrontal Cortex, Decision Making, Reward, Corpus Striatum, Male, Reinforcement Machine Learning, Neurons, Reinforcement, Psychology