Midbrain dopamine neurons compute inferred and cached value prediction errors in a common framework

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Midbrain dopamine neurons have been proposed to signal reward prediction errors as defined in temporal difference (TD) learning algorithms. While these models have been extremely powerful in interpreting dopamine activity, they typically do not use value derived through inference in computing errors. This is important because much real world behavior – and thus many opportunities for error-driven learning – is based on such predictions. Here, we show that error-signaling rat dopamine neurons respond to the inferred, model-based value of cues that have not been paired with reward and do so in the same framework as they track the putative cached value of cues previously paired with reward. This suggests that dopamine neurons access a wider variety of information than contemplated by standard TD models and that, while their firing conforms to predictions of TD models in some cases, they may not be restricted to signaling errors from TD predictions.

DOI: http://dx.doi.org/10.7554/eLife.13665.001

eLife digest

Learning is driven by discrepancies between what we think is going to happen and what actually happens. These discrepancies, or ‘prediction errors’, trigger changes in the brain that support learning. These errors are signaled by neurons in the midbrain – called dopamine neurons – that fire rapidly in response to unexpectedly good events, and thereby instruct other parts of the brain to learn about the factors that occurred before the event. These events can be rewards, such as food, or cues that have predicted rewards in the past.

Yet we often anticipate, or infer, rewards even if we have not experienced them directly in a given situation. This inference reflects our ability to mentally simulate likely outcomes or consequences of our actions in new situations based upon, but going beyond, our previous experiences. These inferred predictions of reward can alter error-based learning just like predictions based upon direct experience; but do inferred reward predictions also alter the error signals from dopamine neurons?

Sadacca et al. tested this question by exposing rats to cues while recording the activity of dopamine neurons from the rats’ midbrains. In some cases, the cues directly predicted rewards based on the rats’ previous experience; in other cases, the cues predicted rewards only indirectly and based on inference. Sadacca et al. found that the dopamine neurons fired in similar ways in response to the cues in both of these situations. This result is consistent with the proposal that dopamine neurons use both types of information to calculate errors in predictions. These findings provide a mechanism by which dopamine neurons could support a much broader and more complex range of learning than previously thought.

DOI: http://dx.doi.org/10.7554/eLife.13665.002

Related collections

Most cited references 39

Record: found
Abstract: found
Article: not found

Getting formal with dopamine and reward.

Wolfram Schultz (2002)

Recent neurophysiological studies reveal that neurons in certain brain structures carry specific signals about past and future rewards. Dopamine neurons display a short-latency, phasic reward signal indicating the difference between actual and predicted rewards. The signal is useful for enhancing neuronal processing and learning behavioral reactions. It is distinctly different from dopamine's tonic enabling of numerous behavioral processes. Neurons in the striatum, frontal cortex, and amygdala also process reward information but provide more differentiated information for identifying and anticipating rewards and organizing goal-directed behavior. The different reward signals have complementary functions, and the optimal use of rewards in voluntary behavior would benefit from interactions between the signals. Addictive psychostimulant drugs may exert their action by amplifying the dopamine reward signal.

0 comments Cited 441 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Learning to predict by the methods of temporal differences

Richard S. Sutton (1988)

0 comments Cited 425 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

A Causal Link Between Prediction Errors, Dopamine Neurons and Learning

Elizabeth E. Steinberg, Ronald Keiflin, Josiah R. Boivin … (2013)

Situations where rewards are unexpectedly obtained or withheld represent opportunities for new learning. Often, this learning includes identifying cues that predict reward availability. Unexpected rewards strongly activate midbrain dopamine neurons. This phasic signal is proposed to support learning about antecedent cues by signaling discrepancies between actual and expected outcomes, termed a reward prediction error. However, it is unknown whether dopamine neuron prediction error signaling and cue-reward learning are causally linked. To test this hypothesis, we manipulated dopamine neuron activity in rats in two behavioral procedures, associative blocking and extinction, that illustrate the essential function of prediction errors in learning. We observed that optogenetic activation of dopamine neurons concurrent with reward delivery, mimicking a prediction error, was sufficient to cause long-lasting increases in cue-elicited reward-seeking behavior. Our findings establish a causal role for temporally-precise dopamine neuron signaling in cue-reward learning, bridging a critical gap between experimental evidence and influential theoretical frameworks.

0 comments Cited 279 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Timothy EJ Behrens: Role: Reviewing editor

Journal

Journal ID (nlm-ta): eLife

Journal ID (iso-abbrev): Elife

Journal ID (hwp): eLife

Journal ID (publisher-id): eLife

Title: eLife

Publisher: eLife Sciences Publications, Ltd

ISSN (Electronic): 2050-084X

Publication date (Electronic, pub): 07 March 2016

Publication date Collection: 2016

Volume: 5

Electronic Location Identifier: e13665

Affiliations

[1 ]deptIntramural Research program of the National Institute on Drug Abuse , National Institutes of Health , Bethesda, United States

[2 ]deptDepartment of Anatomy and Neurobiology , University of Maryland School of Medicine , Baltimore, United States

[3 ]deptDepartment of Neuroscience , Johns Hopkins School of Medicine , Baltimore, United States

[4]University College London , United Kingdom

[5]University College London , United Kingdom

Author notes

brian.sadacca@ 123456nih.gov (BFS);

geoffrey.schoenbaum@ 123456nih.gov (GS)

Article

Publisher ID: 13665

DOI: 10.7554/eLife.13665

PMC ID: 4805544

PubMed ID: 26949249

SO-VID: e608491e-56d0-4a48-9801-a706bdd56450

License:

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

History

Date received : 09 December 2015

Date accepted : 03 March 2016

Funding

Funded by: FundRef http://dx.doi.org/10.13039/100000026, National Institute on Drug Abuse;

Award ID: IRP

Award Recipient : Geoffrey Schoenbaum

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Custom metadata

elife-xml-version 2.5

Author impact statement Midbrain dopamine neurons in rats signal discrepancies between predicted and actual rewards, regardless of whether the rewards are predicted on the basis of experience or inference.

ScienceOpen disciplines: Life sciences

Keywords: dopamine,prediction error,rat,single unit

Data availability:

ScienceOpen disciplines: Life sciences

Keywords: dopamine, prediction error, rat, single unit