+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Differential effects of reward and punishment in decision making under uncertainty: a computational study

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Computational models of learning have proved largely successful in characterizing potential mechanisms which allow humans to make decisions in uncertain and volatile contexts. We report here findings that extend existing knowledge and show that a modified reinforcement learning model, which has separate parameters according to whether the previous trial gave a reward or a punishment, can provide the best fit to human behavior in decision making under uncertainty. More specifically, we examined the fit of our modified reinforcement learning model to human behavioral data in a probabilistic two-alternative decision making task with rule reversals. Our results demonstrate that this model predicted human behavior better than a series of other models based on reinforcement learning or Bayesian reasoning. Unlike the Bayesian models, our modified reinforcement learning model does not include any representation of rule switches. When our task is considered purely as a machine learning task, to gain as many rewards as possible without trying to describe human behavior, the performance of modified reinforcement learning and Bayesian methods is similar. Others have used various computational models to describe human behavior in similar tasks, however, we are not aware of any who have compared Bayesian reasoning with reinforcement learning modified to differentiate rewards and punishments.

          Related collections

          Most cited references 18

          • Record: found
          • Abstract: found
          • Article: not found

          The neural basis of loss aversion in decision-making under risk.

          People typically exhibit greater sensitivity to losses than to equivalent gains when making decisions. We investigated neural correlates of loss aversion while individuals decided whether to accept or reject gambles that offered a 50/50 chance of gaining or losing money. A broad set of areas (including midbrain dopaminergic regions and their targets) showed increasing activity as potential gains increased. Potential losses were represented by decreasing activity in several of these same gain-sensitive areas. Finally, individual differences in behavioral loss aversion were predicted by a measure of neural loss aversion in several regions, including the ventral striatum and prefrontal cortex.
            • Record: found
            • Abstract: found
            • Article: not found

            Distinct roles for direct and indirect pathway striatal neurons in reinforcement

            Dopamine signaling is implicated in reinforcement learning, but the neural substrates targeted by dopamine are poorly understood. Here, we bypassed dopamine signaling itself and tested how optogenetic activation of dopamine D1- or D2-receptor-expressing striatal projection neurons influenced reinforcement learning in mice. Stimulating D1-expressing neurons induced persistent reinforcement, whereas stimulating D2-expressing neurons induced transient punishment, demonstrating that activation of these circuits is sufficient to modify the probability of performing future actions.
              • Record: found
              • Abstract: found
              • Article: not found

              Dynamic dopamine modulation in the basal ganglia: a neurocomputational account of cognitive deficits in medicated and nonmedicated Parkinsonism.

               M. Frank (2004)
              Dopamine (DA) depletion in the basal ganglia (BG) of Parkinson's patients gives rise to both frontal-like and implicit learning impairments. Dopaminergic medication alleviates some cognitive deficits but impairs those that depend on intact areas of the BG, apparently due to DA ''overdose.'' These findings are difficult to accommodate with verbal theories of BG/DA function, owing to complexity of system dynamics: DA dynamically modulates function in the BG, which is itself a modulatory system. This article presents a neural network model that instantiates key biological properties and provides insight into the underlying role of DA in the BG during learning and execution of cognitive tasks. Specifically, the BG modulates the execution of ''actions'' (e.g., motor different parts of the frontal cortex. Phasic changes in DA, which occur during error feedback, dynamically modulate the BG threshold for facilitating/suppressing a cortical command in response to particular stimuli. Reduced dynamic range of DA explains Parkinson and DA overdose deficits with a single underlying dysfunction, despite overall differences in raw DA levels. Simulated Parkinsonism and medication effects provide a theoretical basis for behavioral data in probabilistic classification and reversal tasks. The model also provides novel testable predictions for neuropsychological and pharmacological studies, and motivates further investigation of BG/DA interactions with the prefrontal cortex in working memory.

                Author and article information

                Front Neurosci
                Front Neurosci
                Front. Neurosci.
                Frontiers in Neuroscience
                Frontiers Media S.A.
                21 February 2014
                : 8
                1School of Computing, University of Leeds Leeds, West Yorkshire, UK
                2Neuroscience and Psychiatry Unit, University of Manchester Manchester, UK
                3School of Business, Monash University Bandar Sunway, Malaysia
                Author notes

                Edited by: Peter Bossaerts, École Polytechnique Fédérale de Lausanne, Switzerland

                Reviewed by: Daniel McNamee, California Institute of Technology, USA; Agnieszka A. Tymula, University of Sydney, Australia

                *Correspondence: Marc de Kamps, School of Computing, University of Leeds, Leeds, West Yorkshire LS2 9JT, UK e-mail: m.deKamps@

                This article was submitted to Decision Neuroscience, a section of the journal Frontiers in Neuroscience.

                Copyright © 2014 Duffin, Bland, Schaefer and de Kamps.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                Page count
                Figures: 9, Tables: 2, Equations: 16, References: 31, Pages: 13, Words: 10922
                Original Research Article


                Comment on this article