8
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Goal-Directed and Habit-Like Modulations of Stimulus Processing during Reinforcement Learning

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent research has shown that perceptual processing of stimuli previously associated with high-value rewards is automatically prioritized even when rewards are no longer available. It has been hypothesized that such reward-related modulation of stimulus salience is conceptually similar to an “attentional habit.” Recording event-related potentials in humans during a reinforcement learning task, we show strong evidence in favor of this hypothesis. Resistance to outcome devaluation (the defining feature of a habit) was shown by the stimulus-locked P1 component, reflecting activity in the extrastriate visual cortex. Analysis at longer latencies revealed a positive component (corresponding to the P3b, from 550–700 ms) sensitive to outcome devaluation. Therefore, distinct spatiotemporal patterns of brain activity were observed corresponding to habitual and goal-directed processes. These results demonstrate that reinforcement learning engages both attentional habits and goal-directed processes in parallel. Consequences for brain and computational models of reinforcement learning are discussed.

          SIGNIFICANCE STATEMENT The human attentional network adapts to detect stimuli that predict important rewards. A recent hypothesis suggests that the visual cortex automatically prioritizes reward-related stimuli, driven by cached representations of reward value; that is, stimulus–response habits. Alternatively, the neural system may track the current value of the predicted outcome. Our results demonstrate for the first time that visual cortex activity is increased for reward-related stimuli even when the rewarding event is temporarily devalued. In contrast, longer-latency brain activity was specifically sensitive to transient changes in reward value. Therefore, we show that both habit-like attention and goal-directed processes occur in the same learning episode at different latencies. This result has important consequences for computational models of reinforcement learning.

          Related collections

          Author and article information

          Journal
          J Neurosci
          J. Neurosci
          jneuro
          jneurosci
          J. Neurosci
          The Journal of Neuroscience
          Society for Neuroscience
          0270-6474
          1529-2401
          15 March 2017
          : 37
          : 11
          : 3009-3017
          Affiliations
          [1]School of Psychology, UNSW Australia, Sydney, New South Wales 2052, Australia
          Author notes
          Correspondence should be addressed to David Luque, School of Psychology, UNSW Australia, Sydney, New South Wales 2052, Australia. d.luque@ 123456unsw.edu.au

          Author contributions: D.L., T.B., R.W.M., O.G., T.J.W., and M.E.L.P. designed research; D.L. and B.N.J. performed research; D.L. analyzed data; D.L., T.B., R.W.M., B.N.J., O.G., T.J.W., and M.E.L.P. wrote the paper.

          Author information
          http://orcid.org/0000-0002-3457-9204
          http://orcid.org/0000-0002-5018-1239
          http://orcid.org/0000-0002-9833-9998
          Article
          PMC6596732 PMC6596732 6596732 3205-16
          10.1523/JNEUROSCI.3205-16.2017
          6596732
          28193692
          4f199ab1-2cf2-4bf5-a72e-2e6fe2c71c36
          Copyright © 2017 the authors 0270-6474/17/373009-09$15.00/0
          History
          : 16 October 2016
          : 5 January 2017
          : 31 January 2017
          Categories
          Research Articles
          Behavioral/Cognitive

          attention,event-related potentials,reward,learning,habit,goal-directed

          Comments

          Comment on this article