29
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Active inference and learning

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Highlights

          • Optimal behaviour is quintessentially belief based.

          • Behaviour can be described as optimising expected free energy.

          • Expected free energy entails pragmatic and epistemic value.

          • Habits are learned by observing one’s own goal directed behaviour.

          • Habits are then selected online during active inference.

          Abstract

          This paper offers an active inference account of choice behaviour and learning. It focuses on the distinction between goal-directed and habitual behaviour and how they contextualise each other. We show that habits emerge naturally (and autodidactically) from sequential policy optimisation when agents are equipped with state-action policies. In active inference, behaviour has explorative (epistemic) and exploitative (pragmatic) aspects that are sensitive to ambiguity and risk respectively, where epistemic (ambiguity-resolving) behaviour enables pragmatic (reward-seeking) behaviour and the subsequent emergence of habits. Although goal-directed and habitual policies are usually associated with model-based and model-free schemes, we find the more important distinction is between belief-free and belief-based schemes. The underlying (variational) belief updating provides a comprehensive (if metaphorical) process theory for several phenomena, including the transfer of dopamine responses, reversal learning, habit formation and devaluation. Finally, we show that active inference reduces to a classical (Bellman) scheme, in the absence of ambiguity.

          Related collections

          Most cited references67

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Life as we know it

          This paper presents a heuristic proof (and simulations of a primordial soup) suggesting that life—or biological self-organization—is an inevitable and emergent property of any (ergodic) random dynamical system that possesses a Markov blanket. This conclusion is based on the following arguments: if the coupling among an ensemble of dynamical systems is mediated by short-range forces, then the states of remote systems must be conditionally independent. These independencies induce a Markov blanket that separates internal and external states in a statistical sense. The existence of a Markov blanket means that internal states will appear to minimize a free energy functional of the states of their Markov blanket. Crucially, this is the same quantity that is optimized in Bayesian inference. Therefore, the internal states (and their blanket) will appear to engage in active Bayesian inference. In other words, they will appear to model—and act on—their world to preserve their functional and structural integrity, leading to homoeostasis and a simple form of autopoiesis.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Bayesian surprise attracts human attention.

            We propose a formal Bayesian definition of surprise to capture subjective aspects of sensory information. Surprise measures how data affects an observer, in terms of differences between posterior and prior beliefs about the world. Only data observations which substantially affect the observer's beliefs yield surprise, irrespectively of how rare or informative in Shannon's sense these observations are. We test the framework by quantifying the extent to which humans may orient attention and gaze towards surprising events or items while watching television. To this end, we implement a simple computational model where a low-level, sensory form of surprise is computed by simple simulated early visual neurons. Bayesian surprise is a strong attractor of human attention, with 72% of all gaze shifts directed towards locations more surprising than the average, a figure rising to 84% when focusing the analysis onto regions simultaneously selected by all observers. The proposed theory of surprise is applicable across different spatio-temporal scales, modalities, and levels of abstraction.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              From the ventral to the dorsal striatum: devolving views of their roles in drug addiction.

              We revisit our hypothesis that drug addiction can be viewed as the endpoint of a series of transitions from initial voluntarily drug use to habitual, and ultimately compulsive drug use. We especially focus on the transitions in striatal control over drug seeking behaviour that underlie these transitions since functional heterogeneity of the striatum was a key area of Ann Kelley's research interests and one in which she made enormous contributions. We also discuss the hypothesis in light of recent data that the emergence of a compulsive drug seeking habit both reflects a shift to dorsal striatal control over behaviour and impaired prefontal cortical inhibitory control mechanisms. We further discuss aspects of the vulnerability to compulsive drug use and in particular the impact of impulsivity. In writing this review we acknowledge the untimely death of an outstanding scientist and a dear personal friend. Copyright © 2013 Elsevier Ltd. All rights reserved.
                Bookmark

                Author and article information

                Contributors
                Journal
                Neurosci Biobehav Rev
                Neurosci Biobehav Rev
                Neuroscience and Biobehavioral Reviews
                Pergamon Press
                0149-7634
                1873-7528
                1 September 2016
                September 2016
                : 68
                : 862-879
                Affiliations
                [a ]The Wellcome Trust Centre for Neuroimaging, UCL, 12 Queen Square, London, United Kingdom
                [b ]Max-Planck—UCL Centre for Computational Psychiatry and Ageing Research, London, United Kingdom
                [c ]Centre for Neurocognitive Research, University of Salzburg, Salzburg, Austria
                [d ]Neuroscience Institute, Christian-Doppler-Klinik, Paracelsus Medical University Salzburg, Salzburg, Austria
                [e ]Caltech Brain Imaging Center, California Institute of Technology, Pasadena, USA
                [f ]Institute of Cognitive Sciences and Technologies, National Research Council, Rome, Italy
                Author notes
                [* ]Corresponding author at: The Wellcome Trust Centre for Neuroimaging, Institute of Neurology, 12 Queen Square, London WC1N 3BG, United Kingdom.The Wellcome Trust Centre for NeuroimagingUCL12 Queen SquareLondonUnited Kingdom k.friston@ 123456ucl.ac.uk
                Article
                S0149-7634(16)30133-6
                10.1016/j.neubiorev.2016.06.022
                5167251
                27375276
                e974093b-d919-4ef9-93a4-dde545847707
                © 2016 The Authors

                This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

                History
                : 5 March 2016
                : 15 June 2016
                : 17 June 2016
                Categories
                Article

                Neurosciences
                active inference,habit learning,bayesian inference,goal-directed,free energy,information gain,bayesian surprise,epistemic value,exploration,exploitation

                Comments

                Comment on this article