8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DORA The Explorer: Directed Outreaching Reinforcement Action-Selection

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Exploration is a fundamental aspect of Reinforcement Learning, typically implemented using stochastic action-selection. Exploration, however, can be more efficient if directed toward gaining new world knowledge. Visit-counters have been proven useful both in practice and in theory for directed exploration. However, a major limitation of counters is their locality. While there are a few model-based solutions to this shortcoming, a model-free approach is still missing. We propose \(E\)-values, a generalization of counters that can be used to evaluate the propagating exploratory value over state-action trajectories. We compare our approach to commonly used RL techniques, and show that using \(E\)-values improves learning and performance over traditional counters. We also show how our method can be implemented with function approximation to efficiently learn continuous MDPs. We demonstrate this by showing that our approach surpasses state of the art performance in the Freeway Atari 2600 game.

          Related collections

          Most cited references5

          • Record: found
          • Abstract: not found
          • Article: not found

          An analysis of model-based Interval Estimation for Markov Decision Processes

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            PAC model-free reinforcement learning

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Near-Bayesian exploration in polynomial time

                Bookmark

                Author and article information

                Journal
                11 April 2018
                Article
                1804.04012
                c3852e9b-a3a0-4a4d-ac2d-0845bc3e59b2

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                Final version for ICLR 2018
                cs.LG cs.AI stat.ML

                Machine learning,Artificial intelligence
                Machine learning, Artificial intelligence

                Comments

                Comment on this article