26
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The field of Deep Reinforcement Learning (DRL) has recently seen a surge in research in batch reinforcement learning, which aims for sample-efficient learning from a given data set without additional interactions with the environment. In the batch DRL setting, commonly employed off-policy DRL algorithms can perform poorly and sometimes even fail to learn altogether. In this paper, we propose a new algorithm, Best-Action Imitation Learning (BAIL), which unlike many off-policy DRL algorithms does not involve maximizing Q functions over the action space. Striving for simplicity as well as performance, BAIL first selects from the batch the actions it believes to be high-performing actions for their corresponding states; it then uses those state-action pairs to train a policy network using imitation learning. Although BAIL is simple, we demonstrate that BAIL achieves state of the art performance on the Mujoco benchmark.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          MuJoCo: A physics engine for model-based control

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A survey of robot learning from demonstration

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Batch Reinforcement Learning

                Bookmark

                Author and article information

                Journal
                27 October 2019
                Article
                1910.12179
                f2808558-06e0-44e0-972e-73568407bc06

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.LG cs.AI stat.ML

                Machine learning,Artificial intelligence
                Machine learning, Artificial intelligence

                Comments

                Comment on this article