9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Shallow Updates for Deep Reinforcement Learning

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Deep reinforcement learning (DRL) methods such as the Deep Q-Network (DQN) have achieved state-of-the-art results in a variety of challenging, high-dimensional domains. This success is mainly attributed to the power of deep neural networks to learn rich domain representations for approximating the value function or policy. Batch reinforcement learning methods with linear representations, on the other hand, are more stable and require less hyper parameter tuning. Yet, substantial feature engineering is necessary to achieve good results. In this work we propose a hybrid approach -- the Least Squares Deep Q-Network (LS-DQN), which combines rich feature representations learned by a DRL algorithm with the stability of a linear least squares method. We do this by periodically re-training the last hidden layer of a DRL network with a batch least squares update. Key to our approach is a Bayesian regularization term for the least squares update, which prevents over-fitting to the more recent data. We tested LS-DQN on five Atari games and demonstrate significant improvement over vanilla DQN and Double-DQN. We also investigated the reasons for the superior performance of our method. Interestingly, we found that the performance improvement can be attributed to the large batch size used by the LS method when optimizing the last layer.

          Related collections

          Most cited references6

          • Record: found
          • Abstract: not found
          • Article: not found

          Individual Comparisons by Ranking Methods

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            What is the best multi-stage architecture for object recognition?

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Neural Fitted Q Iteration – First Experiences with a Data Efficient Neural Reinforcement Learning Method

                Bookmark

                Author and article information

                Journal
                2017-05-21
                Article
                1705.07461
                3f77028f-f078-4eb1-9cf1-ef3504087f67

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.AI cs.LG stat.ML

                Machine learning,Artificial intelligence
                Machine learning, Artificial intelligence

                Comments

                Comment on this article