Meta-Gradient Reinforcement Learning with an Objective Discovered Online

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Deep reinforcement learning includes a broad family of algorithms that parameterise an internal representation, such as a value function or policy, by a deep neural network. Each algorithm optimises its parameters with respect to an objective, such as Q-learning or policy gradient, that defines its semantics. In this work, we propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network, solely from interactive experience with its environment. Over time, this allows the agent to learn how to learn increasingly effectively. Furthermore, because the objective is discovered online, it can adapt to changes over time. We demonstrate that the algorithm discovers how to address several important issues in RL, such as bootstrapping, non-stationarity, and off-policy learning. On the Atari Learning Environment, the meta-gradient algorithm adapts over time to learn with greater efficiency, eventually outperforming the median score of a strong actor-critic baseline.

Related collections

Author and article information

Journal

Publication date Created: 16 July 2020

Article

ArXiV ID: 2007.08433

SO-VID: fa19e441-68a0-4ccc-99d1-3138e9e30341

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.LG cs.AI stat.ML

ScienceOpen disciplines: Machine learning,Artificial intelligence

Data availability:

ScienceOpen disciplines: Machine learning, Artificial intelligence

Meta-Gradient Reinforcement Learning with an Objective Discovered Online

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 89