ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

26

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

Preprint

Author(s): Xinyue Chen , Zijian Zhou , Zheng Wang , Che Wang , Yanqiu Wu , Qing Deng , Keith Ross

Publication date Created: 27 October 2019

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The field of Deep Reinforcement Learning (DRL) has recently seen a surge in research in batch reinforcement learning, which aims for sample-efficient learning from a given data set without additional interactions with the environment. In the batch DRL setting, commonly employed off-policy DRL algorithms can perform poorly and sometimes even fail to learn altogether. In this paper, we propose a new algorithm, Best-Action Imitation Learning (BAIL), which unlike many off-policy DRL algorithms does not involve maximizing Q functions over the action space. Striving for simplicity as well as performance, BAIL first selects from the batch the actions it believes to be high-performing actions for their corresponding states; it then uses those state-action pairs to train a policy network using imitation learning. Although BAIL is simple, we demonstrate that BAIL achieves state of the art performance on the Mujoco benchmark.

Related collections

Most cited references 4

Record: found
Abstract: not found
Conference Proceedings: not found

MuJoCo: A physics engine for model-based control

Emanuel Todorov, Tom Erez, Yuval Tassa (2012)

0 comments Cited 499 times – based on 0 reviews

Record: found
Abstract: not found
Article: not found

A survey of robot learning from demonstration

Brenna Argall, Sonia Chernova, Manuela Veloso … (2009)

0 comments Cited 413 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Book Chapter: not found

Batch Reinforcement Learning

Sascha Lange, Thomas Gäbel, Martin Riedmiller (2012)

0 comments Cited 22 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 27 October 2019

Article

ArXiV ID: 1910.12179

SO-VID: f2808558-06e0-44e0-972e-73568407bc06

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.LG cs.AI stat.ML

ScienceOpen disciplines: Machine learning,Artificial intelligence

Data availability:

ScienceOpen disciplines: Machine learning, Artificial intelligence

Comments

Comment on this article