A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

While reinforcement learning (RL) provides a framework for learning through trial and error, translating RL algorithms into the real world has remained challenging. A major hurdle to real-world application arises from the development of algorithms in an episodic setting where the environment is reset after every trial, in contrast with the continual and non-episodic nature of the real-world encountered by embodied agents such as humans and robots. Prior works have considered an alternating approach where a forward policy learns to solve the task and the backward policy learns to reset the environment, but what initial state distribution should the backward policy reset the agent to? Assuming access to a few demonstrations, we propose a new method, MEDAL, that trains the backward policy to match the state distribution in the provided demonstrations. This keeps the agent close to the task-relevant states, allowing for a mix of easy and difficult starting states for the forward policy. Our experiments show that MEDAL matches or outperforms prior methods on three sparse-reward continuous control tasks from the EARL benchmark, with 40% gains on the hardest task, while making fewer assumptions than prior works.

Related collections

Author and article information

Journal

Publication date Created: 10 May 2022

Article

ArXiV ID: 2205.05212

SO-VID: 1bd541a0-9fa4-4ad6-a3ba-5aa2417fa0e0

License:

http://creativecommons.org/licenses/by/4.0/

History

Custom metadata

Categories cs.LG cs.AI cs.RO

ScienceOpen disciplines: Robotics,Artificial intelligence

Data availability:

ScienceOpen disciplines: Robotics, Artificial intelligence

A State-Distribution Matching Approach to Non-Episodic Reinforcement Learning

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 14