Deep Policy Optimization with Temporal Logic Constraints

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Temporal logics, such as linear temporal logic (LTL), offer a precise means of specifying tasks for (deep) reinforcement learning (RL) agents. In our work, we consider the setting where the task is specified by an LTL objective and there is an additional scalar reward that we need to optimize. Previous works focus either on learning a LTL task-satisfying policy alone or are restricted to finite state spaces. We make two contributions: First, we introduce an RL-friendly approach to this setting by formulating this problem as a single optimization objective. Our formulation guarantees that an optimal policy will be reward-maximal from the set of policies that maximize the likelihood of satisfying the LTL specification. Second, we address a sparsity issue that often arises for LTL-guided Deep RL policies by introducing Cycle Experience Replay (CyclER), a technique that automatically guides RL agents towards the satisfaction of an LTL specification. Our experiments demonstrate the efficacy of CyclER in finding performant deep RL policies in both continuous and discrete experimental domains.

Related collections

Author and article information

Journal

Publication date Created: 17 April 2024

Article

ArXiV ID: 2404.11578

SO-VID: b9d9ab56-541a-47ba-92b3-7e7a3f2c192c

License:

http://creativecommons.org/licenses/by/4.0/

History

Custom metadata

Comments preprint, 8 pages

Categories cs.LG cs.AI cs.FL

ScienceOpen disciplines: Theoretical computer science,Artificial intelligence

Data availability:

ScienceOpen disciplines: Theoretical computer science, Artificial intelligence

Deep Policy Optimization with Temporal Logic Constraints

Read this article at

Abstract

Related collections

Radiology and Natural Language Processing

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 87