Dynamic robotic tracking of underwater targets using reinforcement learning

Masmitja, I.; Martin, M.; O’Reilly, T.; Kieft, B.; Palomeras, N.; Navarro, J.; Katija, K

doi:10.1126/scirobotics.ade7811

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: not found

Dynamic robotic tracking of underwater targets using reinforcement learning

Author(s): I. Masmitja ¹ ^, ² , M. Martin ³ ^, ⁴ , T. O’Reilly ² , B. Kieft ² , N. Palomeras ⁵ , J. Navarro ¹ , K. Katija ²

Publication date Created: July 26 2023

Publication date (Print): July 26 2023

Journal: Science Robotics

Publisher: American Association for the Advancement of Science (AAAS)

Read this article at

ScienceOpenPublisher

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

To realize the potential of autonomous underwater robots that scale up our observational capacity in the ocean, new techniques are needed. Fleets of autonomous robots could be used to study complex marine systems and animals with either new imaging configurations or by tracking tagged animals to study their behavior. These activities can then inform and create new policies for community conservation. The role of animal connectivity via active movement of animals represents a major knowledge gap related to the distribution of deep ocean populations. Tracking underwater targets represents a major challenge for observing biological processes in situ, and methods to robustly respond to a changing environment during monitoring missions are needed. Analytical techniques for optimal sensor placement and path planning to locate underwater targets are not straightforward in such cases. The aim of this study was to investigate the use of reinforcement learning as a tool for range-only underwater target-tracking optimization, whose promising capabilities have been demonstrated in terrestrial scenarios. To evaluate its usefulness, a reinforcement learning method was implemented as a path planning system for an autonomous surface vehicle while tracking an underwater mobile target. A complete description of an open-source model, performance metrics in simulated environments, and evaluated algorithms based on more than 15 hours of at-sea field experiments are presented. These efforts demonstrate that deep reinforcement learning is a powerful approach that enhances the abilities of autonomous robots in the ocean and encourages the deployment of algorithms like these for monitoring marine biological systems in the future.

Abstract

Deep reinforcement learning was used on marine robots to autonomously track moving underwater targets in the ocean.

Related collections

Most cited references 59

Record: found
Abstract: found
Article: not found

Human-level control through deep reinforcement learning.

Volodymyr Mnih, Koray Kavukcuoglu, David Silver … (2015)

The theory of reinforcement learning provides a normative account, deeply rooted in psychological and neuroscientific perspectives on animal behaviour, of how agents may optimize their control of an environment. To use reinforcement learning successfully in situations approaching real-world complexity, however, agents are confronted with a difficult task: they must derive efficient representations of the environment from high-dimensional sensory inputs, and use these to generalize past experience to new situations. Remarkably, humans and other animals seem to solve this problem through a harmonious combination of reinforcement learning and hierarchical sensory processing systems, the former evidenced by a wealth of neural data revealing notable parallels between the phasic signals emitted by dopaminergic neurons and temporal difference reinforcement learning algorithms. While reinforcement learning agents have achieved some successes in a variety of domains, their applicability has previously been limited to domains in which useful features can be handcrafted, or to domains with fully observed, low-dimensional state spaces. Here we use recent advances in training deep neural networks to develop a novel artificial agent, termed a deep Q-network, that can learn successful policies directly from high-dimensional sensory inputs using end-to-end reinforcement learning. We tested this agent on the challenging domain of classic Atari 2600 games. We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters. This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

0 comments Cited 1497 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Mastering the game of Go with deep neural networks and tree search.

David Silver, Aja Huang, Chris J Maddison … (2016)

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses 'value networks' to evaluate board positions and 'policy networks' to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.