LVIS: Learning from Value Function Intervals for Contact-Aware Robot
  Controllers

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Guided policy search is a popular approach for training controllers for high-dimensional systems, but it has a number of pitfalls. Non-convex trajectory optimization has local minima, and non-uniqueness in the optimal policy itself can mean that independently-optimized samples do not describe a coherent policy from which to train. We introduce LVIS, which circumvents the issue of local minima through global mixed-integer optimization and the issue of non-uniqueness through learning the optimal value function (or cost-to-go) rather than the optimal policy. To avoid the expense of solving the mixed-integer programs to full global optimality, we instead solve them only partially, extracting intervals containing the true cost-to-go from early termination of the branch-and-bound algorithm. These interval samples are used to weakly supervise the training of a neural net which approximates the true cost-to-go. Online, we use that learned cost-to-go as the terminal cost of a one-step model-predictive controller, which we solve via a small mixed-integer optimization. We demonstrate the LVIS approach on a cart-pole system with walls and a planar humanoid robot model and show that it can be applied to a fundamentally hard problem in feedback control--control through contact.

Related collections

Most cited references 3

Record: found
Abstract: not found
Article: not found

A direct method for trajectory optimization of rigid bodies through contact

Cecilia Cantu, Russ Tedrake, Michael Posa (2014)

0 comments Cited 68 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Combining the benefits of function approximation and trajectory optimization

Igor Mordatch, Emo Todorov (2014)

0 comments Cited 18 times – based on 0 reviews

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Sean Mason, Nicholas Rotella, Stefan Schaal … (2017)

Torque control algorithms which consider robot dynamics and contact constraints are important for creating dynamic behaviors for humanoids. As computational power increases, algorithms tend to also increase in complexity. However, it is not clear how much complexity is really required to create controllers which exhibit good performance. In this paper, we study the capabilities of a simple approach based on contact consistent LQR controllers designed around key poses to control various tasks on a humanoid robot. We present extensive experimental results on a hydraulic, torque controlled humanoid performing balancing and stepping tasks. This feedback control approach captures the necessary synergies between the DoFs of the robot to guarantee good control performance. We show that for the considered tasks, it is only necessary to re-linearize the dynamics of the robot at different contact configurations and that increasing the number of LQR controllers along desired trajectories does not improve performance. Our result suggest that very simple controllers can yield good performance competitive with current state of the art, but more complex, optimization-based whole-body controllers. A video of the experiments can be found at https://youtu.be/5T08CNKV1hw.

0 comments Cited 1 times – based on 0 reviews

Preprint

     Review now

Bookmark

All references

Author and article information

Journal

Publication date Created: 15 September 2018

Article

ArXiV ID: 1809.05802

SO-VID: 17de9c9a-3a8c-4890-a76c-95ee1762f39c

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments 7 pages, 8 figures. Submitted to the 2019 IEEE International Conference on Robotics and Automation (ICRA 2019)

Categories cs.RO

ScienceOpen disciplines: Robotics

Data availability:

ScienceOpen disciplines: Robotics

LVIS: Learning from Value Function Intervals for Contact-Aware Robot Controllers

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 3

A direct method for trajectory optimization of rigid bodies through contact

Combining the benefits of function approximation and trajectory optimization

Balancing and Walking Using Full Dynamics LQR Control With Contact Constraints

Author and article information

Journal

Article

History

Custom metadata

Comments

Comment on this article

Similar content 163

Most referenced authors 17