2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We study the global convergence of policy optimization for finding the Nash equilibria (NE) in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of LQ games, viewing it as a nonconvex-nonconcave saddle-point problem in the policy space. Specifically, we show that despite its nonconvexity and nonconcavity, zero-sum LQ games have the property that the stationary point of the objective with respect to the feedback control policies constitutes the NE of the game. Building upon this, we develop three projected nested-gradient methods that are guaranteed to converge to the NE of the game. Moreover, we show that all of these algorithms enjoy both global sublinear and local linear convergence rates. Simulation results are then provided to validate the proposed algorithms. To the best of our knowledge, this work appears the first to investigate the optimization landscape of LQ games, and provably show the convergence of policy optimization methods to the Nash equilibria. Our work serves as an initial step of understanding the theoretical aspects of policy-based reinforcement learning algorithms for zero-sum Markov games in general.

          Related collections

          Most cited references9

          • Record: found
          • Abstract: not found
          • Book Chapter: not found

          Markov games as a framework for multi-agent reinforcement learning

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Risk-Sensitive Linear/Quadratic/Gaussian Control

              P Whittle (1981)
                Bookmark

                Author and article information

                Journal
                31 May 2019
                Article
                1906.00729
                07573b4d-e768-49ff-a153-91bdb6e9ab9a

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.LG cs.GT cs.SY math.OC stat.ML

                Numerical methods,Theoretical computer science,Performance, Systems & Control,Machine learning,Artificial intelligence

                Comments

                Comment on this article