5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Path Following Control for Underactuated Airships with Magnitude and Rate Saturation

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          This paper proposes a reinforcement learning (RL) based path following strategy for underactuated airships with magnitude and rate saturation. The Markov decision process (MDP) model for the control problem is established. Then an error bounded line-of-sight (LOS) guidance law is investigated to restrain the state space. Subsequently, a proximal policy optimization (PPO) algorithm is employed to approximate the optimal action policy through trial and error. Since the optimal action policy is generated from the action space, the magnitude and rate saturation can be avoided. The simulation results, involving circular, general, broken-line, and anti-wind path following tasks, demonstrate that the proposed control scheme can transfer to new tasks without adaptation, and possesses satisfying real-time performance and robustness.

          Related collections

          Most cited references55

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Adam: A Method for Stochastic Optimization

          , (2015)
          We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Reinforcement learning in robotics: A survey

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach

                Bookmark

                Author and article information

                Journal
                Sensors (Basel)
                Sensors (Basel)
                sensors
                Sensors (Basel, Switzerland)
                MDPI
                1424-8220
                15 December 2020
                December 2020
                : 20
                : 24
                : 7176
                Affiliations
                [1 ]School of Aeronautic Science and Engineering, Beijing University of Aeronautics and Astronautics, Beijing 100191, China; ghbuaaa@ 123456buaa.edu.cn (H.G.); oujiajun@ 123456buaa.edu.cn (J.O.); yuanjiace@ 123456buaa.edu.cn (J.Y.)
                [2 ]Frontier Institute of Science and Technology Innovation, Beijing University of Aeronautics and Astronautics, Beijing 100191, China; guoxiao@ 123456buaa.edu.cn
                [3 ]School of Electronic and Information Engineering, Beijing University of Aeronautics and Astronautics, Beijing 100191, China
                Author notes
                [* ]Correspondence: louwenjie@ 123456buaa.edu.cn
                Author information
                https://orcid.org/0000-0002-2880-7653
                https://orcid.org/0000-0002-1202-3635
                Article
                sensors-20-07176
                10.3390/s20247176
                7765289
                33333882
                2aea6dc1-a763-4fef-a592-89b3d92e39cb
                © 2020 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 15 October 2020
                : 10 December 2020
                Categories
                Article

                Biomedical engineering
                reinforcement learning,path following,underactuated airships,magnitude and rate saturation

                Comments

                Comment on this article