35
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Managing engineering systems with large state and action spaces through deep reinforcement learning

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Decision-making for engineering systems can be efficiently formulated as a Markov Decision Process (MDP) or a Partially Observable MDP (POMDP). Typical MDP and POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the sizes of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described in explicit forms for the entire system and may only be accessible through numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL approach, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep function approximations that parametrize large state spaces, DCMAC also adopts a factorized representation of the system actions, being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network (DQN) solutions and exact policies, where applicable, and outperforms optimized baselines that are based on time-based, condition-based and periodic policies.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Life-Cycle Cost Design of Deteriorating Structures

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Planning under Uncertainty for Robotic Tasks with Mixed Observability

                Bookmark

                Author and article information

                Journal
                05 November 2018
                Article
                1811.02052
                b922977f-8694-4369-ad06-2ea4342cd5ff

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.SY cs.LG cs.MA

                Performance, Systems & Control,Artificial intelligence
                Performance, Systems & Control, Artificial intelligence

                Comments

                Comment on this article