3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Characterizing the Performance of Executing Many-tasks on Summit

      Preprint
      , , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Many scientific workloads are comprised of many tasks, where each task is an independent simulation or analysis of data. The execution of millions of tasks on heterogeneous HPC platforms requires scalable dynamic resource management and multi-level scheduling. RADICAL-Pilot (RP) -- an implementation of the Pilot abstraction, addresses these challenges and serves as an effective runtime system to execute workloads comprised of many tasks. In this paper, we characterize the performance of executing many tasks using RP when interfaced with JSM and PRRTE on Summit: RP is responsible for resource management and task scheduling on acquired resource; JSM or PRRTE enact the placement of launching of scheduled tasks. Our experiments provide lower bounds on the performance of RP when integrated with JSM and PRRTE. Specifically, for workloads comprised of homogeneous single-core, 15 minutes-long tasks we find that: PRRTE scales better than JSM for > O(1000) tasks; PRRTE overheads are negligible; and PRRTE supports optimizations that lower the impact of overheads and enable resource utilization of 63% when executing O(16K), 1-core tasks over 404 compute nodes.

          Related collections

          Most cited references13

          • Record: found
          • Abstract: not found
          • Article: not found

          The Impact of Heterogeneous Computing on Workflows for Biomolecular Simulation and Analysis

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Harnessing the Power of Many: Extensible Toolkit for Scalable Ensemble Applications

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              glideinWMS—a generic pilot-based workload management system

              I Sfiligoi (2008)
                Bookmark

                Author and article information

                Journal
                08 September 2019
                Article
                1909.03057
                abbd0cb8-0a70-4b7f-986b-1dbb06f08f4a

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.DC

                Networking & Internet architecture
                Networking & Internet architecture

                Comments

                Comment on this article