7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Running a Pre-Exascale, Geographically Distributed, Multi-Cloud Scientific Simulation

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          As we approach the Exascale era, it is important to verify that the existing frameworks and tools will still work at that scale. Moreover, public Cloud computing has been emerging as a viable solution for both prototyping and urgent computing. Using the elasticity of the Cloud, we have thus put in place a pre-exascale HTCondor setup for running a scientific simulation in the Cloud, with the chosen application being IceCube's photon propagation simulation. I.e. this was not a purely demonstration run, but it was also used to produce valuable and much needed scientific results for the IceCube collaboration. In order to reach the desired scale, we aggregated GPU resources across 8 GPU models from many geographic regions across Amazon Web Services, Microsoft Azure, and the Google Cloud Platform. Using this setup, we reached a peak of over 51k GPUs corresponding to almost 380 PFLOP32s, for a total integrated compute of about 100k GPU hours. In this paper we provide the description of the setup, the problems that were discovered and overcome, as well as a short description of the actual science output of the exercise.

          Related collections

          Author and article information

          Journal
          16 February 2020
          Article
          2002.06667
          2d5ab882-bd7e-45fc-a99e-e2e63916aa21

          http://creativecommons.org/licenses/by-nc-sa/4.0/

          History
          Custom metadata
          18 pages, 5 figures, 4 tables, to be published in Proceedings of ISC High Performance 2020
          cs.DC

          Networking & Internet architecture
          Networking & Internet architecture

          Comments

          Comment on this article