1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Automatic Horizontal Fusion for GPU Kernels

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We present automatic horizontal fusion, a novel optimization technique that complements the standard kernel fusion techniques for GPU programs. Unlike the standard fusion, whose goal is to eliminate intermediate data round trips, our horizontal fusion technique aims to increase the thread-level parallelism to hide instruction latencies. We also present HFuse, a new source to source CUDA compiler that implements automatic horizontal fusion. Our experimental results show that horizontal fusion can speed up the running time by 2.5%-60.8%. Our results reveal that the horizontal fusion is especially beneficial for fusing kernels with instructions that require different kinds of GPU resources (e.g., a memory-intensive kernel and a compute-intensive kernel).

          Related collections

          Author and article information

          Journal
          02 July 2020
          Article
          2007.01277
          846bc11f-b029-4493-b62a-a204f13aa931

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          cs.DC cs.PL

          Programming languages,Networking & Internet architecture
          Programming languages, Networking & Internet architecture

          Comments

          Comment on this article