77
views
0
recommends
+1 Recommend
1 collections
    1
    shares
      • Record: found
      • Abstract: found
      • Article: found

      An Improved Algorithm for Optimizing MapReduce Based on Locality and Overlapping

      research-article

      Read this article at

      ScienceOpenPublisher
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          MapReduce is currently the most popular programming model for big data processing, and Hadoop is a well-known MapReduce implementation platform. However, Hadoop jobs suffer from imbalanced workloads during the reduce phase and inefficiently utilize the available computing and network resources. In some cases, these problems lead to serious performance degradation in MapReduce jobs. To resolve these problems, in this paper, we propose two algorithms, the Locality-Based Balanced Schedule (LBBS) and Overlapping-Based Resource Utilization (OBRU), that optimize the Locality-Enhanced Load Balance (LELB) and the Map, Local reduce, Shuffle, and final Reduce (MLSR) phases. The LBBS collects partition information from input data during the map phase and generates balanced schedule plans for the reduce phase. OBRU is responsible for using computing and network resources efficiently by overlapping the local reduce, shuffle, and final reduce phases. Experimental results show that the LBBS and OBRU algorithms yield significant improvements in load balancing. When LBBS and OBRU are applied, job performance increases by 15% from that of models using LELB and MLSR.

          Author and article information

          Journal
          Tsinghua Science and Technology
          Tsinghua Science and Technology
          Tsinghua University Press (Xueyan Building, Tsinghua University, Beijing 100084, China )
          1007-0214
          05 December 2018
          : 23
          : 6
          : 744-753
          Affiliations
          [1]∙ Jianjiang Li, Jie Wang, and Xiaolei Yang are with the Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China. E-mail: lijianjiang@ustb.edu.cn; wangjieblingbling@126.com; chinayangxiaolei@163.com.
          [2]∙ Bin Lyu was with the Department of Computer Science and Technology, University of Science and Technology Beijing, Beijing 100083, China when he did this research and now he is with University of Southern California, Los Angeles, CA 90089, USA.
          [3]∙ Jie Wu is with the Department of Computer and Information Sciences, Temple University, Philadelphia, PA 19122, USA. E-mail: jiewu@temple.edu.
          Author notes
          * To whom correspondence should be addressed. E-mail: colinlvbin@ 123456gmail.com .

          Jianjiang Li is currently an associate professor at University of Science and Technology Beijing, China. He received the PhD degree in computer science and technology from Tsinghua University in 2005. He was a visiting scholar at Temple University from Jan. 2014 to Jan. 2015. His current research interests include parallel computing, cloud computing, and parallel compilation.

          Jie Wang is currently a master student in University of Science and Technology Beijing, China. She received the BS degree from Tangshan Normal University in 2015. Her current research interests include cloud computing and parallel computing.

          Bin Lyu is currently a master student in University of Southern California, USA. He received the BS degree from University of Science and Technology Beijing in 2017. His current research interests include cloud computing and parallel computing.

          Jie Wu is the Associate Vice Provost for International Affairs at Temple University and IEEE Fellow. He also serves as Director of Center for Networked Computing and Laura H. Carnell professor. His current research interests include mobile computing and wireless networks, routing protocols, cloud and green computing, network trust and security, and social network applications. He received the PhD degree from Florida Atlantic University in 1989.

          Xiaolei Yang received the master degree from University of Science and Technology Beijing, China, in 2016. His current research interests include parallel computing, cloud computing, and recommender systems.

          Article
          1007-0214-23-6-744
          10.26599/TST.2018.9010115
          a2582f6d-ebf6-4571-8ec8-cab91487a625
          Copyright @ 2018
          History
          : 23 July 2018
          : 10 August 2018

          Software engineering,Data structures & Algorithms,Applied computer science,Computer science,Artificial intelligence,Hardware architecture
          load balance,overlapping,MapReduce,data locality

          Comments

          Comment on this article