Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2015, Vol. 9 Issue (1) : 128-141    https://doi.org/10.1007/s11704-014-4138-y
RESEARCH ARTICLE
SAMES: deadline-constraint scheduling in MapReduce
Xite WANG(),Derong SHEN,Mei BAI,Tiezheng NIE,Yue KOU,Ge YU
College of Information Science & Engineering, Northeastern University, Shenyang 110819, China
 Download: PDF(748 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

MapReduce is a popular parallel data-processing system, and task scheduling is one of the kernel techniques in MapReduce. In many applications, users have requirements that their MapReduce jobs should be completed before specific deadlines. Hence, in this paper, a novel scheduling algorithm based on the most effective sequence (SAMES) is proposed for deadline-constraint jobs in MapReduce. First, according to the characteristics of MapReduce, we propose a novel sequence-based execution strategy for MapReduce jobs and a new concept, the effective sequence (ES). Then, we design some efficient approaches for finding ESes and choose the most effective sequence (MES) for job execution. We also propose methods for MES-updates and exception handling. Finally, we verify the effectiveness of SAMES through experiments. The experimental results show that SAMES is an efficient scheduling algorithm for deadline-constraint jobs in MapReduce.

Keywords MapReduce      scheduling      deadline     
Corresponding Author(s): Xite WANG   
Issue Date: 09 February 2015
 Cite this article:   
Xite WANG,Derong SHEN,Mei BAI, et al. SAMES: deadline-constraint scheduling in MapReduce[J]. Front. Comput. Sci., 2015, 9(1): 128-141.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-014-4138-y
https://academic.hep.com.cn/fcs/EN/Y2015/V9/I1/128
1 Dean J, Ghemawat S. Mapreduce: simplified data processing on large clusters. Communications of the ACM, 2008, 51(1): 107-113
https://doi.org/10.1145/1327452.1327492
2 Jiang D, Ooi B C, Shi L, Wu S. The performance of mapreduce: an in-depth study. Proceedings of the VLDB Endowment, 2010, 3(1-2): 472-483
https://doi.org/10.14778/1920841.1920903
3 Polo J, Carrera D, Becerra Y, Torres J. Performance-driven task coscheduling for mapreduce environments. In: Proceedings of the Network Operations and Managment Symposium (NOMS). 2010, 373-380
4 Kc K, Anyanwu K. Scheduling hadoop jobs to meet deadlines. In: Proceedings of 2010 IEEE Second International Conference on Cloud Computing Technology and Science (CloudCom). 2010, 388-392
5 Verma A, Cherkasova L, Kumar V S, Campbell R H. Deadline-based workload management for mapreduce environments: pieces of the performance puzzle. In: Proceedings of the Network Operations and Managment Symposium (NOMS). 2012, 900-905
6 Sandholm T, Lai K. Dynamic proportional share scheduling in hadoop. In: Proceedings of the Job Scheduling Strategies for Parallel Processing. Berlin: Springer, 2010, 110-131
https://doi.org/10.1007/978-3-642-16505-4_7
7 Schwarzkopf M, Konwinski A, Abd-El-Malek M, Wilkes J. Omega: flexible, scalable schedulers for large compute clusters. In: Proceedings of the 8th ACM European Conference on Computer Systems, ACM. 2013, 351-364
8 Wolf J, Balmin A, Rajan D, Hildrum K, Khandekar R, Parekh S, Wu K L, Vernica R. Circumflex: a scheduling optimizer for mapreduce workloads with shared scans. SIGOPS, 2012, 46(1): 26-32
https://doi.org/10.1145/2146382.2146388
9 Morton K, Balazinska M, Grossman D. Paratimer: a progress indicator for mapreduce dags. In: SIGMOD Conference’10. 2010, 507-518
10 Condie T, Conway N, Alvaro P, Hellerstein J M. Mapreduce online. In: Proceedings of NSDI. 2010, 313-328
11 Zaharia M, Elmeleegy K, Borthakur D, Shenker S, Sen Sarma J, Stoica I. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In: Proceedings of EuroSys, ACM. 2010, 265-278
12 Zaharia M, Konwinski A, Joseph A D, Katz R, Stoica I. Improving mapreduce performance in heterogeneous environments. In: Proceedings of OSDI. 2008, 29-42
13 Verma A, Cherkasova L, Campbell R H. Aria: automatic resource inference and allocation for mapreduce environments. In: Proceedings of the 8th ACM International Conference on Autonomic Computing, ACM. 2011, 235-244
14 Dou A, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos V H. Misco: a mapreduce framework for mobile systems. In: Proceedings of the 3rd International Conference on PErvasive Technologies Related to Assistive Environments, ACM. 2010, 32-39
15 Dou A J, Kalogeraki V, Gunopulos D, Mielikainen T, Tuulos V H. Scheduling for real-time mobile mapreduce systems. In: Proceedings of the 5th ACM International Conference on Distributed Event-based System. 2011, 347-358
[1] Supplementary Material Download
[1] Han Yao HUANG, Kyung Tae KIM, Hee Yong YOUN. Determining node duty cycle using Q-learning and linear regression for WSN[J]. Front. Comput. Sci., 2021, 15(1): 151101-.
[2] Zeinab ASKARI, Avid AVOKH. EMSC: a joint multicast routing, scheduling, and call admission control in multi–radio multi–channel WMNs[J]. Front. Comput. Sci., 2020, 14(5): 145503-.
[3] Zhuo WANG, Qun CHEN, Bo SUO, Wei PAN, Zhanhuai LI. Reducing partition skew on MapReduce: an incremental allocation approach[J]. Front. Comput. Sci., 2019, 13(5): 960-975.
[4] Libing WU, Lei NIE, Samee U. KHAN, Osman KHALID, Dan WU. A V2I communication-based pipeline model for adaptive urban traffic light scheduling[J]. Front. Comput. Sci., 2019, 13(5): 929-942.
[5] Lin WANG, Depei QIAN, Rui WANG, Zhongzhi LUAN, Hailong YANG, Huaxiang ZHANG. A novel index system describing program runtime characteristics for workload consolidation[J]. Front. Comput. Sci., 2019, 13(3): 489-499.
[6] Yihong GAO, Huadong MA. StreamTune: dynamic resource scheduling approach for workload skew in video data center[J]. Front. Comput. Sci., 2018, 12(4): 669-681.
[7] Cheqing JIN, Jie CHEN, Huiping LIU. MapReduce-based entity matching with multiple blocking functions[J]. Front. Comput. Sci., 2017, 11(5): 895-911.
[8] Mei BAI,Junchang XIN,Guoren WANG,Roger ZIMMERMANN,Xite WANG. Skyline-join query processing in distributed databases[J]. Front. Comput. Sci., 2016, 10(2): 330-352.
[9] Qi WANG,Donghui WANG,Chaohuan HOU. Exploiting write power asymmetry to improve phase change memory system performance[J]. Front. Comput. Sci., 2015, 9(4): 566-575.
[10] Huiju WANG,Furong LI,Xuan ZHOU,Yu CAO,Xiongpai QIN,Jidong CHEN,Shan WANG. HC-Store: putting MapReduce’s foot in two camps[J]. Front. Comput. Sci., 2014, 8(6): 859-871.
[11] JongHyuk LEE,SungJin CHOI,JoonMin GIL,Taeweon SUH,HeonChang YU. A scheduling algorithm with dynamic properties in mobile grid[J]. Front. Comput. Sci., 2014, 8(5): 847-857.
[12] Najme MANSOURI. Network and data location aware approach for simultaneous job scheduling and data replication in large-scale data grid environments[J]. Front. Comput. Sci., 2014, 8(3): 391-408.
[13] Kok-Lim Alvin YAU, Kae Hsiang KWONG, Chong SHEN. Reinforcement learning models for scheduling in wireless networks[J]. Front Comput Sci, 2013, 7(5): 754-766.
[14] Huafeng YU, Yue MA, Thierry GAUTIER, Lo?c BESNARD, Jean-Pierre TALPIN, Paul Le GUERNIC, Yves SOREL. Exploring system architectures in AADL via Polychrony and SynDEx[J]. Front Comput Sci, 2013, 7(5): 627-649.
[15] Kenli LI, Zhao TONG, Dan LIU, Teklay TESFAZGHI, Xiangke LIAO. A PTS-PGATS based approach for data-intensive scheduling in data grids[J]. Front Comput Sci Chin, 2011, 5(4): 513-525.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed