Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2015, Vol. 9 Issue (6) : 904-918    https://doi.org/10.1007/s11704-015-4560-9
RESEARCH ARTICLE
Adaptive and scalable load balancing for metadata server cluster in cloud-scale file systems
Quanqing XU1,*(),Rajesh Vellore ARUMUGAM1,Khai Leong YONG1,Yonggang WEN2,Yew-Soon ONG2,Weiya XI1
1. Data Storage Institute, Agency for Science, Technology and Research, Singapore 138632, Singapore
2. School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore
 Download: PDF(923 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Big data is an emerging term in the storage industry, and it is data analytics on big storage, i.e., Cloud-scale storage. In Cloud-scale (or EB-scale) file systems, load balancing in request workloads across a metadata server cluster is critical for avoiding performance bottlenecks and improving quality of services.Many good approaches have been proposed for load balancing in distributed file systems. Some of them pay attention to global namespace balancing, making metadata distribution across metadata servers as uniform as possible. However, they do not work well in skew request distributions, which impair load balancing but simultaneously increase the effectiveness of caching and replication. In this paper, we propose Cloud Cache (C2), an adaptive and scalable load balancing scheme for metadata server cluster in EB-scale file systems. It combines adaptive cache diffusion and replication scheme to cope with the request load balancing problem, and it can be integrated into existing distributed metadata management approaches to efficiently improve their load balancing performance. C2 runs as follows: 1) to run adaptive cache diffusion first, if a node is overloaded, loadshedding will be used; otherwise, load-stealing will be used; and 2) to run adaptive replication scheme second, if there is a very popular metadata item (or at least two items) causing a node be overloaded, adaptive replication scheme will be used, in which the very popular item is not split into several nodes using adaptive cache diffusion because of its knapsack property. By conducting performance evaluation in trace-driven simulations, experimental results demonstrate the efficiency and scalability of C2.

Keywords metadata management      load balancing      adaptive cache diffusion      adaptive replication      cloud-scale file systems     
Corresponding Author(s): Quanqing XU   
Just Accepted Date: 08 June 2015   Issue Date: 10 November 2015
 Cite this article:   
Quanqing XU,Rajesh Vellore ARUMUGAM,Khai Leong YONG, et al. Adaptive and scalable load balancing for metadata server cluster in cloud-scale file systems[J]. Front. Comput. Sci., 2015, 9(6): 904-918.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-015-4560-9
https://academic.hep.com.cn/fcs/EN/Y2015/V9/I6/904
1 Raicu I, Foster I, Beckman P. Making a case for distributed file systems at exascale. In: Proceedings of the 3rd International Workshop on Large-scale System and Application Performance. 2011, 11−18
https://doi.org/10.1145/1996029.1996034
2 Amer A, Long D, and Schwarz T. Reliability challenges for storing exabytes. In: Proceedings of International Conference on Computing, Networking and Communications. 2014, 907−913
https://doi.org/10.1109/iccnc.2014.6785458
3 Ousterhout J K, Costa H D, Harrison D, Kunze J A, Kupfer M D, Thompson J G. A trace-driven analysis of the UNIX 4.2 BSD file system. In: Proceedings of ACM Symposium on Operating Systems Principles. 1985, 15−24
https://doi.org/10.1145/323647.323631
4 Zhu Y, Jiang H, Wang J, Xian F. HBA: Distributed metadata management for large cluster-based storage systems. IEEE Transactions on Parallel and Distributed Systems, 2008, 19(6): 750−763
https://doi.org/10.1109/TPDS.2007.70788
5 Hua Y, Zhu Y, Jiang H, Feng D, Tian L. Supporting scalable and adaptive metadata management in ultralarge-scale file systems. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(4): 580−593
https://doi.org/10.1109/TPDS.2010.116
6 Welch B, Unangst M, Abbasi Z, Gibson G A, Mueller B, Small J, Zelenka J, Zhou B. Scalable performance of the panasas parallel file system. In: Proceedings of the 6th USENIX Conference on File and Storage Technologies. 2008, 17−33
7 Xu Q, Arumugam R V, Yang K L, Mahadevan S. DROP: Facilitating distributed metadata management in EB-scale storage systems. In: Proceedings of the 30th IEEE Symposium on Mass Storage Systems and Technologies. 2013, 1−10
https://doi.org/10.1109/msst.2013.6558422
8 Chen Z, Xiong J, Meng D. Replication-based highly available metadata management for cluster file systems. In: Proceedings of IEEE International Conference on Cluster Computing. 2010, 292−301
https://doi.org/10.1109/cluster.2010.34
9 Wendell P, Freedman M J. Going viral: flash crowds in an open CDN. In: Proceedings of ACM SIGCOMM Conference on Internet Measurement. 2011, 549−558
https://doi.org/10.1145/2068816.2068867
10 Fan B, Lim H, Andersen D G, Kaminsky M. Small cache, big effect: provable load balancing for randomly partitioned cluster services. In: Proceedings of ACM Symposium on Cloud Computing. 2011, 26−28
https://doi.org/10.1145/2038916.2038939
11 Xu Q, Arumugam R V, Yong K L, Wen Y, Ong Y S. C2: Adaptive load balancing for metadata server cluster in cloud-scale storage systems. In: Proceedings of the 18th Asia Pacific Symposium on Intelligent and Evolutionary Systems. 2015, 195−209
https://doi.org/10.1007/978-3-319-13359-1_16
12 Kavalanekar S, Worthington B L, Zhang Q, Sharda V. Characterization of storage workload traces from production windows servers. In: Proceedings of IEEE International Symposium on Workload Characterization. 2008, 119−128
https://doi.org/10.1109/iiswc.2008.4636097
13 Ellard D, Ledlie J, Malkani P, Seltzer MI. Passive NFS tracing of email and research workloads. In: Proceedings of USENIX Conference on File and Storage Technologies. 2003, 203−216
14 Stoica I, Morris R, Karger D R, Kaashoek MF, Balakrishnan H. Chord: a scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review, 2001, 31(4): 149−160
https://doi.org/10.1145/964723.383071
15 Ledlie J, Seltzer M I. Distributed, secure load balancing with skew, heterogeneity and churn. In: Proceedings of IEEE International Conference on Computer Communications. 2005, 1419−1430
https://doi.org/10.1109/infcom.2005.1498366
16 Andersen D G, Franklin J, Kaminsky M, Phanishayee A, Tan L, Vasudevan V. FAWN: a fast array of wimpy nodes. In: Proceedings of ACM Symposium on Operating Systems Principles. 2009, 1−14
https://doi.org/10.1145/1629575.1629577
17 O’Neil P E, Cheng E, Gawlick D, O’Neil E J. The log-structured merge-tree (LSM-tree). Acta Informatica, 1996, 33(4): 351−385
https://doi.org/10.1007/s002360050048
18 Chang F, Dean J, Ghemawat S, Hsieh W C, Wallach D A, Burrows M, Chandra T, Fikes A, Gruber R. Bigtable: A distributed storage system for structured data. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation. 2006, 205−218
19 Shetty P, Spillane R P, Malpani R, Andrews B, Seyster J, Zadok E. Building workload-independent storage with VT-trees. In: Proceedings of USENIX conference on File and Storage Technologies. 2013, 17−30
20 Wang P, Sun G, Jiang S, Ouyang J, Lin S, Zhang C, Cong J. An efficient design and implementation of LSM-tree based key-value store on open-channel SSD. In: Proceedings of European Conference on Computer Systems. 2014, 13−16
https://doi.org/10.1145/2592798.2592804
21 Sivasubramanian S, Pierre G, Steen M, Alonso G. Analysis of caching and replication strategies for web applications. IEEE Internet Computing, 2007, 11(1): 60−66
https://doi.org/10.1109/MIC.2007.3
22 Gummadi P K, Dunn R J, Saroiu S, Gribble S D, Levy H M, Zahorjan J. Measurement, modeling, and analysis of a peer-to-peer file-sharing workload. In: Proceedings of ACM Symposium on Operating Systems Principles. 2003, 314−329
https://doi.org/10.1145/945445.945475
23 Khuller S, Kim Y A, Wan Y J. Algorithms for data migration with cloning. In: Proceedings of ACM on Principles of Database Systems. 2003, 27−36
https://doi.org/10.1145/773153.773156
24 Fan L, Cao P, Almeida J M, Broder A Z. Summary cache: a scalable wide-area web cache sharing protocol. IEEE/ACM Transactions on Networking, 2000, 8(3): 281−293
https://doi.org/10.1109/90.851975
25 Bykov S, Geller A, Kliot G, Larus J R, Pandya R, Thelin J. Orleans: cloud computing for everyone. In: Proceedings of ACM Symposium on Cloud Computing. 2011, 1−14
https://doi.org/10.1145/2038916.2038932
26 Xu Q, Arumugam R, Yong K L, Mahadevan S. Efficient and scalable metadata management in EB-scale file systems. IEEE Transactions on Parallel and Distributed Systems, 2014, 25(11): 2840−2850
https://doi.org/10.1109/TPDS.2013.293
27 Ratnasamy S, Handley M, Karp R M, Shenker S. Topologically-aware overlay construction and server selection. In: Proceedings of IEEE International Conference on Computer Communications. 2002, 1190−1199
https://doi.org/10.1109/infcom.2002.1019369
28 Renesse R, Schneider F B. Chain replication for supporting high throughput and availability. In: Proceedings of USENIX Symposium on Operating Systems Design and Implementation. 2004, 91−104
29 Moritz R H, Williams R C. A coin-tossing problem and some related combinatorics. Mathematics Magazine, 1988, 61(1): 24−29
https://doi.org/10.2307/2690326
30 Berenbrink P, Brinkmann A, Friedetzky T, Meister D, Nagel L. Distributing storage in cloud environments. In: Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, Workshops and PhD Forum. 2013, 963−973
https://doi.org/10.1109/ipdpsw.2013.148
31 Berenbrink P, Brinkmann A, Friedetzky T, Nagel L. Balls into nonuniform bins. Journal of Parallel and Distributed Computing, 2014, 74(2): 2065−2076
https://doi.org/10.1016/j.jpdc.2013.10.008
32 Aho A V, Lam M S, Sethi R, Ullman J. Compilers: Principles, Techniques, and Tools. Reading, Massachusetts: Addison-Wesley Publishing Company, 2006
33 Hua Y, Jiang H, Zhu Y, Feng D, Tian L. Smartstore: a new metadata organization paradigm with semantic-awareness for next-generation file systems. In: Proceedings of the ACM/IEEE Conference on High Performance Computing Networking, Storage and Analysis. 2009, 1−12
https://doi.org/10.1145/1654059.1654070
34 Godfrey B, Lakshminarayanan K, Surana S, Karp R M, Stoica I. Load balancing in dynamic structured P2P systems. In: Proceedings of IEEE International Conference on Computer Communications. 2004, 2253−2262
35 Karger D R, Ruhl M. Simple efficient load balancing algorithms for peer-to-peer systems. In: Proceedings of the 16th Annual ACM Symposium on Parallelism in Algorithms and Architectures. 2004, 36−43
https://doi.org/10.1145/1007912.1007919
36 Naor M, Wieder U. Novel architectures for P2P applications: the continuous-discrete approach. ACM Transactions on Algorithms, 2007, 3(3): 1−37
https://doi.org/10.1145/1273340.1273350
37 You G, Hwang S, Jain N. Scalable load balancing in cluster storage systems. In: Proceedings of the 12th International Middleware Conference on International Federation for Information Processing. 2011, 101−122
https://doi.org/10.1007/978-3-642-25821-3_6
38 Annapureddy S, Freedman MJ, Mazières D. Shark: scaling file servers via cooperative caching. In: Proceedings of the 2nd USENIX Symposium on Networked Systems Design and Implementation. 2005, 129−142
39 Batsakis A, Burns R C. NFS-CD: write-enabled cooperative caching in NFS. IEEE Transactions on Parallel and Distributed Systems, 2008, 19(3): 323−333
https://doi.org/10.1109/TPDS.2008.14
40 Yadgar G, Factor M, Schuster A. Cooperative caching with return on investment. In: Proceedings of the 29th IEEE Symposium on Mass Storage Systems and Technologies. 2013, 1−13
https://doi.org/10.1109/msst.2013.6558446
41 Ramaswamy L, Liu L, Iyengar A. Cache clouds: cooperative caching of dynamic documents in edge networks. In: Proceedings of the 25th IEEE International Conference on Distributed Computing Systems. 2005, 229−238
https://doi.org/10.1109/icdcs.2005.16
42 Xu Q, Shen H T, Chen Z, Cui B, Zhou X, Dai Y. Hybrid information retrieval policies based on cooperative cache in mobile P2P networks. Frontiers of Computer Science in China, 2009, 3(3): 381−395
https://doi.org/10.1007/s11704-009-0055-x
43 Dabek F, Kaashoek M F, Karger D R, Morris R, Stoica I. Wide-area cooperative storage with CFS. In: Proceedings of ACM Symposium on Operating Systems Principles. 2001, 202−215
https://doi.org/10.1145/502034.502054
44 Ramasubramanian V, Sirer E G. Beehive: O(1) lookup performance for power-law query distributions in peer-to-peer overlays. In: Proceedings of USENIX Symposium on Networked Systems Design and Implementation. 2004, 99−112
45 Gopalakrishnan V, Silaghi B D, Bhattacharjee B, Keleher P J. Adaptive replication in peer-to-peer systems. In: Proceedings of the 24th IEEE International Conference on Distributed Computing Systems. 2004, 360−369
https://doi.org/10.1109/icdcs.2004.1281601
[1] Supplementary Material-Highlights in 3-page ppt
Download
[1] Yihong GAO, Huadong MA. StreamTune: dynamic resource scheduling approach for workload skew in video data center[J]. Front. Comput. Sci., 2018, 12(4): 669-681.
[2] Xiong FU, Juzhou CHEN, Song DENG, Junchang WANG, Lin ZHANG. Layered virtual machine migration algorithm for network resource balancing in cloud computing[J]. Front. Comput. Sci., 2018, 12(1): 75-85.
[3] Cheqing JIN, Jie CHEN, Huiping LIU. MapReduce-based entity matching with multiple blocking functions[J]. Front. Comput. Sci., 2017, 11(5): 895-911.
[4] Yaobin HE, Haoyu TAN, Wuman LUO, Shengzhong FENG, Jianping FAN. MR-DBSCAN: a scalable MapReduce-based DBSCAN algorithm for heavily skewed data[J]. Front. Comput. Sci., 2014, 8(1): 83-99.
[5] Xuejun YANG, Xiangke LIAO, Weixia XU, Junqiang SONG, Qingfeng HU, Jinshu SU, Liquan XIAO, Kai LU, Qiang DOU, Juping JIANG, Canqun YANG, . TH-1: China’s first petaflop supercomputer[J]. Front. Comput. Sci., 2010, 4(4): 445-455.
[6] Wenchao JIANG, Matthias BAUMGARTEN, Yanhong ZHOU, Hai JIN, . A bipartite model for load balancing in grid computing environments[J]. Front. Comput. Sci., 2009, 3(4): 503-523.
[7] YANG Xuejun, WANG Panfeng, DU Yunfei, ZHOU Haifang. A data-distributed parallel algorithm for wavelet-based fusion of remote sensing images[J]. Front. Comput. Sci., 2007, 1(2): 231-240.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed