Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2021, Vol. 15 Issue (2) : 152501    https://doi.org/10.1007/s11704-019-9099-8
RESEARCH ARTICLE
Hierarchical data replication strategy to improve performance in cloud computing
Najme MANSOURI1,2(), Mohammad Masoud JAVIDI1,2, Behnam Mohammad Hasani ZADE1,2
1. Department of Computer Science, Shahid Bahonar University of Kerman, Kerman 76169-14111, Iran
2. Mahani Mathematical Research Center, Shahid Bahonar University of Kerman, Kerman 76169-14111, Iran
 Download: PDF(1481 KB)  
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Cloud computing environment is getting more interesting as a new trend of data management. Data replication has been widely applied to improve data access in distributed systems such as Grid and Cloud. However, due to the finite storage capacity of each site, copies that are useful for future jobs can be wastefully deleted and replaced with less valuable ones. Therefore, it is considerable to have appropriate replication strategy that can dynamically store the replicas while satisfying quality of service (QoS) requirements and storage capacity constraints. In this paper, we present a dynamic replication algorithm, named hierarchical data replication strategy (HDRS). HDRS consists of the replica creation that can adaptively increase replicas based on exponential growth or decay rate, the replica placement according to the access load and labeling technique, and finally the replica replacement based on the value of file in the future. We evaluate different dynamic data replication methods using CloudSim simulation. Experiments demonstrate that HDRS can reduce response time and bandwidth usage compared with other algorithms. It means that the HDRS can determine a popular file and replicates it to the best site. This method avoids useless replications and decreases access latency by balancing the load of sites.

Keywords cloud computing      data replication      multi-tier architecture      simulation      load balance     
Corresponding Author(s): Najme MANSOURI   
Just Accepted Date: 19 November 2019   Issue Date: 01 December 2020
 Cite this article:   
Najme MANSOURI,Mohammad Masoud JAVIDI,Behnam Mohammad Hasani ZADE. Hierarchical data replication strategy to improve performance in cloud computing[J]. Front. Comput. Sci., 2021, 15(2): 152501.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-019-9099-8
https://academic.hep.com.cn/fcs/EN/Y2021/V15/I2/152501
1 X Fu, J Chen, S Deng, J Wang, L Zhang. Layered virtual machine migration algorithm for network resource balancing in cloud computing. Frontiers of Computer Science, 2018, 12(1): 75–85
https://doi.org/10.1007/s11704-016-6135-9
2 N Mansouri, M M Javidi. A hybrid data replication strategy with fuzzybased deletion for heterogeneous cloud data centers. The Journal of Supercomputing, 2018, 74(10): 5349–5372
https://doi.org/10.1007/s11227-018-2427-1
3 N Mansouri, M M Javidi. A review of data replication based on metaheuristics approach in cloud computing and data grid. Soft Computing, 2020
https://doi.org/10.1007/s00500-020-04802-1
4 X Yang, D Wallom, S Waddington, J Wang, A Shaon, B Matthews, M Wilson, Y Guo, L Guo, J D Blower, A V Vasilakos, K Liu, P Kershaw. Cloud computing in e-Science: research challenges and opportunities. The Journal of Supercomputing, 2014, 70: 1453–1471
https://doi.org/10.1007/s11227-014-1251-5
5 Y Shi, X Meng, J Zhao, X Hu, B Liu, H Wang. Benchmarking cloudbased data management systems. In: Proceedings of the 2nd International CIKM Workshop on Cloud Data Management. 2010
https://doi.org/10.1145/1871929.1871938
6 A Thusoo, J Sarma, N Jain, Z Shao, P Chakka, S Anthony, H Liu, P Wyckoff, R Murthy. Hive-a warehousing solution over a MapReduce framework. Proceedings of the VLDB Endowment, 2009, 2(2): 1626–1629
https://doi.org/10.14778/1687553.1687609
7 J Kuhlenkamp, M Klems, O Röss. Benchmarking scalability and elasticity of distributed database systems. Proceedings of the VLDB Endowment, 2014, 7(12): 1219–1230
https://doi.org/10.14778/2732977.2732995
8 T Loukopoulos, I Ahmad, D Papadias. An overview of data replication on the internet. In: Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks (ISPAN.02). 2002, 27–32
9 N Mansouri. Adaptive data replication strategy in cloud computing for performance improvement. Frontiers of Computer Science, 2016, 10(5): 925–935
https://doi.org/10.1007/s11704-016-5182-6
10 H F ElYamany, M F Mohamed, K Grolinger, M A Capretz. A generalized service replication process in distributed environments. In: Proceedings of the 5th International Conference on Cloud Computing and Services Science (CLOSER). 2015, 20–22
https://doi.org/10.5220/0005485201860193
11 H Kim, M Parashar, D J Foran, L Yang. Investigating the use of cloudbursts for high-throughput medical image registration. In: Proceedings of the 10th IEEE/ACM International Conference on Grid Computing (GRID). 2009
https://doi.org/10.1109/GRID.2009.5353065
12 M F Mohamed. Service replication taxonomy in distributed environments. Service Oriented Computing and Applications, 2016, 10(3): 317–336
https://doi.org/10.1007/s11761-015-0189-7
13 H Zhong, Z Zhang, X Zhang. A dynamic replica management strategy based on data grid. In: Proceedings of the 9th International Conference on Grid and Cloud Computing. 2010, 18–23
https://doi.org/10.1109/GCC.2010.17
14 S Ghemawat, H Gobioff, S T Leung. The Google file system. In: Proceedings of the 19th ACM Symposium on Operating Systems Principles. 2003, 29–43
https://doi.org/10.1145/1165389.945450
15 Y Wang, J Wang. An optimized replica distribution method in cloud storage system. Journal of Control Science and Engineering, 2017, 11: 1–8
https://doi.org/10.1155/2017/2428982
16 B A Milani, N J Navimipour. A comprehensive review of the data replication techniques in the cloud environments: major trends and future directions. Journal of Network and Computer Applications, 2016, 64: 229–238
https://doi.org/10.1016/j.jnca.2016.02.005
17 K Tabet, R Mokadem, M R Laouar, S Eom. Data replication in cloud systems: a survey. International Journal of Systems and Social Change, 2017, 8(3): 1–17
https://doi.org/10.4018/IJISSC.2017070102
18 K Shvachko, K Hairong, S Radia, R Chansler. The Hadoop distributed file system. In: Proceedings of the 26th Symposium onMass Storage Systems and Technologies, Incline Village, NV. 2010, 1–10
https://doi.org/10.1109/MSST.2010.5496972
19 N Mansouri, G H Dastghaibyfard. Job scheduling and dynamic data replication in data grid environment. The Journal of Supercomputing, 2013, 64: 204–225
https://doi.org/10.1007/s11227-012-0850-2
20 U Tos, R Mokadem, A Hameurlain, T Ayav, S Bora. Dynamic replication strategies in data grid systems: a survey. The Journal of Supercomputing, 2015, 71(11): 4116–4140
https://doi.org/10.1007/s11227-015-1508-7
21 J Jianjin, Y Guangwen. An optimal replication strategy for data grid systems. Frontiers of Computer Science, 2007, 1(3): 338–348
https://doi.org/10.1007/s11704-007-0033-0
22 N Mansouri, M M Javidi. A new prefetching-aware data replication to decrease access latency in cloud environment. Journal of Systems and Software, 2018, 144: 197–215
https://doi.org/10.1016/j.jss.2018.05.027
23 S Gopinath, E Sherly. A dynamic replica factor calculator for weighted dynamic replication management in cloud storage systems. Procedia Computer Science, 2018, 132: 1771–1780
https://doi.org/10.1016/j.procs.2018.05.152
24 N Mansouri, G H Dastghaibyfard, E Mansouri. Combination of data replication and scheduling algorithm for improving data availability in data grids. Journal of Network and Computer Applications, 2013, 36: 711–722
https://doi.org/10.1016/j.jnca.2012.12.021
25 C Dabas, J Aggarwal. An intensive review of data replication algorithms for cloud systems. In: Shetty N, Pathaik L, Nagaraj H, Hamsavath P, Nalini N, eds. Emerging Research in Computing, Information, Communication and Applications. Springer, Singapore, 2019, 25–39
https://doi.org/10.1007/978-981-13-5953-8_3
26 N Mansouri, G H Dastghaibyfard. Enhanced dynamic hierarchical replication and weighted scheduling strategy in data grid. Journal of Parallel and Distributed Computing, 2013, 73(4): 534–543
https://doi.org/10.1016/j.jpdc.2013.01.002
27 K Ranganathan, I Foster. Identifying dynamic replication strategies for a high performance data grid. In: Proceedings of International Workshop on Grid Computing. 2001, 75–86
https://doi.org/10.1007/3-540-45644-9_8
28 S M Park, J H Kim, Y B Ko, W S Yoon. Dynamic data grid replication strategy based on Internet hierarchy. In: Proceedings of International Conference on Grid and Cooperative Computing. 2003, 838–846
https://doi.org/10.1007/978-3-540-24680-0_133
29 J Myint, A Hunger. Comparative analysis of adaptive file replication algorithms for cloud data storage. In: Proceedings of International Conference on Future Internet of Things and Cloud. 2014
https://doi.org/10.1109/FiCloud.2014.28
30 L M Khanli, A Isazadeh, T N Shishavan. PHFS: a dynamic replication method, to decrease access latency in the multi-tier data grid. Future Generation Computer Systems, 2011, 27(3): 233–244
https://doi.org/10.1016/j.future.2010.08.013
31 D W Sun, G R Chang, S Gao, L Z Jin, X W Wang. Modeling a dynamic data replication strategy to increase system availability in cloud computing environments. Journal of Computer Science and Technology, 2012, 27: 256–272
https://doi.org/10.1007/s11390-012-1221-4
32 R S Chang, H P Chang. A dynamic data replication strategy using accessweights in data grids. Journal of Supercomputing, 2008, 45(3): 277–295
https://doi.org/10.1007/s11227-008-0172-6
33 Y H Kim, M J Jung, C H Lee. Energy-aware real-time task scheduling exploiting temporal locality. IEICE Transactions on Information and Systems, 2010, 93(5): 1147–1153
https://doi.org/10.1587/transinf.E93.D.1147
34 D W Sun, G R Chang, C Miao, L Z Jin, X W Wang. Analyzing modeling and evaluating dynamic adaptive fault tolerance strategies in cloud computing environments. The Journal of Supercomputing, 2013, 66: 193–228
https://doi.org/10.1007/s11227-013-0898-7
35 B Zhang, X Wang, M Huang. A PGSA based data replica selection scheme for accessing cloud storage system. Advanced Computer Architecture, 2014, 451: 140–151
https://doi.org/10.1007/978-3-662-44491-7_11
36 X Ding, J You. Plant Growth Simulation Algorithm. Shanghai People’s Publishing House, 2011, 1–59
37 S Q Long, Y L Zhao, W Chen. MORM: a multi-objective optimized replication management strategy for cloud storage cluster. Journal of Systems Architecture, 2014, 60(2): 234–244
https://doi.org/10.1016/j.sysarc.2013.11.012
38 C Lou, M Zheng, X Liu, X Li. Replica selection strategy based on individual QoS sensitivity constraints in cloud environment. Pervasive Computing and the Networked World, 2014, 8351: 393–399
https://doi.org/10.1007/978-3-319-09265-2_40
39 K A Kumar, A Quamar, A Deshpande, S Khuller. SWORD: workloadaware data placement and replica selection for cloud data management systems. The VLDB Journal, 2014, 23(6): 845–870
https://doi.org/10.1007/s00778-014-0362-1
40 U Tos, R Mokadem, A Hameurlain, T Ayav, S Bora. Ensuring performance and provider profit through data replication in cloud systems. Cluster Computing, 2018, 21(3): 1479–1492
https://doi.org/10.1007/s10586-017-1507-y
41 Z Wu, M Butkiewicz, D Perkins, E Katz-Basset, H V Madhyastha. Spanstore: cost-effective geo-replicated storage spanning multiple cloud services. In: Proceedings of the 24th ACM Symposium on Operating Systems Principles. 2013, 292–308
https://doi.org/10.1145/2517349.2522730
42 A Vulimiri, C Curino, B Godfrey, J Padhye, G Varghese. Global analytics in the face of bandwidth and regulatory constraints. In: Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation. 2015, 323–336
43 Q Wei, B Veeravalli, B Gong, L Zeng, D Feng. CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster. In: Proceedings of IEEE International Conference on Cluster Computing. 2010, 188–196
https://doi.org/10.1109/CLUSTER.2010.24
44 E B Edwin, P Umamaheswari, M R Thanka. An efficient and improved multi-objective optimized replication management with dynamic and cost aware strategies in cloud computing data center. Cluster Computing, 2019, 22: 11119–11128
https://doi.org/10.1007/s10586-017-1313-6
45 S K Azimi. A Bee Colony (Beehive) based approach for data replication in cloud environments. In: Montaser Kouhsari S, eds. Fundamental Research in Electrical Engineering. Springer, Singapore, 2018, 1039–1052
https://doi.org/10.1007/978-981-10-8672-4_80
46 I Tatarinov, S D Viglas, K S Beyer, J Shanmugasundaram, E J Shekita, C Zhang. Storing and querying ordered XML using a relational database system. In: Proceedings of the 2002 ACMSIGMOD International Conference on Management of Data. 2002, 204–215
https://doi.org/10.1145/564691.564715
47 X Cheng, C Dale, J Liu. Statistics and social network of YouTube videos. In: Proceedings of the 16th International Workshop on Quality of Service. 2008, 229–238
https://doi.org/10.1109/IWQOS.2008.32
48 M K Madi, S Hassan. Dynamic replication algorithm in data grid: survey. In: Proceedings of International Conference on Network Applications, Protocols and Services. 2008
49 M Madi, S Hassan, Y Yusof. A dynamic replication strategy based on exponential growth/decay rate. In: Proceedings of International Conference on Computing and Informatics. 2009
50 L Xu, T W Ling, H Wu, Z Bao. DDE: from dewey to a fully dynamic XML labeling scheme. In: Proceedings of SIGMOD Conference. 2009, 719–730
https://doi.org/10.1145/1559845.1559921
51 A Dogan. A study on performance of dynamic file replication algorithms for real-time file access in data grids. Future Generation Computer Systems, 2009, 25(8): 829–839
https://doi.org/10.1016/j.future.2009.02.002
52 A M Rahmani, Z Fadaie, A T Chronopoulos. Data placement using dewey encoding in a hierarchical data grid. Journal of Network and Computer Applications, 2015, 49: 88–98
https://doi.org/10.1016/j.jnca.2014.11.009
53 L A Barroso, J Clidaras, U Holzle. The Datacenter As a Computer: an Introduction to the Design of Warehouse-scale Machines. 2nd ed. Morgan and Claypool Publishers, 2013
54 R Murugesan, C Elango, S Kannan. Cloud computing networks with poisson arrival process dynamic resource allocation. IOSR Journal of Computer Engineering, 2014, 16(5): 124–129
https://doi.org/10.9790/0661-1654124129
55 M A S Mosleh, G Radhamani, S H Hasan. Adaptive cost-based task scheduling in cloud environment. Scientific Programming, 2016
https://doi.org/10.1155/2016/8239239
56 D G Cameron, R Carvajal-schiaffino, A Paul Millar, C Nicholson, K Stockinger, F Zini. UK Grid Simulation with OptorSim. UK e-Science All Hands Meeting, 2003
57 L W Lee, P Scheuermann, R Vingralek. File assignment in parallel I/O systems with minimal variance of service time. IEEE Transactions on Computers, 2000, 49(2): 127–140
https://doi.org/10.1109/12.833109
58 K Ranganathan, I Foster. Decoupling computation and data scheduling in distributed data intensive applications. In: Proceedings of International Symposium for High Performance Distributed Computing. 2002
59 L Breslau, P Cao, L Fan, G Phillips, S Shenker. Web caching and Zipf-like distributions: evidence and implications. In: Proceedings of IEEE INFOCOM’ 99, Conference on Computer Communications. 1999, 126–134
https://doi.org/10.1109/INFCOM.1999.749260
60 A Iamnitchi, M Ripeanu, I Foster. Locating data in (small-world?) peerto-peer scientific collaborations. In: Proceedings of the 1st International Workshop on Peer-to-Peer Systems. 2002, 232–241
https://doi.org/10.1007/3-540-45748-8_22
61 M Visser. Zipf’s law, power laws and maximum entropy. New Journal of Physics, 2013, 15(4): 1–13
https://doi.org/10.1088/1367-2630/15/4/043021
62 L Adamic, B Huberman. Zipf’s law and the Internet. Glottometrics, 2002, 3(1): 143–150
63 U Tos, R Mokadem, A Hameurlain, T Ayav, S Bora. Dynamic replication strategies in data grid systems: a survey. The Journal of Supercomputing, 2015, 21(11): 4116–4140
https://doi.org/10.1007/s11227-015-1508-7
[1] Highlights Download
[1] Tianye ZHANG, Qi WANG, Liwen LIN, Jiazhi XIA, Xiwang XU, Yanhao HUANG, Xiaonan LUO, Wenting ZHENG, Wei CHEN. WaveLines: towards effective visualization and analysis of stability in power grid simulation[J]. Front. Comput. Sci., 2021, 15(6): 156704-.
[2] Yao QIN, Hua WANG, Shanwen YI, Xiaole LI, Linbo ZHAI. A multi-objective reinforcement learning algorithm for deadline constrained scientific workflow scheduling in clouds[J]. Front. Comput. Sci., 2021, 15(5): 155105-.
[3] Wei ZHENG, Ying WU, Xiaoxue WU, Chen FENG, Yulei SUI, Xiapu LUO, Yajin ZHOU. A survey of Intel SGX and its applications[J]. Front. Comput. Sci., 2021, 15(3): 153808-.
[4] Tingting CHEN, Haikun LIU, Xiaofei LIAO, Hai JIN. Resource abstraction and data placement for distributed hybrid memory pool[J]. Front. Comput. Sci., 2021, 15(3): 153103-.
[5] Jianpeng HU, Linpeng HUANG, Tianqi SUN, Ying FAN, Wenqiang HU, Hao ZHONG. Proactive planning of bandwidth resource using simulation-based what-if predictions forWeb services in the cloud[J]. Front. Comput. Sci., 2021, 15(1): 151201-.
[6] Jiayang LIU, Jingguo BI, Mu LI. Secure outsourcing of large matrix determinant computation[J]. Front. Comput. Sci., 2020, 14(6): 146807-.
[7] Daian YUE, Vania JOLOBOFF, Frédéric MALLET. TRAP: trace runtime analysis of properties[J]. Front. Comput. Sci., 2020, 14(3): 143201-.
[8] Samuel IRVING, Bin LI, Shaoming CHEN, Lu PENG, Weihua ZHANG, Lide DUAN. Computer comparisons in the presence of performance variation[J]. Front. Comput. Sci., 2020, 14(1): 21-41.
[9] Ying JIANG, Shichao LIU, Thomas EHRHARD. A fully abstract semantics for value-passing CCS for trees[J]. Front. Comput. Sci., 2019, 13(4): 828-849.
[10] Jin HUANG, Jiong CHEN, Weiwei XU, Hujun BAO. A survey on fast simulation of elastic objects[J]. Front. Comput. Sci., 2019, 13(3): 443-459.
[11] Jing LIU, Tengfei LI, Zuohua DING, Yuqing QIAN, Haiying SUN, Jifeng HE. AADL+: a simulation-based methodology for cyber-physical systems[J]. Front. Comput. Sci., 2019, 13(3): 516-538.
[12] Meysam VAKILI, Neda JAHANGIRI, Mohsen SHARIFI. Cloud service selection using cloud service brokers: approaches and challenges[J]. Front. Comput. Sci., 2019, 13(3): 599-617.
[13] Qiang LIU, Xiaoshe DONG, Heng CHEN, Yinfeng WANG. IncPregel: an incremental graph parallel computation model[J]. Front. Comput. Sci., 2018, 12(6): 1076-1089.
[14] Juan ZHANG, Fuqing DUAN, Mingquan ZHOU, Dongcan JIANG, Xuesong WANG, Zhongke WU, Youliang HUANG, Guoguang DU, Shaolong LIU, Pengbo ZHOU, Xiangang SHANG. Stable and realistic crack pattern generation using a cracking node method[J]. Front. Comput. Sci., 2018, 12(4): 777-797.
[15] Xiong FU, Juzhou CHEN, Song DENG, Junchang WANG, Lin ZHANG. Layered virtual machine migration algorithm for network resource balancing in cloud computing[J]. Front. Comput. Sci., 2018, 12(1): 75-85.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed