1. State Key Laboratory of Software Development Environment, Beihang University, Beijing 100191, China 2. School of Computer Science and Engineering, Beihang University, Beijing 100191, China 3. High Performance Computing Center, Beihang University, Beijing 100191, China 4. College of Software, Beihang University, Beijing 100191, China 5. Hebei Key Laboratory of Agricultural Big Data, Hebei Agricultural University, Baoding 071001, China 6. College of Computer Science and Technology, Zhejiang University, HangZhou 310013, China
The authors of this paper have previously proposed the global virtual data space system (GVDS) to aggregate the scattered and autonomous storage resources in China’s national supercomputer grid (National Supercomputing Center in Guangzhou, National Supercomputing Center in Jinan, National Supercomputing Center in Changsha, Shanghai Supercomputing Center, and Computer Network Information Center in Chinese Academy of Sciences) into a storage system that spans the wide area network (WAN), which realizes the unified management of global storage resources in China. At present, the GVDS has been successfully deployed in the China National Grid environment. However, when accessing and sharing remote data in the WAN, the GVDS will cause redundant transmission of data and waste a lot of network bandwidth resources. In this paper, we propose an edge cache system as a supplementary system of the GVDS to improve the performance of upper-level applications accessing and sharing remote data. Specifically, we first designs the architecture of the edge cache system, and then study the key technologies of this architecture: the edge cache index mechanism based on double-layer hashing, the edge cache replacement strategy based on the GDSF algorithm, the request routing based on consistent hashing method, and the cluster member maintenance method based on the SWIM protocol. The experimental results show that the edge cache system has successfully implemented the relevant operation functions (read, write, deletion, modification, etc.) and is compatible with the POSIX interface in terms of function. Further, it can greatly reduce the amount of data transmission and increase the data access bandwidth when the accessed file is located at the edge cache system in terms of performance, i.e., its performance is close to the performance of the network file system in the local area network (LAN).
. [J]. Frontiers of Computer Science, 2023, 17(1): 171102.
Jiantong HUO, Yaowen XU, Zhisheng HUO, Limin XIAO, Zhenxue HE. Research on key technologies of edge cache in virtual data space across WAN. Front. Comput. Sci., 2023, 17(1): 171102.
global path: File global path; ioproxy_info: the IO agent information that manages the file; relative path: the relative path of the IO agent; token: customer certificate; return: cached metadata information or failed
global path: file global path; ioproxy_info: the IO agent information that manages the file; buf: file buffer; offset: start position of file writing; length: the length of the data written in the file; return: the amount of data written, 0(write successful), ?1 (write failed)
global path: file global path; ioproxy_info: the IO agent information that manages the file; relative path: the relative path of the IO agent; buf: file buffer; offset: start position of file reading; length: the length of the data read in the file; return: the amount of data read (read successful), ?1 (read failed)
File closing interface of edge cache system
Int64 EdgeCacheClose (string global path, string ioproxy_info, string relative path)
global path: file global path; ioproxy_info: the IO agent information that manages the file; relative path: the relative path of the IO agent; return: 0 (successful shutdown), ?1 (failed shutdown)
Tab.2
Interface
Specification
Return value
Interface Description
Cache file interface for adding
POST /cache; Json Value; /r/n
Value 200 OK; 500 Internal Server Error
The management interface adds files corresponding to the specified global path to the edge cache system.
Cache file query interface
POST /cache; Json Value; /r/n
Value 200 OK; 500 Internal Server Error
The management interface queries the file corresponding to the specified global path in the edge cache system
Cache file update interface
POST /cache; Json Value; /r/n
Value 200 OK; 500 Internal Server Error
The management interface synchronizes the file corresponding to the specified global path in the edge cache system
Cache file delete interface
POST /cache; Json Value; /r/n
Value 200 OK; 500 Internal Server Error
The management interface deletes the file corresponding to the specified global path in the edge cache system
Edge cache cluster status query interface
GET /cluster
Value 200 OK; 500 Internal Server Error
The management interface obtains the cluster status information of the edge cache service node
Edge cache storage capacity status query interface
GET /storage
Value 200 OK; 500 Internal Server Error
The management interface queries the storage capacity status of the edge cache system
K Ashton . That ‘internet of things’ thing. RFID Journal, 2009, 22( 7): 97– 114
2
L M Liu , B Wang . Research of an end-to-end transfer mechanism for big data in CMAGrid environment. Computing Technology and Automation, 2014, 33( 1): 122– 126
3
B Wang , X Zong , H Tian . Design and establishment of a nationwide meteorological computational grid. Journal of Applied Meteorological Science, 2010, 21( 5): 632– 640
4
S Li , L D Xu , S Zhao . The internet of things: a survey. Information Systems Frontiers, 2015, 17( 2): 243– 259
5
J Dilley , B Maggs , J Parikh , H Prokop , R Sitaraman , B Weihl . Globally distributed content delivery. IEEE Internet Computing, 2002, 6( 5): 50– 58
6
M Satyanarayanan . The emergence of edge computing. Computer, 2017, 50( 1): 30– 39
7
Z Su , M Dai , Q Xu , R Li , S Fu . Q-learning-based spectrum access for content delivery in mobile networks. IEEE Transactions on Cognitive Communications and Networking, 2020, 6( 1): 35– 47
8
L Ramaswamy , L Liu , A Iyengar . Scalable delivery of dynamic content using a cooperative edge cache grid. IEEE Transactions on Knowledge and Data Engineering, 2007, 19( 5): 614– 630
9
J Zhao. The case for VM-based cloudlets in mobile computing. See Docin.Com/P-1950150101 website, 2010
10
S Wilkinson, T Boshevski, J Brandoff, V Buterin. Storj a peer-to-peer cloud storage network. See , 2014
11
B Chen , C Yang , G Wang . High-throughput opportunistic cooperative device-to-device communications with caching. IEEE Transactions on Vehicular Technology, 2017, 66( 8): 7527– 7539
12
H Tan, H C Jiang, Z Han, L Liu, Q Zhao. Camul: online caching on multiple caches with relaying and bypassing. In: Proceedings of 2019 IEEE Conference on Computer Communications. 2019
13
A Headquarters. Cisco wide area application services configuration guide. See
14
B Berg, D S Berger, S McAllister, I Grosof, S Gunasekar, J Lu, M Uhlar, J Carrig, N Beckmann, M Harchol-Balter, G R Ganger. The Cachelib caching engine: design and experiences at scale. In: Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation. 2020, 753– 768
15
L Cherkasova. Improving WWW proxies performance with greedy-dual-size-frequency caching policy. See hpl.hp.com/techreports/98/HPL-98–69R1 website, 1998
16
Karger D, Lehman E, Leighton T, Panigrahy R, Levine M, Lewin D. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing. 1997, 654−663
17
A Das, I Gupta, A Motivala. SWIM: scalable weakly-consistent infection-style process group membership protocol. In: Proceedings International Conference on Dependable Systems and Networks. 2002, 303−312
18
K Birman . The promise, and limitations, of gossip protocols. ACM SIGOPS Operating Systems Review, 2007, 41( 5): 8– 13