Kun WANG, Song WU(), Shengbang LI, Zhuo HUANG, Hao FAN, Chen YU, Hai JIN
National Engineering Research Center for Big Data Technology and System, Services Computing Technology and System Lab, Cluster and Grid Computing Lab School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China
Container-based virtualization is becoming increasingly popular in cloud computing due to its efficiency and flexibility. Resource isolation is a fundamental property of containers. Existing works have indicated weak resource isolation could cause significant performance degradation for containerized applications and enhanced resource isolation. However, current studies have almost not discussed the isolation problems of page cache which is a key resource for containers. Containers leverage memory cgroup to control page cache usage. Unfortunately, existing policy introduces two major problems in a container-based environment. First, containers can utilize more memory than limited by their cgroup, effectively breaking memory isolation. Second, the OS kernel has to evict page cache to make space for newly-arrived memory requests, slowing down containerized applications. This paper performs an empirical study of these problems and demonstrates the performance impacts on containerized applications. Then we propose pCache (precise control of page cache) to address the problems by dividing page cache into private and shared and controlling both kinds of page cache separately and precisely. To do so, pCache leverages two new technologies: fair account (f-account) and evict on demand (EoD). F-account splits the shared page cache charging based on per-container share to prevent containers from using memory for free, enhancing memory isolation. And EoD reduces unnecessary page cache evictions to avoid the performance impacts. The evaluation results demonstrate that our system can effectively enhance memory isolation for containers and achieve substantial performance improvement over the original page cache management policy.
D Merkel . Docker: lightweight linux containers for consistent development and deployment. Linux Journal, 2014, 239: 2
2
R, Zeng X F, Hou L, Zhang C, Li W L, Zheng M Y Guo . Performance optimization for cloud computing systems in the microservice era: state-of-the-art and research opportunities. Frontiers of Computer Science, 2022, 16( 6): 166106
3
X F, Hou C, Li J C, Liu L, Zhang S L, Ren J W, Leng Q, Chen M Y Guo . AlphaR: learning-powered resource management for irregular, dynamic microservice graph. In: Proceeding of IEEE International Parallel and Distributed Processing Symposium. 2021, 797−806
4
K, Suo Y, Zhao W, Chen J Rao . An analysis and empirical study of container networks. In: Proceedings of IEEE INFOCOM 2018-IEEE Conference on Computer Communications. 2018, 189−197
5
Zhang Y Q, Goiri Í, Chaudhry G I, Fonseca R, Elnikety S, Delimitrou C, Bianchini R. Faster and cheaper serverless computing on harvested resources. In: Proceedings of the 28th ACM SIGOPS Symposium on Operating Systems Principles. 2021, 724−739
6
H, Huang J, Rao S, Wu H, Jin K, Suo X F Wu . Adaptive resource views for containers. In: Proceedings of International Symposium on High-Performance Parallel and Distributed Computing. 2019, 243−254
7
Soltesz S, Pötzl H, Fiuczynski M E, Bavier A, Peterson L. Container-based operating system virtualization: a scalable, high-performance alternative to hypervisors. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems. 2007, 275−287
8
O, Laadan J Nieh . Operating System virtualization: practice and experience. In: Proceedings of the 3rd Annual Haifa Experimental Systems Conference. 2010, 17
9
J, Khalid E, Rozner W, Felter C, Xu K, Rajamani A, Ferreira A Akella . Iron: Isolating network-based CPU in container environments. In: Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation. 2018, 313−328
10
Y H Z, Li J C, Zhang C F, Jiang J, Wan Z J Ren . PINE: Optimizing performance isolation in container environments. IEEE Access, 2019, 7: 30410–30422
11
K S Senthil . Practical LXC and LXD: Linux Containers for Virtualization and Orchestration. New York: Apress, 2017
12
Xie X L, Wang P, Wang Q. The performance analysis of Docker and rkt based on Kubernetes. In: Proceedings of the 13th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery. 2017, 2137−2141
13
Skarlatos D, Chen Q R, Chen J Y, Xu T Y, Torrellas J. Draco: Architectural and operating system support for system call security. In: Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture. 2020, 42−57
14
H D, Do V, Hayot-Sasson Silva R F, Da C, Steele H, Casanova T Glatard . Modeling the Linux page cache for accurate simulation of data-intensive applications. In: Proceedings of 2021 IEEE International Conference on Cluster Computing. 2021, 398−408
15
D, Eklov E Hagersten . StatStack: Efficient modeling of LRU caches. In: Proceedings of 2010 IEEE International Symposium on Performance Analysis of Systems & Software. 2010, 55−65
16
V, Tarasov E, Zadok S Shepler . Filebench: A flexible framework for file system benchmarking. The USENIX Magazine, 2016, 41( 1): 6–12
17
Y C, Xiang X L, Wang Z H, Huang Z Y, Wang Y W, Luo Z L Wang . DCAPS: Dynamic cache allocation with partial sharing. In: Proceedings of the Thirteenth EuroSys Conference. 2018, 1−15
18
M, Xu L, Thi X, Phan H Y, Choi I Lee . vCAT: Dynamic cache management using CAT virtualization. In: Proceedings of 2017 IEEE Real-Time and Embedded Technology and Applications Symposium. 2017, 211−222
19
P, Sohal M, Bechtel R, Mancuso H, Yun O Krieger . A closer look at Intel Resource Director Technology (RDT). In: Proceedings of the 30th International Conference on Real-Time Networks and Systems. 2022, 127−139
20
Chaudhuri M. Zero inclusion victim: Isolating core caches from inclusive last-level cache evictions. In: Proceeding of the 48th ACM/IEEE Annual International Symposium on Computer Architecture. 2021, 71−84
21
C, Delimitrou C Kozyrakis . Bolt: I know what you did last summer... in the cloud. ACM SIGARCH Computer Architecture News, 2017, 45( 1): 599–613
22
S Volckaert . Randomization-based defenses against data-oriented attacks. In: Proceedings of the 8th ACM Workshop on Moving Target Defense. 2021, 1−2
23
R Love . Linux Kernel Development. 3rd ed. New York: Pearson Education, 2010
24
W, Felter A, Ferreira R, Rajamony J Rubio . An updated performance comparison of virtual machines and Linux containers. In: Proceedings of 2015 IEEE International Symposium on Performance Analysis of Systems and Software. 2015, 171−172
25
P, Sharma L, Chaufournier P, Shenoy Y C Tay . Containers and virtual machines at scale: A comparative study. In: Proceedings of the 17th International Middleware Conference. 2016, 1
26
Plauth M, Feinbube L, Polze A. A performance survey of lightweight virtualization techniques. In: Proceedings of the 6th European Conference on Service-Oriented and Cloud Computing. 2017, 34−48
27
Matthews J N, Hu W J, Hapuarachchi M, Deshane T, Dimatos D, Hamilton G, McCabe M, Owens J. Quantifying the performance isolation properties of virtualization systems. In: Proceedings of 2007 Workshop on Experimental Computer Science. 2007, 6−es
28
Xavier M G, De Oliveira I C, Rossi F D, Dos Passos R D, Matteussi K J, De Rose C A. A performance isolation analysis of disk-intensive workloads on container-based clouds. In: Proceedings of the 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing. 2015, 253−260
29
N Z, Yang W B, Shen J K, Li Y T, Yang K J, Lu J T, Xiao T Y, Zhou C G, Qin W, Yu J F, Ma K Ren . Demons in the shared kernel: Abstract resource attacks against OS-level virtualization. In: Proceedings of 2021 ACM SIGSAC Conference on Computer and Communications Security. 2021, 764−778
30
Anjali, Caraza-Harter T, Swift M M. Blending containers and virtual machines: A study of firecracker and gVisor. In: Proceedings of the 16th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. 2020, 101−113
31
Sartakov V A, Vilanova L, Eyers D, Shinagawa T, Pietzuch P. CAP-VMs: Capability-based isolation and sharing in the cloud. In: Proceedings of the 16th USENIX Symposium on Operating Systems Design and Implementation. 2022, 597−612
32
Z C, Hua Y, Yu J Y, Gu Y B, Xia H B, Chen B Y Zang . TZ-container: Protecting container from untrusted OS with ARM TrustZone. Science China Information Sciences, 2021, 64( 9): 192101
33
Y Q, Sun D, Safford M, Zohar D, Pendarakis Z S, Gu T Jaeger . Security namespace: making linux security frameworks available to containers. In: Proceedings of the 27th USENIX Conference on Security Symposium. 2018, 1423−1439
34
Gao X, Gu Z S, Kayaalp M, Pendarakis D, Wang H N. Containerleaks: Emerging security threats of information leakages in container clouds. In: Proceedings of the 47th Annual IEEE/IFIP International Conference on Dependable Systems and Networks. 2017, 237−248
35
X, Gao Z S, Gu Z F, Li H, Jamjoom C Wang . Houdini’s escape: Breaking the resource rein of Linux control groups. In: Proceedings of 2019 ACM SIGSAC Conference on Computer and Communications Security. 2019, 1073−1086
36
Huang H, Rao J, Wu S, Jin H, Jiang S, Che H, Wu X F. Towards exploiting CPU elasticity via efficient thread oversubscription. In: Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing. 2021, 215−226
37
S, Wu Z, Huang P F, Chen H, Fan S, Ibrahim H Jin . Container-aware I/O stack: Bridging the gap between container storage drivers and solid state devices. In: Proceedings of the 18th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. 2022, 18−30
38
T, Heo D, Schatzberg A, Newell S, Liu S, Dhakshinamurthy I, Narayanan J, Bacik C, Mason C Q, Tang D Skarlatos . IOCost: Block IO control for containers in datacenters. In: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 2022, 595−608
39
Gu L, Guan J J, Wu S, Jin H, Rao J, Suo K, Zeng D Z. CNTC: A container aware network traffic control framework. In: Proceeding of the 14th International Conference on Green, Pervasive, and Cloud Computing. 2019, 208−222
40
A, Randazzo I Tinnirello . Kata containers: An emerging architecture for enabling mec services in fast and secure way. In: Proceedings of 2019 Sixth International Conference on Internet of Things: Systems, Management and Security. 2019, 209−214
41
F, Manco C, Lupu F, Schmidt J, Mendes S, Kuenzer S, Sati K, Yasukata C, Raiciu F Huici . My VM is lighter (and safer) than your container. In: Proceedings of the 26th Symposium on Operating Systems Principles. 2017, 218−233
42
I, Mavridis H Karatza . Combining containers and virtual machines to enhance isolation and extend functionality on cloud computing. Future Generation Computer Systems, 2019, 94: 674–696
43
Shen Z M, Sun Z, Sela G E, Bagdasaryan E, Delimitrou C, Renesse R V, Weatherspoon H. X-Containers: Breaking down barriers to improve performance and isolation of cloud-native containers. In: Proceedings of the 24th International Conference on Architectural Support for Programming Languages and Operating Systems. 2019, 121−135
44
H, Tazaki A, Moroo Y, Kuga R Nakamura . How to design a library OS for practical containers? In: Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments. 2021, 15−28
45
Z J, Li J, Cheng Q, Chen E Y, Guan Z Z, Bian Y, Tao B, Zha Q, Wang W D, Han M Y Guo . RunD: A lightweight secure container runtime for high-density deployment and high-concurrency startup in serverless computing. In: Proceeding of 2022 USENIX Annual Technical Conference. 2022, 53−68
46
J T, Lim J Nieh . Optimizing nested virtualization performance using direct virtual hardware. In: Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems. 2020, 557−574
47
J, Huang M K, Qureshi K Schwan . An evolutionary study of Linux memory management for fun and profit. In: Proceedings of the 2016 USENIX Conference on USENIX Annual Technical Conference. 2016, 465−478
48
J, Kim P, Shin S, Noh D, Ham S Hong . Reducing memory interference latency of safety-critical applications via memory request throttling and Linux Cgroup. In: Proceedings of 2018 31st IEEE International System-on-Chip Conference. 2018, 215−220
49
Z Y, Zhuang C, Tran J, Weng H, Ramachandra B Sridharan . Taming memory related performance pitfalls in linux Cgroups. In: Proceedings of 2017 International Conference on Computing, Networking and Communications. 2017, 531−535
50
K, Oh J, Park Y I Eom . Weight-based page cache management scheme for enhancing I/O proportionality of Cgroups. In: Proceedings of 2019 IEEE International Conference on Consumer Electronics. 2019, 1−3
51
J, Park Y I Eom . Weight-aware cache for application-level proportional I/O sharing. IEEE Transactions on Computers, 2021, 71( 10): 2395–2407
52
D, Zheng R, Burns A S Szalay . Toward millions of file system IOPS on low-cost, commodity hardware. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. 2013, 1−12
53
J, Bang C, Kim S, Kim Q C, Chen C, Lee E K, Byun J, Lee H Eom . Finer-LRU: A scalable page management scheme for HPC manycore architectures. In: Proceeding of 2021 IEEE International Parallel and Distributed Processing Symposium. 2021, 567−576