Please wait a minute...
Frontiers of Computer Science

ISSN 2095-2228

ISSN 2095-2236(Online)

CN 10-1014/TP

Postal Subscription Code 80-970

2018 Impact Factor: 1.129

Front. Comput. Sci.    2024, Vol. 18 Issue (4) : 184504    https://doi.org/10.1007/s11704-023-1346-3
Networks and Communication
Multi-user reinforcement learning based task migration in mobile edge computing
Yuya CUI1,2, Degan ZHANG1(), Jie ZHANG3, Ting ZHANG4, Lixiang CAO1, Lu CHEN1
1. Tianjin Key Lab of Intelligent Computing and Novel Software Technology, Tianjin University of Technology, Tianjin 300384, China
2. School of Internet of Things Engineering, Jiangsu Vocational College of Information Technology, Wuxi 214153, China
3. School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing 100044, China
4. School of Sports Economics and Management, Tianjin University of Sport, Tianjin 301617, China
 Download: PDF(5072 KB)   HTML
 Export: BibTeX | EndNote | Reference Manager | ProCite | RefWorks
Abstract

Mobile Edge Computing (MEC) is a promising approach. Dynamic service migration is a key technology in MEC. In order to maintain the continuity of services in a dynamic environment, mobile users need to migrate tasks between multiple servers in real time. Due to the uncertainty of movement, frequent migration will increase delays and costs and non-migration will lead to service interruption. Therefore, it is very challenging to design an optimal migration strategy. In this paper, we investigate the multi-user task migration problem in a dynamic environment and minimizes the average service delay while meeting the migration cost. In order to optimize the service delay and migration cost, we propose an adaptive weight deep deterministic policy gradient (AWDDPG) algorithm. And distributed execution and centralized training are adopted to solve the high-dimensional problem. Experiments show that the proposed algorithm can greatly reduce the migration cost and service delay compared with the other related algorithms.

Keywords mobile edge computing      mobility      service migration      deep reinforcement learning      deep deterministic policy gradient     
Corresponding Author(s): Degan ZHANG   
Just Accepted Date: 11 May 2023   Issue Date: 12 July 2023
 Cite this article:   
Yuya CUI,Degan ZHANG,Jie ZHANG, et al. Multi-user reinforcement learning based task migration in mobile edge computing[J]. Front. Comput. Sci., 2024, 18(4): 184504.
 URL:  
https://academic.hep.com.cn/fcs/EN/10.1007/s11704-023-1346-3
https://academic.hep.com.cn/fcs/EN/Y2024/V18/I4/184504
Fig.1  System model
Fig.2  DDPG framework
  
Fig.3  AWDDPG framework
Fig.4  Centralized training framework based on AWDDPG
  
ParameterValue
The number of IoT, N[60?140]
The number of MEC, M[15,20]
Input data size, bn[100,500] KB
Bandwidth, Bm100 MHz
Computing power of edge servers (CPU frequency of edge server), Fm[10,30] GHz
Migration budget cost, Costbudget[0.5?3] GJ
Channel gain, Gm,n10?6
Transmission power, Pn1 W
White noise power, ?m10?9 W
Penalty, C?0.6
Communication distance of the server[50,100] m
Bandwidth between mobile device and MEC server15 MHz
Tab.1  Experimental parameters
ParameterValue
Memory pool capacity, D10000
Mini-batch size, K32
The learning rate of Actor0.0001
The learning rate of Critic0.001
κa/κc0.001
Discount factor,μ0.9
Tab.2  AWDDPG parameter setting
Fig.5  The convergence of AWDDPG. (a) The rewards of AWDDPG and DDPG algorithms; (b) the loss function of the AWDDPG algorithm
Fig.6  The effect of learning efficiency on average completion time
Fig.7  System reward with different discount factor. (a) μ=0.9; (b) μ=0.8
Fig.8  The migration cost and execution delay. (a) The migration cost of the whole system; (b) the task execution delay of the whole system; (c) the geographic information between the edge server and the mobile user at time t
Fig.9  The comparisons of average completion time. (a) The average completion time of different input data sizes; (b) the average completion time of different numbers of users; (c) the average completion time of different numbers of MEC servers; (d) the average completion time of different migration cost budgets
Fig.10  The average migration cost of different input data size
  
  
  
  
  
  
1 Zhang D G, Wang J X, Zhang J X, Zhang T, Yang C, Jiang K W. A new method of fuzzy multicriteria routing in vehicle Ad Hoc network. IEEE Transactions on Computational Social Systems, 2022, doi: 10.1109/TCSS.2022.3193739
2 D, Zhang G, Li K, Zheng X, Ming Z H Pan . An energy-balanced routing method based on forward-aware factor for Wireless Sensor Networks. IEEE Transactions on Industrial Informatics, 2014, 10( 1): 766–773
3 S, Liu D, Zhang X, Liu T, Zhang H Wu . Adaptive repair algorithm for TORA routing protocol based on flood control strategy. Computer Communications, 2020, 151: 437–448
4 Zhang D G, Dong W M, Zhang T, Zhang P, Zhang P, Sun G X, Cao Y H. New computing tasks offloading method for MEC based on prospect theory framework. IEEE Transactions on Computational Social Systems, 2022, doi: 10.1109/TCSS.2022.3228692
5 D G, Zhang H L, Niu S Liu . Novel PEECR-based clustering routing approach. Soft Computing, 2017, 21( 24): 7313–7323
6 J, Chen L, Zhang Y C, Liang X, Kang R Zhang . Resource allocation for wireless-powered IoT networks with short packet communication. IEEE Transactions on Wireless Communications, 2019, 18( 2): 1447–1461
7 L, Shan S, Gao S, Chen M, Xu F, Zhang X, Bao M Chen . Energy-efficient resource allocation in NOMA-integrated V2X networks. Computer Communications, 2023, 197: 23–33
8 Zhang D, Wang W, Zhang J, Zhang T, Du J, Yang C. Novel edge caching approach based on multi-agent deep reinforcement learning for internet of vehicles. IEEE Transactions on Intelligent Transportation Systems, 2023, doi: 10.1109/TITS.2023.3264553
9 J, Yang M, Ding G, Mao Z, Lin D G, Zhang T H Luan . Optimal base station antenna downtilt in downlink cellular networks. IEEE Transactions on Wireless Communications, 2019, 18( 3): 1779–1791
10 D G, Zhang Y Y, Cui T Zhang . New quantum-genetic based OLSR protocol (QG-OLSR) for mobile Ad hoc network. Applied Soft Computing, 2019, 80: 285–296
11 D, Zhang H, Ge T, Zhang Y Y, Cui X, Liu G Mao . New multi-hop clustering algorithm for vehicular Ad Hoc networks. IEEE Transactions on Intelligent Transportation Systems, 2019, 20( 4): 1517–1530
12 J, Chen G, Mao C, Li W, Liang D G Zhang . Capacity of cooperative vehicular networks with infrastructure support: multiuser case. IEEE Transactions on Vehicular Technology, 2018, 67( 2): 1546–1560
13 T, Zhang D G, Zhang H R, Yan J N, Qiu J X Gao . A new method of data missing estimation with FNN-based tensor heterogeneous ensemble learning for internet of vehicle. Neurocomputing, 2021, 420: 98–110
14 L, Chen D G, Zhang J, Zhang T, Zhang J Y, Du H R Fan . An approach of flow compensation incentive based on Q-learning strategy for IoT user privacy protection. AEU–International Journal of Electronics and Communications, 2022, 148: 154172
15 D G, Zhang S, Liu X H, Liu T, Zhang Y Y Cui . Novel dynamic source routing protocol (DSR) based on genetic algorithm-bacterial foraging optimization (GA‐BFO). International Journal of Communication Systems, 2018, 31( 18): e3824
16 D, Zhang X, Wang X, Song D Zhao . A novel approach to mapped correlation of ID for RFID anti-collision. IEEE Transactions on Services Computing, 2014, 7( 4): 741–748
17 S, Liu D G, Zhang X H, Liu T, Zhang J X, Gao C L, Gong Y Y Cui . Dynamic analysis for the average shortest path length of mobile Ad Hoc networks under random failure scenarios. IEEE Access, 2019, 7: 21343–21358
18 D G, Zhang H, Wu P Z, Zhao X H, Liu Y Y, Cui L, Chen T Zhang . New approach of multi-path reliable transmission for marginal wireless sensor network. Wireless Networks, 2020, 26( 2): 1503–1517
19 D G, Zhang C H, Ni J, Zhang T, Zhang Z H Zhang . New method of vehicle cooperative communication based on fuzzy logic and signaling game strategy. Future Generation Computer Systems, 2023, 142: 131–149
20 Y, Cui D, Zhang T, Zhang L, Chen M, Piao H Zhu . Novel method of mobile edge computation offloading based on evolutionary game strategy for IoT devices. AEU-International Journal of Electronics and Communications, 2020, 118: 153134
21 D, Zhang L, Cao H, Zhu T, Zhang J, Du K Jiang . Task offloading method of edge computing in internet of vehicles based on deep reinforcement learning. Cluster Computing, 2022, 25( 2): 1175–1187
22 J, Zhang M J, Piao D G, Zhang J, Zhang W M Dong . An approach of multi-objective computing task offloading scheduling based NSGS for IOV in 5G. Cluster Computing, 2022, 25( 6): 4203–4219
23 D, Zhang W, Shuo J, Zhang H, Zhu T, Zhang X Zheng . A content distribution method of internet of vehicles based on edge cache and immune cloning strategy. Ad Hoc Networks, 2023, 138: 103012
24 D, Zhang M, Piao T, Zhang C, Chen H Zhu . New algorithm of multi-strategy channel allocation for edge computing. AEU-International Journal of Electronics and Communications, 2020, 126: 153372
25 Y Y, Cui D G, Zhang T, Zhang J, Zhang M Piao . A novel offloading scheduling method for mobile application in mobile edge computing. Wireless Networks, 2022, 28( 6): 2345–2363
26 L, Xiao X, Lu T, Xu X, Wan W, Ji Y Zhang . Reinforcement learning-based mobile offloading for edge computing against jamming and interference. IEEE Transactions on Communications, 2020, 68( 10): 6114–6126
27 S, Guan A Boukerche . Design and implementation of offloading and resource management techniques in a mobile cloud environment. In: Proceedings of the 17th ACM International Symposium on Mobility Management and Wireless Access. 2019, 97–102
28 Guan S, Boukerche A. A MEC-based distributed offloading model for ubiquitous and time-constraint offloading. In: Proceedings of the 23rd IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications. 2019, 1–8
29 S, Guan Grande R E, De A Boukerche . A multi-layered scheme for distributed simulations on the cloud environment. IEEE Transactions on Cloud Computing, 2019, 7( 1): 5–18
30 A, Nadembega A, Hafid T Taleb . A destination and mobility path prediction scheme for mobile networks. IEEE Transactions on Vehicular Technology, 2015, 64( 6): 2577–2590
31 Z, Tang X, Zhou F, Zhang W, Jia W Zhao . Migration modeling and learning algorithms for containers in fog computing. IEEE Transactions on Services Computing, 2019, 12( 5): 712–725
32 Q, Yuan J, Li H, Zhou T, Lin G, Luo X Shen . A joint service migration and mobility optimization approach for vehicular edge computing. IEEE Transactions on Vehicular Technology, 2020, 69( 8): 9041–9052
33 S, Wang R, Urgaonkar T, He K, Chan M, Zafer K K Leung . Dynamic service placement for mobile micro-clouds with predicted future costs. IEEE Transactions on Parallel and Distributed Systems, 2017, 28( 4): 1002–1016
34 Plachy J, Becvar Z, Strinati E C. Dynamic resource allocation exploiting mobility prediction in mobile edge computing. In: Proceedings of the 27th IEEE Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC). 2016, 1–6
35 A, Nadembega A S, Hafid R Brisebois . Mobility prediction model-based service migration procedure for follow me cloud to support QoS and QoE. In: Proceedings of 2016 IEEE International Conference on Communications (ICC). 2016, 1–6
36 A, Aissioui A, Ksentini A M, Gueroui T Taleb . On enabling 5G automotive systems using follow me edge-cloud concept. IEEE Transactions on Vehicular Technology, 2018, 67( 6): 5302–5316
37 Zhang W, Chen J, Zhang Y, Raychaudhuri D. Towards efficient edge cloud augmentation for virtual reality MMOGS. In: Proceedings of the 2nd ACM/IEEE Symposium on Edge Computing. 2017, 8
38 C, Zhang Z Zheng . Task migration for mobile edge computing using deep reinforcement learning. Future Generation Computer Systems, 2019, 96: 111–118
39 Gao Z, Jiao Q, Xiao K, Wang Q, Mo Z, Yang Y. Deep reinforcement learning based service migration strategy for edge computing. In: Proceedings of 2019 IEEE International Conference on Service-Oriented System Engineering (SOSE). 2019, 116–1165
40 M, Li P, Si Y Zhang . Delay-tolerant data traffic to software-defined vehicular networks with mobile edge computing in smart city. IEEE Transactions on Vehicular Technology, 2018, 67( 10): 9073–9086
41 Y, He N, Zhao H Yin . Integrated networking, caching, and computing for connected vehicles: a deep reinforcement learning approach. IEEE Transactions on Vehicular Technology, 2018, 67( 1): 44–45
42 H, Peng X Shen . Deep reinforcement learning based resource management for multi-access edge computing in vehicular networks. IEEE Transactions on Network Science and Engineering, 2020, 7( 4): 2416–2428
43 T, Schaul J, Quan I, Antonoglou D Silver . Prioritized experience replay. In: Proceedings of the 4th International Conference on Learning Representations. 2016, 1–23
44 T P, Lillicrap J J, Hunt A, Pritzel N, Heess T, Erez Y, Tassa D, Silver D Wierstra . Continuous control with deep reinforcement learning. In: Proceedings of the 4th International Conference on Learning Representations. 2016, 1–14
45 J N, Foerster Y M, Assael Freitas N, De S Whiteson . Learning to communicate with deep multi-agent reinforcement learning. In: Proceedings of the 30th International Conference on Neural Information Processing Systems. 2016, 2145–2153
46 Park S W, Boukerche A, Guan S. A novel deep reinforcement learning based service migration model for mobile edge computing. In: Proceedings of the 24th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT). 2020, 1–8
47 Ma L, Yi S, Li Q. Efficient service handoff across edge servers via docker container migration. In: Proceedings of the 2nd ACM/IEEE Symposium on Edge Computing. 2017, 11
48 C, Liu F, Tang Y, Hu K, Li Z, Tang K Li . Distributed task migration optimization in MEC by extending multi-agent deep reinforcement learning approach. IEEE Transactions on Parallel and Distributed Systems, 2021, 32( 7): 1603–1614
49 Liu N, Li Z, Xu J, Xu Z, Lin S, Qiu Q, Tang J, Wang Y. A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In: Proceedings of the 37th IEEE International Conference on Distributed Computing Systems. 2017, 372–382
[1] FCS-21346-OF-YC_suppl_1 Download
[1] Xiao MA, Shen-Yi ZHAO, Zhao-Heng YIN, Wu-Jun LI. Clustered Reinforcement Learning[J]. Front. Comput. Sci., 2025, 19(4): 194313-.
[2] Libing WU, Rui ZHANG, Qingan LI, Chao MA, Xiaochuan SHI. A mobile edge computing-based applications execution framework for Internet of Vehicles[J]. Front. Comput. Sci., 2022, 16(5): 165506-.
[3] Zhihan JIANG, Yan LIU, Xiaoliang FAN, Cheng WANG, Jonathan LI, Longbiao CHEN. Understanding urban structures and crowd dynamics leveraging large-scale vehicle mobility data[J]. Front. Comput. Sci., 2020, 14(5): 145310-.
[4] Satoshi MIYAZAWA, Xuan SONG, Tianqi XIA, Ryosuke SHIBASAKI, Hodaka KANEDA. Integrating GPS trajectory and topics from Twitter stream for human mobility estimation[J]. Front. Comput. Sci., 2019, 13(3): 460-470.
[5] Liang WANG, Zhiwen YU, Bi GUO, Fei YI, Fei XIONG. Mobile crowd sensing task optimal allocation: a mobility pattern matching perspective[J]. Front. Comput. Sci., 2018, 12(2): 231-244.
[6] Bing YU,Yanni HAN,Hanning YUAN,Xu ZHOU,Zhen XU. A cost-effective scheme supporting adaptive service migration in cloud data center[J]. Front. Comput. Sci., 2015, 9(6): 875-886.
[7] Jiansu PU,Siyuan LIU,Panpan XU,Huamin QU,Lionel M. NI. MViewer: mobile phone spatiotemporal data viewer[J]. Front. Comput. Sci., 2014, 8(2): 298-315.
[8] Xiaolong LI, Gang PAN, Zhaohui WU, Guande QI, Shijian LI, Daqing ZHANG, Wangsheng ZHANG, Zonghui WANG. Prediction of urban human mobility using large-scale taxi traces and its applications[J]. Front Comput Sci, 2012, 6(1): 111-121.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed