三维无人机-多接入边缘计算场景下的多智能体协作任务调度能效优化方案

doi:10.1631/FITEE.2300393

Frontiers of Information Technology & Electronic Engineering

2024, Vol. 25

Issue (6): 824-838 https://doi.org/10.1631/FITEE.2300393

本期目录

三维无人机-多接入边缘计算场景下的多智能体协作任务调度能效优化方案

李阳¹(

), 魏子令¹(

), 苏金树^1,²(

), 赵宝康¹(

)

¹. 国防科技大学计算机学院，中国长沙市，410073
². 军事科学院，中国北京市，100091

A multi-agent collaboration scheme for energy-efficient task scheduling in a 3D UAV-MEC space

Yang LI¹(

), Ziling WEI¹(

), Jinshu SU^1,²(

), Baokang ZHAO¹(

)

¹. College of Computer, National University of Defense Technology, Changsha 410073, China
². Academy of Military Sciences, Beijing 100091, China

全文: PDF(1036 KB)

摘要:

针对智能应用算力处理需求，多接入边缘计算（multi-access edge computing，MEC）在网络边缘为其提供计算服务。无人机（unmanned aerial vehicle，UAV）具有良好机动性，可在MEC中作为临时空中边缘节点为地面用户提供边缘服务。然而，MEC环境复杂且动态可变，如何为多台无人机制定合适的服务策略具有一定挑战。此外，现有很多UAV-MEC相关工作均假定无人机飞行高度固定，即飞行在二维平面内，忽略了飞行高度的重要性。在同信道干扰存在的前提下，本文通过优化能效实现任务完成量的最大化，多台无人机在三维空间中共同协作为地面用户提供任务计算服务。为实现能效优化目标，最大化任务完成量并最小化飞行能耗，须制定最优飞行策略、子信道选择策略以及任务调度策略。基于多智能体深度确定性策略梯度算法（multi-agent deep deterministic policy gradient，MADDPG），本文提出好奇心驱动和双网络结构的多智能体深度确定性策略梯度算法（curiosity-driven and twin-networks-structured MADDPG，CTMADDPG）解决上述优化问题，通过内部奖励促进智能体的状态探索，避免收敛于次优策略。同时，利用双批评家网络降低Q值高估概率，实现稳定更新。仿真结果表明CTMADDPG算法在最大化整个系统能效方面表现突出，优于其他基准测试算法。

Abstract：

Multi-access edge computing (MEC) presents computing services at the edge of networks to address the enormous processing requirements of intelligent applications. Due to the maneuverability of unmanned aerial vehicles (UAVs), they can be used as temporal aerial edge nodes for providing edge services to ground users in MEC. However, MEC environment is usually dynamic and complicated. It is a challenge for multiple UAVs to select appropriate service strategies. Besides, most of existing works study UAV-MEC with the assumption that the flight heights of UAVs are fixed; i.e., the flying is considered to occur with reference to a two-dimensional plane, which neglects the importance of the height. In this paper, with consideration of the co-channel interference, an optimization problem of energy efficiency is investigated to maximize the number of fulfilled tasks, where multiple UAVs in a three-dimensional space collaboratively fulfill the task computation of ground users. In the formulated problem, we try to obtain the optimal flight and sub-channel selection strategies for UAVs and schedule strategies for tasks. Based on the multi-agent deep deterministic policy gradient (MADDPG) algorithm, we propose a curiosity-driven and twin-networks-structured MADDPG (CTMADDPG) algorithm to solve the formulated problem. It uses the inner reward to facilitate the state exploration of agents, avoiding convergence at the sub-optimal strategy. Furthermore, we adopt the twin critic networks for update stabilization to reduce the probability of Q value overestimation. The simulation results show that CTMADDPG is outstanding in maximizing the energy efficiency of the whole system and outperforms the other benchmarks.

Key words： Multi-access edge computing Multi-agent reinforcement learning Unmanned aerial vehicles Task scheduling

收稿日期: 2023-06-01 出版日期: 2024-08-06

通讯作者: 魏子令,苏金树 E-mail: liyang20@nudt.edu.cn;weiziling@nudt.edu.cn;sjs@nudt.edu.cn;bkzhao@nudt.edu.cn

Corresponding Author(s): Ziling WEI,Jinshu SU

引用本文:

李阳, 魏子令, 苏金树, 赵宝康. 三维无人机-多接入边缘计算场景下的多智能体协作任务调度能效优化方案[J]. Frontiers of Information Technology & Electronic Engineering, 2024, 25(6): 824-838.
Yang LI, Ziling WEI, Jinshu SU, Baokang ZHAO. A multi-agent collaboration scheme for energy-efficient task scheduling in a 3D UAV-MEC space. Front. Inform. Technol. Electron. Eng, 2024, 25(6): 824-838.

链接本文:

https://academic.hep.com.cn/fitee/CN/10.1631/FITEE.2300393
https://academic.hep.com.cn/fitee/CN/Y2024/V25/I6/824

[1]	FITEE-0824-24004-YL_suppl_1	Download
[2]	FITEE-0824-24004-YL_suppl_2	Download
[3]	FITEE-0824-24004-YL_suppl_3	Download

Viewed

Full text

Abstract

Cited

Shared

Discussed