基于交替方向乘子法与深度强化学习算法的资源分配

北京邮电大学学报 ›› 2022, Vol. 45 ›› Issue (6): 126-130.

基于交替方向乘子法与深度强化学习算法的资源分配

郭兴康¹,孙君²

1. 南京邮电大学江苏省无线通信实验室
2. 南京邮电大学

收稿日期:2022-03-01 修回日期:2022-07-06 出版日期:2022-12-28 发布日期:2022-11-24
通讯作者: 孙君 E-mail:sunjun@njupt.edu.cn
基金资助:
省部级重点实验室开放课题;国家自然科学基金

Resource Allocation Based on Alternating Direction Multiplier Method and Deep Reinforcement Learning Algorithm

Received:2022-03-01 Revised:2022-07-06 Online:2022-12-28 Published:2022-11-24
Supported by:
Open project of provincial and ministerial key laboratories;National Natural Science Foundation of China

摘要/Abstract

摘要： 为了研究在有限信道状态信息下，密集型网络的资源分配问题，提出了交替方向乘子法结合深度强化学习算法的模型驱动学习框架。该框架区别于数据驱动框架，能够根据具体问题进行一对一建模。针对资源分配的问题建模内容包括：将基站选择、功率和子载波分配用交替方向乘子法进行交替优化；用深度强化学习算法优化权重，求解目标函数，提高算法性能；框架利用有效信道状态信息而非多余信息，降低了通信开销；加强对最低用户服务质量要求参数的约束，可以在保证用户的体验下最大化小区频谱效率。仿真结果表明，该模型驱动学习框架在较少的迭代次数下即可收敛。

关键词: 密集型网络, 模型驱动, 资源分配, 深度强化学习, 交替方向乘子法

Abstract: In order to optimize resource allocation of dense network under limited channel state information, a model-driven learning framework combined with alternating direction method of multipliers, as well as deep reinforcement learning algorithm, is proposed. This framework differs from data-driven ones, which enables one-to-one modeling of specific problems. The steps on how to model resource allocation include: alternately optimizing base station selection, power, and subcarrier allocation with alternating direction method of multipliers; using deep reinforcement learning algorithm to optimize weights, solve target functions and improve performance of the system; using effective channel state information instead of redundant information to reduce overhead on communication; adding constraints on users’ quality of service requirements to maximize cell spectral efficiency while ensuring user experience, which can maximize the spectral efficiency of the cell while ensuring users’ experience. The simulation results show that the model-driven learning framework can converge in a small number of iterations.

Key words: dense network, model-driven, resource allocation, deep reinforcement learning, alternating direction multiplier method

中图分类号:

TN929.5

郭兴康孙君. 基于交替方向乘子法与深度强化学习算法的资源分配[J]. 北京邮电大学学报, 2022, 45(6): 126-130.

[1]	刘阳, 滕颖蕾, 牛涛, 郅佳琳. 基于深度强化学习的滤波器剪枝方案[J]. 北京邮电大学学报, 2023, 46(3): 31-36.
[2]	谭炜骞, 吴斌伟, 汪硕. 确定性网络跨域传输架构与DRL流量调度算法[J]. 北京邮电大学学报, 2023, 46(3): 37-42.
[3]	杨华, 耿烜, 孔宁. 一种采用dueling-DDQN算法的无线网络MAC协议[J]. 北京邮电大学学报, 2023, 46(3): 25-30.
[4]	公雨魏翼飞. 一种集成学习辅助DDPG的资源优化算法[J]. 北京邮电大学学报, 2023, 46(2): 29-36.
[5]	彭维平杨玉莹宋成阎俊豪. VEC中多边缘节点协作卸载与资源分配算法[J]. 北京邮电大学学报, 2023, 46(2): 78-83.
[6]	孙国玮许方敏朱瑾瑜张恒升赵成林. 算力网络中的确定性调度与路由联合智能优化方案[J]. 北京邮电大学学报, 2023, 46(2): 9-14.
[7]	魏明亮耿绥燕赵雄文胡玮范静怡. 超密集网络中移动边缘计算的资源分配和任务卸载联合优化研究[J]. 北京邮电大学学报, 2023, 46(2): 50-56.
[8]	郭令奇褚智贤廖建新王敬宇陆璐. 意图驱动的自智网络资源按需服务[J]. 北京邮电大学学报, 2022, 45(6): 85-91.
[9]	李玺兰段继忠. 基于稀疏变换学习的改进灵敏度编码重建算法[J]. 北京邮电大学学报, 2022, 45(5): 97-102.
[10]	郁小松, 朱青橙, 顾佳明, 赵永利, 张杰. 云边协同光量子物联网架构及资源分配[J]. 北京邮电大学学报, 2022, 45(3): 50-56.
[11]	郅佳琳, 王楠, 满毅, 滕颖蕾. 面向硬件感知的边缘计算卸载和资源分配[J]. 北京邮电大学学报, 2022, 45(2): 22-28.
[12]	杜梅, 周军华, 李敦桥, 陈士钊, 魏翼飞. MEC计算卸载与资源分配联合智能优化方案[J]. 北京邮电大学学报, 2022, 45(2): 65-71.
[13]	张雨晴, 李云, 黄鸿锐, 庄宏成. 异构网络中任务卸载与资源分配联合优化算法[J]. 北京邮电大学学报, 2022, 45(2): 91-97.
[14]	贾雨宁, 魏翼飞, 周军华. 基于SDN与NFV的服务功能链编排算法[J]. 北京邮电大学学报, 2022, 45(2): 85-90.
[15]	高晓娜, 卢光跃, 叶迎晖, 昝金枚. 认知反向散射网络通信容量公平的资源优化[J]. 北京邮电大学学报, 2021, 44(6): 26-32.