北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2022, Vol. 45 ›› Issue (6): 126-130.

• 论文 • 上一篇    下一篇

基于交替方向乘子法与深度强化学习算法的资源分配

郭兴康1,孙君2   

  1. 1. 南京邮电大学江苏省无线通信实验室
    2. 南京邮电大学
  • 收稿日期:2022-03-01 修回日期:2022-07-06 出版日期:2022-12-28 发布日期:2022-11-24
  • 通讯作者: 孙君 E-mail:sunjun@njupt.edu.cn
  • 基金资助:
    省部级重点实验室开放课题;国家自然科学基金

Resource Allocation Based on Alternating Direction Multiplier Method and Deep Reinforcement Learning Algorithm

  • Received:2022-03-01 Revised:2022-07-06 Online:2022-12-28 Published:2022-11-24
  • Supported by:
    Open project of provincial and ministerial key laboratories;National Natural Science Foundation of China

摘要: 为了研究在有限信道状态信息下,密集型网络的资源分配问题,提出了交替方向乘子法结合深度强化学习算法的模型驱动学习框架。该框架区别于数据驱动框架,能够根据具体问题进行一对一建模。针对资源分配的问题建模内容包括:将基站选择、功率和子载波分配用交替方向乘子法进行交替优化;用深度强化学习算法优化权重,求解目标函数,提高算法性能;框架利用有效信道状态信息而非多余信息,降低了通信开销;加强对最低用户服务质量要求参数的约束,可以在保证用户的体验下最大化小区频谱效率。仿真结果表明,该模型驱动学习框架在较少的迭代次数下即可收敛。

关键词: 密集型网络, 模型驱动, 资源分配, 深度强化学习, 交替方向乘子法

Abstract: In order to optimize resource allocation of dense network under limited channel state information, a model-driven learning framework combined with alternating direction method of multipliers, as well as deep reinforcement learning algorithm, is proposed. This framework differs from data-driven ones, which enables one-to-one modeling of specific problems. The steps on how to model resource allocation include: alternately optimizing base station selection, power, and subcarrier allocation with alternating direction method of multipliers; using deep reinforcement learning algorithm to optimize weights, solve target functions and improve performance of the system; using effective channel state information instead of redundant information to reduce overhead on communication; adding constraints on users’ quality of service requirements to maximize cell spectral efficiency while ensuring user experience, which can maximize the spectral efficiency of the cell while ensuring users’ experience. The simulation results show that the model-driven learning framework can converge in a small number of iterations.

Key words: dense network, model-driven, resource allocation, deep reinforcement learning, alternating direction multiplier method

中图分类号: