北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2023, Vol. 46 ›› Issue (2): 9-14.

• 算力网络与分布式云 • 上一篇    下一篇

算力网络中的确定性调度与路由联合智能优化方案

孙国玮1,许方敏1,朱瑾瑜2,张恒升2,赵成林3   

  1. 1. 北京邮电大学
    2. 中国信息通信研究院技术与标准研究所
    3. 北京邮电大学信息与通信工程学院
  • 收稿日期:2022-07-26 修回日期:2022-09-13 出版日期:2023-04-28 发布日期:2023-05-14
  • 通讯作者: 许方敏 E-mail:xufm@bupt.edu.cn
  • 基金资助:
    复杂生物场景下的分子信号检测理论与方法研究;2021年工业互联网创新发展工程项目;面向工业互联网场景无线边缘智能协同关键技术研究

Deterministic Scheduling and Routing Joint Intelligent Optimization Scheme in Computing First Network

  • Received:2022-07-26 Revised:2022-09-13 Online:2023-04-28 Published:2023-05-14

摘要: 算力网络(CFN)将异质算力信息和网络融合,提高了资源利用率和网络传输效率,时间敏感网络(TSN)保证了传输的低时延高可靠性能,二者融合可以实现高效率的确定性转发。一体化决策CFN中的资源调度和路由规划以及TSN中的门控排布会出现决策变量过多、计算复杂度过高、优化性能不足等问题。针对以上问题,提出了一个根据IEEE 802.1Qbv做门控排布和算力网络路由规划、算力资源调度的融合架构。基于深度强化学习提出了改进后的RBDQN(reward-back deep Q-learning)算法优化门控,并采用贪婪算法协助路由路径规划。算法以平均时延、能量损耗和用户满意度为多优化指标建立效用函数。仿真结果表明,相比于遗传算法,RBDQN能够把小规模调度问题收敛时间降低1倍以上,针对多业务、多节点的算力网络问题能够将收敛时间降低数十倍。同时,算法能够避免模型陷入局部最优,相比于传统DQN,决策结果将效用函数指标性能提升超过10%,相同指标下的收敛时间下降约50%。

关键词: 时间敏感网络, 算力网络, 深度强化学习

Abstract: The Compute first network (CFN) integrates heterogeneous computing power information with the network to improve resource utilization and network transmission efficiency. The time-sensitive network (TSN) ensures low-latency and high-reliability transmission performance. The fusion of the two can achieve high efficiency deterministic forwarding. The resource scheduling and routing planning in the integrated decision-making CFN and the gate control arrangement in the TSN will have problems such as too many decision variables, too high computational complexity, and insufficient optimization performance. In response to the above problems, a fusion architecture based on IEEE 802.1Qbv for gated arrangement, computing network routing planning, and computing resource scheduling is proposed. Based on deep reinforcement learning, an improved RBDQN (reward-back deep Q-learning) algorithm is proposed to optimize gate control list, and a greedy algorithm is used to assist routing path planning. The algorithm establishes a utility function based on the average delay, energy consumption and user satisfaction as multiple optimization indicators. The simulation results show that, compared with the genetic algorithm, RBDQN can reduce the convergence time of small-scale scheduling problems by more than 1 times, and can reduce the convergence time by dozens of times for multi-service and multi-node computing network problems. At the same time, the algorithm can avoid the model from falling into a local optimum. Compared with the traditional DQN, the decision result improves the performance of the utility function index by more than 10%, and the convergence time under the same index decreases by about 50%.

Key words: time sensitive network, computing first network, deep reinforcement learning

中图分类号: