北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2020, Vol. 43 ›› Issue (2): 87-93.doi: 10.13190/j.jbupt.2019-103

• 论文 • 上一篇    下一篇

基于深度强化学习的综合能源业务通道优化机制

马庆刘1, 喻鹏1, 吴佳慧1, 熊翱1, 颜拥2   

  1. 1. 北京邮电大学 网络与交换技术国家重点实验室, 北京 100876;
    2. 国网浙江省电力有限公司, 杭州 310007
  • 收稿日期:2019-05-31 发布日期:2020-04-28
  • 通讯作者: 喻鹏(1986-),男,副教授,E-mail:yupeng@bupt.edu.cn. E-mail:yupeng@bupt.edu.cn
  • 作者简介:马庆刘(1994-),男,硕士生.
  • 基金资助:
    国家电网公司科技项目"高可信智能感知互动综合服务系统关键技术研发及应用示范"(52110418002V)

A Integrated Energy Service Channel Optimization Mechanism Based on Deep Reinforcement Learning

MA Qing-liu1, YU Peng1, WU Jia-hui1, XIONG Ao1, YAN Yong2   

  1. 1. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    2. State Grid Zhejiang Electric Power Company Limited, Hangzhou 310007, China
  • Received:2019-05-31 Published:2020-04-28

摘要: 为了保障综合能源系统的稳定运行,承载综合能源业务的通信网络需要具备高可靠、低风险等特征.依据综合能源业务的通道要求,提出了一种深度强化学习的算法,旨在对大规模综合能源业务在承载的电力通信网上寻找到整体最优的路径.该方法以整体时延和网络负载均衡度为目标,对网络拓扑进行训练,并保存模型,然后通过迭代学习获取最优的结果.仿真结果表明,该方法找到的路径既可以保证整体时延较短,又可以保证网络的整体负载均衡.同时,在网络规模很大、业务数量很多的情况下,深度强化学习算法可有效提高计算效率.

关键词: 深度强化学习, 路径优化, 时延, 负载均衡

Abstract: In order to ensure the stable operation of the integrated energy system, the integrated energy service needs to have high reliability and low risk when being carried by the communication network. According to the channel requirements of the integrated energy service, an algorithm of deep reinforcement learning is proposed, aiming to find the overall optimal path for the large-scale integrated energy service on the carried power communication network. The method that aims at the overall delay and network load balance, trains the network topology and saves the model, and then obtains the optimal result through iterative learning. The simulation results show that the routing found by this method can ensure the overall delay is short and guarantee the overall load balance of the network. At the same time, for scenarios with a large network size and a large number of services, the deep reinforcement learning algorithm can effectively improve the computational efficiency.

Key words: deep reinforcement learning, routing optimization, time delay, load balancing

中图分类号: