北京邮电大学学报

  • EI

北京邮电大学学报 ›› 2019, Vol. 42 ›› Issue (6): 43-48,57.doi: 10.13190/j.jbupt.2019-140

• 论文 • 上一篇    下一篇

一种基于多智能体强化学习的流量分配算法

程超1, 滕俊杰2, 赵艳领3, 宋梅1   

  1. 1. 北京邮电大学 电子工程学院, 北京 100876;
    2. 中国金融认证中心, 北京 100054;
    3. 机械工业仪器仪表综合技术经济研究所, 北京 100055
  • 收稿日期:2019-07-10 出版日期:2019-12-28 发布日期:2019-11-15
  • 通讯作者: 宋梅(1960-),女,教授,博士生导师,E-mail:songm@bupt.edu.cn. E-mail:songm@bupt.edu.cn
  • 作者简介:程超(1993-),男,硕士生.
  • 基金资助:
    国家重点研发计划项目(2018YFB1201500);国家自然科学基金项目(61871046);北京市自然科学基金项目(L171011);北京市重大专项项目(Z181100003118012)

Traffic Distribution Algorithm Based on Multi-Agent Reinforcement Learning

CHENG Chao1, TENG Jun-jie2, ZHAO Yan-ling3, SONG Mei1   

  1. 1. School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    2. China Financial Certification Authority, Beijing 100054, China;
    3. Instrumentation Technology and Economy Institute, Beijing 100055, China
  • Received:2019-07-10 Online:2019-12-28 Published:2019-11-15
  • Supported by:
     

摘要: 传统的流量工程策略的研究大多集中在构建和求解数学模型方面,其计算复杂度过高,为此,提出了一种经验驱动的基于多智能体强化学习的流量分配算法.该算法无需求解复杂数学模型即可在预计算的路径上进行有效的流量分配,从而高效且充分地利用网络资源.算法在软件定义网络控制器上进行集中训练,且在训练完成后再接入交换机或者路由器上分布式执行,同时也避免和控制器的频繁交互.实验结果表明,相对于最短路径和等价多路径算法,新算法有效减少了网络的端到端时延,并且增大了网络吞吐量.

关键词: 流量工程, 多智能体强化学习, 软件定义网络, 时延, 吞吐量

Abstract: Most of the researches on traditional traffic engineering strategies focus on constructing and solving mathematical models. To reduce computational complexity,an experience-driven traffic allocation algorithm based on multi-agent reinforcement learning was proposed. It can effectively distribute traffic on pre-calculated paths without solving complex mathematical models and then fully utilize network resources. The algorithm performs centralized training on the software defined networking controller,and can be executed on the access switch or router in a distributed way after the training is completed. Frequent interactions with the controller are avoided at the same time. Experiments show that the algorithm is effective in reducing the end-to-end delay and increasing throughput of the network with respect to the shortest-path and the equal-cost multi-path.

Key words: traffic engineering, multi-agent reinforcement learning, software-defined networking, delay, throughput

中图分类号: