基于合作博弈及深度学习的节点协作缓存机制

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (3): 55-61.

基于合作博弈及深度学习的节点协作缓存机制

金宁,周文倩,周旭颖,金小萍

中国计量大学信息工程学院

收稿日期:2023-04-25 修回日期:2023-06-11 出版日期:2024-06-30 发布日期:2024-06-13
通讯作者: 周旭颖 E-mail:xuyingzhou@cjlu.edu.cn
基金资助:
国家自然科学基金项目(62201539); 中国计量大学基本科研业务费项目(2022YW61)

Collaborative caching mechanism of nodes based on cooperative game and deep learning

Received:2023-04-25 Revised:2023-06-11 Online:2024-06-30 Published:2024-06-13

摘要/Abstract

摘要： 随着无线移动通信的不断发展, 用户激增的内容需求与有限无线网络资源之间的矛盾日益加剧。利用设备到设备（device-to-device, D2D）通信实现边缘节点间缓存内容的共享, 可以改善用户体验质量并减轻核心网络的流量负担。针对节点缓存空间受限的场景, 考虑交互成本及个体理性等因素将协作缓存问题建模成合作博弈, 实现系统效用的优化。根据节点间效用是否可转移, 分类讨论两种情况下的合作博弈：在效用可转移（Transferable Utility, TU）博弈下, 推导出节点形成稳定大联盟的条件；在效用不可转移（Non-Transferable Utility, NTU）博弈下, 考虑到理性节点无法确保形成稳定的大联盟，且联盟的数量随用户数剧增。因此，提出一种基于深度强化学习的联盟形成算法在有限时间内保证节点间稳定联盟的形成。理论分析和仿真结果表明, 所提出的联盟形成算法能收敛于纳什稳定最优解或者渐进最优解, 性能上优于其他对比算法。

关键词: 节点协作, 内容共享, 合作博弈, 深度强化学习

Abstract: With the development of wireless mobile communication, the contradiction between users' proliferating content demands and the limited wireless network resources is increasing. The use of Device-to-Device (D2D) communication to realize the sharing of cached contents between edge nodes can improve user experience and reduce the burden of traffic on the core network. This paper models the collaboration caching problem as a cooperative game considering the factors of interaction costs and individual rationality to optimize the system utility with limited cache space. According to whether the utility between nodes can be transferred, we discuss the cooperative game in two cases. Under the transferable utility (TU) game, the conditions for nodes to form a stable grand coalition are derived, and it is proved that the coalition has the nature of nuclear nonempty when the coalition cost of nodes satisfy certain conditions. For non-transferable utility (NTU) game, the rational nodes cannot ensure the formation of a stable grand coalition, and the number of formable coalitions increases dramatically with the number of users. Therefore, a deep reinforcement learning-based coalition formation algorithm is proposed to ensure the formation of stable coalitions within a limited time. Theoretical analysis and simulation results show that the proposed algorithm can converge to a Nash-stable optimal solution or asymptotically optimal solution, which outperforms other comparison algorithms.

Key words: node collaboration, content sharing, cooperative game, deep reinforcement learning

中图分类号:

TN929.52

金宁周文倩周旭颖金小萍. 基于合作博弈及深度学习的节点协作缓存机制[J]. 北京邮电大学学报, 2024, 47(3): 55-61.

[1]	李晓辉周媛媛吕思婷苏家楠. 基于深度强化学习的动态网络切片资源部署算法[J]. 北京邮电大学学报, 2024, 47(4): 0-0.
[2]	杨树杰方楚星郝昊蒋可. 一种基于DQN的全景视频边缘缓存优化方案[J]. 北京邮电大学学报, 2023, 46(5): 60-65.
[3]	刘阳, 滕颖蕾, 牛涛, 郅佳琳. 基于深度强化学习的滤波器剪枝方案[J]. 北京邮电大学学报, 2023, 46(3): 31-36.
[4]	谭炜骞, 吴斌伟, 汪硕. 确定性网络跨域传输架构与DRL流量调度算法[J]. 北京邮电大学学报, 2023, 46(3): 37-42.
[5]	杨华, 耿烜, 孔宁. 一种采用dueling-DDQN算法的无线网络MAC协议[J]. 北京邮电大学学报, 2023, 46(3): 25-30.
[6]	孙国玮许方敏朱瑾瑜张恒升赵成林. 算力网络中的确定性调度与路由联合智能优化方案[J]. 北京邮电大学学报, 2023, 46(2): 9-14.
[7]	公雨魏翼飞. 一种集成学习辅助DDPG的资源优化算法[J]. 北京邮电大学学报, 2023, 46(2): 29-36.
[8]	郭兴康孙君. 基于交替方向乘子法与深度强化学习算法的资源分配[J]. 北京邮电大学学报, 2022, 45(6): 126-130.
[9]	郭令奇褚智贤廖建新王敬宇陆璐. 意图驱动的自智网络资源按需服务[J]. 北京邮电大学学报, 2022, 45(6): 85-91.
[10]	郅佳琳, 王楠, 满毅, 滕颖蕾. 面向硬件感知的边缘计算卸载和资源分配[J]. 北京邮电大学学报, 2022, 45(2): 22-28.
[11]	黄浩, 胡智群, 王鲁晗, 路兆铭, 温向明. 基于Sumtree DDPG的智能交通信号控制算法[J]. 北京邮电大学学报, 2021, 44(1): 97-103.
[12]	管婉青, 张海君, 路兆铭. 基于DRL的6G多租户网络切片智能资源分配算法[J]. 北京邮电大学学报, 2020, 43(6): 132-139.
[13]	马庆刘, 喻鹏, 吴佳慧, 熊翱, 颜拥. 基于深度强化学习的综合能源业务通道优化机制[J]. 北京邮电大学学报, 2020, 43(2): 87-93.
[14]	薛宁, 霍如, 曾诗钦, 汪硕, 黄韬. 基于DRL的MEC任务卸载与资源调度算法[J]. 北京邮电大学学报, 2019, 42(6): 64-69,104.
[15]	黄勤龙, 马兆丰, 傅镜艺, 钮心忻. 基于代理重加密的多媒体数字版权授权协议[J]. 北京邮电大学学报, 2013, 36(6): 7-12.