北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (5): 29-34.

• 论文 • 上一篇    下一篇

基于域划分的图匹配网络数据流分类方法

杜玉鑫,何明枢,路子逵,王欣雷,王小娟   

  1. 北京邮电大学
  • 收稿日期:2023-10-18 修回日期:2023-12-15 出版日期:2024-10-28 发布日期:2024-11-10
  • 通讯作者: 何明枢 E-mail:hemingshu@bupt.edu.cn

Traffic Classification Using Domain-Based Graph Matching

  • Received:2023-10-18 Revised:2023-12-15 Online:2024-10-28 Published:2024-11-10
  • Contact: MINGSHU HE E-mail:hemingshu@bupt.edu.cn

摘要: 针对当前网络流量分类存在流量数据加密、分布不均匀以及用户隐私问题,本文提出了一种基于域划分的图匹配网络流量分类方法,仅通过非内容特征表征网络流特征,并通过图匹配算法降低所辖类间非平衡差异,以实现粗粒度聚类算法以及可靠图匹配算法。首先,本文设计了一个无监督聚类框架,依据少量特征研究流量数据的不同分布和类别相似性,通过无监督聚类消除网络差异,将网络会话聚合到具有提取的主要特征的几个聚类中;然后将来自相同网络的聚类之间的相关性来构建相似图;最后提出一个图匹配算法,通过结合图神经网络(GNN)和图匹配网络(GMN)揭示了不同网络关系之间的可靠对应关系,将测试网络中的聚类与初始网络中的集群进行关联,从而可以根据训练集网络中的关联聚类对测试集群进行标记。仿真结果表明,所提方法分类准确率可以达到96.8%,显著优于现有方法。

关键词: 粗粒度聚类, 流量分类, 图匹配算法, 主要特征

Abstract: This paper proposes a domain-based graph matching approach to address the current challenges in network traffic classification, including data encryption, uneven distribution, and user privacy concerns. The method relies solely on non-content features to characterize network flow characteristics and employs graph matching algorithms to reduce inter-class imbalances, enabling coarse-grained clustering and reliable graph matching. Firstly, an unsupervised clustering framework is designed, which studies the diverse distributions and category similarities of traffic data based on a limited set of features. This unsupervised clustering helps mitigate network disparities by aggregating network sessions into a few clusters with extracted primary features. Next, the correlation between clusters from the same network is used to construct a similarity graph. Finally, a graph matching algorithm is proposed, which combines graph neural networks and graph matching networks to reveal reliable correspondences between different network relationships. This allows for associating clusters in the test network with clusters in the initial network, enabling the labeling of test clusters based on associated clusters in the training set. Simulation results demonstrate that the proposed method achieves an accuracy rate of 96.8%, which is significantly superior to existing approaches.

Key words: Coarse-grained clustering, Traffic classification, Graph Matching Algorithm, Primary features

中图分类号: