北京邮电大学学报

  • EI核心期刊

北京邮电大学学报

• •    

基于相对参量的自适应密度峰值聚类算法

邵壮   

  1. 西安电子科技大学
  • 收稿日期:2023-11-01 修回日期:2023-12-26 发布日期:2024-07-18
  • 通讯作者: 邵壮

Adaptive density peak clustering based on comparative quantities

  • Received:2023-11-01 Revised:2023-12-26 Published:2024-07-18

摘要: 摘要:密度峰值聚类(DPC)算法自2014年被发表在Science杂志上后,因其简洁性和有效性得到广泛讨论及应用。然而,研究发现DPC算法有一些明显缺点。针对密度峰值聚类(DPC)算法的缺陷,提出了一种基于相对参量的自适应密度峰值聚类(ACDPC)算法,该算法通过引入新参量相对局部密度来判断簇中心,削弱了数据集中不同簇的密度不同对聚类效果的影响,通过使用新参量相对连通距离来衡量簇间的相似性,消除了数据集中不同簇的尺寸大小不同对聚类效果的影响,增强了算法在不同数据集上的适用性;算法通过构造相对局部密度信息熵函数,可根据数据集的特点,自适应的确定相关参数,增强了算法的智能性;算法采用新的点分配策略,避免了链式反应。实验结果表明,ACDPC算法相较于标准DPC算法及其改进算法,其聚类性能得到较大提升。

关键词: 聚类, 密度峰值, 自适应密度峰值聚类, 相对局部密度, 相对连通距离

Abstract: ABSTRACT:Peak Clustering(DPC) was proposed in journal Science in 2014,which has aroused widespread discussion and application due to its efficiency and simplicity.However,some obvious shortcomings had been found in studies. To overcome these deficiencies,a novel clustering algorithm named Adaptive density peak clustering based on comparative quantities is proposed. In the improved algorithm, a new quantity named relative local density was used to assess the similarity between clusters,which greatly improved its applicability to datasets,another quantity called relative connectivity distance was applied for measuring the similarity between clusters,which effectively eliminates the influence of different sizes of clusters in the dataset. The applicability of the algorithm on different datasets was enhanced. By constructing a comentropy function, parameters could be adaptively determined according to the characteristics of the datasets,which improved the intelligence of the algorithm.A new allocation strategy is proposed to avoid the effects of the ‘chain reaction’.Simulations show that compared with the DPC and its improved algorithm,the performance of ACDPC algorithm is greatly improved.

Key words: clustering, density peak, adaptive density peak clustering, relative local density, relative connectivity distance

中图分类号: