北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2017, Vol. 40 ›› Issue (4): 54-59.doi: 10.13190/j.jbupt.2017.04.009

• 论文 • 上一篇    下一篇

基于k-近邻域中心偏移的鲁棒性异常检测算法

赵建龙1, 曲桦1,2, 赵季红2,3   

  1. 1. 西安交通大学 软件学院, 西安 710049;
    2. 西安交通大学 电子与信息工程学院, 西安 710049;
    3. 西安邮电大学 通信与信息工程学院, 西安 710061
  • 收稿日期:2016-09-12 出版日期:2017-08-28 发布日期:2017-07-10
  • 作者简介:赵建龙(1992-),男,博士生,E-mail:z.jl199235@stu.xjtu.edu.cn;曲桦(1961-),男,教授,博士生导师.
  • 基金资助:
    国家自然科学基金项目(61371087和61531013)

Robust Outlier Detection Algorithm Based on k-Nearest Neighbor Region Center Migration

ZHAO Jian-long1, QU Hua1,2, ZHAO Ji-hong2,3   

  1. 1. School of Software Engineering, Xi'an Jiaotong University, Xi'an 710049, China;
    2. School of Electronic and Information Engineering, Xi'an Jiaotong University, Xi'an 710049, China;
    3. School of Communication and Information Engineering, Xi'an University of Posts and Telecommunications, Xi'an 710061, China
  • Received:2016-09-12 Online:2017-08-28 Published:2017-07-10

摘要: 针对大多数基于距离和密度的异常检测算法敏感于近邻参数k的问题,提出了一种鲁棒性异常检测标准——k-近邻域中心偏移异常因子(COOF).数据结点的k-近邻域中心位置会随着近邻参数k的变化而发生迁移,鉴于异常结点要比正常结点对k-近邻域中心位置偏移量的影响更大,通过累加因递增k而产生的偏移量来表征数据结点的异常程度,并在COOF基础上实现了鲁棒性的异常检测算法.通过综合数据和真实数据的实验仿真可知,COOF不仅对近邻参数k具有鲁棒性,而且相比基于距离的k最近邻算法、基于局部距离的异常因子和基于密度的局部异常因子具有更稳定且更准确的异常检测性能.

关键词: 异常检测, k最近邻, 局部异常因子, 中心偏移异常因子

Abstract: Considering the distance- and density-based outlier detection algorithms are often sensitive to a nearest neighbor parameter k, termed k-center offset outlier factor (COOF), a robust outlier detection criterion for the characterization of abnormal degree of each data object was proposed. Each data object is included in a region within its k nearest neighbors, and the center of region will migrate with the change of nearest neighbor parameter k. In general, the variation of center offset of k nearest neighbor region is greater for an outlier than a normal object. According to this observation, for each data object, COOF is defined as the accumulation of this kind of offset when increasing the nearest neighbor parameter from one to k. Finally, the outlier detection algorithm based on COOF was also presented. Through artificial data and real data experimental simulations show that COOF is insensitive to parameter k, and has more stable and accurate outlier detection performance compared to k nearest neighbor, local distance-based outlier factor and local outlier factor, which are the distance-based method and density-based method respectively.

Key words: outlier detection, k nearest neighbor, local outlier factor, center offset outlier factor

中图分类号: