北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2017, Vol. 40 ›› Issue (1): 36-41.doi: 10.13190/j.jbupt.2017.01.006

• 论文 • 上一篇    下一篇

基于冗余分析的特征选择算法

仇利克, 郭忠文, 刘青, 刘颖健, 仇志金   

  1. 中国海洋大学 信息科学与工程学院, 青岛 266200
  • 收稿日期:2016-08-28 出版日期:2017-02-28 发布日期:2017-03-14
  • 作者简介:仇利克(1979-),女,博士生,E-mail:qllike@163.com;郭忠文(1965-),男,教授,博士生导师.
  • 基金资助:
    国家自然科学基金项目(61379127,61379128)

Feature Selection Algorithm Based on Redundancy Analysis

QIU Li-ke, GUO Zhong-wen, LIU Qing, LIU Ying-jian, QIU Zhi-jin   

  1. College of Information Science and Engineering, Ocean University of China, Qingdao 266200, China
  • Received:2016-08-28 Online:2017-02-28 Published:2017-03-14

摘要: 针对冗余特征判定难题,分析了特征和特征之间的相关性以及特征和目标值之间相关性的联系,给出了判定冗余特征的准则,在此基础上给出了近似冗余特征的定义,并提出了一种基于冗余分析的特征选择算法。算法分2步去除无关特征和冗余特征。实验结果表明,所提出的特征选择算法能有效降低特征维数,提高预测准确率。

关键词: 特征选择, 相关, 冗余, Pearson相关系数, 预测

Abstract: Aiming at the problem of redundant feature identification, this article analyzes the internal relationship between two kinds of correlation (correlation between feature and feature, correlation between feature and target value) and provides criterions for redundant feature determination. Approximate redundant feature is defined and a feature selection method based on redundancy is presented thereafter. The algorithm is divided into two steps to remove irrelevant features and redundant features respectively. Simulatios demonstrate that the proposed feature selection algorithms can effectively reduce feature dimension, and improve the accuracy.

Key words: feature selection, relevance, redundancy, Pearson correlation coefficient, prediction

中图分类号: