北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2016, Vol. 39 ›› Issue (s1): 72-75.doi: 10.13190/j.jbupt.2016.s.017

• 论文 • 上一篇    下一篇

基于样本加权的基因特征选取模型

芮兰兰, 张洁, 郭少勇, 熊翱   

  1. 北京邮电大学 网络与交换技术国家重点实验室, 北京 100876
  • 收稿日期:2015-06-06 出版日期:2016-06-28 发布日期:2016-06-28
  • 作者简介:芮兰兰(1979-),女,副教授;张洁(1990-),女,硕士生,E-mail:zhangheminjun@163.com.
  • 基金资助:

    国家自然科学基金创新研究群体科学基金项目(61121061);国家自然科学基金项目(61302078,61372108);北京高等学校青年英才计划项目基金项目(YETP0476)

Sample Weighting Based Gene Feature Selection Model

RUI Lan-lan, ZHANG Jie, GUO Shao-yong, XIONG Ao   

  1. State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2015-06-06 Online:2016-06-28 Published:2016-06-28

摘要:

针对基因表达谱数据的特点,提出了一种基于样本加权的基因特征选取模型.首先提出一种样本权重的计算方法;其次结合样本权重改进信息增益度量标准,并用其衡量基因信息量的大小,同时将基因之间信息量的重复性视为基因噪声干扰,建立未消噪和消噪的基因特征选取模型;最后结合支持向量机、逻辑回归、神经网络和决策树4种分类器,将所提模型与常见的基因选取模型进行比较分析.实验结果表明,所提选取模型在不影响分类性能的前提下,具有较好的稳定性.

关键词: 特征选取, 信息增益, 样本权重, 噪声干扰

Abstract:

According to the characteristics of gene expression data, a gene feature selection model based on improved information gain was put forward. The improved information gain was proposed to measure gene information quantity with sample weight and a no de-noising and de-noising gene feature selection model was established. The proposed model is compared with common gene selection model using four classifiers. Experiments validate that the proposed method can improve stability of feature selection algorithms without sacrificing predictive accuracy.

Key words: feature selection, information gain, sample weight, noise interference

中图分类号: