北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2006, Vol. 29 ›› Issue (s2): 136-138.doi: 10.13190/jbupt.2006s2.136.300

• 论文 • 上一篇    下一篇

基于主成分分析的自动文本分类模型研究

张 锦1,2, 李 光3, 曹 伍4, 胡瑞芬1   

  1. 1. 浙江大学 生物医学工程系, 杭州 310027; 2. 湖南大学 软件学院, 长沙 410082; 3. 浙江大学 工业控制技术国家重点实验室, 杭州 310027; 4. 纽约州立大学Postdam分校 教育学院, 纽约, 美国
  • 收稿日期:2006-09-27 修回日期:1900-01-01 出版日期:2006-11-30 发布日期:2006-11-30
  • 通讯作者: 张 锦

Research on Auto Text Classification Model Based on PCA

ZHANG Jin1,3, LI Guang2, CAO Wu4 , HU Rui-fen1   

  1. 1. Dept. of Biomedical Engineering, Zhejiang University, Hangzhou 310027, China;
    2. National Laboratory of Industrial Control Technology, Zhejiang University, Hangzhou 310027, China;
    3. Software School, Hunan University, Changsha 410082, China;
    4. School of Education, State university of New York at Potsdam, New York, USA
  • Received:2006-09-27 Revised:1900-01-01 Online:2006-11-30 Published:2006-11-30
  • Contact: ZHANG Jin

摘要:

提出了一种基于BP神经网络和主成分分析的文本分类模型。该模型利用主成分分析实现对特征矩阵的降维,通过大量的模拟实验逐步优化BP网络的各项参数。在20_newgroups数据集上的模拟实验表明,该模型具有较好的性能并能得到较高的分类精度。

关键词: 文本分类, 反向传播神经网络, 主成分分析

Abstract:

Base on BP neural networks and PCA, an auto text classification model was presented. By the use of PCA, this model reduced the dimensions of feature matrix, and the parameters were optimized gradually by amounts of simulation results. The experimental results based on 20-newgroups shows that the model has better performance and can get high accuracy of text classification.

Key words: text classification, back-propagation neural network, principal component analysis

中图分类号: