北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2019, Vol. 42 ›› Issue (1): 114-119.doi: 10.13190/j.jbupt.2018-067

• 研究报告 • 上一篇    下一篇

基于VSM和Bisecting K-means聚类的新闻推荐方法

袁仁进, 陈刚, 李锋, 魏双建   

  1. 信息工程大学 地理空间信息学院, 郑州 450052
  • 收稿日期:2018-04-16 出版日期:2019-02-28 发布日期:2019-03-08
  • 通讯作者: 陈刚(1971-),男,教授,博士生导师,E-mail:chengang_vge@sina.com. E-mail:chengang_vge@sina.com
  • 作者简介:袁仁进(1994-),男,硕士生.
  • 基金资助:
    国家自然科学基金项目(41301428)

A News Recommendation Method Based on VSM and Bisecting K-means Clustering

YUAN Ren-jin, CHEN Gang, LI Feng, WEI Shuang-jian   

  1. Institute of Geospatial Information, Information Engineering University, Zhengzhou 450052, China
  • Received:2018-04-16 Online:2019-02-28 Published:2019-03-08
  • Supported by:
     

摘要: 针对海量新闻数据给用户带来的困扰,为提升用户阅读新闻的个性化体验,提出了融合向量空间模型和Bisecting K-means聚类的新闻推荐方法.首先进行新闻文本向量化,使用向量空间模型和TF-IDF算法构建出新闻特征向量;采用Bisecting K-means聚类算法对新闻特征向量集进行聚类;然后将已聚类的新闻集分为训练集和测试集,根据训练集构建"用户-新闻类别-新闻"三层层次结构的用户兴趣模型;最后采用余弦相似度方法得出新闻推荐结果,并与测试集进行对比分析.实验以基于用户的协同过滤算法、基于物品的协同过滤算法、结合向量空间模型和K-means聚类的推荐方法为基准,实验结果表明,该方法具有可行性,在准确率、召回率和F值上都有所提高.

关键词: 个性化推荐, 向量空间模型, Bisecting K-means聚类算法, 用户兴趣模型

Abstract: Personalized recommendation technology is a good solution to the problem of information overload. In order to improve the user's personalized experience of reading news, a news recommendation method based on the vector space model and Bisecting K-means clustering is proposed. Firstly, the news text vectorization is carried out:using the vector space model and TF-IDF algorithm to construct news feature vectors; then Bisecting K-means clustering algorithm is utilized to cluster the news feature vector set; after that, the clustered news set is divided into training set and test set, according to the training set, a "user-news category-news" three-level structure of the user interest model is built; finally, the cosine similarity method is used to calculate news recommendation results. The experiments are based on user-based collaborative filtering algorithm, item-based collaborative filtering algorithm, combined vector space model and K-means clustering recommendation method, and the results show that the proposed method is feasible, and the accuracy rate, recall rate and F value all have been improved.

Key words: personalized recommendation, vector space model, Bisecting K-means clustering algorithm, user interest model

中图分类号: