北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2011, Vol. 34 ›› Issue (s1): 55-58.doi: 10.13190/jbupt.2011s1.55.duanrx

• 论文 • 上一篇    下一篇

HDP主题模型的用户意图聚类

段瑞雪,王小捷,孙月萍,李文峰   

  1. 北京邮电大学 计算机学院, 北京 100876
  • 出版日期:2011-10-28 发布日期:2011-10-28
  • 作者简介:段瑞雪(1984-),女,博士生,E-mail:duanruixue@gmail.com 王小捷(1969-),男,教授,博士生导师
  • 基金资助:

    国家自然科学基金项目(90920006)

Clustering User Goals Based on Hierarchical Dirichlet Process Topic Model

    

  1.  
  • Online:2011-10-28 Published:2011-10-28
  • Supported by:
     

摘要:

为了实现对网络搜索中用户意图的进一步理解,提出采用hierarchical dirichlet process (HDP)的方法来完成用户意图的聚类. 动词能够较好地体现用户意图,因此,完成动词的聚类就可以获得更好的用户意图的聚类. 提出用与动词具有依存关系的名词和与其共现的名词来表示动词文档. 实验结果表明,加入文档层的HDP模型具有比潜在狄雷克来分配模型和狄雷克来混合模型(DPMM)更好的聚类性能.

关键词: 用户意图, 依存关系, 动词聚类, 潜在狄雷克来分配模型

Abstract:

In order to understand users’goals in web searching better, an approach which makes use of hierarchical dirichlet process (HDP) model to cluster the verbs of users’goals is proposed. Verbs are good indications of users’goal, and good verb clustering therefore causes good clustering of users’goals. Verbs of users’goals are represented by cooccurrence nouns and nouns that have dependency relations with the verb. Experiments show that HDP has better performance in verbs clustering than latent Dirichlet allocation and dirichlet process mixture model.

Key words: user goals, dependency relation, verb clustering, latent Dirichlet allocation

中图分类号: