北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2012, Vol. 35 ›› Issue (3): 91-94.doi: 10.13190/jbupt.201203.91.wangch

• 研究报告 • 上一篇    下一篇

面向话题追踪的Dirichlet过程混合模型

王婵,王晓捷,袁彩霞   

  1. 北京邮电大学 智能科学与技术中心, 北京 100876
  • 收稿日期:2011-07-18 修回日期:2011-11-26 出版日期:2012-06-28 发布日期:2012-02-29
  • 通讯作者: 王婵 E-mail:wchan@bupt.edu.cn
  • 作者简介:王婵(1986-),女,博士生,E-mail:wchan@bupt.edu.cn 王小捷(1969-),男,教授,博士生导师
  • 基金资助:

    国家自然科学基金重大研究计划培育项目(90920006)

A Topic Tracking Oriented Dirichlet Process Mixture Model

WANG Chan, WANG Xiao-jie,YUAN Cai-xia   

  1. Center of Intelligent Science and Technology, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2011-07-18 Revised:2011-11-26 Online:2012-06-28 Published:2012-02-29
  • Contact: Chan WANG E-mail:wchan@bupt.edu.cn

摘要:

提出了一个能有效结合待测话题信息的Dirichlet过程混合模型进行话题追踪. 模型在基于Gibbs抽样进行参数推理时融入待测话题信息,得到报道和待测话题的相关度. 实验结果表明,该方法不需要大规模训练数据,基于少量的种子报道就可以显著提高话题追踪的性能.

关键词: 话题追踪, Dirichlet过程混合模型, Gibbs抽样, 待测话题

Abstract:

A Dirichlet process mixture model which can make use of information of known topics efficiently is proposed for topic tracking. Prior knowledge of known topics is combined in Gibbs sampling for model inference, and similarities between new story and known topics can be gained. Experiments show that the model, without a large scale of indomain data, can improve the performance of topic tracking significantly even with a few ontopic stories.

Key words: topic tracking, Dirichlet process mixture model, Gibbs sampling, known topics

中图分类号: