北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2006, Vol. 29 ›› Issue (4): 1-5.doi: 10.13190/jbupt.200604.1.050

• 论文 •    下一篇

基于决策树的汉语代词共指消解

王智强,李 蕾,王 枞   

  1. 北京邮电大学 智能科学技术研究中心,北京 100876
  • 收稿日期:2005-07-13 修回日期:1900-01-01 出版日期:2006-08-30 发布日期:2006-08-30
  • 通讯作者: 王智强

Chinese Pronominal Coreference Resolution Based on Decision Tree

WANG Zhi-qiang, LI Lei, WANG Cong   

  1. Research Center of Intelligent Sciences and Technology, Beijing University of Posts and Telecommunications, Beijing 100876,China
  • Received:2005-07-13 Revised:1900-01-01 Online:2006-08-30 Published:2006-08-30
  • Contact: WANG Zhi-qiang

摘要:

提出一种统计与规则相结合的决策树算法进行汉语代词共指消解,利用规则过滤掉属性冲突的反例,一定程度上弥补了决策树算法忽略属性关联性的缺点。采用Chinese Treebank作为语料进行测试,手工标注其中的共指关系和特征向量;先用规则过滤,再采用C4.5决策树算法选择先行语。实验结果消解成功率为82.59%,其中人称代词和指示代词的成功率分别为87.60%和75.21%。

关键词: 自然语言理解, 共指消解, 汉语代词, 决策树, 过滤规则

Abstract:

An integrated method based on decision tree for Chinese pronominal coreference is proposed. The basic idea is to some extent that filtering out the negative examples based on rules and could compensate the drawback of decision tree that ignoring the relationship between attributes. The performance of the proposed method is tested on Chinese Treebank. In our experiments, the attributes and coreferences are manually labeled, and then the rule patterns are utilized to feature vectors following the decision tree of C4.5 algorithm. The success rate is 82.59%, in which the rate of personal pronouns and demonstrative pronouns are 87.60% and 75.21% respectively.

Key words: natural language understanding, coreference resolution, Chinese pronoun, decision tree, filter rules

中图分类号: