北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2017, Vol. 40 ›› Issue (6): 115-119.doi: 10.13190/j.jbupt.2016-220

• 研究报告 • 上一篇    下一篇

基于重启随机游走的实体识别与链接方法

谭咏梅1, 郑迪1, 刘姝雯1, 吕学强2   

  1. 1. 北京邮电大学 智能科学与技术中心, 北京 100876;
    2. 北京信息科技大学 网络文化与数字传播北京市重点实验室, 北京 100101
  • 收稿日期:2016-12-12 出版日期:2017-12-28 发布日期:2017-12-28
  • 作者简介:谭咏梅(1975-),女,副教授,E-mail:ymtan@bupt.edu.cn.
  • 基金资助:
    网络文化与数字传播北京市重点实验室开放课题(ICDD201703);国家自然科学基金面上项目(61671070)

Entity Discovery and Linking Approach Based on Random Walk with Restart

TAN Yong-mei1, ZHENG Di1, LIU Shu-wen1, LÜ Xue-qiang2   

  1. 1. Intelligence Science and Technology Center, Beijing University of Posts and Telecommunications, Beijing 100876, China;
    2. Beijing Key Laboratory of Internet Culture and Digital Dissemination Research, Beijing Information Science and Technology University, Beijing 100101, China
  • Received:2016-12-12 Online:2017-12-28 Published:2017-12-28
  • Supported by:
     

摘要: 提出基于重启随机游走的实体识别和链接方法,在知识库部分实体构成的图结构中进行随机游走,从而获得实体和指称的分布式表示,并由此计算出相似度最高的实体作为链接实体.该方法在2015年Tri-Lingual Entity Discovery and Linking评测任务中的F值为0.665,高于其他参赛系统.实验结果表明,本方法可以有效克服特征稀缺问题,并减轻流行度差异对实验结果造成的影响.

关键词: 实体链接, 语义相似度, 随机游走

Abstract: An entity discovery and linking approach based on random walk with restart was presented. Unified semantic representation for entities and documents-the probability distribution obtained from a random walk on a subgraph of the knowledge based was adopted. According to this distributed representation, the entities that are similar with mentions as the linking results was obtained. This method achieved 0.665 F value on entity linking section of TAC 2015 TEDL task, it performs better than other participating systems. It is illustrated that the method can overcome the feature sparsity issue and is less amenable to feature sparsity bias.

Key words: entity linking, semantic relatedness, random walk

中图分类号: