北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (5): 128-134.

• 研究报告 • 上一篇    下一篇

基于多通道残差混合空洞卷积的注意力词义消歧

张春祥,张育隆,高雪瑶   

  1. 哈尔滨理工大学
  • 收稿日期:2023-09-07 修回日期:2023-11-22 出版日期:2024-10-28 发布日期:2024-11-10
  • 通讯作者: 高雪瑶 E-mail:xueyao_gao@163.com
  • 基金资助:
    国家自然科学基金;中国博士后科学基金;黑龙江省自然科学基金

Multi-Channel Residual Hybrid Dilated Convolution with Attention for Word Sense Disambiguation

  • Received:2023-09-07 Revised:2023-11-22 Online:2024-10-28 Published:2024-11-10

摘要: 针对当前词义消歧模型的泛化能力不足,提出了基于多通道残差混合空洞卷积的注意力(Multi-Channel Residual Hybrid Dilated Convolution with Attention,MRHA)词义消歧模型。使用语言学知识构建消歧特征,采用三种向量化方式将消歧特征向量化,组成三通道词嵌入矩阵,将位置编码与三通道词嵌入矩阵进行深度融合。设计了一种复杂的卷积编码器以增加模型的表达能力。在SemEval-2007: Task#5和SemEval-2021: Task#2上进行实验,结果表明:相比最新的基于聚类语义标签的词义消歧模型(Word Sense Disambiguation Using Clustered Sense Labels,CSL)和多头注意力机制的多通道卷积神经网络(Multi-Channel Convolutional Neural Networks with Multi-Head Attention,MCNN-MA),所提方法的平均偏差降低了1.345%和2.157%。

关键词: 词义消歧, 语言学知识, 混合空洞卷积, 卷积编码器

Abstract: Aiming at insufficient generalization ability of current WSD (word sense disambiguation) model, Multi-Channel Residual Hybrid Dilated Convolution with Attention (MRHA) WSD model is proposed. Linguistic knowledge is used to construct disambiguation features, 3 vectorization methods are used to vectorize disambiguation features to form 3-channel word embedding matrix, and positional coding is deeply fused with 3-channel word embedding matrix. A complex convolutional encoder is designed to increase expressive ability of WSD model. Experiments are conducted on SemEval-2007: Task#5 and SemEval-2021: Task#2. Experimental results show that compared with the newest WSD model using Clustered Sense Labels (CSL) and Multi-Channel Convolutional Neural Networks with Multi-Head Attention (MCNN-MA), average bias of the proposed method is respectively reduced to 1.345% and 2.157%.

Key words: Word sense disambiguation, Linguistic knowledge, Hybrid Dilated Convolution, Convolutional encoder

中图分类号: