Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

Journal of Beijing University of Posts and Telecommunications ›› 2023, Vol. 46 ›› Issue (4): 91-96,122.

Previous Articles     Next Articles

Chinese Spelling Correction Model Based on Gated Feature Fusion

ZHOU Yuhao, SUN Zhe, WU Xiaofei, YU Ke   

  1. Beijing University of Posts and Telecommunications
  • Received:2022-06-28 Revised:2022-09-26 Online:2023-08-28 Published:2023-08-24

Abstract: In response to the problem of model performance being affected by incorrect pronunciation or glyph when fusing semantic, phonetic and glyph information of Chinese characters equally in Chinese spelling correction, a Chinese spelling correction model based on gated feature fusion is proposed, which uses adaptive gates to selectively fuse semantic, phonetic and glyph information to improve the performance of the model and enhance the interpretability of the model. The improved four corner code is used to encode the glyph features of Chinese characters, effectively extracting the glyph features of Chinese characters, and based on this, the glyph similarity confusion set in the pre-training stage of the model is expanded. The pre-training mask strategy based on confusion set replacement is used to enable the model to effectively learn the erroneous knowledge contained in the text. On the public SIGHAN13, SIGHAN14 and SIGHAN15 datasets, the proposed model achieves correction F1-scores of 78.7% , 67.8% and 77.7% , respectively, which are 1.5% , 1.5% and 1.0% higher than the optimal baseline model.

Key words: Chinese spelling correction, pre-training, gated feature fusion, four corner code

CLC Number: