Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

Journal of Beijing University of Posts and Telecommunications ›› 2022, Vol. 45 ›› Issue (4): 13-18,57.doi: 10.13190/j.jbupt.2021-191

• Special Topics on Intelligent Medical • Previous Articles     Next Articles

Traditional Chinese Medicine Symptom Normalization Approach Based on Pre-Trained Language Models

XIE Yonghong1,2, TAO Hu1,2, JIA Qi1,2, YANG Shibing1,2, HAN Xinliang2   

  1. 1. School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China;
    2. Beijing Key Laboratory of Knowledge Engineering for Materials Science, Beijing 100083, China
  • Received:2021-09-01 Online:2022-08-28 Published:2022-09-03

Abstract: To solve the issue in traditional Chinese medicine that one symptom has different literal descriptions and one symptom corresponds to multiple normalized descriptions, a two-stage framework based on pre-trained language models is proposed. In the first step, according to the definition and classification of symptoms, a multi-label text classification model is adopted to semantically divide the symptom descriptions to obtain candidate normalization symptom words. In the second step, we score and sort the candidate normalization symptom words with an entity matching model, and some strategies are designed to perform a second recall of the results to improve performance. After that, the candidate word with the highest score in each semantic label is regarded as the normalization result. Experiments results show that the proposed method performs better than traditional methods on solving the symptom normalization problem. Furthermore, the research compares and analyzes the results using different pre-trained language models on the symptom normalization task to verify the effectiveness of the proposed method.

Key words: traditional Chinese medicine, symptom normalization, entity matching, semantic classification, pre-trained language model

CLC Number: