北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (3): 30-35.

• • 上一篇    下一篇

面向小样本文本分类的互学习原型网络

刘俊1,秦晓瑞1,陶剑1,董洪飞1,李晓旭2   

  1. 1. 中国航空综合技术研究所
    2. 兰州理工大学 计算机与通信学院
  • 收稿日期:2023-06-02 修回日期:2023-09-25 出版日期:2024-06-30 发布日期:2024-06-13
  • 通讯作者: 刘俊 E-mail:jun. liu@163. com
  • 基金资助:
    国家自然科学基金项目(62176110); 军委装备发展部技术基础项目(221ZHK11015)

Mutual Learning Prototype Network for Few-shot Text Classification

  • Received:2023-06-02 Revised:2023-09-25 Online:2024-06-30 Published:2024-06-13

摘要: 小样本原型网络被视为解决小样本文本分类问题的有效方法之一。然而,现有方法通常只依赖于单一原型进行训练和推理,容易受到噪声等因素的影响,从而导致泛化能力不足。针对这一问题,提出了一种用于小样本文本分类的互学习原型网络Mutual Learning-ProtoNet(MLProtoNet)。在保留现有算法通过文本嵌入特征直接计算原型的基础上,引入了BERT网络,将文本嵌入特征输入到BERT中以生成新的原型。接着,利用互学习算法,使这两个原型相互约束并进行知识交换,以过滤掉不准确的语义信息。此过程旨在提升模型的特征提取能力,并通过两个原型的共同决策来提高分类精度。在两个小样本文本分类数据集上的实验结果证实了所提出方法的有效性。具体来说,在FewRel数据集上,该方法在5-way 1-shot实验中相较于当前最优方法,精度提高了2.97%,而在5-way 5-shot实验中精度提高了1.99%。

关键词: 人工智能, 文本分类, 小样本学习, 互学习, 原型网络

Abstract: Few-shot prototype networks are regarded as one of the effective methods to solve few-shot text classification problems. However, existing methods usually rely only on a single prototype for training and inference, which is susceptible to noise and other factors, resulting in insufficient generalization ability. To address this problem, a Mutual Learn-ing-Prototype Network(MLProtoNet) for small-sample text classification is proposed. On the basis of retaining the ex-isting algorithm to compute the prototype directly by text embedding features, thie paper introduces the BERT network, which inputs the text embedding features into BERT to generate a new prototype. Then, using the mutual learning algorithm, the two prototypes are mutually constrained and knowledge is exchanged to filter out the inaccurate semantic information. This process aims to enhance the feature extraction capability of the model and improve the classification accuracy by joint decision making of the two prototypes. Experimental results on two few-shot text classification da-tasets confirm the effectiveness of our proposed approach. Specifically, on the FewRel dataset, our method improves the accuracy by 2.97% in the 5-way 1-shot experiment compared to the current optimal method, and by 1.99% in the 5-way 5-shot experiment.

Key words: artificial intelligence, text classification, few-shot learning, mutual learning, prototype network

中图分类号: