北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (4): 50-56.

• 体系化人工智能专题 • 上一篇    下一篇

融合大语言模型的领域问答系统构建方法

齐思洋,胡慧云,李洪冰,李琦,肖波   

  1. 北京邮电大学 人工智能学院
  • 收稿日期:2023-12-28 修回日期:2024-03-26 出版日期:2024-08-28 发布日期:2024-08-26
  • 通讯作者: 肖波 E-mail:xiaobo@bupt.edu.cn
  • 基金资助:
    国家自然科学基金项目; 北京邮电大学研究生创新创业项目资助

Domain-Specific Question Answering System Construction Approach Integrated with Large Language Model

QI Siyang, HU Huiyun, LI Hongbing, LI Qi, XIAO Bo   

  • Received:2023-12-28 Revised:2024-03-26 Online:2024-08-28 Published:2024-08-26

摘要: 针对构建领域问答系统时所面临的数据成本高、知识构建复杂和不同领域数据集差异大等挑战,提出了一种融合大语言模型和领域知识的问答系统构建方法。现有方法多是直接将本地知识语料分段存储匹配,在进行检索增强生成时,查询文本与分段内容语义匹配度不高,从而降低文本生成质量。为此,提出基于提示工程的查询语义对齐优化方法,通过生成“假设性问答对冶来统一用户查询和语料的语义空间,从而提高领域知识的检索效率和答案的准确性。此外,实验证明,所提方法能够克服模型训练成本高的问题,迅速构建部署到不同垂直领域,并在性能上优于其他方法。

关键词: 垂直领域, 大语言模型, 问答系统, 知识库

Abstract: The construction of domain-specific question answering system frequently encounters challenges, including substantial data costs, intricate knowledge construction, and the significant differences among datasets from various domains. To address these challenges, an approach that integrates large language models and domain specific knowledge for question answering system construction is proposed. Most of the existing methods directly store and match local knowledge corpus in segments. When performing retrieval-augmented generation, the semantic matching between the query and the corpus is insufficient, thus reducing the quality of text generation. Therefore, the prompt aligned retrieval generation approach is proposed to unify the semantic space of user queries and corpus by generating pseudo question and answer pairs, thereby improving the retrieval efficiency of domain knowledge and the accuracy of answers. Experiments show that the proposed approach overcomes challenges related to high model training costs, enabling rapid deployment across various vertical domains and outperforming other methods.

Key words: vertical domain , large language model ,   question answering system ,   knowledge base

中图分类号: