Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

Journal of Beijing University of Posts and Telecommunications ›› 2024, Vol. 47 ›› Issue (2): 123-129.

Previous Articles     Next Articles

A Retrieval Model of Engineering Consulting Report Based on Joint Semantic and Association Matching

  

  • Received:2023-03-03 Revised:2023-05-05 Online:2024-04-28 Published:2024-01-24

Abstract: Writing engineering consulting reports requires writers to collect and read a large number of government policy documents, news reports, etc. There exist some problems such as high labor cost and long writing cycle. How to use text retrieval technology to intelligently match relevant paragraphs and recommend them to writers become particularly important. Proposes a text retrieval model for engineering consulting reports, abbreviated as JSAM, which combines semantic matching and association matching to achieve accurate and efficient retrieval of titles and paragraphs, and can effectively assist the writing of engineering consulting reports. A text retrieval corpus for engineering consulting reports is constructed. The comparative learning model of simCSE is fine-tuned by the corpus set. The Vanilla BERT model is initialized by the obtained model parameters, and the semantic matching score is obtained by sending the text information of the corpus into the Vanilla BERT model. At the same time, the text information and keyword information are represented by word-level semantic primitive vectors through the SAT model, and sent to the deep text interaction model DRMM to obtain the association matching score. The obtained semantic matching score and association matching score are normalized and then weighted and fused to obtain the final matching score, and the text retrieval between the title and the paragraph is completed. Compared with the comparative model CEDR-DRMM, the JSAM combines context vector representation and text interaction matching method, which improves the evaluation index of P@20 by 4.03 percentage points and effectively enhances the effect of text retrieval.

Key words: text retrieval, joint ranking, word vector, character vector, sememe

CLC Number: