北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2024, Vol. 47 ›› Issue (4): 36-43.

• 体系化人工智能专题 • 上一篇    下一篇

基于多模态推荐指令的大语言模型指令微调

郝博文1,3, 柳溢菲2, 李立耀3, 王 洁1, 彭 岩1   

  1. 1. 首都师范大学 管理学院 2. 首都师范大学 数学科学学院;
    3. 福建技术师范学院 非遗数字化与多源信息融合福建省高校工程研究中心
  • 收稿日期:2023-12-19 修回日期:2024-01-17 出版日期:2024-08-28 发布日期:2024-08-26
  • 通讯作者: 彭岩 E-mail:pengyan@cnu.edu.cn
  • 基金资助:
    福建省高校工程研究中心开放基金; 国家自然科学基金项目

The Instruction Tuning of Large Language Models with Multi-Modal Recommendation Instruction

HAO Bowen1,3, LIU Yifei2, LI Liyao3, WANG Jie1, PENG Yan1   

  • Received:2023-12-19 Revised:2024-01-17 Online:2024-08-28 Published:2024-08-26

摘要: 基于多模态指令的大语言模型指令微调能够有效赋予大模型解决相关多模态任务的能力。为了进一步使大模型能够完成多模态零样本或少样本推荐任务,提出了多模态推荐大语言模型,该模型以大语言模型 ChatGLM2-6B 为基座,选取包含文本、图片信息的多模态推荐数据集,利用 ChatGPT GPT4 构建多模态用户画像和物品属性生成指令,以及零样本和少样本推荐指令,并采用高效参数微调 P-tuning v2 方式,仅需用一张 A100 40GB 图形处理器即可微调得到多模态推荐大语言模型,用于完成多模态零样本和少样本推荐任务。实验结果证明,所提模型显著优于现有基线模型。

关键词: 多模态推荐指令, 大语言模型, 指令微调

Abstract: The tuning of large language models based on multimodal instructions has been proven effective in endowing large language models with the capability to address relevant multimodal tasks. To further empower large language models in handling multimodal zero-shot or few-shot recommendation tasks, multi-modal recommendation of large language model is proposed, which is built upon the foundation of ChatGLM2-6B, and is trained on multimodal recommendation dataset that includes both textual and image information. The construction of multimodal user profiles and item attributes is achieved through the utilization of ChatGPT and GPT-4 for generating instructions. Additionally, instructions for zero-shot and few-shot recommendations are formulated. The model undergoes efficient parameter fine-tuning using the P-tuning v2 method, requiring only a single A100 40GB graphics processing unit for the fine-tuning process. Experimental results demonstrate that the proposed model significantly outperforms existing baseline models.

Key words: multimodal recommendation instructions, large language model, instruction tuning

中图分类号: