北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2019, Vol. 42 ›› Issue (1): 61-67.doi: 10.13190/j.jbupt.2018-040

• 论文 • 上一篇    下一篇

基于多模态判别性嵌入空间的图像情感分析

吕光瑞, 蔡国永, 林煜明   

  1. 桂林电子科技大学 广西可信软件重点实验室, 桂林 541004
  • 收稿日期:2018-03-20 出版日期:2019-02-28 发布日期:2019-03-08
  • 通讯作者: 蔡国永(1971-),男,教授,硕士生导师,E-mail:ccgycai@guet.edu.cn. E-mail:ccgycai@guet.edu.cn
  • 作者简介:吕光瑞(1989-),男,硕士生.
  • 基金资助:
    国家自然科学基金项目(61763007,61562014);广西自然科学基金项目(2017JJD160017);广西可信软件重点实验室项目(kx201503)

Image Sentiment Analysis with Multimodal Discriminative Embedding Space

Lü Guang-rui, CAI Guo-yong, LIN Yu-ming   

  1. Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China
  • Received:2018-03-20 Online:2019-02-28 Published:2019-03-08

摘要: 为了解决图像情感分析中存在的情感鸿沟和大的类内方差问题,提出了一种可以同时利用视觉模态和文本模态之间的深度潜在关联、视觉模态的深度线性判别和图像中层语义融合的弱监督方法.利用多模态深度网络结构找到一个视觉模态和文本模态之间最大深度关联且视觉模态具有深度判别性的潜在嵌入空间,并在该潜在空间中将文本的语义映射特征迁移到图像的判别性视觉映射特征中;结合注意力机制,设计涵盖潜在空间中映射特征的注意力网络,用于情感分类.在真实数据集上的实验结果表明,所提出的方法获得了更好的情感分类准确率.

关键词: 情感分析, 潜在关联, 线性判别, 多模态网络, 注意力机制

Abstract: In order to alleviate affective gap and large intra-class variance existing in visual sentiment analysis, firstly a new method is proposed, which exploits simultaneously not only deep latent correlations between visual and textual modalities, but also deep linear discrimination of visual modality and weak supervision of mid-level semantic features of images. The method uses multimodal deep network architecture to find a latent embedding space in which deep correlations between visual and textual modalities are maximized, and at the same time there is a deep discrimination on visual modality. In the latent space, the extracted semantic feature of texts can be transferred to the extracted discriminant visual feature of images. Secondly based on the usfulness of attention mechanism, an attention network is presented, which accepts the extracted features in the latent space as input and is trained as a sentiment classifier. Results of experiments conducted on real datasets show that the proposed approach achieves better sentiment classification accuracy than those state-of-the-art approaches.

Key words: sentiment analysis, latent correlation, linear discrimination, multimodal network, attention mechanism

中图分类号: