北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2022, Vol. 45 ›› Issue (1): 69-74.doi: 10.13190/j.jbupt.2021-104

• 论文 • 上一篇    下一篇

基于改进EMA单元的传统服饰图像语义分割

赵海英, 朱会, 侯小刚   

  1. 北京邮电大学 人工智能学院, 北京 100876
  • 收稿日期:2021-06-01 出版日期:2022-02-28 发布日期:2021-12-16
  • 通讯作者: 侯小刚(1984—),男,博士生,邮箱:houxiaogang05@bupt.edu.cn E-mail:houxiaogang05@bupt.edu.cn
  • 作者简介:赵海英(1972—),女,副教授
  • 基金资助:
    国家重点研发计划项目(2020YFF0305304);北京邮电大学基本科研业务费项目(2020RC26)

Traditional Custume Image Semantic Segmentation Based on Improved EMA Unit

ZHAO Haiying, ZHU Hui, HOU Xiaogang   

  1. School of Artificial Intelligence, Beijing University of Posts and Telecommunications, Beijing 100876, China
  • Received:2021-06-01 Online:2022-02-28 Published:2021-12-16

摘要: 针对传统服饰图像分割中标签易混淆和小目标易丢失带来的目标边缘细节难以保留等问题,提出了一种基于卷积注意力特征的残差期望最大化注意力语义分割网络模型。该模型首先以ResNeXt-50作为共享特征的主干网络,并通过在特征提取阶段引入一组平行的卷积注意力模块,可以有效地抑制无效特征,使目标区域的特征更加显著。然后利用残差思想对期望最大化注意力(EMA)单元进行优化,以解决迭代过程中梯度爆炸或者消失的问题,从而更好地建立特征图中位置间的关联,最终实现基于显著性融合学习的语义分割模型。最后在传统民族服饰数据集上通过定性与定量的实验验证了所提模型的有效性,其中平均交并比分割指标达到83.91%,取得了同类算法中最优效果。

关键词: 深度学习, 传统服饰, 特征提取, 注意力机制, 语义分割

Abstract: The object edge details are difficult to retain due to the confusion of labels and the loss of small objects in traditional clothing image segmentation. To solve the issue, a semantic segmentation network model termed residual expectation maximization attention based on convolution attention feature is proposed. The proposed model first uses ResNeXt-50 as the backbone network for shared features, and introduces a set of parallel convolutional attention modules in the feature extraction stage, which can effectively suppress invalid features and make the features of the target region more prominent. Then, the residual idea is used to optimize the expectation maximization attention unit to avoid the gradient explosion or disappearance in the iterative process, so as to establish the relationship between the positions in the feature map and realize the semantic segmentation model based on saliency fusion learning. Finally, qualitative and quantitative experiments verify the efficiency of the proposed model on the traditional national costume data set. The mean intersection over union segmentation index reaches 83.91%, which achieves the best results among similar algorithms.

Key words: deep learning, traditional costume, feature extraction, attention mechanism, semantic segmentation

中图分类号: