北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2020, Vol. 43 ›› Issue (5): 98-104,117.doi: 10.13190/j.jbupt.2020-033

• 论文 • 上一篇    下一篇

一种基于高层特征融合的网络商品分类

刘逸琛, 孙华志, 马春梅, 姜丽芬, 钟长鸿   

  1. 天津师范大学 计算机与信息工程学院, 天津 300387
  • 收稿日期:2020-04-23 发布日期:2021-03-11
  • 通讯作者: 马春梅(1985-),女,讲师,E-mail:mcmxhd@163.com. E-mail:mcmxhd@163.com
  • 作者简介:刘逸琛(1995-),男,硕士生.
  • 基金资助:
    国家自然科学基金项目(61702370);天津市自然科学基金项目(18JCYBJC85900,18JCQNJC70200);天津市科技发展战略研究计划项目(17ZLZXZF00530);天津市教委科研计划项目(JW1702)

Commodity Classification of Online Based on High-Level Feature Fusion

LIU Yi-chen, SUN Hua-zhi, MA Chun-mei, JIANG Li-fen, ZHONG Chang-hong   

  1. School of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China
  • Received:2020-04-23 Published:2021-03-11

摘要: 为了利用商品文本标题实现商品自动分类,提出一种基于高层特征融合的商品分类模型.首先,提出基于字嵌入和词嵌入的文本底层特征表示法,进而获得更强的商品标题结构特征表达;其次,提出了联合自注意力、卷积神经网络和通道注意力的机制,对文本标题的底层特征进行增强并获得高层增强特征;最后,通过将文本的字嵌入和词嵌入的高层增强特征进行融合,最终获得商品文本标题的综合特征,并实现商品自动分类.以商品标题语料作为数据集进行了实验,实验结果表明,该模型对三级商品类别的分类精度能够达到84.348%,召回率和F1值分别达到了47.8%和49.4%,优于现有可用于商品文本标题分类的先进短文本分类方法.

关键词: 商品分类, 短文本分类, 特征融合, 特征增强, 注意力机制

Abstract: In order to realize automatic classification of commodities by leveraging text titles of commodities, a commodity classification model high-level feature fusion (HFF) based on high-level feature fusion is proposed. Firstly, a char embedding and word embedding based low-level feature representation method for the text title is proposed. Then a stronger feature expression of the commodity title structure can be obtained. Secondly, a joint self-attention mechanism, convolutional neural network, and channel attention are proposed to enhance the low-level features and obtain high-level enhancement features of the text title. Finally,by fusing the high-level enhancement features of the word embedding and the char embedding of the text, a comprehensive feature of the text title of the commodity is finally obtained and used for the commodity classification. Experiments are conduct on the dataset of the commodity titles. The experiments show that the classification accuracy of HFF for the third-level commodity can reach 84.348%. In addition, the recall and the F1 value of the HFF reach 47.8% and 49.4%, respectively, which is superior to the existing advanced short text classification method that can be used for the commodity text titles classification.

Key words: commodity classification, short text classification, feature fusion, feature enhancement, attention

中图分类号: