北京邮电大学学报

  • EI核心期刊

北京邮电大学学报 ›› 2020, Vol. 43 ›› Issue (5): 91-97.doi: 10.13190/j.jbupt.2020-056

• 论文 • 上一篇    下一篇

基于视频数据特性的动态手势识别

谢晓燕1, 赵欢1, 蒋林2   

  1. 1. 西安邮电大学 计算机学院, 西安 710121;
    2. 西安科技大学 集成电路设计实验室, 西安 710054
  • 收稿日期:2020-06-07 发布日期:2021-03-11
  • 通讯作者: 赵欢(1995-),男,硕士生,E-mail:750746730@qq.com. E-mail:750746730@qq.com
  • 作者简介:谢晓燕(1972-),女,教授,硕士生导师.
  • 基金资助:
    国家自然科学基金项目(61834005,61772417,61602377);陕西省国际科技合作计划项目(2018KW-006);榆林市科技计划项目(2019-133)

Dynamic Gesture Recognition Based on Characteristics of Encoded Video Data

XIE Xiao-yan1, ZHAO Huan1, JIANG Lin2   

  1. 1. School of Computer, Xi'an University of Posts&Telecommunications, Xi'an 710121, China;
    2. Integrated Circuit Design Laboratory, Xi'an University of Science and Technology, Xi'an 710054, China
  • Received:2020-06-07 Published:2021-03-11

摘要: 针对现有动态手势识别方法环境适应性低、计算复杂的问题,提出了一种基于视频数据特性的动态手势识别方法.使用基于密度的聚类算法DBSCAN直接从视频编码数据中的运动矢量提取出运动趋势特征,再通过随机森林分类运动趋势,结合卷积神经网络(CNN)提取的手型特征识别动态手势.实验结果表明,该方法对剑桥大学和美国西北大学数据集中动态手势的平均识别率分别达到94.22%和94.48%,并且与CNN结合长短期记忆网络的识别方法相比,手势识别时间减少了85%.在背景图像复杂且光照条件不足时,该方法仍然能够维持较高的识别率,表现出较好的鲁棒性.

关键词: 动态手势识别, 运动矢量, DBSCAN, 随机森林, 卷积神经网络

Abstract: Aiming at the challenges to scene adaptability and computational complexity of dynamic gesture recognition, a method based on characteristics of encoded video data is proposed. Firstly, density-based spatial clustering of applications with noise is used to extract motion trend features from motion vectors. Then, the motion trends are classified by random forest. Finally, combined by the hand shape features extracted by convolutional neural network(CNN), the dynamic gesture is recognized. The experiment shows that the proposed method has an average recognition rate of 94.22% and 94.48% respective for university of Cambridge and Northwestern University hand gesture data sets. Compared with the scheme combine of CNN and long short-term memory, the gesture recognition time is reduced by 85%. It can still maintain a higher recognition rate for the complex background with insufficient illumination, represents a higher robustness.

Key words: dynamic gesture recognition, motion vector, density-based spatial clustering of applications with noise, random forest, convolutional neural network

中图分类号: