Journal of Beijing University of Posts and Telecommunications

  • EI核心期刊

Journal of Beijing University of Posts and Telecommunications ›› 2020, Vol. 43 ›› Issue (2): 129-134.doi: 10.13190/j.jbupt.2019-071

• REPORTS • Previous Articles    

A Visual Object Tracking Algorithm Based on Features Extracted by Deep Residual Network

MA Su-gang1,2, ZHAO Xiang-mo1, HOU Zhi-qiang2, WANG Zhong-min2,3, SUN Han-lin2   

  1. 1. School of Information Engineering, Chang'an University, Xi'an 710064, China;
    2. School of Computer Science and Technology, Xi'an University of Posts and Telecommunications, Xi'an 710121, China;
    3. Shanxi Key Laboratory of Network Data Analysis and Intelligent Processing, Xi'an University of Posts and Telecommunications, Xi'an 710121, China
  • Received:2019-04-30 Published:2020-04-28

Abstract: Because the objects are easy to be lost in complex scenes, a scale adaptive visual object tracking algorithm based on deep residual network (ResNet) features is proposed. Firstly, the ResNet is used to extract the multi-layer deep features of the image region of interest. Considering the restraining effect of rectified linear units (ReLU) activation function on target features, only the convolutional layers before ReLU function are selected. Secondly, the translation filters based on kernelized correlation filter are constructed in the extracted multi-layer features, and then the weighted fusion of the multiple response maps is carried out to obtain the target position with the largest response value. After the target location is determined, the target is sampled at multiple scales, and the felzenszwalb histogram of oriented gradients (fHOG) features of different scale images are extracted separately. On this basis, a scale correlation filter is constructed to estimate the target scale accurately. Comparing with six related algorithms in OTB100, an experiment is carried. It is shown that the proposed algorithm achieves high tracking success rate and accuracy, and can adapt to scale variation, background clutter and other complex scenes.

Key words: visual object tracking, deep residual network, kernelized correlation filter, deep learning, scale estimation

CLC Number: