Long-term Visual Object Tracking of Arbitrary Objects Based on Switching Between Traditional Method and Deep Learning Technique

Document Type : Original Article


1 Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran

2 Professor of Electrical Engineering


Visual tracking of the arbitrary object is a fundamental and challenging topic in the field of machine vision, which has traditionally been done by considering a model for the target and using the training data of the same video. Most trackers can hardly top the results of the most popular methods when considering real-time and online performance. In this article, a tracker framework based on the Siamese network is presented, which is an online tracker learning and a real-time tracking process, and its name is STD-Siam. Since the Siamese network has limited online training and cannot handle the challenges of tracking for the long term, STD-Siam aims to switch between traditional tracking and deep learning, training both trackers to eliminate the ambiguity between the target and the background in each scenario. First, the training data is generated through the traditional tracker, then these data are expanded with the augmentation technique so that the deep network can be trained well. This method can be executed at a speed of 66 FPS, and compared to the current similar algorithms, despite its simplicity, it can achieve good results and track the target for the long term. This tracking speed is beyond real-time due to the spike detector in the frequency domain, which accurately calculates the selected target candidates and avoids blindly scanning the entire image to reduce the computational burden.


Main Subjects