In this post, we will cover how to use OpenCV’s multi-object tracking API implemented using the MultiTracker class. We will share code in both C++ and Python.
Before we dive into the details, please check previous posts listed below on Object Tracking to understand the basics of single object trackers implemented in OpenCV.
Why do we need Multi Object Tracking
Most beginners in Computer Vision and Machine Learning learn about object detection. If you are a beginner, you may be tempted to think why do we need object tracking at all. Can’t we just detect objects in every frame?
Let’s explore a few reasons why tracking is useful.
First, when there are multiple objects (say people) detected in a video frame, tracking helps establish the identity of the objects across frames.
Second, in some cases, object detection may fail but it may still be possible to track the object because tracking takes into account the location and appearance of the object in the previous frame.
Third, some tracking algorithms are very fast because they do a local search instead of a global search. So we can obtain a very high frame rate for our system by performing object detection every n-th frame and tracking the object in intermediate frames.
So, why not track the object indefinitely after the first detection? A tracking algorithm may sometimes lose track of the object it is tracking. For example, when the motion of the object is too large, a tracking algorithm may not be able to keep up. So many real-world applications use detection and tracking together.
In this tutorial, we will focus on just the tracking part. The objects we want to track will be specified by dragging a bounding box around them.
Recently, more advanced state of the art trackers have come into light. One of them is DeepSORT, which uses an advanced association matrix than it’s predecessor, SORT. It uses YOLO network as the object detection model, and can save the person identities for upto 30 frames (default, can be changed). FairMOT uses joint detection and re-ID approach to give real time results. It can use YOLOv5 or DLA-34 as the backbone, and is superior in the re-identification task.
MultiTracker : OpenCV’s Multiple Object Tracker
The MultiTracker class in OpenCV provides an implementation of multi-object tracking. It is a naive implementation because it processes the tracked objects independently without any optimization across the tracked objects.
Let’s go over the code step by step to find out how can we use OpenCV’s multi-object tracking API.
Step 1: Create a Single Object Tracker
A multi-object tracker is simply a collection of single object trackers. We start by defining a function that takes a tracker type as input and creates a tracker object. OpenCV has 8 different tracker types : BOOSTING, MIL, KCF,TLD, MEDIANFLOW, GOTURN, MOSSE, CSRT.
If you want to use the GOTURN tracker, please make sure to download the caffe model.
In the code below, given the name of the tracker class, we return the tracker object. This will be later used to populate the multi-tracker.
Python
from __future__ import print_function
import sys
import cv2
from random import randint
trackerTypes = ['BOOSTING', 'MIL', 'KCF','TLD', 'MEDIANFLOW', 'GOTURN', 'MOSSE', 'CSRT']
def createTrackerByName(trackerType):
# Create a tracker based on tracker name
if trackerType == trackerTypes[0]:
tracker = cv2.TrackerBoosting_create()
elif trackerType == trackerTypes[1]:
tracker = cv2.TrackerMIL_create()
elif trackerType == trackerTypes[2]:
tracker = cv2.TrackerKCF_create()
elif trackerType == trackerTypes[3]:
tracker = cv2.TrackerTLD_create()
elif trackerType == trackerTypes[4]:
tracker = cv2.TrackerMedianFlow_create()
elif trackerType == trackerTypes[5]:
tracker = cv2.TrackerGOTURN_create()
elif trackerType == trackerTypes[6]:
tracker = cv2.TrackerMOSSE_create()
elif trackerType == trackerTypes[7]:
tracker = cv2.TrackerCSRT_create()
else:
tracker = None
print('Incorrect tracker name')
print('Available trackers are:')
for t in trackerTypes:
print(t)
return tracker
C++
Note: In addition to including opencv2/opencv.hpp, you also need to include opencv2/tracking.hpp.
#include <opencv2/opencv.hpp>
#include <opencv2/tracking.hpp>
using namespace cv;
using namespace std;
vector<string> trackerTypes = {"BOOSTING", "MIL", "KCF", "TLD", "MEDIANFLOW", "GOTURN", "MOSSE", "CSRT"};
// create tracker by name
Ptr<Tracker> createTrackerByName(string trackerType)
{
Ptr<Tracker> tracker;
if (trackerType == trackerTypes[0])
tracker = TrackerBoosting::create();
else if (trackerType == trackerTypes[1])
tracker = TrackerMIL::create();
else if (trackerType == trackerTypes[2])
tracker = TrackerKCF::create();
else if (trackerType == trackerTypes[3])
tracker = TrackerTLD::create();
else if (trackerType == trackerTypes[4])
tracker = TrackerMedianFlow::create();
else if (trackerType == trackerTypes[5])
tracker = TrackerGOTURN::create();
else if (trackerType == trackerTypes[6])
tracker = TrackerMOSSE::create();
else if (trackerType == trackerTypes[7])
tracker = TrackerCSRT::create();
else {
cout << "Incorrect tracker name" << endl;
cout << "Available trackers are: " << endl;
for (vector<string>::iterator it = trackerTypes.begin() ; it != trackerTypes.end(); ++it)
std::cout << " " << *it << endl;
}
return tracker;
}
Step 2: Read First Frame of a Video
A multi-object tracker requires two inputs
- A video frame
- Location (bounding boxes) of all objects we want to track.
Given this information, the tracker tracks the location of these specified objects in all subsequent frames.
In the code below, we first load the video using the VideoCapture class and read the first frame. This will be used later to initialize the MultiTracker.
Python
# Set video to load
videoPath = "videos/run.mp4"
# Create a video capture object to read videos
cap = cv2.VideoCapture(videoPath)
# Read first frame
success, frame = cap.read()
# quit if unable to read the video file
if not success:
print('Failed to read video')
sys.exit(1)
C++
// set default values for tracking algorithm and video
string videoPath = "videos/run.mp4";
// Initialize MultiTracker with tracking algo
vector<Rect> bboxes;
// create a video capture object to read videos
cv::VideoCapture cap(videoPath);
Mat frame;
// quit if unabke to read video file
if(!cap.isOpened())
{
cout << "Error opening video file " << videoPath << endl;
return -1;
}
// read first frame
cap >> frame;
Step 3: Locate Objects in the First Frame
Next, we need to locate objects we want to track in the first frame. The location is simply a bounding box.
OpenCV provides a function called selectROI that pops up a GUI to select bounding boxes (also called a Region of Interest (ROI)).
In the C++ version, selectROI allows you to obtain multiple bounding boxes, but in the Python version, it returns just one bounding box. So, in the Python version, we need a loop to obtain multiple bounding boxes.
For every object, we also select a random color to display the bounding box.
The code is shown below.
Python
## Select boxes
bboxes = []
colors = []
# OpenCV's selectROI function doesn't work for selecting multiple objects in Python
# So we will call this function in a loop till we are done selecting all objects
while True:
# draw bounding boxes over objects
# selectROI's default behaviour is to draw box starting from the center
# when fromCenter is set to false, you can draw box starting from top left corner
bbox = cv2.selectROI('MultiTracker', frame)
bboxes.append(bbox)
colors.append((randint(0, 255), randint(0, 255), randint(0, 255)))
print("Press q to quit selecting boxes and start tracking")
print("Press any other key to select next object")
k = cv2.waitKey(0) & 0xFF
if (k == 113): # q is pressed
break
print('Selected bounding boxes {}'.format(bboxes))
C++
// Get bounding boxes for first frame
// selectROI's default behaviour is to draw box starting from the center
// when fromCenter is set to false, you can draw box starting from top left corner
bool showCrosshair = true;
bool fromCenter = false;
cout << "\n==========================================================\n";
cout << "OpenCV says press c to cancel objects selection process" << endl;
cout << "It doesn't work. Press Escape to exit selection process" << endl;
cout << "\n==========================================================\n";
cv::selectROIs("MultiTracker", frame, bboxes, showCrosshair, fromCenter);
// quit if there are no objects to track
if(bboxes.size() < 1)
return 0;
vector<Scalar> colors;
getRandomColors(colors, bboxes.size());
The getRandomColors function is rather simple
// Fill the vector with random colors
void getRandomColors(vector<Scalar>& colors, int numColors)
{
RNG rng(0);
for(int i=0; i < numColors; i++)
colors.push_back(Scalar(rng.uniform(0,255), rng.uniform(0, 255), rng.uniform(0, 255)));
}
Step 3: Initialize the MultiTracker
Until now, we have read the first frame and obtained bounding boxes around objects. That is all the information we need to initialize the multi-object tracker.
We first create a MultiTracker object and add as many single object trackers to it as we have bounding boxes. In this example, we use the CSRT single object tracker, but you try other tracker types by changing the trackerType variable below to one of the 8 tracker times mentioned at the beginning of this post. The CSRT tracker is not the fastest but it produces the best results in many cases we tried.
You can also use different trackers wrapped inside the same MultiTracker, but of course, it makes little sense.
The MultiTracker class is simply a wrapper for these single object trackers. As we know from our previous post, the single object tracker is initialized using the first frame and the bounding box indicating the location of the object we want to the track. The MultiTracker passes this information over to the single object trackers it is wrapping internally.
Python
# Specify the tracker type
trackerType = "CSRT"
# Create MultiTracker object
multiTracker = cv2.MultiTracker_create()
# Initialize MultiTracker
for bbox in bboxes:
multiTracker.add(createTrackerByName(trackerType), frame, bbox)
C++
// Specify the tracker type
string trackerType = "CSRT";
// Create multitracker
Ptr<MultiTracker> multiTracker = cv::MultiTracker::create();
// Initialize multitracker
for(int i=0; i < bboxes.size(); i++)
multiTracker->add(createTrackerByName(trackerType), frame, Rect2d(bboxes[i]));
Step 4: Update MultiTracker & Display Results
Finally, our MultiTracker is ready and we can track multiple objects in a new frame. We use the update method of the MultiTracker class to locate the objects in a new frame. Each bounding box for each tracked object is drawn using a different color.
Python
# Process video and track objects
while cap.isOpened():
success, frame = cap.read()
if not success:
break
# get updated location of objects in subsequent frames
success, boxes = multiTracker.update(frame)
# draw tracked objects
for i, newbox in enumerate(boxes):
p1 = (int(newbox[0]), int(newbox[1]))
p2 = (int(newbox[0] + newbox[2]), int(newbox[1] + newbox[3]))
cv2.rectangle(frame, p1, p2, colors[i], 2, 1)
# show frame
cv2.imshow('MultiTracker', frame)
# quit on ESC button
if cv2.waitKey(1) & 0xFF == 27: # Esc pressed
break
C++
while(cap.isOpened())
{
// get frame from the video
cap >> frame;
// Stop the program if reached end of video
if (frame.empty()) break;
//Update the tracking result with new frame
multiTracker->update(frame);
// Draw tracked objects
for(unsigned i=0; i<multiTracker->getObjects().size(); i++)
{
rectangle(frame, multiTracker->getObjects()[i], colors[i], 2, 1);
}
// Show frame
imshow("MultiTracker", frame);
// quit on x button
if (waitKey(1) == 27) break;
}