MultiTracker : Multiple Object Tracking using OpenCV (C++/Python)

In this post, we will cover how to use OpenCV’s multi-object tracking API implemented using the MultiTracker class. We will share code in both C++ and Python.

Before we dive into the details, please check previous posts listed below on Object Tracking to understand the basics of single object trackers implemented in OpenCV.

Why do we need Multi Object Tracking

Most beginners in Computer Vision and Machine Learning learn about object detection. If you are a beginner, you may be tempted to think why do we need object tracking at all. Can’t we just detect objects in every frame?

Let’s explore a few reasons why tracking is useful.

First, when there are multiple objects (say people) detected in a video frame, tracking helps establish the identity of the objects across frames.

Second, in some cases, object detection may fail but it may still be possible to track the object because tracking takes into account the location and appearance of the object in the previous frame.

Third, some tracking algorithms are very fast because they do a local search instead of a global search. So we can obtain a very high frame rate for our system by performing object detection every n-th frame and tracking the object in intermediate frames.

So, why not track the object indefinitely after the first detection? A tracking algorithm may sometimes lose track of the object it is tracking. For example, when the motion of the object is too large, a tracking algorithm may not be able to keep up. So many real-world applications use detection and tracking together.

In this tutorial, we will focus on just the tracking part. The objects we want to track will be specified by dragging a bounding box around them.

Recently, more advanced state of the art trackers have come into light. One of them is DeepSORT, which uses an advanced association matrix than it’s predecessor, SORT. It uses YOLO network as the object detection model, and can save the person identities for upto 30 frames (default, can be changed). FairMOT uses joint detection and re-ID approach to give real time results. It can use YOLOv5 or DLA-34 as the backbone, and is superior in the re-identification task.

100K+ Learners
3 Hours of Learning

Join Free OpenCV Bootcamp

15K+ Learners
3 Hours of Learning

Join Free TensorFlow Bootcamp

10K+ Learners
8 Hours of Learning

Join Free PyTorch Bootcamp

MultiTracker : OpenCV’s Multiple Object Tracker

The MultiTracker class in OpenCV provides an implementation of multi-object tracking. It is a naive implementation because it processes the tracked objects independently without any optimization across the tracked objects.

Let’s go over the code step by step to find out how can we use OpenCV’s multi-object tracking API.

Step 1: Create a Single Object Tracker

A multi-object tracker is simply a collection of single object trackers. We start by defining a function that takes a tracker type as input and creates a tracker object. OpenCV has 8 different tracker types : BOOSTING, MIL, KCF,TLD, MEDIANFLOW, GOTURN, MOSSE, CSRT.

If you want to use the GOTURN tracker, please make sure to download the caffe model.

In the code below, given the name of the tracker class, we return the tracker object. This will be later used to populate the multi-tracker.

Download Code To easily follow along this tutorial, please download code by clicking on the button below. It's FREE!

Click here to download the source code to this post

Python

from __future__ import print_function
import sys
import cv2
from random import randint

trackerTypes = ['BOOSTING', 'MIL', 'KCF','TLD', 'MEDIANFLOW', 'GOTURN', 'MOSSE', 'CSRT']

def createTrackerByName(trackerType):
  # Create a tracker based on tracker name
  if trackerType == trackerTypes[0]:
    tracker = cv2.TrackerBoosting_create()
  elif trackerType == trackerTypes[1]:
    tracker = cv2.TrackerMIL_create()
  elif trackerType == trackerTypes[2]:
    tracker = cv2.TrackerKCF_create()
  elif trackerType == trackerTypes[3]:
    tracker = cv2.TrackerTLD_create()
  elif trackerType == trackerTypes[4]:
    tracker = cv2.TrackerMedianFlow_create()
  elif trackerType == trackerTypes[5]:
    tracker = cv2.TrackerGOTURN_create()
  elif trackerType == trackerTypes[6]:
    tracker = cv2.TrackerMOSSE_create()
  elif trackerType == trackerTypes[7]:
    tracker = cv2.TrackerCSRT_create()
  else:
    tracker = None
    print('Incorrect tracker name')
    print('Available trackers are:')
    for t in trackerTypes:
      print(t)

  return tracker

C++
Note: In addition to including opencv2/opencv.hpp, you also need to include opencv2/tracking.hpp.

#include <opencv2/opencv.hpp>
#include <opencv2/tracking.hpp>

using namespace cv;
using namespace std;

vector<string> trackerTypes = {"BOOSTING", "MIL", "KCF", "TLD", "MEDIANFLOW", "GOTURN", "MOSSE", "CSRT"}; 

// create tracker by name
Ptr<Tracker> createTrackerByName(string trackerType)
{
  Ptr<Tracker> tracker;
  if (trackerType ==  trackerTypes[0])
    tracker = TrackerBoosting::create();
  else if (trackerType == trackerTypes[1])
    tracker = TrackerMIL::create();
  else if (trackerType == trackerTypes[2])
    tracker = TrackerKCF::create();
  else if (trackerType == trackerTypes[3])
    tracker = TrackerTLD::create();
  else if (trackerType == trackerTypes[4])
    tracker = TrackerMedianFlow::create();
  else if (trackerType == trackerTypes[5])
    tracker = TrackerGOTURN::create();
  else if (trackerType == trackerTypes[6])
    tracker = TrackerMOSSE::create();
  else if (trackerType == trackerTypes[7])
    tracker = TrackerCSRT::create();
  else {
    cout << "Incorrect tracker name" << endl;
    cout << "Available trackers are: " << endl;
    for (vector<string>::iterator it = trackerTypes.begin() ; it != trackerTypes.end(); ++it)
      std::cout << " " << *it << endl;
  }
  return tracker;
}

Step 2: Read First Frame of a Video

A multi-object tracker requires two inputs

A video frame
Location (bounding boxes) of all objects we want to track.

Given this information, the tracker tracks the location of these specified objects in all subsequent frames.

In the code below, we first load the video using the VideoCapture class and read the first frame. This will be used later to initialize the MultiTracker.

Python

  # Set video to load
  videoPath = "videos/run.mp4"

  # Create a video capture object to read videos
  cap = cv2.VideoCapture(videoPath)

  # Read first frame
  success, frame = cap.read()
  # quit if unable to read the video file
  if not success:
    print('Failed to read video')
    sys.exit(1)

C++

// set default values for tracking algorithm and video
  string videoPath = "videos/run.mp4";

  // Initialize MultiTracker with tracking algo
  vector<Rect> bboxes;

  // create a video capture object to read videos
  cv::VideoCapture cap(videoPath);
  Mat frame;

  // quit if unabke to read video file
  if(!cap.isOpened())
  {
    cout << "Error opening video file " << videoPath << endl;
    return -1;
  }

  // read first frame
  cap >> frame;

Step 3: Locate Objects in the First Frame

Next, we need to locate objects we want to track in the first frame. The location is simply a bounding box.

OpenCV provides a function called selectROI that pops up a GUI to select bounding boxes (also called a Region of Interest (ROI)).

In the C++ version, selectROI allows you to obtain multiple bounding boxes, but in the Python version, it returns just one bounding box. So, in the Python version, we need a loop to obtain multiple bounding boxes.

For every object, we also select a random color to display the bounding box.

The code is shown below.
Python

  ## Select boxes
  bboxes = []
  colors = [] 

  # OpenCV's selectROI function doesn't work for selecting multiple objects in Python
  # So we will call this function in a loop till we are done selecting all objects
  while True:
    # draw bounding boxes over objects
    # selectROI's default behaviour is to draw box starting from the center
    # when fromCenter is set to false, you can draw box starting from top left corner
    bbox = cv2.selectROI('MultiTracker', frame)
    bboxes.append(bbox)
    colors.append((randint(0, 255), randint(0, 255), randint(0, 255)))
    print("Press q to quit selecting boxes and start tracking")
    print("Press any other key to select next object")
    k = cv2.waitKey(0) & 0xFF
    if (k == 113):  # q is pressed
      break

  print('Selected bounding boxes {}'.format(bboxes))

C++

  // Get bounding boxes for first frame
  // selectROI's default behaviour is to draw box starting from the center
  // when fromCenter is set to false, you can draw box starting from top left corner
  bool showCrosshair = true;
  bool fromCenter = false;
  cout << "\n==========================================================\n";
  cout << "OpenCV says press c to cancel objects selection process" << endl;
  cout << "It doesn't work. Press Escape to exit selection process" << endl;
  cout << "\n==========================================================\n";
  cv::selectROIs("MultiTracker", frame, bboxes, showCrosshair, fromCenter);

  // quit if there are no objects to track
  if(bboxes.size() < 1)
    return 0;

  vector<Scalar> colors;
  getRandomColors(colors, bboxes.size());

The getRandomColors function is rather simple

// Fill the vector with random colors
void getRandomColors(vector<Scalar>& colors, int numColors)
{
  RNG rng(0);
  for(int i=0; i < numColors; i++)
    colors.push_back(Scalar(rng.uniform(0,255), rng.uniform(0, 255), rng.uniform(0, 255)));
}

Step 3: Initialize the MultiTracker

Until now, we have read the first frame and obtained bounding boxes around objects. That is all the information we need to initialize the multi-object tracker.

We first create a MultiTracker object and add as many single object trackers to it as we have bounding boxes. In this example, we use the CSRT single object tracker, but you try other tracker types by changing the trackerType variable below to one of the 8 tracker times mentioned at the beginning of this post. The CSRT tracker is not the fastest but it produces the best results in many cases we tried.

You can also use different trackers wrapped inside the same MultiTracker, but of course, it makes little sense.

The MultiTracker class is simply a wrapper for these single object trackers. As we know from our previous post, the single object tracker is initialized using the first frame and the bounding box indicating the location of the object we want to the track. The MultiTracker passes this information over to the single object trackers it is wrapping internally.

Python

  # Specify the tracker type
  trackerType = "CSRT"    

  # Create MultiTracker object
  multiTracker = cv2.MultiTracker_create()

  # Initialize MultiTracker
  for bbox in bboxes:
    multiTracker.add(createTrackerByName(trackerType), frame, bbox)

C++

  // Specify the tracker type
  string trackerType = "CSRT";
  // Create multitracker
  Ptr<MultiTracker> multiTracker = cv::MultiTracker::create();

  // Initialize multitracker
  for(int i=0; i < bboxes.size(); i++)
    multiTracker->add(createTrackerByName(trackerType), frame, Rect2d(bboxes[i]));

Step 4: Update MultiTracker & Display Results

Finally, our MultiTracker is ready and we can track multiple objects in a new frame. We use the update method of the MultiTracker class to locate the objects in a new frame. Each bounding box for each tracked object is drawn using a different color.

Python

  # Process video and track objects
  while cap.isOpened():
    success, frame = cap.read()
    if not success:
      break

    # get updated location of objects in subsequent frames
    success, boxes = multiTracker.update(frame)

    # draw tracked objects
    for i, newbox in enumerate(boxes):
      p1 = (int(newbox[0]), int(newbox[1]))
      p2 = (int(newbox[0] + newbox[2]), int(newbox[1] + newbox[3]))
      cv2.rectangle(frame, p1, p2, colors[i], 2, 1)

    # show frame
    cv2.imshow('MultiTracker', frame)

    # quit on ESC button
    if cv2.waitKey(1) & 0xFF == 27:  # Esc pressed
      break

C++

  while(cap.isOpened())
  {
    // get frame from the video
    cap >> frame;

    // Stop the program if reached end of video
    if (frame.empty()) break;

    //Update the tracking result with new frame
    multiTracker->update(frame);

    // Draw tracked objects
    for(unsigned i=0; i<multiTracker->getObjects().size(); i++)
    {
      rectangle(frame, multiTracker->getObjects()[i], colors[i], 2, 1);
    }

    // Show frame
    imshow("MultiTracker", frame);

    // quit on x button
    if  (waitKey(1) == 27) break;

   }