 Edit

# Simple Background Estimation in Videos using OpenCV (C++/Python) In many computer vision applications, the processing power at your disposal is low. In such cases, we have to use simple, yet effective techniques.

In this post, we will cover one such technique for estimating the background of a scene when the camera is static and there are some moving objects in the scene. This scenario is not uncommon. For example, many traffic and surveillance cameras are rigidly fixed.

This post has been tested on OpenCV 4.4.

## Temporal Median Filtering

To understand the idea we are going to describe in this post, let us consider a simpler problem in 1D.

Suppose we are estimating a quantity (say the temperature of the room) every 10 milliseconds.

Let’s say, the temperature of the room is 70 degrees Fahrenheit.

In the above figure, we have shown measurements from two thermometers — a good thermometer and a bad thermometer.

I’ve partnered exclusively with OpenCV.org to bring you official courses in AI, Computer Vision, and Deep Learning to take you on a structured path from first steps to mastery.

The good thermometer shown on the left reports 70 degrees with some level of Gaussian noise. To get a more accurate estimate of temperature, we can simply average the values over a few seconds. Since the noise is Gaussian with positive and negative values, the average will cancel out the noise. Indeed, the average value in this specific case is 70.01.

The bad thermometer, on the other hand, behaves like the good thermometer most of the time but occasionally, the numbers are completely wrong.

In fact, if we take an average of the numbers reported by the bad thermometer we get 71.07 degrees. This is clearly an overestimate.

Can we still get a good estimate of temperature?

The answer is YES. When the data contains outliers, the median is a more robust estimate of the value we are trying to estimate.

The median is the mid-value of the data when it is sorted in ascending or descending order.

The median of the curve shown above is 70.05 degrees which is a much better estimate than 71.07 degrees.

The only downside is that the median is more expensive to calculate compared to the mean/average.

## Using the Median for Background Estimation

Now, let’s return to the problem of estimating the background when the camera is static.

We can assume most of the time, every pixel sees the same piece of the background because the camera is not moving. Occasionally, a car or other moving object comes in the front and obscure the background.

For a video sequence, we can randomly sample a few frames (say 25 frames).

In other words, for every pixel, we now have 25 estimates of the background. As long as a pixel is not covered by a car or other moving object more than 50% of the time, the median of the pixel over these 25 frames will give a good estimate of the background at that pixel.

We can repeat this for every pixel and recover the entire background.

## Background Estimation Code (C++ / Python)

Now, let’s look at the actual code.

### Python code

```import numpy as np
import cv2
from skimage import data, filters

# Open Video
cap = cv2.VideoCapture('video.mp4')

# Randomly select 25 frames
frameIds = cap.get(cv2.CAP_PROP_FRAME_COUNT) * np.random.uniform(size=25)

# Store selected frames in an array
frames = []
for fid in frameIds:
cap.set(cv2.CAP_PROP_POS_FRAMES, fid)
frames.append(frame)

# Calculate the median along the time axis
medianFrame = np.median(frames, axis=0).astype(dtype=np.uint8)

# Display median frame
cv2.imshow('frame', medianFrame)
cv2.waitKey(0)
```

### C++ code

```#include <opencv2/opencv.hpp>
#include <iostream>
#include <random>

using namespace std;
using namespace cv;
```

We will create a couple of functions to calculate the median frame.

```int computeMedian(vector<int> elements)
{
nth_element(elements.begin(), elements.begin()+elements.size()/2, elements.end());

//sort(elements.begin(),elements.end());
return elements[elements.size()/2];
}

cv::Mat compute_median(std::vector<cv::Mat> vec)
{
// Note: Expects the image to be CV_8UC3
cv::Mat medianImg(vec.rows, vec.cols, CV_8UC3, cv::Scalar(0, 0, 0));

for(int row=0; row<vec.rows; row++)
{
for(int col=0; col<vec.cols; col++)
{
std::vector<int> elements_B;
std::vector<int> elements_G;
std::vector<int> elements_R;

for(int imgNumber=0; imgNumber<vec.size(); imgNumber++)
{
int B = vec[imgNumber].at<cv::Vec3b>(row, col);
int G = vec[imgNumber].at<cv::Vec3b>(row, col);
int R = vec[imgNumber].at<cv::Vec3b>(row, col);

elements_B.push_back(B);
elements_G.push_back(G);
elements_R.push_back(R);
}

medianImg.at<cv::Vec3b>(row, col)= computeMedian(elements_B);
medianImg.at<cv::Vec3b>(row, col)= computeMedian(elements_G);
medianImg.at<cv::Vec3b>(row, col)= computeMedian(elements_R);
}
}
return medianImg;
}
```
```int main(int argc, char const *argv[])
{
std::string video_file;
if(argc > 1)
{
video_file = argv;
} else
{
video_file = "video.mp4";
}

VideoCapture cap(video_file);
if(!cap.isOpened())
cerr << "Error opening video file\n";

// Randomly select 25 frames
default_random_engine generator;
uniform_int_distribution<int>distribution(0,
cap.get(CAP_PROP_FRAME_COUNT));

vector<Mat> frames;
Mat frame;

for(int i=0; i<25; i++)
{
int fid = distribution(generator);
cap.set(CAP_PROP_POS_FRAMES, fid);
Mat frame;
cap >> frame;
if(frame.empty())
continue;
frames.push_back(frame);
}
// Calculate the median along the time axis
Mat medianFrame = compute_median(frames);

// Display median frame
imshow("frame", medianFrame);
waitKey(0);
}
```

As you can see, we randomly select 25 frames and calculate the median of every pixel over the 25 frames. This median frame is a good estimate of the background as long as every pixel sees the background at least 50% of the time.

The result is shown below The estimate of the background is the median frame calculated by finding the median value of every pixel over 25 frames.

## Frame differencing (C++ / Python)

The obvious next question is if we can create a mask for every frame which shows parts of the image that are in motion.

This is accomplished in the following steps

1. Convert the median frame to grayscale.
2. Loop over all frames in the video. Extract the current frame and convert it to grayscale.
3. Calcualte the absolute difference between the current frame and the median frame.
4. Threshold the above image to remove noise and binarize the output.

Let’s check out the code.

### Python code

```# Reset frame number to 0
cap.set(cv2.CAP_PROP_POS_FRAMES, 0)

# Convert background to grayscale
grayMedianFrame = cv2.cvtColor(medianFrame, cv2.COLOR_BGR2GRAY)

# Loop over all frames
ret = True
while(ret):

# Convert current frame to grayscale
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
# Calculate absolute difference of current frame and
# the median frame
dframe = cv2.absdiff(frame, grayMedianFrame)
# Treshold to binarize
th, dframe = cv2.threshold(dframe, 30, 255, cv2.THRESH_BINARY)
# Display image
cv2.imshow('frame', dframe)
cv2.waitKey(20)

# Release video object
cap.release()

# Destroy all windows
cv2.destroyAllWindows()

```

### C++ code

```//  Reset frame number to 0
cap.set(CAP_PROP_POS_FRAMES, 0);

// Convert background to grayscale
Mat grayMedianFrame;
cvtColor(medianFrame, grayMedianFrame, COLOR_BGR2GRAY);

// Loop over all frames
while(1)
{
cap >> frame;

if (frame.empty())
break;

// Convert current frame to grayscale
cvtColor(frame, frame, COLOR_BGR2GRAY);

// Calculate absolute difference of current frame and the median frame
Mat dframe;
absdiff(frame, grayMedianFrame, dframe);

// Threshold to binarize
threshold(dframe, dframe, 30, 255, THRESH_BINARY);

// Display Image
imshow("frame", dframe);
waitKey(20);
}

cap.release();
return 0;
}
```

## Results

The video below shows the output of background estimation and frame differencing.

FREE Python Course
We have designed this Python course in collaboration with OpenCV.org for you to build a strong foundation in the essential elements of Python, Jupyter, NumPy and Matplotlib.
FREE OpenCV Crash Course
We have designed this FREE crash course in collaboration with OpenCV.org to help you take your first steps into the fascinating world of Artificial Intelligence and Computer Vision. The course will be delivered straight into your mailbox.   