Minified OpenCV Haar and LBP Cascades

In this post, I am sharing Haar and LBP object detection cascades that have the same performance as the OpenCV cascades, but they have much smaller file sizes. I will also explain the ideas used in this minification.

Why do you care about the file size?

Let’s say you are building a mobile app. Downloading the app is the first experience the user has with your app. The smaller the size of the app, the better their first experience. You do not want a 2MB face detector in your app. Smaller models will also load a bit faster because the parser has to parse a fewer number of lines. So, we strive for —

Less but better

Download Code To easily follow along this tutorial, please download code by clicking on the button below. It's FREE!

Click here to download the source code to this post

How to reduce the size of OpenCV Haar and LBP Cascades ?

I am sharing three ideas for reducing the size of Haar and LBP cascades. Two of these ideas are implemented in the cascades I have shared.

Idea 1 : Minify XML by removing white spaces

Haar and LBP cascades that come with OpenCV are simple XML files. They also have a ton of white spaces, new lines, tabs etc. which are completely useless for defining the cascade. So we just remove unnecessary white spaces. Note, not all white spaces can be removed. E.g. inside the cascades you will find structures of this form.

<internalNodes>
    0 -1 367 -1.2275323271751404e-02
</internalNodes>

You should not remove the white spaces between the numbers.

Idea 2 : Remove unnecessary precision

I occasionally tuned into the news about US Presidential Election 2016. Nate Silver, who is famous for calling many election results predicted 24.8% chance of a Trump win. One of my favorite authors, Nassim Nicholas Taleb, countered that prediction with

Quite insulting to probability when someone “estimates” noise with precision: 24.8%, not 25 or 24.

His point is that polling is extremely noisy. Why create a false sense of precision when none exists?

A lot of times, precision is unnecessary. In the OpenCV cascades, there are a lot of floating point numbers in double precision. You can easily truncate them without loss of accuracy. E.g. -1.2275323271751404e-02 can be replaced by -1.2275e-02

100K+ Learners
3 Hours of Learning

Join Free OpenCV Bootcamp

15K+ Learners
3 Hours of Learning

Join Free TensorFlow Bootcamp

10K+ Learners
8 Hours of Learning

Join Free PyTorch Bootcamp

Idea 3 : Binarize the cascades

The final idea is to convert the XML file into its binary form. Of course, this will make the file incompatible with OpenCV loader and you will have to write your own loader for this cascade. But if the model size is large, it may be worth the effort.

Minified OpenCV Cascades

I have implemented ideas 1 and 2. You can download the minified cascades by clicking here. Minified cascade filenames have a prefix “mallick_”

I have listed a few examples of the decrease in file size.

Cascade Name	File size	Minified file size
haarcascade_eye.xml	336K	180K
haarcascade_frontalface_default.xml	912K	488K
haarcascade_frontalface_alt_tree.xml	2.6M	1.4M
lbpcascade_frontalface.xml	52K	36K

In the shared respository, I have included a tester script (tester.py) that loads a haar cascade and its minified version and displays the results on the same frame of the your webcam. The blue box denotes output of the haar cascade and the red box denotes the output of the minified cascade. The blue box is deliberately made 4 pixels smaller in width and height for display purposes. Here is the usage

python tester.py haarcascades/haarcascade_frontalface.xml

Minified OpenCV Haar and LBP Cascades

How to reduce the size of OpenCV Haar and LBP Cascades ?

Idea 1 : Minify XML by removing white spaces

Idea 2 : Remove unnecessary precision

Idea 3 : Binarize the cascades

Minified OpenCV Cascades

Get Started with OpenCV

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?