In this post, I am sharing Haar and LBP object detection cascades that have the same performance as the OpenCV cascades, but they have much smaller file sizes. I will also explain the ideas used in this minification.
Why do you care about the file size?
Let’s say you are building a mobile app. Downloading the app is the first experience the user has with your app. The smaller the size of the app, the better their first experience. You do not want a 2MB face detector in your app. Smaller models will also load a bit faster because the parser has to parse a fewer number of lines. So, we strive for —
Less but better
How to reduce the size of OpenCV Haar and LBP Cascades ?
I am sharing three ideas for reducing the size of Haar and LBP cascades. Two of these ideas are implemented in the cascades I have shared.
Idea 1 : Minify XML by removing white spaces
Haar and LBP cascades that come with OpenCV are simple XML files. They also have a ton of white spaces, new lines, tabs etc. which are completely useless for defining the cascade. So we just remove unnecessary white spaces. Note, not all white spaces can be removed. E.g. inside the cascades you will find structures of this form.
<internalNodes>
0 -1 367 -1.2275323271751404e-02
</internalNodes>
You should not remove the white spaces between the numbers.
Idea 2 : Remove unnecessary precision
I occasionally tuned into the news about US Presidential Election 2016. Nate Silver, who is famous for calling many election results predicted 24.8% chance of a Trump win. One of my favorite authors, Nassim Nicholas Taleb, countered that prediction with
Quite insulting to probability when someone “estimates” noise with precision: 24.8%, not 25 or 24.
His point is that polling is extremely noisy. Why create a false sense of precision when none exists?
A lot of times, precision is unnecessary. In the OpenCV cascades, there are a lot of floating point numbers in double precision. You can easily truncate them without loss of accuracy. E.g. -1.2275323271751404e-02 can be replaced by -1.2275e-02
Idea 3 : Binarize the cascades
The final idea is to convert the XML file into its binary form. Of course, this will make the file incompatible with OpenCV loader and you will have to write your own loader for this cascade. But if the model size is large, it may be worth the effort.
Minified OpenCV Cascades
I have implemented ideas 1 and 2. You can download the minified cascades by clicking here. Minified cascade filenames have a prefix “mallick_”
I have listed a few examples of the decrease in file size.
Cascade Name | File size | Minified file size |
---|---|---|
haarcascade_eye.xml | 336K | 180K |
haarcascade_frontalface_default.xml | 912K | 488K |
haarcascade_frontalface_alt_tree.xml | 2.6M | 1.4M |
lbpcascade_frontalface.xml | 52K | 36K |
In the shared respository, I have included a tester script (tester.py) that loads a haar cascade and its minified version and displays the results on the same frame of the your webcam. The blue box denotes output of the haar cascade and the red box denotes the output of the minified cascade. The blue box is deliberately made 4 pixels smaller in width and height for display purposes. Here is the usage
python tester.py haarcascades/haarcascade_frontalface.xml