In our previous posts, we discussed how to perform Body and Hand pose estimation using the OpenPose library. Recently, as part of our consulting business, we got a chance to try the state-of-the-art pose-estimation system ( wrnchAI ) built by wrnch and compare it’s performance with OpenPose. We evaluated the Human Body Pose Estimation systems and report our findings by comparing wrnchAI vis-a-vis OpenPose on the following parameters :
- Accuracy
- Computation Speed
- System Requirements
- Model Size
- Other Features
- Mobile Support
- Tracking Support.
- Green Screen Segmentation.
- Support for Application Development.
- Licensing
- Ease of Setup and Use
1. Accuracy Comparison
We use the COCO Dataset for the performance evaluation and comparison of wrnchAI with OpenPose. We use the following two datasets for evaluation of Accuracy.
COCO Validation Data ( COCO – Val )
It contains 5,000 images.
COCO Test-Dev Data ( COCO – Test-dev )
It contains 20,288 images.
1.1. Evaluation Metrics
COCO uses a metric called Object Keypoint Similarity(OKS). It gives a measure of how close the predicted Keypoint is with the ground truth. Higher OKS means higher overlap between predicted Keypoints and the ground truth.
COCO uses mean Precision and Recall as the primary metrics for evaluation for their yearly challenge. In addition, it also measures metrics like Precision and Recall at 50% and 75% OKS (AP_50, AP_75, AR_50, AR_75) and Precision and Recall for medium(AP_medium, AR_medium) and large objects (AP_large, AR_large).
We generate the results and use the COCO evaluation server for a fair evaluation.
1.2. Experimental Setup
We fix the shorter side of the image to a constant size ( net_size ) and resize the other side by preserving the aspect ratio. For example, suppose the net_size is kept at 320, and the input image is of height=640 and width=960, i.e. the image has an aspect ratio of 1.5; we resize the height to 320 and correspondingly resize the width to 480 ( maintaining the Aspect Ratio of 1.5 ).
For each dataset, we conduct two experiments : for two different input resolution. The net_size must be in multiples of 16 ( a requirement for OpenPose ). The two net_size used are 176 and 320.
1.3. Results on Validation Data
1.4. Results on Test-Dev Data
1.5. Observations
We can make the following observations from the above results and performance metrics. We have also given some visual results in support of the our observations.
- The overall performance of both wrnchAI and OpenPose is similar for most of the metrics.
Wrnch OpenPose - wrnchAI performs better for smaller input images as suggested by the charts for input_size=176. This can also be inferred from the AP_medium and AR_medium numbers, which indicate how well a model performs for small to medium sized persons. wrnchAI outperforms OpenPose by ~4-10% for small-medium input images.
Wrnch OpenPose - OpenPose performs better than wrnchAI for larger image size. OpenPose outperforms wrnchAI by ~2-4% for large input images.
- On Visual inspection of the results we found the following :
-
- wrnchAI is better at Keypoint Localization than the numbers suggest.
Wrnch OpenPose
- wrnchAI is better at Keypoint Localization than the numbers suggest.
-
- One of the factors why the numbers are low is that wrnchAI tries to predict key points for the occluded parts of the body. This is good for practical purposes but affects the performance numbers.
Wrnch OpenPose
- One of the factors why the numbers are low is that wrnchAI tries to predict key points for the occluded parts of the body. This is good for practical purposes but affects the performance numbers.
-
- Another factor for the lower numbers for wrnchAI is False Positives. OpenPose handles False Positives slightly better than wrnchAI.
Wrnch OpenPose
- Another factor for the lower numbers for wrnchAI is False Positives. OpenPose handles False Positives slightly better than wrnchAI.
Overall, we can say that the accuracy of the two methods is similar, with wrnchAI being better at smaller images and OpenPose marginally better at large images.
2. Computation Speed Comparison
2.1. Experimental Setup
Given below is the system configuration used for evaluating the methods on the basis of computation speed.
Processor : Intel Core i7 6850K – 6 Core
RAM : 32 GB
GPU : NVIDIA GTX 1080 Ti with 11 GB RAM
OS : Linux 16.04 LTS
Programming Language : Python
We perform 2 experiments for evaluating the computation speed for both the models.
2.2. Speed comparison across different input size
We compute the time required to predict the key points for 1000 frames and divide it by 1000. This gives us the speed in terms of Frames per Second ( FPS ). The experiments are done at 3 different input resolution. The input resolutions used are – 320×224, 480×320, 720×480, 960×640.
Given below is the result of comparison of OpenPose and wrnchAI in terms of FPS.
Observations
It is clear that wrnchAI is much faster than OpenPose. wrnchAI is ~3.5x faster than OpenPose for small images and ~2x faster for medium to large images.
2.3. Speed Comparison for increasing number of persons
The goal of this experiment is to check if the inference time is dependent on the number of persons present, I.e. whether the time taken by the model increase if the number of people in the image increases. For this, we took images with different number of persons ranging from 1 to 20 and checked the inference time for each image. Given below is the graph showing the results of the experiment. The input resolution used was 480×320.
From the above figure, it can be seen that OpenPose performance remains constant at ~33 to 34 FPS irrespective of the number of persons in the scene. However, speed for wrnchAI decreases slightly. Specifically, the speed goes from 74 FPS to 66 FPS as the number of persons increase from 1 to 20. We can say that the speed decrease by 2FPS for an increase of 5 people. This however, would not be a critical issue since the speed itself is very high and a small drop in FPS won’t affect the overall performance of the system.
3. System Requirements
wrnchAI
System RAM : At least 2.5 GB
GPU RAM : At least 1 GB
GPU : CUDA enabled
CUDA – 10
TensorRT for Inference
OS : Ubuntu / Windows / Jetson TX2 / iOS 11.0+
OpenPose
System RAM : At least 2.5 GB
GPU RAM : At least 2.5 GB
GPU : CUDA / AMD
CUDA version – 9+
OS : Ubuntu / Windows / Jetson TX2 / MacOS ( CPU Only )
For an input image size of 480×320 image, the RAM usage was found to be as given below :
WrnchAI | OpenPose | |
---|---|---|
GPU RAM | 996 MB | 2128 MB |
System RAM | 2400 MB | 1340 MB |
4. Model Size Comparison
There are 3 models provided in both OpenPose and wrnchAI. The models are for key point detection of the following :
- Body
- Hand
- Face
Apart from the above three, wrnchAI also provides a model for 3D pose estimation. OpenPose also provides 3D reconstruction, but that requires use of depth cameras.
Given below is the comparison of model size of wrnchAI and OpenPose
4.1. Observations
- The 2D pose estimation model for wrnchAI is more light-weight than the OpenPose model
- The models for Hand and Face are much smaller for wrnchAI, making the whole suite of 3 models very light-weight. Because of the light-weight models, wrnchAI can be easily ported to Mobile devices.
5. Other Features
We evaluate wrnchAI and OpenPose on some qualitative factors, which cannot be measured but are important nonetheless.
1. Mobile Support
wrnchAI has mobile development support for iOS. OpenPose does not have any support for mobile devices yet.
2. Tracking Support
wrnchAI and OpenPose both have tracking support. However, both are in development phase and not stable yet. OpenPose currently has only single person tracking.
3. Green Screen Segmentation
wrnchAI provides support for green screen background, which might be helpful in many AR / VR applications, OpenPose does not provide the background mask.
4. Support for Application Development
wrnchAI has a mature Unity plugin which can be used for game development. OpenPose has added support for Unity in January, 2019.
5. Licensing
OpenPose license mentions that it cannot be used in the field of sports. This limits the use cases for OpenPose. There are no such issues with wrnchAI license.
6. Ease of Setup and Use
wrnchAI provides Debian packages for Linux and zip files for Windows, which can be easily installed. OpenPose, on the other hand has to be installed from source, which might be tricky.
7. Documentation and Support
The documentation provided by both wrnchAI and OpenPose are updated regularly. However, there is scope for improvement in quality of documentation in both cases.
Both OpenPose and wrnchAI have good technical support and issue tracking.
6. Conclusion
- The accuracy and speed comparison shows that wrnchAI offers a very fast Pose Estimation system with negligible loss in accuracy.
- Light-weight models allow wrnchAI to be easily integrated with Mobile Applications.
- wrnchAI is more suitable for application development as it has better support for Unity, iOS, Green Screen among others.
Try out wrnchAI
Interested in using wrnchAI engine for your projects? You can try out their demo or request them for a trial version.