OpenCV Java API for Image Classification

OpenCV library is widely used due to its extensive coverage of the computer vision tasks, and availability to involve it in various projects, including deep learning. Usually, OpenCV is used with C++ and Python API; even though it can be used with Java. In this article, we will show an example of how it can be used together using OpenCV Java API.

We will explore how convolutional neural networks could be included in the pipeline of Java application using the OpenCV DNN module.

This post will be helpful for people who are familiar with Java and would like to experiment with the deep learning field.

Here is what we will do in the post:

Convert the MobileNet classification model trained in PyTorch to ONNX
Check the model prediction on a simple example
Construct a Java pipeline for image classification

MobileNet

MobileNets are a family of mobile-first low-latency and low-power DNN models. In this post, we use the MobileNetV2 model to transform the final Java application.

The MobileNetV2 architecture, based on MobileNetV1, was introduced in MobileNetV2: Inverted Residuals and Linear Bottlenecks in 2019. The idea of the network is its aptness to mobile devices and platforms with constrained resources. It was achieved with the reduction of the operations and memory consumption without impact on the model accuracy through the novel layer module called “inverted residual with linear bottleneck”.

Network input data is filtered with lightweight depthwise convolutions. Further linear convolutions will transform the obtained features into a low-dimensional representation.

100K+ Learners
3 Hours of Learning

Join Free OpenCV Bootcamp

15K+ Learners
3 Hours of Learning

Join Free TensorFlow Bootcamp

10K+ Learners
8 Hours of Learning

Join Free PyTorch Bootcamp

ONNX Format

Open Neural Network Exchange (ONNX) is an open-source AI project. Its goal is to make the interchange between neural network models and other frameworks possible.

Thus, if you faced a necessity to exploit a DNN model in a project, composed with another tool, which differs from the model tool, you can easily deal with it, transferring the model into ONNX format. Afterwards, it is ready for importing into another framework.

The examples of such transfers can be PyTorch to CoreML conversion or, in our case, PyTorch to org.opencv.dnn.Net conversion.

Code Implementation

In this section, we will learn how to convert PyTorch MobileNetV2 into org.opencv.dnn.Net, exploring both Python and Java examples.

PyTorch Model into ONNX Conversion

We have provided a Python script (Mobilenetv2ToOnnx.py) for converting the model trained in PyTorch to ONNX format. It consists of the following pipeline:

Data loading and preprocessing
Initialization of the configured PyTorch model
PyTorch model conversion to ONNX
Evaluation and prediction comparison of the models – original and converted

Here is how its main function looks like:

# read and process the input image
image = get_preprocessed_image(image_name=args.input_image)

# obtain original model
pytorch_model = models.mobilenet_v2(pretrained=True)

# provide inference of the original PyTorch model
pytorch_model.eval()
pytorch_predictions = pytorch_model(torch.Tensor(image)).detach().numpy()

# obtain OpenCV generated ONNX model
onnx_model_path = get_onnx_model(original_model=pytorch_model, input_image=image)

# check if conversion succeeded
onnx_model = onnx.load(onnx_model_path)
onnx.checker.check_model(onnx_model)

# check onnx model output
compare_pytorch_onnx(
    pytorch_predictions, onnx_model_path, image,
)

# decode classification results
get_predicted_class(pytorch_preds=pytorch_predictions)

Download Code To easily follow along this tutorial, please download code by clicking on the button below. It's FREE!

Click here to download the source code to this post

To run the script for the model conversion to ONNX format and its further evaluation execute the below line:

python Mobilenetv2ToOnnx.py

The default input image for inference is images/coffee.jpg:

The default input image for inference using opencv java, coffee cup and coffee beans on a table top.

For more details, check the README.md file in the GitHub repository.

Let’s take a look through the code in detail.

1. Data Preparation

It introduces the necessary image transformations (resize, center crop, normalization) and shape transformations:

def get_preprocessed_image(image_name):
  # read image
    original_image = cv2.imread(image_name)

    # convert original image to RGB format
    image = cv2.cvtColor(original_image, cv2.COLOR_BGR2RGB)

    # transform input image:
    # 1. resize the image
    # 2. crop the image
    # 3. normalize: subtract mean and divide by standard deviation
    transform = Compose(
        [
            Resize(height=256, width=256),
            CenterCrop(224, 224),
            Normalize(
                 mean=[0.485, 0.456, 0.406],
                 std=[0.229, 0.224, 0.225]),
        ],
    )
    image = transform(image=image)["image"]

    # change the order of channels
    image = image.transpose(2, 0, 1)
    return np.expand_dims(image, axis=0)

2. Initializing the Pre-trained PyTorch MobileNetV2 model

pytorch_model = models.mobilenet_v2(pretrained=True)

3. Converting PyTorch model to ONNX

We convert PyTorch model to pytorch_mobilenet.onnx with torch.onnx.export function and OpenCV Net extraction with cv2.dnn.readNetFromONNX(...):

torch.onnx.export(
    original_model, torch.Tensor(input_image), model_name, verbose=True,
)

converted_onnx_model = cv2.dnn.readNetFromONNX(model_name)

4. Evaluation and predictions comparison:

To do so, we should:

configure ONNX InferenceSession and get the inference results:

    # get onnx result
    session = onnxruntime.InferenceSession(onnx_model_path)
    input_name = session.get_inputs()[0].name
    onnx_result = session.run([], {input_name: input_image})
    onnx_result = np.squeeze(onnx_result, axis=0)

compare predictions obtained from the initial PyTorch model and ONNX:

for test_onnx_result, gold_result in zip(onnx_result, original_model_preds):
        np.testing.assert_almost_equal(
            gold_result, test_onnx_result, decimal=3,
        )

Code Execution

Following is the output after executing the python script.

ONNX model was successfully generated: models/pytorch_mobilenet.onnx

Checking PyTorch model and converted ONNX model outputs ...
PyTorch and ONNX output values are equal!

Predicted class by PyTorch model:  cup

OpenCV Java Model Inference

Now that we have obtained the network model (pytorch_mobilenet.onnx) with the help of Mobilenetv2ToOnnx.py script, we can transfer it into the desired environment; in our case it’s Java. The general transferring and inference pipeline is described in the below code:

public static void main(String[] args) {
    String imageLocation = "images/coffee.jpg";

    // load the OpenCV native library
    System.loadLibrary(Core.NATIVE_LIBRARY_NAME);

    // read and process the input image
    Mat inputBlob = DnnOpenCV.getPreprocessedImage(imageLocation);

    // read generated ONNX model into org.opencv.dnn.Net object
    Net dnnNet = Dnn.readNetFromONNX(DnnOpenCV.MODEL_PATH);
    System.out.println("DNN from ONNX was successfully loaded!");

    // set OpenCV model input
    dnnNet.setInput(inputBlob);

    // provide inference
    Mat classification = dnnNet.forward();

    // decode classification results
    DnnOpenCV.getPredictedClass(classification);
}

Data Preprocessing

Let’s begin with data preprocessing. Firstly, we define some constant values significant for the image transformations:

private static final int TARGET_IMG_WIDTH = 224;
private static final int TARGET_IMG_HEIGHT = 224;

private static final double SCALE_FACTOR = 1 / 255.0;

private static Mat imageRead;

Here SCALE_FACTOR will be multiplied by the pixel values to scale them between 0 and 1.

imageRead is initialised outside the function scope to make it accessible for the whole class.

The below code presents image reading into org.opencv.core.Mat and its resizing:

// get the image from the internal resource folder
imageRead = Imgcodecs.imread(imagePath);

// this object will store the preprocessed image
Mat image = new Mat();

// resize input image
Imgproc.resize(imageRead, image, new Size(256, 256));

For further image manipulations, we need to convert the Mat object into CvType.CV_32FC3 floating-point type for the correct representation of values:

// create empty Mat images for float conversions
Mat imgFloat = new Mat(image.rows(), image.cols(), CvType.CV_32FC3);

// convert input image to float type multiplying by SCALE_FACTOR
image.convertTo(imgFloat, CvType.CV_32FC3, SCALE_FACTOR);

During the conversion process, the image values were multiplied by the SCALE_FACTOR. Thus, we have scaled values in the imgFloat org.opencv.core.Mat of the float type.

Center Cropping

The next step is image center cropping:

// crop input image
imgFloat = centerCrop(imgFloat);

The below lines are centerCrop core:

int y1 = Math.round((inputImage.rows() - TARGET_IMG_HEIGHT) / 2);
int y2 = Math.round(y1 + TARGET_IMG_HEIGHT);
int x1 = Math.round((inputImage.cols() - TARGET_IMG_WIDTH) / 2);
int x2 = Math.round(x1 + TARGET_IMG_WIDTH);

Rect centerRect = new Rect(x1, y1, (x2 - x1), (y2 - y1));

Prepare image for input

The final stage is model input preparation with Dnn.blobFromImage(...) function, which provides 4-dimensional NCHW Mat, where H and W are equal to the appropriate Size(TARGET_IMG_WIDTH, TARGET_IMG_HEIGHT) values:

// prepare DNN input
Mat blob = Dnn.blobFromImage(
    imgFloat,
    1.0, /* default scalefactor */
    new Size(TARGET_IMG_WIDTH, TARGET_IMG_HEIGHT), /* target size */
    mean,  /* mean */
    true, /* swapRB */
    false /* crop */
);

// divide on std
Core.divide(blob, std, blob);

It is worth noting that the function Dnn.blobFromImage(...) has all the necessary operations for basic image transformations:

scaling between 0 and 1
mean subtraction
BGR to RGB conversion
central crop.

This set of transformations is perfectly suited for Caffe models. However, in our case, we need to divide pixel values by std, and implementing it with Core.divide(...). Another important detail is scaling. In blobFromImage function image is scaled after mean subtraction:

images[i] -= mean;
images[i] *= scalefactor;

Hence, we scale the image with convertTo(imgFloat, CvType.CV_32FC3, SCALE_FACTOR) to fit the PyTorch data preprocessing pipeline.

We also have to use the central crop before blobFromImage. The reason is the initial image resizing to 256 × 256. There are the following lines in OpenCV documentation:

“If [the] crop is true, [the] input image is resized, so one side after resize is equal to [the] corresponding dimension in size, and another one is equal or larger. Then, a crop from the center is performed”

Thus, applying blobFromImage cropping differs from the PyTorch preprocessing pipeline.

Read the ONNX model

To obtain converted MobileNetV2 model we use Dnn.readNetFromONNX(DnnOpenCV.MODEL_PATH)

// read generated ONNX model into org.opencv.dnn.Net object
Net dnnNet = Dnn.readNetFromONNX(DnnOpenCV.MODEL_PATH);
System.out.println("DNN from ONNX was successfully loaded");

For the inference process, we pass the prepared inputBlob and initiate forward().

// set OpenCV model input
dnnNet.setInput(inputBlob);

// provide inference
Mat classification = dnnNet.forward();

Decode the classification results

To below code preforms the class name decoding:

// prepare ArrayList from the file with listed classes
ArrayList<String> imgLabels = new ArrayList<String>();
try {
    imgLabels = getImgLabels(IMAGENET_CLASSES);
} catch (IOException ex) {
    System.out.printf("Could not read %s file:%n", IMAGENET_CLASSES);
    ex.printStackTrace();
}
// obtain max prediction result
Core.MinMaxLocResult mm = Core.minMaxLoc(classificationResult);
double maxValIndex = mm.maxLoc.x;
System.out.println("Predicted Class: " + imgLabels.get((int) maxValIndex));

OpenCV Java Code Execution

When the java code is executed, it predicts the object in the image on the terminal

DNN from ONNX was successfully loaded!
Predicted class: cup

Moreover, a window pops up, which displays the input image and the predicted object.

The output image, with the predicted class cup overlaid on the image. — Sample image with the predicted class

As a result, we get the equivalent classification results. The implemented classification logic in Java can be built into an Android application. Refer this post to learn how this could be done.

Image Classification with OpenCV Java

MobileNet

ONNX Format

Code Implementation

PyTorch Model into ONNX Conversion

1. Data Preparation

2. Initializing the Pre-trained PyTorch MobileNetV2 model

3. Converting PyTorch model to ONNX

4. Evaluation and predictions comparison:

Code Execution

OpenCV Java Model Inference

Data Preprocessing

Center Cropping

Prepare image for input

Read the ONNX model

Decode the classification results

OpenCV Java Code Execution

Get Started with OpenCV

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?

MobileNet

ONNX Format

Code Implementation

PyTorch Model into ONNX Conversion

1. Data Preparation

2. Initializing the Pre-trained PyTorch MobileNetV2 model

3. Converting PyTorch model to ONNX

4. Evaluation and predictions comparison:

Code Execution

OpenCV Java Model Inference

Data Preprocessing

Center Cropping

Prepare image for input

Read the ONNX model

Decode the classification results

OpenCV Java Code Execution

Subscribe & Download Code

Get Started with OpenCV

Subscribe to receive the download link, receive updates, and be notified of bug fixes

Which email should I send you the download link?