Image Quality Assessment : BRISQUE

Photography is the favorite hobby of millions of people around the world. After all, how difficult can it be! In the words of Diane Arbus, a famous American photographer — “Taking pictures is like tiptoeing into the kitchen late at night and stealing Oreo cookies.” Taking a photo is easy,

brisque iqa feature

Photography is the favorite hobby of millions of people around the world. After all, how difficult can it be! In the words of Diane Arbus, a famous American photographer —

“Taking pictures is like tiptoeing into the kitchen late at night and stealing Oreo cookies.”

Taking a photo is easy, but taking a high-quality photo is hard. It requires good composition and lighting. The right lens and superior equipment can make a big difference. But above all, a high-quality photo requires good taste and judgment. You need an eye of an expert.

But is there a mathematical quality measure that captures this human judgment?

The answer is both yes and no!

There are some measures of quality that are easy for an algorithm to capture. For example, we can look at the information captured by the pixels and flag an image as noisy or blurry.

On the other hand, some measures of quality are almost impossible for an algorithm to capture. For example, an algorithm would have a tough time assessing the quality of a picture that requires cultural context.

In this post, we will learn about an algorithm for predicting image quality score.

Note: This tutorial has been tested on Ubuntu 18.04, 16.04, with Python 3.6.5, Python 2.7 and OpenCV 3.4.1 and 4.0.0-pre versions.

What is Image Quality Assessment (IQA)?

Image Quality Assessment (IQA) algorithms take an arbitrary image as input and output a quality score as output. There are three types of IQAs:

  1. Full-Reference IQA: Here you have a ‘clean’ reference (non-distorted) image to measure the quality of your distorted image. This measure may be used in assessing the quality of an image compression algorithm where we have access to both the original image and its compressed version.
  2. Reduced-Reference IQA: Here you don’t have a reference image, but an image having some selective information about it (e.g. watermarked image) to compare and measure the quality of distorted image.
  3. Objective Blind or No-Reference IQA: The only input the algorithm gets is the image whose quality you want to measure. This is thus called, No-Reference or Objective-Blind.

No-Reference IQA

In this post, we will discuss one of the No-Reference IQA Metrics, called Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE). Before we go deeper into the theory, let’s first understand two basic terms:

  1. Distorted Image: As the name suggests, a distorted image is a version of the original image that is distorted by blur, noise, watermarking, color transformations, geometric transformations and so on and so forth.
types of distortions
Fig. 1 Distortions used in TID 2008 Database
  • Natural Image: An image directly captured by a camera with no post processing is a natural image in our context. Here is an example of natural image and a distorted image.
  • collage of natural and distorted
    Fig. 2 Natural Image (left) and Noisy Image (distorted, right)

    As you can imagine, it is not always clear-cut whether an image is distorted or it’s natural. For example, when a video is smartly rendered with motion blur, the algorithm may get confused about its quality because of the intentional blur. So one has to use this quality measure in the right context.

    Image Quality Assessment (IQA) Dataset

    Quality Score TID2008
    Fig. 3 TID2008 Image Quality Score Scaling (0 to 100) : lesser the score, better the subjective quality.

    Quality is a subjective matter. To teach an algorithm about good and bad quality, we need to show the algorithm examples of many images and their quality score.

    Who assigns the quality score for these training images? Humans, of course. But we cannot rely on the opinion of just one human. So we need opinions of several humans and assign the image a mean score between 0 (best) and 100 (worst). This score is called the Mean Quality Score in academic literature.

    Do we need to collect this data ourselves? Fortunately, this dataset called TID2008 has been made available for research purposes.

    Blind/Referenceless Image Spatial Quality Evaluator (BRISQUE)

    In this section, we will describe the steps needed for BRISQUE algorithm used for No-Reference IQA. We have modified the C++ code provided by the authors to work on OpenCV 3.4. In addition, we have written code for Python 2 and Python 3.

    Fig. 5 shows the steps involved in calculating BRISQUE.

    Workflow of BRISQUE IQA
    Fig. 5 Steps to Calculate Image Quality Score using BRISQUE Model

    Step 1: Extract Natural Scene Statistics (NSS)

    The distribution of pixel intensities of natural images differs from that of distorted images. This difference in distributions is much more pronounced when we normalize pixel intensities and calculate the distribution over these normalized intensities. In particular, after normalization pixel intensities of natural images follow a Gaussian Distribution (Bell Curve) while pixel intensities of unnatural or distorted images do not. The deviation of the distribution from an ideal bell curve is therefore a measure of the amount of distortion in the image.

    We have shown this in the Figure below.

    Normalized Luminance Difference
    Fig. 4 On left: shows a natural image with no artificial effects added, fits gaussian distribution. On right: An artificial image, doesn’t fit the same distribution well.

    Mean Substracted Contrast Normalization (MSCN)

    There are a few different ways to normalize an image. One such normalization is called Mean Substracted Contrast Normalization (MSCN). The figure below shows how to calculate MSCN coefficients.

    MSCN Coefficients Steps
    Fig. 6 Steps to calculate MSCN Coefficients

    This can be visualized as:

    mscn visualization steps
    Fig. 7 Calculation of MSCN Coefficients
    Don’t let the math below scare you. The code after the math is much simpler and easy to understand!

    To calculate the MSCN Coefficients, the image intensity I(i,j) at pixel (i, j) is transformed to the luminance \widehat{I}(i, j).

    (1)   \begin{align*}\widehat{I}(i, j) = \cfrac{I(i, j) - \mu(i, j)}{\sigma(i, j) + C}\end{align*}

    Where i \in 1,2, \cdots M, j \in 1,2, \cdots N (M and N are height and width respectively). Functions \mu(i, j) and \sigma(i, j) are local mean field and local variance field respectively. Local Mean Field (\mu) is nothing but the Gaussian Blur of the original image, while Local Variance Field (\sigma) is the Gaussian Blur of the square of the difference of original image and \mu. In the equation below \bf{W} is the Gaussian Blur Window function.

    (2)   \begin{align*}\bf{\mu} &= \bf{W}*\bf{I} \\\bf{\sigma} &= \sqrt{\bf{W}*(\bf{I} - \bf{\mu})^2}\end{align*}

    We use GaussianBlur function in C++ and Python to calculate MSCN Coefficients, as shown below:

    Download Code To easily follow along this tutorial, please download code by clicking on the button below. It's FREE!

    C++

    Mat im = imread("image_scenery.jpg"); // read image
    cvtColor(im, im, COLOR_BGR2GRAY); // convert to grayscale
    
    im.convertTo(im, 1.0/255); // normalize and copy the read image to orig_bw
    Mat mu(im.size(), CV_64FC1, 1);
    GaussianBlur(im, mu, Size(7, 7), 1.166); // apply gaussian blur
    Mat mu_sq = mu.mul(mu); 
    
    // compute sigma
    Mat sigma = im.size(), CV_64FC1, 1);
    sigma = im.mul(im);
    GaussianBlur(sigma, sigma, Size(7, 7), 1.166); // apply gaussian blur
    subtract(sigma, mu_sq, sigma); // sigma = sigma - mu_sq
    cv::pow(sigma, double(0.5), sigma); // sigma = sqrt(sigma)
    add(sigma, Scalar(1.0/255), sigma); // to avoid DivideByZero Exception
    Mat structdis(im.size(), CV_64FC1, 1);
    subtract(im, mu, structdis); // structdis = im - mu
    divide(structdis, sigma, structdis);
    

    Python

    im = cv2.imread("image_scenery.jpg", 0) # read as gray scale
    blurred = cv2.GaussianBlur(im, (7, 7), 1.166) # apply gaussian blur to the image
    blurred_sq = blurred * blurred
    sigma = cv2.GaussianBlur(im * im, (7, 7), 1.166)
    sigma = (sigma - blurred_sq) ** 0.5
    sigma = sigma + 1.0/255 # to make sure the denominator doesn't give DivideByZero Exception
    structdis = (im - blurred)/sigma # final MSCN(i, j) image
    

    Pairwise products for neighborhood releationships

    MSCN provides a good normalization for pixel intensities. However, the difference in natural vs. distorted images is not limited to pixel intensity distributions, but also the relationship between a pixel and its neighbors.

    To capture neighborhood relationshiops the authors used pair-wise products of MSCN image with a shifted version of the MSCN image. Four orientations are used to find the pairwise product for the MSCN coefficients, namely: Horizontal (H), Vertical (V), Left-Diagonal (D1), Right-Diagonal (D2).

    (3)   \begin{align*}H(i, j) &= \widehat{I}(i, j)\widehat{I}(i, j + 1) \\V(i, j) &= \widehat{I}(i, j)\widehat{I}(i + 1, j) \\D1(i, j) &= \widehat{I}(i, j)\widehat{I}(i + 1, j + 1) \\D2(i, j) &= \widehat{I}(i, j)\widehat{I}(i + 1, j - 1) \\\end{align*}

    Pair-wise products can be calculated in Python and C++ as below:

    C++:

    // declare shifting indices array
    int shifts[4][2] = {{0, 1}, {1, 0}, {1, 1}, {-1, 1}};
    
    // calculate pair-wise products for every combination of shifting indices
    for(int itr_shift = 1; itr_shift <= 4; itr_shift++)
    {
        int* reqshift = shifts[itr_shift - 1]; // the required shift index
        // declare shifted image
        Mat shifted_structdis(imdist_scaled.size(), CV_64F, 1);
        // BwImage is a helper class to create a copy of the image and create helper functions to access it's pixel values
        BwImage OrigArr(structdis);
        BwImage ShiftArr(shifted_structdis);
        // calculate pair-wise component for the given orientation
        for(int i = 0; i < structdis.rows; i++)
        {
            for(int j = 0; j < structdis.cols; j++) { if(i + reqshift[0] >= 0 && i + reqshift[0] < structdis.rows && j + reqshift[1] >= 0 && j + reqshift[1] < structdis.cols)
                {
                    ShiftArr[i][j] = OrigArr[i + reqshift[0]][j + reqshift[1]];
                }f
                else
                {
                    ShiftArr[i][j] = 0;
                }
            }
        }
        Mat shifted_new_structdis;
        shifted_new_structdis = ShiftArr.equate(shifted_new_structdis);
        // find the pairwise product
        multiply(structdis, shifted_new_structdis, shifted_new_structdis);
    }
    

    Python:

    # indices to calculate pair-wise products (H, V, D1, D2)
    shifts = [[0,1], [1,0], [1,1], [-1,1]]
    # calculate pairwise components in each orientation
    for itr_shift in range(1, len(shifts) + 1):
        OrigArr = structdis
        reqshift = shifts[itr_shift-1] # shifting index
    
        for i in range(structdis.shape[0]):
            for j in range(structdis.shape[1]):
                if(i + reqshift[0] >= 0 and i + reqshift[0] < structdis.shape[0] \ and j + reqshift[1] >= 0 and j  + reqshift[1] < structdis.shape[1]):
                   ShiftArr[i, j] = OrigArr[i + reqshift[0], j + reqshift[1]]
                else:
                   ShiftArr[i, j] = 0
    

    The two for loops can be reduced to a couple of lines using cv2.warpAffine function as shown below. This leads to a huge speed up.

    # create affine matrix (to shift the image)
    M = np.float32([[1, 0, reqshift[1]], [0, 1, reqshift[0]]])
    ShiftArr = cv2.warpAffine(OrigArr, M, (structdis.shape[1], structdis.shape[0])
    

    Step 2 : Calculate Feature Vectors

    Until now, we have derived 5 images from the original image — 1 MSCN image and 4 pairwise product images to capture neighbor relationships (Horizontal, Vertical, Left Diagonal, Right Diagonal).

    Next, we will use these 5 images to calculate a feature vector of size 36×1 ( i.e. an array of 18 numbers). Note the original input image could be of any dimension (width/height), but the feature vector is always of size 36×1.

    The first two elements of the 36×1 feature vector is calculated by fitting the MSCN image to a Generalized Gaussian Distribution (GGD). A GGD has two parameters — one for shape and one for variance.

    Next, an Asymmetric Generalized Gaussian Distribution (AGGD) is fit to each of the four pairwise product images. AGGD is an asymmetric form of Generalized Gaussian Fitting (GGD). It has four parameters — shape, mean, left variance and right variance. Since there are 4 pairwise product images, we end up with 16 values.

    Therefore, we end up with 18 elements of the feature vector. The image is downsized to half its original size and the same process is repeated to obtain 18 new numbers bringing the total to 36 numbers.

    This is summarized in the table below.

    Feature RangeFeature DescriptionProcedure
    1 - 2Shape and Variance.GGD fit to MSCN coefficients.
    3 - 6Shape, Mean, Left Variance, Right VarianceAGGD fit to Horizontal Pairwise Products
    7 - 10Shape, Mean, Left Variance, Right VarianceAGGD fit to Vertical Pairwise Products
    11 - 14Shape, Mean, Left Variance, Right VarianceAGGD fit to Diagonal (left) Pairwise Products
    15 - 18Shape, Mean, Left Variance, Right VarianceAGGD fit to Diagonal (Right) Pairwise Products

    Step 3: Prediction of Image Quality Score

    In a typical Machine Learning application, an image is first converted to a feature vector. Then the feature vectors and outputs ( in this case the quality score ) of all images in the training dataset are fed to a learning algorithm like Support Vector Machine (SVM). We have covered SVMs in this post.

    One can download the data and train an SVM to solve this problem, but in this post, we will simply use the trained model provided by the authors.

    We use LIBSVM to predict the final quality score by first loading the trained model and then predicting the probability using the support vectors produced by the model. It’s important to note that the feature vectors are first scaled to -1 to 1 and are then used for prediction. We share Python and C++ way to do this:

    C++:

    // make a svm_node object and push values of feature vectors into it
    struct svm_node x[37];
    
    // features is a rescaled vector to (-1, 1)
    for(int i =; i < 36; ++i) {
      x[i].value = features[i];
      x[i].index = i + 1; // svm_node indexes start from 1
    }
    
    // load the model from modelfile - allmodel
    string model = "allmodel";
    model = svm_load_model(modelfile.c_str())
    
    // get number of classes from the model
    int nr_class = svm_get_nr_class(model);
    double *prob_estimates = (double *) malloc(nr_class*sizeof(double));
    
    // predict the quality score using svm_predict_probability function
    qualityscore = svm_predict_probability(model, x, prob_estimates);
    

    Python:

    # load the model from allmodel file
    model = svmutil.svm_load_model("allmodel")
    # create svm node array from features list
    x, idx = gen_svm_nodearray(x[1:], isKernel=(model.param.kernel_type == PRECOMPUTED))
    nr_classifier = 1 # fixed for svm type as EPSILON_SVR (regression)
    prob_estimates = (c_double * nr_classifier)()
    
    # predict quality score of an image using libsvm module
    qualityscore = svmutil.libsvm.svm_predict_probability(model, x, dec_values)
    

    Let’s come to the execution of BRISQUE Metric now, as proposed in the paper using C++ and our own version using Python. We have made some minor changes in the code for it to work with OpenCV 3.x.y and Python 3.x versions.

    C++:

    1. Fork/Clone the repository: First step will be to clone or fork the repository to your current directory.
    2. Execution: We have already compiled the code and the working file is in the repository itself. Use following command to check your code: ./brisquequality "image_path"

    Python:

    # Python 2.7
    python2 brisquequality.py <image_path>
    # Python 3.x
    python3 brisquequailty.py <image_path>
    

    We have performed the metric on 4 types of distortions. Here is the final quality score, for each distortion:

    Original ImageJPEG2K Compression Heavy Compression Gaussian Noise Median Blur
    original imagejpeg2k compressedcompressed heavynoise versionblur vision
    26.828630.741733.069279.875172.7044


    References:

    1. No-Reference Image Quality Assessment in the Spatial Domain
    2. Laboratory for Image and Video Engineering
    3. Trained model file for BRISQUE Image Quality Assessment
    4. TID 2008 – A Database for Evaluation of Full-Reference Visual Quality Assessment Metrics

    Read Next

    VideoRAG: Redefining Long-Context Video Comprehension

    VideoRAG: Redefining Long-Context Video Comprehension

    Discover VideoRAG, a framework that fuses graph-based reasoning and multi-modal retrieval to enhance LLMs' ability to understand multi-hour videos efficiently.

    AI Agent in Action: Automating Desktop Tasks with VLMs

    AI Agent in Action: Automating Desktop Tasks with VLMs

    Learn how to build AI agent from scratch using Moondream3 and Gemini. It is a generic task based agent free from…

    The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks

    The Ultimate Guide To VLM Evaluation Metrics, Datasets, And Benchmarks

    Get a comprehensive overview of VLM Evaluation Metrics, Benchmarks and various datasets for tasks like VQA, OCR and Image Captioning.

    Subscribe to our Newsletter

    Subscribe to our email newsletter to get the latest posts delivered right to your email.

    Subscribe to receive the download link, receive updates, and be notified of bug fixes

    Which email should I send you the download link?

     

    Get Started with OpenCV

    Subscribe To Receive

    We hate SPAM and promise to keep your email address safe.​