Potholes on the road can become fatally dangerous when driving at high speed. This is more so when the driver of a car or vehicle cannot see the pothole from far away and applies sudden brakes or maneuvers the car away at high speed. The later action can be equally dangerous for other drivers as well. But what if we use deep learning and object detection to detect potholes much farther away compared to humans? Such a system is bound to help us. This is exactly what we will do in this blog post. We will use the YOLOv4 object detection model and the Darknet framework to create a pothole detection system.
Before we move further into the blog post, let’s have an overview of the models that we will cover here:
- We will mainly focus on the two well-known models in the Darknet YOLOv4 repository. They are YOLOv4 and YOLOv4-Tiny.
- We will use a pothole detection dataset which is a combination of two open-source datasets available.
- There will be four experiments in total and we will discuss all the details further in the post.
We will get into the details of the models, the experiments that we will carry out, and the dataset in one of the further sections.
- A Brief Introduction to Darknet and YOLOv4
- The Pothole Dataset
- Training Experiments to Carry Out
- Training YOLOv4 Models on Custom Pothole Dataset
- mAP Comparison
- Running Inference for Pothole Detection using YOLOv4
1. A Brief Introduction to Darknet and YOLOv4
The Darknet project is an open-source object detection framework well known for providing training and inference support for YOLO models. The library is written in C.
The Darknet logo (source)
The Darknet project was started by Joseph Redmon in 2014 with the release of the very first YOLO paper. Shortly after the publication of YOLOv3, it was taken over by Alexey Bochkovskiy who now maintains an active fork of the original repository. He also added support for YOLOv4 models, some of the best object detection models out there.
YOLOv4, YOLOv4-Tiny, and YOLOv4-CSP are some of the well-known and widely used object detection models in the repository. Along with these, Alexey also added some really nice features to the codebase:
- The code now supports mixed-precision training for GPUs with Tensor cores. It can increase the training speed by around 3 times on GPUs which support it.
- Mosaic augmentation was also added during training which greatly improves model accuracy as it learns to detect objects in more difficult images (see section 3.3. Additional Improvements of the YOLOv4 paper for details).
- The code now also supports multi-resolution training. This changes the resolution of the images by +-50% of the base resolution every 10 batches while training the model. This helps the model learn to detect objects in both smaller and larger images. But it also requires a substantial amount of GPU memory to train for the same batch size as compared to single-resolution training. The reason for this is that every few batches when the resolution changes to +50% of the base resolution, it will require more GPU memory.
2. The Pothole Dataset
In this post, we will combine two open-source datasets to obtain a moderately large and varied set of images for training the YOLOv4 models.
We obtain one of the datasets from Roboflow. This dataset contains 665 images in total and it has already been split into 465 training, 133 validation, and 67 test images.
The other dataset that we use is mentioned in this ResearchGate article. Although the authors provide the link to a large dataset, we use a subset of it for our purpose.
We combine the two datasets in a random manner and create a training, validation, and test set. The dataset contains only one class, that is, Pothole.
You don’t need to worry about this phase of dataset processing, as you will get access to the final dataset directly.
The following are a few annotated images from the final dataset.
Ground truth images with annotations from the pothole detection dataset
We will just carry out a minor preprocessing of the dataset, the details of which we will discuss in the coding section.
The dataset that we will use here has the following split: 1265 training images, 401 validation images, and 118 validation images.
3. Training Experiments To Carry Out
As discussed before, we will carry out a total of 4 experiments using the YOLOv4 and YOLOv4-Tiny models. The following are the briefing of those experiments:
- We will start with training the YOLOv4-Tiny model with a fixed resolution of 416×416.
- Then we will carry out dynamic resolution training using the YOLOv4-Tiny model with a base resolution of 416×416. Now, this experiment should give us a higher mAP on the test set compared to the fixed resolution training. We will confirm so when analyzing the results.
- Next, we will train a dynamic resolution model using the YOLOv4 model with a base resolution of 608×608. This again should give us a higher mAP on the test set compared to the tiny models.
- Finally, we will carry out fixed resolution training for the YOLOv4 model with an image resolution of 608×608.
All the details pertaining to the training parameters & hyperparameters, and setting up the configuration file for each experiment will be discussed in their respective sections.
4. Training YOLOv4 Models on the Custom Pothole Dataset
From here onward, we will discuss the coding details of this post. This includes the preprocessing steps to generate the text files for the image paths, the preparation of the configuration files, the creation of the data files, training, and evaluation on the test set.
There are two ways to proceed here. We can either proceed with steps that one should carry out on a local system’s terminal and IDE, or the steps one should carry out in a Jupyter notebook (maybe local, Colab, or any other cloud-based Jupyter environment). A Jupyter notebook along with all the implementation details is already available for download. Here, we follow the steps for developing code in IDE and executing commands in the terminal. That way, we will gain experience in both. If you are on Windows OS, it is recommended that you use the provided Jupyter notebook and run it on Colab. The following steps for local execution were carried out on a Ubuntu system. Although please note that if you proceed on your local system, training will require more than 10 GB of GPU memory for some experiments.
Download the Dataset
To download the dataset, simply execute the following command in your terminal inside the directory of your choice.
And extract the dataset using the following command.
Inside the dataset directory, you should find the following directory structure.
├── test │ ├── G0010124.JPG │ ├── G0010124.txt │ ... │ ├── img-98_jpg.rf.667209472947ff4d519f65c6e206a7c3.jpg │ └── img-98_jpg.rf.667209472947ff4d519f65c6e206a7c3.txt ├── train │ ├── G0010033.JPG │ ├── G0010033.txt │ ... │ ├── img-9_jpg.rf.de0e0920eee97f99bfa4d5a2ed29d82e.jpg │ └── img-9_jpg.rf.de0e0920eee97f99bfa4d5a2ed29d82e.txt ├── valid │ ├── G0028267.JPG │ ├── G0028267.txt │ ... │ ├── img-94_jpg.rf.26ce6c0878886e2b49b0191cf4f952bb.jpg │ └── img-94_jpg.rf.26ce6c0878886e2b49b0191cf4f952bb.txt ├── PotholeDataset.pdf ├── README.dataset.txt └── README.roboflow.txt
The train, valid, and test directories contain the images along with the text files which hold the labels. For YOLOv4, the bounding box coordinates need to be in [x_center, y_center, width, height] format which are relative to the image size. Other than, that, the label in each case is 0 as we have only one class. The next block shows an example of one such text file.
0 0.5497282608695652 0.5119565217391304 0.017934782608695653 0.005072463768115942 0 0.41032608695652173 0.5253623188405797 0.025 0.005797101449275362 0 0.30842391304347827 0.5282608695652173 0.014673913043478261 0.005797101449275362 0 0.1654891304347826 0.5224637681159421 0.027717391304347826 0.005797101449275362
Each line in the text files represent one object in the dataset. The first number, which is 0, represents the class. The rest four floating-point numbers represent the coordinates in the above-mentioned format.
Cloning and Building Darknet
Next, we need to clone and build Darknet. Execute the following command in the terminal.
git clone https://github.com/AlexeyAB/darknet.git
Enter into the darknet directory using:
Note that all the remaining commands will be executed from the darknet directory. So, all paths will be relative to this directory and the dataset directory should be one folder back relative to the darknet directory.
Now, we need to build Darknet. The build process that we follow here expects GPU to be available in the system along with CUDA and cuDNN installed. Open the Makefile and make the following changes in the first 7 lines:
GPU=0 CUDNN=0 CUDNN_HALF=0 OPENCV=0 AVX=0 OPENMP=0 LIBSO=0
GPU=1 CUDNN=1 CUDNN_HALF=1 OPENCV=1 AVX=1 OPENMP=1 LIBSO=1
Basically, we are saying the Makefile that we want to build Darnet with CUDA, cuDNN, mixed-precision support for the GPU computation part. The AVX and OPENMP ensure optimized performance on the CPU as well if it is supported.
Now, save the file and run make in the terminal.
While building Darknet, if you face the following error:
opencv.hpp: No such file or directory
Then you need to install OpenCV using the following command and then run make again.
apt install libopencv-dev
It should be complete without issues this time.
With this, we are all set to use Darknet with CUDA (GPU) support now on our local systems.
Preparing the Text Files for Image Paths
For the Darknet YOLOv4 training and testing, we need all the image paths to be in a text file. These text files will then be used to map to the image path.
Note: The paths in the text files should be relative to the darknet directory.
Let’s take a look at the code which will make things clearer. The prepare_darknet_image_txt_paths.py contains the code for generating the train.txt, valid.txt, and test.txt files.
import os DATA_ROOT_TRAIN = os.path.join( '..', 'dataset', 'train' ) DATA_ROOT_VALID = os.path.join( '..', 'dataset', 'valid' ) DATA_ROOT_TEST = os.path.join( '..', 'dataset', 'test' ) train_image_files_names = os.listdir(os.path.join(DATA_ROOT_TRAIN)) with open('train.txt', 'w') as f: for file_name in train_image_files_names: if not '.txt' in file_name: write_name = os.path.join(DATA_ROOT_TRAIN, file_name) f.writelines(write_name+'\n') valid_data_files__names = os.listdir(os.path.join(DATA_ROOT_VALID)) with open('valid.txt', 'w') as f: for file_name in valid_data_files__names: if not '.txt' in file_name: write_name = os.path.join(DATA_ROOT_VALID, file_name) f.writelines(write_name+'\n') test_data_files__names = os.listdir(os.path.join(DATA_ROOT_TEST)) with open('test.txt', 'w') as f: for file_name in test_data_files__names: if not '.txt' in file_name: write_name = os.path.join(DATA_ROOT_TEST, file_name) f.writelines(write_name+'\n')
We simply iterate over the train, valid, and test directories containing the image files and create the text files. The text files will be created in the darknet directory.
Following are a few lines from the train.txt file.
../dataset/train/img-111_jpg.rf.d7e58630e249c45d8c1d564d847dc236.jpg ../dataset/train/G0027797.JPG ../dataset/train/img-52_jpg.rf.f95bed5b5d7b75aca48be59a17751be5.jpg ../dataset/train/img-384_jpg.rf.52f9bf925832084c3778e4d0de4dfb56.jpg ../dataset/train/img-108_jpg.rf.a35e86abc558a98f252bfc10e49fd6d9.jpg ../dataset/train/G0052399.JPG ../dataset/train/G0054293.JPG ../dataset/train/G0031367.JPG
There are two things to observe here:
- The order of the files is already randomized.
- And the image paths are relative to the current directory.
We are all set with the dataset preparation part and building Darknet also. Now, let’s move on to the core experimental part, that is, the training of YOLOv4 models using different parameters.
Training YOLOv4-Tiny Model with Fixed Resolution
We will start with training the YOLOv4-Tiny model. We will create the configuration and data files for this. For the configuration, we will change the batch size and number of batches to train for, but leave the other settings to their default values.
Setting Up Model Configuration and Data Files
Inside the cfg directory in the darknet folder, create a copy of the yolov4-tiny-custom.cfg file. Name it as yolov4-tiny-pothole.cfg. From here on, all the configuration settings that we discuss are based on a 16GB Tesla P100 GPU available on Colab. You may adjust the configurations according to your availability, but the experiments and results that we discuss here are based on the settings as per the mentioned hardware.
In the new configuration file, change the batch from 64 to 32, set the max_batches to 8000, and steps as 6400, 7200. Basically, we will be training the model for 8000 steps with a batch size of 32. The learning rates will be scheduled to reduce at steps 6400 and 7200. Next are the number of filters and classes. In the tiny model configuration file, we can find two [yolo] layers. Change the classes in those layers from 80 to 1 as we have only one class. Before each [yolo] layer there will be [convolutional] layers containing the filters parameter. Change the number of filters to the value given by (num_classes+5)*3 which will be 18 in our case. And for the tiny YOLOv4 model, we need to change this in the two [convolutional] layers before the [yolo] layers.
Then we need to create a pothole.names file inside build/darknet/x64/data. This will contain the class names in each new line. As we have only one class, just enter the word pothole in the first line.
Next, we need to create a .data file. We create a separate file for each experiment. Create a pothole_yolov4_tiny.data inside build/darknet/x64/data. This file will contain information about the classes, the dataset paths, and the location to store the trained models. Enter the following information in that file:
classes = 1 train = train.txt valid = valid.txt names = build/darknet/x64/data/pothole.names backup = backup_yolov4_tiny
We specify the number of classes, the training and validation text file paths, the path to class names and the backup folder path. This is the folder where the trained model will be saved. Although we can use the same folder for all experiments, we will create a new folder for each experiment.
Before we move further, make sure to create the backup_yolov4_tiny folder in the darknet directory where the trained models will be saved. Else, the training process will throw an error as the directory is not automatically created.
This completes all the steps that we need to complete before we can start training. For further experiments, this will get easier as we have all configurations in place for the first experiment.
Train the YOLOv4-Tiny Model
To train the model, we will use the already available pretrained tiny model. Download it by executing the following command on the terminal.
Then execute the following command in the terminal inside the darknet directory.
./darknet detector train build/darknet/x64/data/pothole_yolov4_tiny.data cfg/yolov4-tiny-pothole.cfg yolov4-tiny.conv.29
The training will take some time based on the hardware being used. When the training ends, you should get an output similar to the following.
Saving weights to backup_yolov4_tiny/yolov4-tiny-pothole_8000.weights Saving weights to backup_yolov4_tiny/yolov4-tiny-pothole_last.weights Saving weights to backup_yolov4_tiny/yolov4-tiny-pothole_final.weights If you want to train from the beginning, then use flag in the end of training command: -clear
The following figure shows the loss plot throughout the training.
Loss plot for Tiny YOLOv4 fixed resolution training.
By the end of the training, YOLOv4-Tiny with 416×416 fixed resolution is giving around 0.12 loss. This looks low enough for object detection training. But we will get a real idea of its accuracy from the mAP (Mean Average Precision).
We will need another .data file to provide the path for the test image files. Create the pothole_test.data inside build/darknet/x64/data directory with the following content.
classes = 1 train = train.txt valid = test.txt names = build/darknet/x64/data/pothole.names backup = backup_test/
The only things changing here are the path to the valid text file and the backup folder name. We can use this same data file for further mAP tests as well.
As we have the trained model on the disk now, we can execute the following command to calculate the mAP at 0.5 IoU.
./darknet detector map build/darknet/x64/data/pothole_test.data cfg/yolov4-tiny-pothole.cfg backup_yolov4_tiny/yolov4-tiny-pothole_final.weights
The output that we get here is:
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision ([email protected]) = 0.400207, or 40.02 %
We get an mAP of 40.02%. This is not very bad considering we trained a tiny model on 416×416 resolution images.
Training YOLOv4-Tiny Model with Multi-Resolution Images
At the beginning of the post, we discussed that Darknet supports multi-resolution training. In this case, the resolution of the images is changed randomly between +50% and -50% every 10 batches from the base resolution that we provide.
How does this help?
During multi-resolution training, the model will get to see both larger and smaller images. This will help it learn and detect objects in more difficult scenarios. Theoretically, we can say that this should provide us with a higher mAP if we keep every other training parameter the same.
Rather than speculating what would happen, let’s try this out.
Setting Up Model Configuration and Data Files for YOLOv4-Tiny Multi-Resolution Training for Pothole Detection
We need to set up the configuration and data files for the multi-resolution training. Let’s tackle the configuration file first.
Create a yolov4-tiny-multi-res-pothole.cfg inside the cfg directory. Now, almost at the end of each model configuration file, Darknet provides a random parameter. In the tiny model configuration files, it is 0 by default indicating that no random resolution (or multi-resolution) will be used during training. We need to make sure that random=1 is set in the configuration file.
All the other configurations and parameters will be the same as the previous training, that is, fixed resolution YOLOv4-Tiny model training.
Now, create a pothole_yolov4_tiny_multi_res.data file inside the build/darknet/x64/data directory with the following content:
classes = 1 train = train.txt valid = valid.txt names = build/darknet/x64/data/pothole.names backup = backup_yolov4_tiny_multi_res
We just change the backup directory name and be sure to create the backup_yolov4_tiny_multi_res folder in the darknet directory.
Train the YOLOv4-Tiny Model with Multi-Resolution
To start the training, we simply need to execute the following command from the darknet directory.
./darknet detector train build/darknet/x64/data/pothole_yolov4_tiny_multi_res.data cfg/yolov4-tiny-multi-res-pothole.cfg yolov4-tiny.conv.29
Note that this will take more time to train compared to the previous experiment as the model will also train on larger images in some of the batches.
The following is the loss plot after the training finishes.
By the end of the training, the loss is 0.32 which is higher compared to the single resolution training. Now, this is expected as the training data becomes difficult whenever training on smaller images. But at the same time, the model got to see much varied scenarios, which means that it may have learned better. Let’s check out mAP.
./darknet detector map build/darknet/x64/data/pothole_test.data cfg/yolov4-tiny-multi-res-pothole.cfg backup_yolov4_tiny_multi_res/yolov4-tiny-multi-res-pothole_final.weights
We get the following output this time.
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision ([email protected]) = 0.415005, or 41.50 %
So, we have slightly higher mAP in this case and this is what we expected also.
Training YOLOv4 with Multi-Resolution Images
We are done with the YOLOv4-Tiny model training. Now, we will move forward with the YOLOv4 model training, which is the large model in the family of YOLOv4 models.
One thing to note here is that for the YOLOv4 model, the multi-resolution training takes place by default. So, the random parameter value is 1. For that reason, we will first carry out multi-resolution training and then move to the fixed resolution training.
Configuration and Data File Setup for YOLOv4 Multi-Resolution Training
For the configuration file, a lot of things will stay similar to the YOLOv4-Tiny model. A few parameters will change and let’s discuss them here.
First, we need to create a copy of the yolov4-custom.cfg file in the cfg directory and rename it as
yolov4-pothole.cfg. Next, we need to ensure the following parameters:
- For the Colab P100 GPU, the batch and subdivision were both set to 32 (the batch size is divided into further mini-batches based on the sub-division parameter). We are using a higher subdivision so that we can train with a batch size of 32. Training with a smaller batch size results in more unstable training.
- Make sure that the width and height are both set to 608. We will be setting the base resolution to 608×608.
- The max_batches is 8000 and steps is 6400, 7200. These remain the same as in the case of tiny model training.
- There are three [yolo] layers in this configuration file. Make sure to set the classes=1 in all three.
- Also, before each [yolo] layer, there are three [convolutional] layers. Make the filters=18 in all these three layers.
The random parameter is already set to 1 by default, so, we do not need to make any changes to that to carry out the multi-resolution training.
Next, coming to the data file. Create a pothole_yolov4.data file in the build/darknet/x64/data directory with the following contents:
classes = 1 train = train.txt valid = valid.txt names = build/darknet/x64/data/pothole.names backup = backup_yolov4
Again, nothing changes here apart from the backup directory name.
Train the YOLOv4 Model with Multi-Resolution
Before starting the training, let’s download the pretrained features using the following command so that we can utilize them.
Next, execute the following command to start the training.
Do note that the training may take a long time when using a mid-range GPU.
./darknet detector train build/darknet/x64/data/pothole_yolov4.data cfg/yolov4-pothole.cfg yolov4.conv.137
We get the following loss plot after the training.
Loss plot after training the YOLOv4 multi-resolution model.
Interestingly, the loss here is much higher than the tiny model training. In fact, we can see more fluctuations in the loss graph compared to previous experiments. But looking at the mAP metric will give us a better idea.
./darknet detector map build/darknet/x64/data/pothole_test.data cfg/yolov4-pothole.cfg backup_yolov4/yolov4-pothole_final.weights
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision ([email protected]) = 0.653199, or 65.32 %
The large model with multi-resolution training gives more than 65% mAP this time. We will see how this higher mAP affects the detections when running inference.
Training YOLOv4 with Fixed Resolution Images
We have reached the final experiment now. We will train the YOLOv4 model with a fixed resolution of 608×608.
This will actually give us a better understanding of how multi-resolution in cases can affect the learning process of the model.
Configuration and Data File Setup for YOLOv4 Fixed Resolution Training
A lot of parameters in the configuration file will remain similar to the above training process. Let’s check out the parameters that need some attention:
- This time also, keep the batch as 32 but reduce the subdivision to 8. We don’t need a higher subdivision this time as we will not be training on images larger than 608×608.
- In the final [yolo] layer, change the random from 1 to 0, that is, random=0. This will turn off multi-resolution training.
- All other parameters remain the same as in the previous training experiment.
Create the pothole_yolov4_fixed.data with the following content:
classes = 1 train = train.txt valid = valid.txt names = build/darknet/x64/data/pothole.names backup = backup_yolov4_fixed
Train the YOLOv4 Fixed Resolution Model
Finally, we are all set to train the model.
./darknet detector train build/darknet/x64/data/pothole_yolov4_fixed.data cfg/yolov4-fixed-pothole.cfg yolov4.conv.137
The following figure shows the loss plot after training.
Loss plot after training the YOLOv4 fixed resolution model.
The loss is surely lower compared to the multi-resolution training experiment. One reason for this could be that the model got to see images of only one scale and therefore it was a bit easier for it to learn.
Let’s check out the mAP results.
./darknet detector map build/darknet/x64/data/pothole_test.data cfg/yolov4-fixed-pothole.cfg backup_yolov4_fixed/yolov4-fixed-pothole_final.weights
IoU threshold = 50 %, used Area-Under-Curve for each unique Recall mean average precision ([email protected]) = 0.693369, or 69.34 %
This is quite surprising. We are getting a more than 4% increase in the mAP this time. This means that this model is able to predict the bounding boxes much better than the multi-resolution model.
5. Visual Comparison of mAP for All Models
The following plot shows the mAP comparison at 0.50 IoU threshold for all the runs that we carried out above.
mAP comparison for different models for pothole detection using YOLOv4.
We can clearly see that for the larger model, the fixed resolution (with 608×608 images) model is giving much better mAP. This is odd given that the multi-resolution resolution model has been trained on a much-varied scale of images, and is expected to give better results. But there is a possibility that the varying scale can make the dataset much harder to learn and therefore may need more batches to get to a similar mAP. And we have trained all the models for 8000 batches with a batch size of 32. This is about 200 epochs for this dataset. Most probably, training the multi-resolution model for more number of epochs should give better results.
But as of now, as the multi-resolution model has seen more varied images, we can expect that it will perform just as well as the fixed resolution model in real-life cases when detecting potholes.
6. Running Inference on Real-Life Pothole Detection Scenarios
Let’s run inference using all 4 trained models. Before moving forward with the inference part, if you intend to run inference on your own videos as well, be sure to have all the folders containing the trained models (the backup directories) in the darknet directory.
We will be using the Python (darknet_video.py) script to run the inference which has been slightly modified to show the FPS on the video frame. The modified script is part of the downloadable code.
Let’s start with the inference using the YOLOv4 Tiny model with fixed resolution.
python darknet_video.py --data_file build/darknet/x64/data/pothole_yolov4_tiny.data --config_file cfg/yolov4-tiny-pothole.cfg --weights backup_yolov4_tiny/yolov4-tiny-pothole_final.weights --input inference_data/video_6.mp4 --out_filename tiny_singleres_vid6.avi --dont_show
For the inference script, we need to provide the following arguments:
--data_file: It is same data file used during training containing the paths to the class name file and the number of classes.
--config_file: The path to the model configuration file.
--weights: This flag takes in the path to the model weights.
--input: The input video file on which we want to run the inference.
--out_filename: Resulting video file name.
It is pretty clear that the model is not performing very well here. It clearly misses a lot of the potholes. And whichever are detected, we can see a lot of fluctuations. The limitations of the fixed resolution tiny model can be seen here.
Now, let’s run inference using the YOLOv4 Tiny model trained with multi-resolution images.
python darknet_video.py --data_file build/darknet/x64/data/pothole_yolov4_tiny_multi_res.data --config_file cfg/yolov4-tiny-multi-res-pothole.cfg --weights backup_yolov4_tiny_multi_res/yolov4-tiny-multi-res-pothole_final.weights --input inference_data/video_6.mp4 --out_filename tiny_multires_vid6.avi --dont_show
The results look almost exactly the same as the single resolution tiny model. This is expected as the multi-resolution model has only a 1% increase in mAP on the test set.
The normal YOLOv4 models had performed quite well on the test set. The following command executes the YOLOv4 multi-resolution model which gave an mAP of around 65% on the test set.
python darknet_video.py --data_file build/darknet/x64/data/pothole_yolov4.data --config_file cfg/yolov4-pothole.cfg --weights backup_yolov4/yolov4-pothole_final.weights --input inference_data/video_6.mp4 --out_filename yolov4_vid6.avi --dont_show
The results are much better here. The model is able to detect potholes that are farther away and with more confidence as well. But we still see some fluctuations in the detections here well.
The final model that we have is the YOLOv4 model trained with fixed 608×608 resolution images. Let’s try that out.
python darknet_video.py --data_file build/darknet/x64/data/pothole_yolov4_fixed.data --config_file cfg/yolov4-fixed-pothole.cfg --weights backup_yolov4_fixed/yolov4-fixed-pothole_final.weights --input inference_data/video_6.mp4 --out_filename yolov4_fixed_vid6.avi --dont_show
The results are really interesting. If you remember, the fixed resolution model gave the highest mAP of more than 69% on the test dataset. But here, it is detecting fewer potholes compared to the multi-resolution model. It is mostly failing when the potholes are small or are farther away. This is mostly happening because the multi-resolution model had learned the features of both, smaller and larger potholes during training. This is also a reminder that the metrics that we get on a specific dataset may not always be a direct representation of the results that we get in real-life use cases.
In this post, we covered a lot of ground regarding the YOLOv4 model and the Darknet framework. We started with setting up Darknet on the Ubuntu system with CUDA support. Then we trained multiple YOLOv4 models with different configurations on the Pothole detection dataset. After training, running the inference gave us a pretty good idea that sometimes trying to solve real-world problems with deep learning can be more difficult than it seems. The varied results that we got from different models made it pretty clear.
To get even better results, we may need to try more powerful and better models, or even add more real-life images to the training set. If you try out any of these, be sure to let us know in the comment section.