How to Train YOLOv4 on a Custom Dataset

In this tutorial, we walkthrough how to train YOLOv4 Darknet for state-of-the-art object detection on your own dataset, with varying number of classes.

Train YOLOv4 on a custom dataset with this tutorial on Darknet! (photo credit)

YOLOv5 has arrived

If you're here for the Darknet, stay for the darknet. Otherwise consider running the YOLOv5 PyTorch tutorial in Colab. You'll have a very performant, trained YOLOv5 model on your custom data in a matter of minutes.

We will take the following steps to implement YOLOv4 on our custom data:

Impatient? Jump to our YOLOv4 Colab notebook.

YOLOv4 Darknet Video Tutorial. Subscribe to our YouTube.

Introduction to Training YOLOv4 on a custom dataset

Object detection models continue to get better, increasing in both performance and speed. In the realtime object detection space, YOLOv3 (released April 8, 2018) has been a popular choice, as has EfficientDet (released April 3rd, 2020) by the Google Brain team.

Progress continues with the recent release of YOLOv4 (released April 23rd, 2020), which has been shown to be the new object detection champion by standard metrics on COCO.

YOLOv4 performance from the paper. (Citation)

These general object detection models are proven out on the COCO dataset which contains a wide range of objects and classes with the idea that if they can perform well on that task, they will generalize well to new datasets.

However, applying the deep learning techniques used in research can be difficult in practice on custom objects. We have been working to make that transition easy and have released similar tutorials in the past including:

This post builds on prior models in being among the first to help you implement YOLOv4 to a custom dataset – not just objects included in the COCO dataset.

By using YOLOv4, you are implementing many of the past research contributions in the YOLO family along with a series of new contributions unique to YOLOv4 including new features: WRC, CSP, CmBN, SAT, Mish activation, Mosaic data augmentation, CmBN, DropBlock regularization, and CIoU loss.

In short, with YOLOv4, you're using a better object detection network architecture and new data augmentation techniques.

In this tutorial, we use the Darknet framework because the ability to train YOLOv4 in TensorFlow, Keras, and PyTorch frameworks is still under construction. (Update: YOLO v4 PyTorch implementation now available.)

If you would like to learn more about the research contributions made by YOLOv4, we recommend reading the following:

What is the Darknet Framework?

It's not TensorFlow, nor is it PyTorch, and it is most certainly is not Keras. It is a custom framework written by Joseph Redmon (whom, by the way, has a phenomenally fun resume).

While Darknet is not as intuitive to use, it is immensely flexible, and it advances state-of-the-art object detection results.

In this post, we'll be using Darknet to implement YOLOv4. Along the way, we'll demystify the difficulties getting Darknet setup within Colab. Stay tuned for future posts where we'll implement YOLOv4 in PyTorch, YOLOv4 in TensorFlow, and YOLOv4 in Keras.

Alright let's get to it! We recommend reading this blog post along side the Colab notebook.

Configuring our GPU Environment for YOLOv4 on Google Colab

For compute, we are going to use Google Colab. Google Colab is a Python Jupyter notebook that runs on a GPU. Google Colab is free to use and, optionally, $10/month to upgrade to a Pro account.

You can use this tutorial on your local machine as well, but configurations will be slightly different. Regardless of environment, the important things we will need to train YOLOv4 are the following:

  • GPU with specific GPU drivers installed
  • OpenCV
  • cuDNN configured on top of GPU drivers

For the next steps, open our YOLOv4 Darknet Colab notebook.

Thankfully, Google Colab takes care of the first two for us, so we only need to configure cuDNN.

Colab gives you OpenCV and a GPU by default

Configuring cuDNN for YOLOv4

Google Colab has been updated with cuDNN pre-installed, this step is no longer needed 🎉

Double checking cuDNN install

Install the Darknet YOLO v4 training environment

Next, we clone our fork of the Darknet YOLO v4 repository. We have made a few minor tweaks to remove print statements and to change the Makefile to play well with Google Colab.

For Google Colab users, we have added a cell that will automatically specify the architecture based on the detected GPU. If you are on a local machine (not Colab), have a look at the Makefile for your machine. You will need to change the following line to fit your GPU based on your GPU's compute capability:

ARCH= -gencode arch=compute_60,code=sm_60

Moving along, after we have clone the repository we !make Darknet for YOLOv4. If your make is successful, you will see a number of printouts and at the bottom you will see the line beginning with:

g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -DOPENCV pkg-config --cflags opencv4 2> /dev/null

Note: You may see warning messages following that line; you may safely ignore these warnings.

Finally, we download the newly released convolutional neural network weights used in YOLOv4.

yolov4.conv.137     100%[===================>] 162.16M  64.2MB/s    in 2.5s    

✅ All set.

Download Our Custom Dataset for YOLOv4 and Set Up Directories

To train YOLOv4 on Darknet with our custom dataset, we need to import our dataset in Darknet YOLO format.

To import our images and bounding boxes in the YOLO Darknet format, we'll use Roboflow.

Don't have a dataset? You can also start with one of the free computer vision datasets. The dataset used in this tutorial is Blood Cell Count and Detection (BCCD), which you can fork to add to your Roboflow account.

Using Your Own Custom Data

To export your own data for this tutorial, sign up for Roboflow and make a public workspace, or make a new public workspace in your existing account.

If your data is private, you can upgrade to a paid plan for export to use external training routines like this one or experiment with using Roboflow's internal training solution.

Labeling Data: If your data is unlabeled we recommend using Roboflow Annotate to add your annotations.

0:00
/0:09

Labeling images with Roboflow Annotate is a breeze

To get your data into Roboflow, create a free Roboflow account. Upload your images and their annotations in any format (VOC XML, COCO JSON, TensorFlow Object Detection CSV, etc).

Once uploaded, select a couple preprocessing steps. We recommend auto-orient and resize to 416x416 (YOLO presumes multiples of 32).

The settings I've chosen for my example dataset, BCCD.

Next, click "Generate" to create a version of these images we will load into Colab. Optionally, provide a name for your version. Upon the images being generated, you'll be prompted to create an export. Export your images and annotations in the Darknet format. Be sure to select "show download code."

Export as YOLO Darknet, and "Show Download Code."

Once the download is zipped, we'll be provided a line of code to download our data anywhere we need. Copy this link, and paste it into our Colab notebook where prompted.

Downloading data from Roboflow - it will download in train/valid/test splits and as a combination of images and annotation txt.

If you are on local, and already have your dataset in the right format, you can use the same Roboflow link or simply copy your files into the directories manually.

Then, we run some code to move the image and annotation files into the correct directories for training.

✅ Onward.

Configure a Custom YOLOv4 Training Config File for Darknet

Configuring the training config for YOLOv4 for a custom dataset is tricky, and we handle it automatically for you in this tutorial. We'll set defaults for the learning rate and batch size below, and you should feel free to adjust these to your dataset's needs.

We set up the config by combining a series of chunked config files. We take the following steps according to the YOLOv4 repository:

  • Set batch size to 64 - batch size is the number of images per iteration
  • Set subdivisions to 12 - subdivisions are the number of pieces your batch is broken into for GPU memory.
  • max_batches to 2000 * number of classes
  • steps to 80% and 90% of max batches
  • change num_classes in all of the YOLO layers
  • change filters in all of the YOLO layers

Most of these you will not need to change. You may want to change the subdivision size to speed up training (smaller subdivisions are faster) or if your GPU does not have enough memory (larger subdivisions require less memory).

✅ Good to go!

Train Our Custom YOLOv4 Object Detector

Now that we have set up the environment, we can begin to train our custom YOLOv4 object detector.

When using custom data, consider using Roboflow Train (train up to 3 models for free with a Public account) for a quick check of how your labeling strategy may perform. This can save you time (and money) before training in Colab,

Training Custom YOLOv4 detector... ⏰

Training will print after every iteration. The mAP will be calculated on the validation set and will print every 1000 iterations. (See our post explaining mAP if to learn more.)

Note: Training will take approximately six hours for 300 images. This is a research framework, not optimized for quick training. To speed up the time it takes the program to run try to lower the number of subdivisions and lower the max_batches.

You want to watch the "avg loss" to see if your detector is converging. Choose the weights on the iteration that achieves the best mAP calculation on your validation set.

Training...

Almost there.

Using Our Custom YOLO v4 Detector for Inference

In this section we will use your trained custom YOLO v4 detector to make inference on test images. When training, the trained weights for our detector are saved every 100 iterations in the ./backup/ directory.

We can reload these weights and make inference on a test image. Remember to use the weights that achieved the highest mAP on your validation set.

My YOLOv4 model for cell detection is the best one I have ever trained

✅ There you have it!

You have trained your own YOLO v4 model to make object detections on custom objects. I have personally found that YOLO v4 does the best among other models for custom object detection tasks.

Saving Model Weights for Future Use

You can save your model weights by moving them from the./backup/ directory and back into your Google Drive. Then you can pick up training from those weights and re-import them for inference.

Conclusion to Training YOLOv4 on Custom Data

In this post, we have walked through training YOLOv4 on your custom object detection task. We have covered the following steps to go from zero to 100 with YOLOv4:

  • Configure our GPU environment on Google Colab
  • Install the Darknet YOLO v4 training environment
  • Download our custom dataset for YOLO v4 and set up directories
  • Configure a custom YOLO v4 training config file for Darknet
  • Train our custom YOLO v4 object detector
  • Reload YOLO v4 trained weights and make inference on test images

Please enjoy deploying the state of the art for detecting your custom objects 🚀

Stay tuned for future tutorials such as a YOLO v4 tutorial in Pytorch, YOLO v4 tutorial in TensorFlow, YOLO v4 tutorial in Keras, and comparing YOLO v4 to EfficientDet for object detection.

Next Steps

To get even more out of the YOLOv4 repository, we have wrote this guide on advanced tactics in YOLOv4. I highly recommend checking that out after you've trained YOLOv4.

Build and deploy with Roboflow for free

Use Roboflow to manage datasets, train models in one-click, and deploy to web, mobile, or the edge. With a few images, you can train a working computer vision model in an afternoon.