The YOLO family of object detection models grows ever stronger with the introduction of YOLOv5 by Ultralytics. In this post, we will walk through how you can train YOLOv5 to recognize your custom objects for your custom use case.

Man riding a bicycle past a car in a driveway with an umbrella in the background.
YOLOv5 inferencing live on video with COCO weights - let's see how to move to custom YOLOv5 weights!

Many thanks to Ultralytics for putting this repository together - we believe that in combination with clean data management tools, this technology will become easily accessible to any developer wishing to deploy computer vision projects in their projects.

We use a public blood cell detection dataset, which you can export yourself. You can also use this tutorial on your own custom data.

To train our detector we take the following steps:

  • Install YOLOv5 dependencies
  • Download Custom YOLOv5 Object Detection Data
  • Define YOLOv5 Model Configuration and Architecture
  • Train a custom YOLOv5 Detector
  • Evaluate YOLOv5 performance
  • Visualize YOLOv5 training data
  • Run YOLOv5 Inference on test images
  • Export Saved YOLOv5 Weights for Future Inference

YOLOv5: What's New?

Only two months ago, we were very excited about the introduction of EfficientDet by Google Brain and wrote some blog posts breaking down EfficientDet. We thought this model might eclipse the YOLO family for prominence in the realtime object detection space - we were wrong.

Within three weeks, YOLOv4 was released in the Darknet framework and we wrote some more on breaking down the research in YOLOv4.

Then a few hours before the writing of this, YOLOv5 has been released and we have found it to be extremely sleek. YOLOv5 is written in the Ultralytics PyTorch framework, which is very intuitive to use and inferences very fast. In fact, we and many others would often translate YOLOv3 and YOLOv4 Darknet weights to the Ultralytics PyTorch weights in order to inference faster with a lighter library.

Is YOLOv5 more performant than YOLOv4? We'll have more to say about this soon, but we have early guesses on YOLOv5 vs YOLOv4.

Performance of YOLOv5 vs EfficientDet (updated 6/23) (source)

YOLOv4 is notably left out of the evaluation on the YOLOv5 repository. That said, YOLOv5 is certainly easier to use and it is very performant on custom data based on our initial runs.

Scaled-YOLOv4 released

Check out the latest modeling in Scaled-YOLOv4. Scaled-YOLOv4 tops EfficientDet accross the curve.

On to training...

We recommend following along concurrently in this YOLOv5 Colab Notebook.

Installing the YOLOv5 Environment

To start off with YOLOv5 we first clone the YOLOv5 repository and install dependencies. This will set up our programming environment to be ready to running object detection training and inference commands.

!git clone https://github.com/ultralytics/yolov5  # clone repo
!pip install -U -r yolov5/requirements.txt  # install dependencies

%cd /content/yolov5

Then, we can take a look at our training environment provided to us for free from Google Colab.

import torch
from IPython.display import Image  # for displaying images
from utils.google_utils import gdrive_download  # for downloading models/datasets

print('torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

It is likely that you will receive a Tesla P100 GPU from Google Colab. Here is what I received:

torch 1.5.0+cu101 _CudaDeviceProperties(name='Tesla P100-PCIE-16GB', major=6, minor=0, total_memory=16280MB, multi_processor_count=56)

The GPU will allow us to accelerate training time. Colab is also nice in that it come preinstalled with torch and cuda. If you are attempting this tutorial on local, there may be additional steps to take to set up YOLOv5.

Download Custom YOLOv5 Object Detection Data

In this tutorial we will download custom object detection data in YOLOv5 format from Roboflow. In the tutorial, we train YOLOv5 to detect cells in the blood stream with a public blood cell detection dataset. You can follow along with the public blood cell dataset or upload your own dataset.

Quick Note on Labeling Tools

If you have unlabeled images, you will first need to label them. For free open source labeling tools, we recommend the following guides on getting started with LabelImg or getting started with CVAT annotation tools. Try labeling ~50 images to proceed in this tutorial. To improve your model's performance later, you will want to label more.

Once you have labeled data, to get move your data into Roboflow, create a free account and then you can drag your dataset in in any format: (VOC XML, COCO JSON, TensorFlow Object Detection CSV, etc).

Once uploaded you can choose preprocessing and augmentation steps:

Roboflow Screenshot: BCCD Preprocessing Options
The settings chosen for the BCCD example dataset

Then, click Generate and Download and you will be able to choose YOLOv5 PyTorch format.

Roboflow Screenshot: Download Dialog (COCO, CreateML, Pascal VOC, YOLO Darknet, YOLO v3, Tensorflow, TFRecord)
Select "YOLO v5 PyTorch"

When prompted, be sure to select "Show Code Snippet." This will output a download curl script so you can easily port your data into Colab in the proper format.

curl -L "https://public.roboflow.ai/ds/YOUR-LINK-HERE" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

Downloading in Colab...

Jupyter Screenshot: Downloading a dataset from Robfolow
Downloading a custom object dataset in YOLOv5 format

The export creates a YOLOv5 .yaml file called data.yaml specifying the location of a YOLOv5 images folder, a YOLOv5 labels folder, and information on our custom classes.

Define YOLOv5 Model Configuration and Architecture

Next we write a model configuration file for our custom object detector. For this tutorial, we chose the smallest, fastest base model of YOLOv5. You have the option to pick from other YOLOv5 models including:

  • YOLOv5s
  • YOLOv5m
  • YOLOv5l
  • YOLOv5x

You can also edit the structure of the network in this step, though rarely will you need to do this. Here is the YOLOv5 model configuration file, which we term custom_yolov5s.yaml:

nc: 3
depth_multiple: 0.33
width_multiple: 0.50

anchors:
  - [10,13, 16,30, 33,23] 
  - [30,61, 62,45, 59,119]
  - [116,90, 156,198, 373,326] 

backbone:
  [[-1, 1, Focus, [64, 3]],
   [-1, 1, Conv, [128, 3, 2]],
   [-1, 3, Bottleneck, [128]],
   [-1, 1, Conv, [256, 3, 2]],
   [-1, 9, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]], 
   [-1, 9, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]],
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 6, BottleneckCSP, [1024]],
  ]

head:
  [[-1, 3, BottleneckCSP, [1024, False]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],
   [-2, 1, nn.Upsample, [None, 2, "nearest"]],
   [[-1, 6], 1, Concat, [1]],
   [-1, 1, Conv, [512, 1, 1]],
   [-1, 3, BottleneckCSP, [512, False]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],
   [-2, 1, nn.Upsample, [None, 2, "nearest"]],
   [[-1, 4], 1, Concat, [1]],
   [-1, 1, Conv, [256, 1, 1]],
   [-1, 3, BottleneckCSP, [256, False]],
   [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]],

   [[], 1, Detect, [nc, anchors]],
  ]

Training Custom YOLOv5 Detector

With our data.yaml and custom_yolov5s.yaml files ready to go we are ready to train!

To kick off training we running the training command with the following options:

  • img: define input image size
  • batch: determine batch size
  • epochs: define the number of training epochs. (Note: often, 3000+ are common here!)
  • data: set the path to our yaml file
  • cfg: specify our model configuration
  • weights: specify a custom path to weights. (Note: you can download weights from the Ultralytics Google Drive folder)
  • name: result names
  • nosave: only save the final checkpoint
  • cache: cache images for faster training

And run the training command:

Training the custom YOLOv5 Detector. It trains quickly!

During training, you want to be watching the mAP@0.5 to see how your detector is performing - see this post on breaking down mAP.

Evaluate Custom YOLOv5 Detector Performance

Now that we have completed training, we can evaluate how well the training procedure performed by looking at the validation metrics. The training script will drop tensorboard logs in runs. We visualize those here:

Tensorboard Screenshot: Training results.
Visualizing tensorboard results on our custom dataset

And if you can't visualize Tensorboard for whatever reason the results can also be plotted with utils.plot_results and saving a result.png.

Neural network statistics (GloU, Objectness, Classification, Precision, Recall)

I stopped training a little early here. You want to take the trained model weights at the point where the validation mAP reaches its highest.

Visualize YOLOv5 training data

During training, the YOLOv5 training pipeline creates batches of training data with augmentations. We can visualize the training data ground truth as well as the augmented training data.

Ground truth BCCD object detection data.
Our training data ground truth
Predicted values for BCCD dataset from trained machine learning model.
Our training data with automatic YOLOv5 augmentations

Run YOLOv5 Inference on Test Images

Now we take our trained model and make inference on test images. After training has completed model weights will save in weights/. For inference we invoke those weights along with a conf specifying model confidence (higher confidence required makes less predictions), and a inference source. source can accept a directory of images, individual images, video files, and also a device's webcam port. For source, I have moved our test/*jpg to test_infer/.

!python detect.py --weights weights/last_yolov5s_custom.pt --img 416 --conf 0.4 --source ../test_infer

The inference time is extremely fast. On our Tesla P100, the YOLOv5s is hitting 142 FPS!!

Inference on YOLOv5s occurring at 142 FPS (.007s/image)

Finally, we visualize our detectors inferences on test images.

YOLOv5 inference on BCCD (RBC, WBC, Platelets)
YOLOv5 inference on test images

Export Saved YOLOv5 Weights for Future Inference

Now that our custom YOLOv5 object detector has been verified, we might want to take the weights out of Colab for use on a live computer vision task. To do so we import a Google Drive module and send them out.

from google.colab import drive
drive.mount('/content/gdrive')

%cp /content/yolov5/weights/last_yolov5s_custom.pt /content/gdrive/My\ Drive

Conclusion

We hoped you enjoyed training your custom YOLOv5 detector!

YOLOv5 is lightweight and extremely easy to use. YOLOv5 trains quickly, inferences quickly, and performs well.

Let's get it out there!