The YOLO family of object detection models grows ever stronger with the introduction of YOLOv5 by Ultralytics. In this post, we will walk through how you can train YOLOv5 to recognize your custom objects for your custom use case.
Many thanks to Ultralytics for putting this repository together - we believe that in combination with clean data management tools, this technology will become easily accessible to any developer wishing to deploy computer vision projects in their projects.
We use a public blood cell detection dataset, which you can export yourself. You can also use this tutorial on your own custom data.
To train our detector we take the following steps:
- Install YOLOv5 dependencies
- Download Custom YOLOv5 Object Detection Data
- Define YOLOv5 Model Configuration and Architecture
- Train a custom YOLOv5 Detector
- Evaluate YOLOv5 performance
- Visualize YOLOv5 training data
- Run YOLOv5 Inference on test images
- Export Saved YOLOv5 Weights for Future Inference
YOLOv5: What's New?
Only two months ago, we were very excited about the introduction of EfficientDet by Google Brain and wrote some blog posts breaking down EfficientDet. We thought this model might eclipse the YOLO family for prominence in the realtime object detection space - we were wrong.
Then a few hours before the writing of this, YOLOv5 has been released and we have found it to be extremely sleek. YOLOv5 is written in the Ultralytics PyTorch framework, which is very intuitive to use and inferences very fast. In fact, we and many others would often translate YOLOv3 and YOLOv4 Darknet weights to the Ultralytics PyTorch weights in order to inference faster with a lighter library.
Is YOLOv5 more performant than YOLOv4? We'll have more to say about this soon, but we have early guesses on YOLOv5 vs YOLOv4.
YOLOv4 is notably left out of the evaluation on the YOLOv5 repository. That said, YOLOv5 is certainly easier to use and it is very performant on custom data based on our initial runs.
On to training... We recommend following along concurrently in this YOLOv5 Colab Notebook.
Installing the YOLOv5 Environment
To start off with YOLOv5 we first clone the YOLOv5 repository and install dependencies. This will set up our programming environment to be ready to running object detection training and inference commands.
!git clone https://github.com/ultralytics/yolov5 # clone repo !pip install -U -r yolov5/requirements.txt # install dependencies %cd /content/yolov5
Then, we can take a look at our training environment provided to us for free from Google Colab.
import torch from IPython.display import Image # for displaying images from utils.google_utils import gdrive_download # for downloading models/datasets print('torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))
It is likely that you will receive a Tesla P100 GPU from Google Colab. Here is what I received:
torch 1.5.0+cu101 _CudaDeviceProperties(name='Tesla P100-PCIE-16GB', major=6, minor=0, total_memory=16280MB, multi_processor_count=56)
The GPU will allow us to accelerate training time. Colab is also nice in that it come preinstalled with
cuda. If you are attempting this tutorial on local, there may be additional steps to take to set up YOLOv5.
Download Custom YOLOv5 Object Detection Data
In this tutorial we will download custom object detection data in YOLOv5 format from Roboflow. In the tutorial, we train YOLOv5 to detect cells in the blood stream with a public blood cell detection dataset. You can follow along with the public blood cell dataset or upload your own dataset.
***Using Your Own Data***
To export your own data for this tutorial, sign up for Roboflow and make a public workspace, or make a new public workspace in your existing account. If your data is private, you can upgrade to a paid plan for export to use external training routines like this one or experiment with using Roboflow's internal training solution.
Quick Note on Labeling Tools
If you have unlabeled images, you will first need to label them. For free open source labeling tools, we recommend Roboflow Annotate or the following guides on getting started with LabelImg or getting started with CVAT annotation tools. Try labeling ~50 images to proceed in this tutorial. To improve your model's performance later, you will want to label more.
Once uploaded you can choose preprocessing and augmentation steps:
Download and you will be able to choose YOLOv5 PyTorch format.
When prompted, be sure to select "Show Code Snippet." This will output a download curl script so you can easily port your data into Colab in the proper format.
curl -L "https://public.roboflow.ai/ds/YOUR-LINK-HERE" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip
****Note you can now also download your data with the Roboflow PIP Package
#from roboflow import Roboflow #rf = Roboflow(api_key="YOUR API KEY HERE") #project = rf.workspace().project("YOUR PROJECT") #dataset = project.version("YOUR VERSION").download("yolov5")
Downloading in Colab...
The export creates a YOLOv5 .yaml file called
data.yaml specifying the location of a YOLOv5
images folder, a YOLOv5
labels folder, and information on our custom classes.
Define YOLOv5 Model Configuration and Architecture
Next we write a model configuration file for our custom object detector. For this tutorial, we chose the smallest, fastest base model of YOLOv5. You have the option to pick from other YOLOv5 models including:
You can also edit the structure of the network in this step, though rarely will you need to do this. Here is the YOLOv5 model configuration file, which we term
nc: 3 depth_multiple: 0.33 width_multiple: 0.50 anchors: - [10,13, 16,30, 33,23] - [30,61, 62,45, 59,119] - [116,90, 156,198, 373,326] backbone: [[-1, 1, Focus, [64, 3]], [-1, 1, Conv, [128, 3, 2]], [-1, 3, Bottleneck, ], [-1, 1, Conv, [256, 3, 2]], [-1, 9, BottleneckCSP, ], [-1, 1, Conv, [512, 3, 2]], [-1, 9, BottleneckCSP, ], [-1, 1, Conv, [1024, 3, 2]], [-1, 1, SPP, [1024, [5, 9, 13]]], [-1, 6, BottleneckCSP, ], ] head: [[-1, 3, BottleneckCSP, [1024, False]], [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], [-2, 1, nn.Upsample, [None, 2, "nearest"]], [[-1, 6], 1, Concat, ], [-1, 1, Conv, [512, 1, 1]], [-1, 3, BottleneckCSP, [512, False]], [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], [-2, 1, nn.Upsample, [None, 2, "nearest"]], [[-1, 4], 1, Concat, ], [-1, 1, Conv, [256, 1, 1]], [-1, 3, BottleneckCSP, [256, False]], [-1, 1, nn.Conv2d, [na * (nc + 5), 1, 1, 0]], [, 1, Detect, [nc, anchors]], ]
Training Custom YOLOv5 Detector
custom_yolov5s.yaml files ready to go we are ready to train!
To kick off training we running the training command with the following options:
- img: define input image size
- batch: determine batch size
- epochs: define the number of training epochs. (Note: often, 3000+ are common here!)
- data: set the path to our yaml file
- cfg: specify our model configuration
- weights: specify a custom path to weights. (Note: you can download weights from the Ultralytics Google Drive folder)
- name: result names
- nosave: only save the final checkpoint
- cache: cache images for faster training
And run the training command:
During training, you want to be watching the mAP@0.5 to see how your detector is performing - see this post on breaking down mAP.
Evaluate Custom YOLOv5 Detector Performance
Now that we have completed training, we can evaluate how well the training procedure performed by looking at the validation metrics. The training script will drop tensorboard logs in
runs. We visualize those here:
And if you can't visualize Tensorboard for whatever reason the results can also be plotted with
utils.plot_results and saving a
I stopped training a little early here. You want to take the trained model weights at the point where the validation mAP reaches its highest.
Visualize YOLOv5 training data
During training, the YOLOv5 training pipeline creates batches of training data with augmentations. We can visualize the training data ground truth as well as the augmented training data.
Run YOLOv5 Inference on Test Images
Now we take our trained model and make inference on test images. After training has completed model weights will save in
weights/. For inference we invoke those weights along with a
conf specifying model confidence (higher confidence required makes less predictions), and a inference
source can accept a directory of images, individual images, video files, and also a device's webcam port. For source, I have moved our
!python detect.py --weights weights/last_yolov5s_custom.pt --img 416 --conf 0.4 --source ../test_infer
The inference time is extremely fast. On our Tesla P100, the YOLOv5s is hitting 142 FPS!!
Finally, we visualize our detectors inferences on test images.
Export Saved YOLOv5 Weights for Future Inference
Now that our custom YOLOv5 object detector has been verified, we might want to take the weights out of Colab for use on a live computer vision task. To do so we import a Google Drive module and send them out.
from google.colab import drive drive.mount('/content/gdrive') %cp /content/yolov5/weights/last_yolov5s_custom.pt /content/gdrive/My\ Drive
We hoped you enjoyed training your custom YOLOv5 detector!
YOLOv5 is lightweight and extremely easy to use. YOLOv5 trains quickly, inferences quickly, and performs well.
Let's get it out there!