A guide on how to label your own computer vision dataset using Microsoft VoTT.
In order to train computer vision models, we need to provide our models with supervision in the form of labeled data. As we show more and more labeled data to our model, the model begins to learn the underlying patterns in our labeling decisions. After the training process is complete, we can deploy our object detection model for automatic inference.
Large scale labeling solutions exist, but are costly. If you are starting a new computer vision project, you might prefer to take a "do it yourself" (DIY) labeling solution to assemble the first version of your dataset. As computer vision models get better and better, it may take as few as 10-50 images to get the first version of your model off the ground.
Gather Training Images
Before starting your labeling job, you must first gather a corpus of unlabeled images for your dataset. We recommend narrowing the domain of your dataset as much as possible to ensure successful modeling results. That is, try to control all of the environmental factors you can control to make the coming task easier on your model.
Microsoft VoTT supports importing images from your local drive, and naturally Bing Image Search and Azure Blob Storage. In this tutorial, we will import our dataset from a local drive.
Installing VoTT Software
Once you have an unlabeled corpus of images, you are ready to install the VoTT labeling software.
Standing Up VoTT Locally
If you have your data in Azure Blob Storage or you are using Bing Image Search, you can go ahead and use VoTT directly through their website.
If you have your images saved to your local drive, it will be easier to start VoTT on your local machine.
Download VoTT Installer
The easiest way to install VoTT locally is by using the installation packages from each release. Installation packages are listed for VoTT on Mac OSX, VoTT on linux, and VoTT on Windows.
Navigate to the Assets box, and download the file you need for your operating system.
I am building VoTT on Mac OSX. So I will drag VoTT over to my Applications folder.
All set ✅
Optional: Compile VoTT from Source
If you want to make tweaks to VoTT, you may want to compile and run the VoTT tool from source.
Then to start the VoTT tool, run the following lines of code in the directory of your choosing (VoTT will be downloaded to your local machine)
git clone https://github.com/Microsoft/VoTT.git cd VoTT npm ci npm start
You will see a lot of strange printouts as npm is setting up the project.
Again, if you're just getting started, I recommend using the VoTT install packages rather than building from source.
Starting a Project in VoTT
Once you have started VoTT in your choice location (locally installed, built from source, or cloud server), you can go ahead and start your labeling project.
New Project. And fill in the relevant fields:
Source Connection map to the folder on your drive that contains the raw image dataset.
Once you have kicked off your project, you will see your images in the tool, ready for labeling.
VoTT Labeling Shortcuts and Tricks
In order to label your dataset quickly, you will want to leverage the shortcuts available in VoTT.
You can draw a box just by clicking and dragging. You can also start a box by typing capital R on the keyboard.
Your class labels will be hot-keyed, so you can just hit the number hotkey to automatically match the box to the correct class.
You can move through images by using the arrow keys.
Ctrl or Cmd + S saves your progress.
Labeling Best Practices
When you are labeling images in VoTT, keep these best practices in mind. Ultimately, you are thinking downstream for your modeling task. Any errant annotations or ambiguities should be resolved through your labeling process.
In general, the following practices should be followed:
1) Label around the entire object
2) Keep bounding boxes tight to the object
3) Label occluded objects by drawing a box around the whole object
4) Label objects that are partially out of frame
5) Beware of choosing class labels that often overlap
Exporting Data from VoTT
Once your dataset is fully labeled, you can hit Ctrl or Cmd + S to save your progress.
Navigate to the export button one the left side of the tool.
Choose the annotation format you would like to export in
Provider. We recommend outputting
Pascal VOC and then loading into Roboflow for dataset conversion to any annotation format. Each model uses a specific object detection annotation format so you will need to convert your VOC XML files to another format.
Save Export Settings.
Finally, to export your dataset, click the export button in the top pane. The dataset will export to the location you provided when you set up the dataset.
Next Steps after Labeling Your Dataset in VoTT
Once you have labeled your dataset in VoTT, we recommend uploading your dataset in Pascal VOC format to Roboflow. From there, you can check the health of your computer vision dataset, manage class labels, export to any dataset format, and use state of the art computer vision models.
With Roboflow, you can generate artificial training data so you can spend less time collecting and labeling, and more time training and deploying your computer vision model. Data augmentation strategies in Roboflow include flipping images, random cropping, creating synthetic computer vision data, and much, much more.