DIY labeling with CVAT
CVAT is an OpenCV project that provides easy labeling for computer vision datasets. CVAT allows you to utilize an easy to use interface to make annotating easier. CVAT is an open labeler, a free open source labeling tool, a free annotator, an image annotator, and of course a Computer Vision Annotation Tool.
In this post, we will be focusing on CVAT's ability to make object detection annotations on images, although, it has many more capabilities including, CVAT annotation tool for video, CVAT annotation tool for semantic segmentation, CVAT for polygon annotations, and so on.
CVAT is an annotation tool among a group of similar DIY labeling tools including LabelImg computer vision labeling tool. We recommend trying to label a batch of images yourself (50+) and training a state of the art model like YOLOv4, to see if your computer vision task is already solved with current technologies.
Label and Annotate Data with Roboflow for free
Use Roboflow to manage datasets, label data, and convert to 26+ formats for using different models. Roboflow is free up to 10,000 images, cloud-based, and easy for teams.
I will be showing the steps that I used to annotate the public aerial maritime object detection dataset taken from a drone. Although a specific dataset is used, this post is meant to be a general guide on how to label an object detection dataset and how to use labeling tools for object detection. Feel free to another similar aerial imagery dataset.
CVAT Quickstart
If this is the first time you have encountered CVAT, then you want to start by launching the CVAT website, which is the quickest way to start labeling your data.
Once into the CVAT website, you will see a page like this:

Launch New CVAT Task
From there, you can launch a new task in CVAT and drag your images in for labeling. You are also prompted to specify the class labels of the objects that you would like to detect. Carefully specify these.
Once your data is uploaded, navigate back to tasks. From there, you will see a task page.

Enter CVAT Labeling Job
You can create jobs to annotate this dataset and you will have automatically set up the CVAT labeling job when you created the task. Note the task/job semantic hierarchy.
Now you can click into your labeling task and get to work!
When you're in the labeling screen you will see the following.

Drawing Annotations in CVAT
You can click "Create Shape" and draw a box around the object you want your detector to detect. Then on the right hand side, you will see the color of the box that you just drew. You can choose among the class labels that you provided when setting up the task.
Exporting Annotations From CVAT
You first want to click "Save". CVAT does not automatically save work.
Then click "Menu", in CVAT you will see the following options:

Then you want to click "Export task dataset" and you can choose among different formats: label VOC XML, label COCO JSON, label YOLO annotations, etc.
Congrats! Now you have a labeled dataset.
CVAT on Local for Serious CVAT
If you are serious about CVAT, you can configure it on local. The CVAT website has these limitations:
- No more than 10 tasks per user
- Uploaded data is limited to 500Mb
On local you will not be subject to these limitations because your machine will be doing the heavy lifting.
To launch CVAT on local, first clone the CVAT repository in your terminal window.
git clone https://github.com/opencv/cvat.git
cd cvat
Then, if you don't have Docker, install Docker. See that Docker is sucessfully installed:
docker version
Now we build CVAT on local and launch with
docker-compose build
docker-compose up -d
This will take a while to run. It is building CVAT dependencies in your local machine.
Then you create your username within your local CVAT service by executing into it:
docker exec -it cvat bash -ic 'python3 ~/manage.py createsuperuser'
Now, navigate to your browser and type
http://localhost:8080/
This will navigate to your local CVAT!
You can come back later and restart the service. If you are having trouble logging into CVAT, you can rebuild with no-cache:
docker-compose build --no-cache
docker-compose up -d
CVAT Labeling Tips, Tricks, Best Practices
When you're operating in CVAT, carefully annotate objects with your downstream model in mind. Keep these labeling best practices in mind while working through your dataset:
1) Label entirely around the object
2) For occluded objects - label them entirely
3) Generally label objects that are partially out of frame
4) Beware of labeling many boxes that overlap or are entirely contained within each other. This can really confuse your model.
CVAT shortcuts:
- Start your labels list with the most represented class - it will be the default when you draw a box
- Label all objects in each class first - you can focus on them and change all of their labels at once
- Type "N" to draw a new box
CVAT Alternatives
CVAT is just one of many computer vision labeling tools. If you're wondering if it's right for you, you may want to read our Ultimate Guide to Object Detection or try Roboflow Annotate, which is designed to simplify many of the rough edges open source tools like CVAT have.
Looking to Get Started with Annotating Data?
Roboflow provides easy annotation with smart auto-suggested defaults. It's no surprise users annotate faster with Roboflow.
Next Steps after Labeling Your Computer Vision Dataset in CVAT
Once your dataset is labeled in CVAT, it is time to move to the creation of your computer vision model!
Roboflow makes it easy to load in your data (just drag and drop your images and your annotation file from CVAT). You can generate even more data with augmentations such as flipping images for CV, random cropping, and creating synthetic computer vision data. If you are interested in using data augmentations to increase the number of your training images (to spend less time in CVAT), this is a good guide on using data augmentation in computer vision.
When you are ready, use Roboflow Train to train a model with one-click and quickly test your model using our web app or your webcam. Alternatively, you can export your data from Roboflow to any format and start training your computer vision model. Our posts on How to Train YOLOv4 and How to Train EfficientDet are good starting points to train your model and then from model evaluation, you can gauge how much more data you may need to collect and annotate.
