In a computer vision dataset, it is common to have annotations referring to class labels. In the above image, our class labels include the different colors and shapes of chess pieces. In order to annotate an image, an image annotation file will often define the annotations specific to a particular image. This annotation file may or may not contain the class labels specific to the annotation in question.
In the case where the annotation file does not specify class labels, a label map is referenced to look up the class name. The label map is the separate source of record for class annotations.
Hands on with the Label Map
It is important to note that not all computer vision dataset formats use the label map. Computer vision datasets that leverage the label map for class labeling include:
Let's take a look at an example annotation of the above image f9a9a175f26d4b26bca3a5338cc1405e.jpg in YOLO Darknet format. The corresponding f9a9a175f26d4b26bca3a5338cc1405e.txt file contains the annotations for objects in the image.
Here, you will notice that the class name is nowhere to be found. Rather, the first entry per line is an integer mapping to the correct class name found in the label map!
Let's take a look at the label map _darknet.labels.
Each integer above maps to a position in this list, and this is how the dataset expresses class labels in the annotations.
That is how the label map works in practice!
It is important to note that different label maps function slightly differently from format to format. For example, the .pbtxt label map for our dataset in TensofFlow TFRecord format looks like this:
Here you can see that the label map is specified in a slightly different fashion with labels displayed in a series of small dictionary entries. And furthermore, the integer referencing a class name starts with 1 not 0!
We have discussed the role that a label map plays in annotating a computer vision dataset. We also got hands on with some real live label maps to see how the label map functions in practice.