In computer vision datasets, the question is pervasive - what is a label map?

In this post, we will demystify the label map by discussing the role that it plays in the computer vision annotation process. Then we will get hands on with some real life examples using a label map.

Chess Pieces annotated with the guidance of a label map

Common Errors Requiring a Label Map

  • Class names missing
  • Class names show as integers
  • Class names do not match my dataset's class names
  • Class labels not recognized
  • Class labels are numbers

The Role of the Label Map

Computer vision datasets come in all flavors of formats. Roboflow supports the injection, conversion, and exportation of over 30 computer vision formats. While automatic conversion of computer vision datasets is convenient, it is useful to understand the dataset structure for use after export.

In a computer vision dataset, it is common to have annotations referring to class labels. In the above image, our class labels include the different colors and shapes of chess pieces. In order to annotate an image, an image annotation file will often define the annotations specific to a particular image. This annotation file may or may not contain the class labels specific to the annotation in question.

In the case where the annotation file does not specify class labels, a label map is referenced to look up the class name. The label map is the separate source of record for class annotations.

Hands on with the Label Map

It is important to note that not all computer vision dataset formats use the label map. Computer vision datasets that leverage the label map for class labeling include:

Let's take a look at an example annotation of the above image f9a9a175f26d4b26bca3a5338cc1405e.jpg in YOLO Darknet format. The corresponding f9a9a175f26d4b26bca3a5338cc1405e.txt file contains the annotations for objects in the image.

1 0.23563218390804597 0.13218390804597702 0.27586206896551724 0.14080459770114942
0 0.09051724137931035 0.28304597701149425 0.1810344827586207 0.10057471264367816
5 0.03879310344827586 0.27873563218390807 0.07758620689655173 0.10344827586206896
5 0.1896551724137931 0.40804597701149425 0.16666666666666666 0.10632183908045977
2 0.1997126436781609 0.5014367816091954 0.1781609195402299 0.10057471264367816
3 0.1221264367816092 0.4942528735632184 0.14367816091954022 0.08908045977011494
3 0.2916666666666667 0.2471264367816092 0.14655172413793102 0.08620689655172414
3 0.5387931034482759 0.4224137931034483 0.15517241379310345 0.08908045977011494
3 0.8204022988505747 0.3620689655172414 0.16091954022988506 0.10632183908045977
3 0.6925287356321839 0.5488505747126436 0.16954022988505746 0.10632183908045977
2 0.8362068965517241 0.7126436781609196 0.22413793103448276 0.12643678160919541
7 0.40948275862068967 0.8951149425287356 0.28160919540229884 0.14367816091954022
11 0.05459770114942529 0.7183908045977011 0.10919540229885058 0.11494252873563218
8 0.860632183908046 0.9425287356321839 0.22701149425287356 0.10919540229885058
10 0.10488505747126436 0.5775862068965517 0.20977011494252873 0.1206896551724138
6 0.10057471264367816 0.7586206896551724 0.1925287356321839 0.10057471264367816
6 0.4209770114942529 0.6724137931034483 0.19540229885057472 0.10344827586206896
9 0.09051724137931035 0.3864942528735632 0.14942528735632185 0.08908045977011494
9 0.11494252873563218 0.6623563218390804 0.15517241379310345 0.09770114942528736
9 0.3175287356321839 0.7514367816091954 0.14942528735632185 0.08908045977011494
9 0.4367816091954023 0.7931034482758621 0.15229885057471265 0.10344827586206896
9 0.5804597701149425 0.7212643678160919 0.16379310344827586 0.10632183908045977
Image annotations that reference a label map

Here, you will notice that the class name is nowhere to be found. Rather, the first entry per line is an integer mapping to the correct class name found in the label map!

Let's take a look at the label map _darknet.labels.

black-bishop
black-king
black-knight
black-pawn
black-queen
black-rook
white-bishop
white-king
white-knight
white-pawn
white-queen
white-rook
The label map in YOLO Darknet maps integers to a class list specified in the label map

Each integer above maps to a position in this list, and this is how the dataset expresses class labels in the annotations.

That is how the label map works in practice!

It is important to note that different label maps function slightly differently from format to format. For example, the .pbtxt label map for our dataset in TensofFlow TFRecord format looks like this:

item {
    name: "black-bishop",
    id: 1,
    display_name: "black-bishop"
}
item {
    name: "black-king",
    id: 2,
    display_name: "black-king"
}
item {
    name: "black-knight",
    id: 3,
    display_name: "black-knight"
}
item {
    name: "black-pawn",
    id: 4,
    display_name: "black-pawn"
}
item {
    name: "black-queen",
    id: 5,
    display_name: "black-queen"
}
item {
    name: "black-rook",
    id: 6,
    display_name: "black-rook"
}
item {
    name: "white-bishop",
    id: 7,
    display_name: "white-bishop"
}
item {
    name: "white-king",
    id: 8,
    display_name: "white-king"
}
item {
    name: "white-knight",
    id: 9,
    display_name: "white-knight"
}
item {
    name: "white-pawn",
    id: 10,
    display_name: "white-pawn"
}
item {
    name: "white-queen",
    id: 11,
    display_name: "white-queen"
}
item {
    name: "white-rook",
    id: 12,
    display_name: "white-rook"
}
Label Map for our dataset in TensorFlow TFRecord Format

Here you can see that the label map is specified in a slightly different fashion with labels displayed in a series of small dictionary entries. And furthermore, the integer referencing a class name starts with 1 not 0!

Conclusion

We have discussed the role that a label map plays in annotating a computer vision dataset. We also got hands on with some real live label maps to see how the label map functions in practice.

Next Steps