If you're wondering this, you're not alone. The annotation group is the category that encompasses all of the classes in your dataset. It answers the question "What kind of things are labeled in this dataset?"

Why do you need an annotation group?

It's not immediately obvious why you need an annotation group in addition to the image and annotation themselves. The short answer is that the same image can be labeled in multiple ways depending on what you're training your model to detect.

One image labeled in two different ways (left: pieces, right: games)

For example: if you're making an augmented reality board game app, you might want to train one model to identify which game the user is pointing their phone app and then one model specific to the pieces from each game.

How Roboflow leverages this information

At Roboflow we're able to use your annotation group to do some magic behind the scenes. Each image can have multiple unique annotations and span multiple datasets (while only counting against your image usage once). This lets us merge datasets for you (so you can follow labeling best practices and have your outsourced annotators only label one class at a time). And we let you correct annotations for an image across all of the datasets it's in.

How do I choose an annotation group?

The easiest way is to fill in the blank: "I labeled all of the _____ in this image."

You want to pick the most specific name that encompasses all of the classes of your dataset. For example, if I'm labeling the different types of chess pieces (eg pawn, knight, bishop, rook, queen, king) I would choose "pieces" as my annotation group. If I was labeling game boards (eg chess, boggle, scrabble, monopoly, sudoku) I would choose "games" as my annotation group.

Note: if your dataset only has a single class, the annotation group may be the same as the class. For example, in a model finding tennis balls you may label each "ball" and your annotation group could simply be called "balls". A second dataset with the same image may have annotation group "rackets". And if you then merged them you might select "equipment" as the annotation group of the combined dataset.

Technically you could choose something generic like "object" or "thing" as your annotation group and everything would work fine. And if you're creating a similar dataset to COCO or ImageNet this may be Ok. But as your dataset library grows you'll be kicking yourself later for not choosing a scalable ontology.