Computer vision is revolutionizing medical diagnoses by assisting doctors with patterns they may not have seen or identifying an error they may have overlooked.

Thus, it's unsurprising one of the more popular "hello world" datasets of object detection is the blood count and cell detection dataset (BCCD). Now two years old, this is a dataset of blood cells photos, originally open sourced by cosmicad and akshaylambda. There are 364 images across three classes: WBC (white blood cells), RBC (red blood cells), and Platelets.

Upon examining this dataset, however, the Roboflow team discovered there is room for improved labelling.

Here's an original image and raw labels that appears to be comprehensive:

Microscope slide showing annotated blood cells.
Everything is labeled in this original!

Yet here's another original image and its raw labels that is clearly missing bounding boxes:

Microscope slide showing annotated platelets.
Only platelets are labeled here!

Now, fair warning, the Roboflow team are not doctors or domain experts and do not claim to have cell biology expertise. However, in reviewing the original 364 microscope image examples, there were examples like the one above where labels can be intuitively added.

Upon reviewing and relabeling, the Roboflow team added 187 labels: 183 RBC, three WBC, and one Platelets. That dataset is freely available here.

(If you're looking to build an object detection model leveraging this dataset, be sure to check our computer vision tutorials)

We'll be running tests on the importance of missing labels shortly. Stay tuned...