Computer vision problems start with finding high quality image datasets. Fortunately, access to common image data is increasingly easier. Datasets like Microsoft's COCO dataset and the Pascal VOC dataset provide a standard for common objects and measuring the efficacy of state-of-the-art computer vision models (like Scaled-YOLOv4, PP-YOLO, YOLOv4, YOLOv5 and more).
Moreover, creating custom datasets from the Open Images Dataset and platforms like Kaggle increasingly provide access to rich image datasets. At Roboflow, we maintain host a wide array of free computer vision datasets that are publicly available.
Offering to Host Your Public Datasets
As we've grown, Roboflow users have found they want to share their datasets with the public to improve computer vision research and contribute to the broader computer vision community.
For example, Roboflow user David Lee created a model that can recognize American Sign Language. David thought other developers should be able to train models to learn sign language, so he annotated his images and open sourced his American Sign Language Letters dataset with a Public Domain license and encouraged Roboflow to make it publicly available.
At Roboflow, we're inspired by work like David's and his desire to make it easier for other developers to improve upon his work. We've even collected and shared our own datasets – things like a drone image dataset, thermal image dataset, chess piece image dataset, dice image dataset, and more. We welcome the opportunity to host your datasets in order to share them with the broader computer vision community.
Hosting data with Roboflow Public datasets provides a number of advantages. It provides exposure to tens of thousands of developers that have built with Roboflow. Roboflow Public Datasets are also automatically indexed by Google Dataset Search. Moreover, datasets on Roboflow Universe are automatically available in every major annotation format: Pascal VOC XML, COCO JSON, TFRecords, and more.