Hugging Face computer vision datasets can be imported into Roboflow for additional labeling and/or cloud-hosted training. Using Hugging Face datasets to kickstart your computer vision model training in Roboflow allows you to then deploy models with a Roboflow hosted API endpoint, in your own private cloud, or on edge devices.
In this tutorial, you will learn how to import a Hugging Face dataset into Roboflow for training a model and deploying a model.
Step 1. Finding Datasets on Hugging Face
Head to Hugging Face and find the dataset suitable for your task. The example dataset used in this tutorial can be found here: https://huggingface.co/datasets/keremberke/license-plate-object-detection
If you want to import annotations with the image, make sure your dataset has a xml, json, or csv file containing necessary information to create the model. Roboflow can be used to convert annotations between formats and supports 40+ different annotation formats.
After finding the right dataset, download the files using git.
Step 2. Download Git
Downloading git is different on different devices. To download git:
Mac:
# run this command in terminal
brew install git (must have brew installed)
brew install git-lfs (must have brew installed)
Linux Ubuntu/Debian:
# run this command in terminal
sudo apt install git
sudo apt install git-lfs
Fedora:
# run this command in terminal
sudo dnf install git
sudo dnf install git-lfs
Step 3. Download the Hugging Face Dataset
With git, we can download the dataset.
Using the link of our preferred dataset, the command should be git clone [“PATH OF THE LINK”].
Using the example dataset, the command will be:
git clone https://huggingface.co/datasets/keremberke/license-plate-object-detection
After running a similar command, we should have new files in our downloads folders
Step 4. Create A Project on Roboflow
Now that we have a dataset, we can create a project on Roboflow. First sign in/create a Roboflow account.
Next, create a project by clicking on the button in the top right corner.
Next, pick a project type. Depending on your data and task, your project type may be different. Because the example dataset wants to detect the position and place of license plates, we will use an object detection project.
Additionally, if your xml, json, or csv file contains the exact coordinates of specific points, this signifies that the dataset may be suited for object detection, as it explicitly indicates where the annotations are located.
After we pick our dataset, we can finally input the files into our dataset.
Conclusion
By following this tutorial, you have learned how to import a Hugging Face dataset into Roboflow using git.
Once uploaded to Roboflow, you can annotate additional objects, add more data from Roboflow Universe, train a model, and deploy the model to any device using Inference.
Cite this Post
Use the following entry to cite this post in your research:
Nathan Yan. (Aug 2, 2024). How to Import Hugging Face Datasets to Roboflow. Roboflow Blog: https://blog.roboflow.com/hugging-face-datasets-roboflow/
Discuss this Post
If you have any questions about this blog post, start a discussion on the Roboflow Forum.