Train Test Split Guide and Overview

Published Oct 28, 2020 • 2 min read

In order to ensure our models are generalizing well (rather than memorizing training data), it is best practice to create a train, test split. That is, absent rigor, our models can easily overfit to a small subset of examples we've collected. Look no further than Tesla using computer vision to identify stop signs – there is significantly more variation than one would anticipate.

Train, test, split (70/20/10) — A train, valid, and test split visualized in Roboflow.

By default, Roboflow prompts users to create train, valid, and test splits at the time of upload to encourage model building best practices. The default settings split a user's data into a 70 / 20 / 10 split: 70 percent of the examples are in the training set, 20 percent are in the validation set, and 10 percent are held out in the testing set.

Create a training, validation, and testing set for computer vision. — When uploading images, Roboflow prompts a user to create a train, valid, and test split.

However, there may be times where you seek greater control over exactly which images are in your training, validation, or testing set. In fact, Andrej Kapathy of Tesla spends as much time on test set curation as training set curation.

Adjusting splits in Roboflow is simple. When uploading data, a user can select which split the images in the current upload should be in the training, validation, or testing set.

0:00

/0:06

Select if the images should go in the training set, validation set, or testing set.

Once we've added images to one split in our dataset, we can select "Add More Images" to repeat the upload process, except we may select "Validation" or "Testing" for our next batch of uploaded images.

Add more images to a dataset in Roboflow. — On the righthand side, we can select "Add More Images" to expand a given image dataset.

As a bonus, if your images happen to be organized in Train, Valid, and Test folders locally and you drop these folders into Roboflow at upload, Roboflow will automatically detect this file structure organization at the time of upload.

0:00

/0:18

Detecting Train, Valid, Test folders and suggests the images are split according to Existing Values

Be sure to refer to the Roboflow documentation for additional tips!

Cite this Post

Use the following entry to cite this post in your research:

Joseph Nelson. (Oct 28, 2020). Train Test Split Guide and Overview. Roboflow Blog: https://blog.roboflow.com/train-test-split-with-roboflow/

Discuss this Post

If you have any questions about this blog post, start a discussion on the Roboflow Forum.

Stay Connected

Get the Latest in Computer Vision First

Written by

Joseph Nelson

Roboflow cofounder and CEO. On a mission to transform every industry by democratizing computer vision. Previously founded and sold a machine learning company.

View more posts

Train Test Split Guide and Overview

Cite this Post

Discuss this Post

Written by

Topics

More About

Launch: Dedicated Deployments

Launch: Fine-Tune Florence-2 for VQA with Roboflow

Launch: Use Your Webcam in Roboflow Workflow Previews

Launch: Label Multimodal Datasets with Roboflow

What is Instance Segmentation? A Guide. [2025]

We Raised $40M to Invest In Enterprise and Open Source Vision AI

Train Test Split Guide and Overview

Build and deploy with Roboflow for free

Cite this Post

Discuss this Post

Written by

Topics

More About

Launch: Dedicated Deployments

Launch: Fine-Tune Florence-2 for VQA with Roboflow

Launch: Use Your Webcam in Roboflow Workflow Previews

Launch: Label Multimodal Datasets with Roboflow

What is Instance Segmentation? A Guide. [2025]

We Raised $40M to Invest In Enterprise and Open Source Vision AI