Blog

Dataset Management

Latest Posts Case Studies Product Updates Logistics Manufacturing

Train, Validation, Test Split for Machine Learning

4 Sep 2020 • 6 min read

Train, Validation, Test Split for Machine Learning

At Roboflow, we often get asked, what is the train, validation, test split and why do I need it? The motivation is quite simple: you should separate you data into train, validation, and test splits to prevent your model from overfitting and to accurately evaluate your model.

VoTT for Image Annotation and Labeling

27 Jul 2020 • 6 min read

VoTT for Image Annotation and Labeling

A guide on using VoTT to label your own computer vision dataset.

Why and How to Implement Random Rotate Data Augmentation

24 Jun 2020 • 9 min read

Why and How to Implement Random Rotate Data Augmentation

Learn how to apply a random rotate data augmentation to images for use in training computer vision models.

How to Convert Annotations from PASCAL VOC to YOLO Darknet

19 Jun 2020 • 5 min read

How to Convert Annotations from PASCAL VOC to YOLO Darknet

A bedrock of computer vision is having labeled data. In object detection [https://blog.roboflow.com/object-detection/] problems, those labels define bounding box positions in a given image. As computer vision rapidly evolves, so, too, do the various file formats available to describe the location of bounding boxes: PASCAL VOC

When to Use Contrast as a Preprocessing Step

15 May 2020 • 4 min read

When to Use Contrast as a Preprocessing Step

Adding contrast to images is a simple yet powerful technique to improve our computer vision models. But why? When considering how to add contrast to images and why we add contrast to images in computer vision, we must start with the basics. What is contrast? How contrast preprocessing improve our

Data Augmentation in YOLOv4

13 May 2020 • 7 min read

Data Augmentation in YOLOv4

Learn how data augmentation is used in training YOLOv4 computer vision models.

When Should I Auto-Orient My Images?

8 May 2020 • 2 min read

When Should I Auto-Orient My Images?

Learn when you should auto-orient images for use in training computer vision models.

Breaking Down Roboflow's Health Check Dimension Insights

29 Apr 2020 • 3 min read

Breaking Down Roboflow's Health Check Dimension Insights

Roboflow [https://roboflow.ai] improves datasets without any user effort. This includes dropping zero-pixel bounding boxes and cropping out-of-frame bounding boxes to be in-line with the edge of an image. Roboflow also notifies users of potential areas requiring attention like severely underrepresented classes (as was present in the original hard

The Difference Between Missing and Null Annotations

24 Apr 2020 • 4 min read

The Difference Between Missing and Null Annotations

A discussion of missing versus null annotations [https://blog.roboflow.com/glossary/#:~:text=annotation] and how VOC XML and COCO JSON handle them. Preparing data for computer vision models [https://models.roboflow.com/] is a tedious task. Even assuming training images are appropriately representative for inference, managing annotations quickly becomes

synthetic image data generation

15 Apr 2020 • 12 min read

How to Create a Synthetic Dataset for Computer Vision

The appeals of synthetic data are alluring: you can rapidly generate a vast amount of diverse, perfectly labeled images for very little cost and without ever leaving the comfort of your office. The good news is: it's easy to try! And we're about to show you how.

How to Create to a TFRecord File for Computer Vision and Object Detection

6 Apr 2020 • 6 min read

How to Create to a TFRecord File for Computer Vision and Object Detection

TensorFlow expedites the machine learning process markedly. From abstracting complex linear algebra to including pre-trained models and weights, getting the most out of TensorFlow is a full-time job. However, when it comes to loading data in ways that TensorFlow expects in order to perform as efficiently as it does, every

image preprocessing

30 Mar 2020 • 1 min read

Introducing Image Preprocessing and Augmentation Previews

Knowing how an image preprocessing step or augmentation is going to appear before you write the code for it is essential. Is it worth it to figure out the right amount of brightness? Will rotation increase variability appropriately? Roboflow is introducing features to take out the guesswork: preprocessing and augmentation

How Flip Augmentation Improves Model Performance

20 Mar 2020 • 2 min read

How Flip Augmentation Improves Model Performance

Flipping an image (and its annotations) is a deceivingly simple technique that can improve model performance in substantial ways. Our models [https://models.roboflow.ai] are learning what collection of pixels and the relationship between those collections of pixels denote an object is in-frame. But machine learning models (like convolutional

Introducing Bounding Box Level Augmentations

18 Mar 2020 • 3 min read

Introducing Bounding Box Level Augmentations

Having training data that matches the diversity of your task is paramount to the success of your models. At Roboflow, we’re committed to providing you with state-of-the-art techniques that can improve your deep learning model [https://models.roboflow.com]’s performance -- without needing to collect any more data

LabelImg for Labeling Object Detection Data

16 Mar 2020 • 3 min read

LabelImg for Labeling Object Detection Data

Accurately labeled data is essential to successful machine learning, and computer vision is no exception. In this walkthrough, we’ll demonstrate how you can use LabelImg [https://github.com/tzutalin/labelImg] to get started with labeling your own data for object detection models [https://models.roboflow.ai/object-detection]. Label and

The Importance of Blur as an Image Augmentation Technique

13 Mar 2020 • 3 min read

The Importance of Blur as an Image Augmentation Technique

Learn about the efficacy of blur as an image augmentation step in computer vision model training.

Why to Add Noise to Images for Machine Learning

9 Mar 2020 • 3 min read

Why to Add Noise to Images for Machine Learning

Learn why adding noise can be effective as an image augmentation in computer vision modeling.

Why and How to Implement Random Crop Data Augmentation

21 Feb 2020 • 4 min read

Why and How to Implement Random Crop Data Augmentation

Learn how to apply a random crop data augmentation to images for use in training computer vision models.

When to Use Grayscale as a Preprocessing Step

5 Feb 2020 • 2 min read

When to Use Grayscale as a Preprocessing Step

Grayscale allows our models to be more computationally efficient. So when **shouldn't** we grayscale our images?

You Might Be Resizing Your Images Incorrectly

31 Jan 2020 • 4 min read

You Might Be Resizing Your Images Incorrectly

Resizing images is a critical preprocessing step in computer vision. Principally, our machine learning models [https://models.roboflow.ai] train faster on smaller images. An input image that is twice as large requires our network to learn from four times as many pixels — and that time adds up. Moreover, many

How to Convert Annotations from PASCAL VOC XML to COCO JSON

29 Jan 2020 • 9 min read

How to Convert Annotations from PASCAL VOC XML to COCO JSON

Convert from VOC XML to COCO JSON (or any format!) in four clicks.

What is Image Preprocessing and Augmentation?

26 Jan 2020 • 8 min read

What is Image Preprocessing and Augmentation?

Understanding image preprocessing and augmentation options is essential to making the most of your training data.

Stay Connected

Get the Latest in Computer Vision First