At Roboflow, we talk to dozens of machine learning researchers, engineers, and software engineers, and get this question multiple times a day.

The number of images you need to train a model

Here’s the truth: a model can work with 100 images, 500 images, or with 10,000. It just depends on what you are doing and the level of accuracy required for you to use your model.

Because the number of images to train a model varies widely, we advocate for building a machine learning pipeline. A machine learning (MLops) pipeline that allows you to bring data in from your deployed environment ensures success because you’ll be able to continue improving your model with active learning.

You can start with 100 source images, increase the total number of generated images using pre-processing and augmentation (now maybe you’re at 500 images!), and deploy your model to start capturing more data to feed into your machine vision model.

Dataset augmentation examples


Along with adding more data to your model, new models or updates to existing models will also change the dynamic of how many images you need to train your model. Research at the world’s major tech companies and universities comes out at a rate that is hard to keep up with. Anyone building computer vision models is faced with constantly evolving technology so you’re better off building or using an infrastructure that allows you to plug and play models in your pipeline.

How do you know if you need more images to train your model?

If you run a model and you are consistently getting “low” confidence level scores, the low scoring images need to go back into the pipeline to improve the model. After adding more data, you inevitably get a model that is not only high scoring but also understands all possibilities of the environment and the true artifact of what the model is trying to learn. More largely, this gets at what the MLops pipeline does for you. It ensures that no matter what happens, the model continues to work.

Active learning pipeline in computer vision

Retailers serve as a great example – let’s say a vendor changes their packaging or has a product promotion. If you are using a limited and static model, computer vision may no longer work for inventory management and you’ll likely run into serious problems. By using a MLops pipeline, you’ll be able to deduce the new product and quickly train the model as the new packaging rolls out (and with a very large data set). If you are like me and you LOVE sugary cereal in the morning – it will detect that your Fruitloops or Lucky Charms changed its box but will still recognize the words “Fruitloops” or “Lucky Charms”.

Take the data-centric path to deploying computer vision

As a result, we even think “training” and “models'' are a bit of a commodity – the most important thing is to build a pipeline so you can train whatever model is the latest and greatest. We take the data-centric approach (rather than the model-centric approach) to computer vision and recommend you do the same. For most people using computer vision, it’s easier to adapt, update, and change the data training their model than editing the model itself. We promise you, you don’t need a PhD to use Roboflow Annotate (or CVAT or LabelImg or VOTT or LabelMe or VGG) and your ability to deploy a useful model is within reach.

The next time you are thinking about how to best build a computer vision project and ask yourself “how many images do I need?!?” or “how many more annotations?!” I strongly recommend setting up a pipeline – our python package is a straightforward project to set up.

Happy building!