There are two different ways to think about algorithmic bias, and they are complementary to one another. The first being the social and ethical side, and second being the more technical side, how we detect it, and mitigate it.
Today we’re going to dive into the technical side of avoiding bias in computer vision models, to give you an introduction into the topic. If you are interested in learning more about the social and ethical side, The Power of Representation in Computer Vision serves as a good introduction.
It’s important to note that bias in Machine Learning isn't only an ethics or technical question, but also a business question, as model errors are lost business. Imagine a model that rejects loan applicants incorrectly. Regardless if that reason was ethical or technical, the business incurred a loss because of it.
What is bias in machine learning?
Bias is the average error of a machine learning model after training on a training dataset. When bias is present, some factors may be given more or less weight than they should. This results in less accurate predictions and, as a result, poorer model performance.
Another measurement we need to look at when speaking about bias is variance. Variance speaks to how much the output of a model changes when presented with new data. One way to observe variance is to analyze test error. High variance can be observed when our training error is close to zero and our test error is high, also known as overfitting.
In an ideal model, both the train error and test error are consistently low, even across different training runs.
How to mitigate machine learning bias
how can we mitigate bias in our models? That's a great question. A key thing we can do is adopt a data-first mentality.
As a practitioner of machine learning and computer vision, it is crucial to remember that no matter how complex a model, performance is ultimately contingent on having a comprehensive and representative dataset.
Let's talk through some practical tips you can use to reduce bias in your models and ensure that your dataset serves your model well.
Collect Large and Representative Data
Training a model with a small dataset will result in high variance, making overfitting hard to avoid. This is because of the limited amount of observations and a large number of predictors available.
A large enough data set with representative data will help the model generalize effectively. Representative data means that the training data has similar characteristics to data collected when the model is deployed.
For example, say we are detecting dogs and our model will be deployed to detect them both indoors at a clinic and outdoors at a dog park. In this scenario, we would want to have data representing both environments, including all the varieties of dogs found in each environment.
We don't need thousands of images of dogs in dozens of different environments if our model is only going to be deployed at a clinic and a dog park. Instead, it would be better to optimize for taking more photos in clinics and dog parks – photos with different breeds, different sizes of dogs, different parts of clinics and parks – so the model has more information about detecting dogs in the areas where the model will be deployed.
Improving your data with active learning
Active learning accelerates the rate a model improves its performance by taking a series of intentional steps for passing data into the model.
Let’s say we don’t have enough images of corgis in a dog breeds dataset, and because of that, our model does not perform well when presented with corgis.
With active learning we prioritize getting corgi data into our training set, retraining, and redeploying. By doing this the model is learning from data that was most likely to fool it. Indeed, the heart of active learning is in the name: always working to help your model learn as you find out its strengths and weaknesses.
Models are optimized based on the variables and parameters in the time they were created. As the world around us changes, the model as it was originally created, will be in new environments. With active learning, we can expose it to new things, retrain, and redeploy it.
Roboflow has an Upload API where you can programmatically collect and send real world images directly from your application to improve your model.
Perform model error analysis
When you are finished training your model on Roboflow, you are provided with the training results, as well as the ability to view more details and generate visualizations. The visualizations show which images in the validation and testing datasets the model performed poorly on compared to the ground truth.
This information can help to validate proposed labels and to understand if you need additional or different types of data, like in different environments, in order to retrain it. You can also find issues like missing or mislabelled data.
Run a health check
The health check feature on Roboflow displays measurements of class balance or imbalance. To continue with our dog example, this could let us know if we have too few corgis and that we need to do some active learning to improve our performance on corgi data. The number of null images are also helpful as they help our model recognize when the objects we are looking for are not present.
Removing duplicate images
Having duplicate images in a data set introduces bias because it gives a model opportunities to learn patterns specific to the duplicates. Duplicate images get a disproportionate amount of training time.
Let's say you have 10 images of a corgi but five of them are the same. Your model might be able to identify some corgis but not others. This may be because your model is learning more specific features about the corgi that appears multiple times in your dataset. The model isn't generalizing well enough to corgis; it is instead learning about a specific photo.
Ensuring duplicates are not in all train, test, and validation sets
We need to make sure duplicates do not enter different train, validation, test splits, since their presence will also bias the evaluation metrics.
Let's return the corgi example. If you have the same photo of a corgi in your validation and test splits, your model may report better performance than it will show in production with other corgis. Your validation and test results might be skewed by correctly identifying an image because the same image appears in training. Remember: remove duplicates before you train
Thankfully, Roboflow automatically removes duplicates during the upload process, so if you’re using Roboflow to upload images you won’t need to worry about this part of the mitigation.
Discuss & Share
You can discuss or ask questions about improving your model and how to make it less biased at discuss.roboflow.com. We also highly encourage our readers to add their datasets and models to Roboflow Universe. (We are desperately in need of corgis.)