Introducing an Improved Shear Augmentation
At Roboflow, we are committed to providing an efficient and effective pipeline for computer vision. This means every once an a while revisiting and revamping old product decisions.
Today, we introduce a new and improved shear augmentation. We'll walk through some details on the change, as well as some intuition and results backing up our reasoning.
Why Data Augmentation Matters
The shear augmentation is a part of the suite of data augmentation options in the Roboflow platform.
Data augmentation in computer vision is a technique to generate more training data from a base training set, by tweaking the base images programmatically. Recently, state of the art computer vision models have been improved through data augmentation.
Of course, the exact details behind an augmentation can be debated, explored, and improved - as we will see in the rest of this article.
The New Shear Augmentation
The Old Shear Augmentation
Our previous shear methodology was to shear an image from both corners with some probability in the x and y direction like so:
We often noticed that with this methodology, downstream model performance rarely improved.
The Intuition
Our intuition was that the old shear augmentation stretched target objects too much and led to a bounding box outlining a lot of incorrectly targeted space. Furthermore, shearing from two directions does not have direct correlation with perspective shaping in the real world.
Shearing from a single corner does, however.
The New Shear Augmentation
The new shear augmentation now shears x,y probabilistically from a single corner like so:
The corner can randomly change, simulating perspective changes in a real world fashion. Bounding boxes stay tighter after augmentation.
The Results
To verify the improvement of our new augmentation we ran two automatic object detection training jobs on sheared public BCCD datasets, old and new.
The results show that the new shear improves performance from 89% mAP to 91% mAP on this test dataset. However, both of these are lower than the vanilla performance of 95% - this is likely due to the fact that the blood cell dataset rarely changes perspective from the straight on microscopic photos.
These results are proof that it is important to run multiple augmented experiments on your base training set, to find which augmentation set performs the best for your object detection task.
As always, happy augmenting!