In this tutorial, we will train state of the art EfficientNet convolutional neural network, to classify images, using a custom dataset and custom classifications. To run this tutorial on your own custom dataset, you need to only change one line of code for your dataset import.
Today, we will train EfficientNet using a Keras framework in Google Colab. We train our classifier to recognize rock, paper, scissors hand gestures - but the tutorial is written generally so you can use this approach to classify your images into any classification type, given the right supervision in your dataset.
Resources in this tutorial:
The Custom Classification Task
Given an image, we are seeking to identify the image as belonging to one class in a series of potential class labels. Our model will form features from the image, pass these features through a deep neural network, and output a series of probabilities corresponding to the likelihood that the image belongs to each of those classes. We can assume that the highest probability that is output corresponds with the models prediction.
In our tutorial we will be training a model to classify rock, paper, scissors hand gestures in the popular game.
Why Use EfficientNet for Classification
EfficientNet is a state of the art convolutional neural network, released open source by Google Brain. The primary contribution in EfficientNet was to thoroughly test how to efficiently scale the size of convolutional neural networks. For example, one could make a ConvNet larger based on width of layers, depth of layers, the image input resolution, or a combination of all of those levers.
EfficientNet forms the backbone for the state of the art object detector EfficientDet. Object detection goes one step further to localize as well as classify objects in an object. If you are searching for localization, I recommend this tutorial on how to train EfficientDet, this YOLOv4 Tutorial, or this YOLOv5 Tutorial.
EfficientNet is currently the most performant convolutional neural network for classification. Image Classifiers are typically benchmarked on ImageNet, an image database organized according to the WordNet hierarchy, containing hundreds of thousands of labeled images.
As you can see in the table, 4 out of the top 5 approaches to the ImageNet task are all based on EfficientNet.
As a nice added bonus, the EfficientNet models we use in this tutorial have been pretrained on ImageNet, meaning that they already have a solid understanding of general features used to classify images.
Let's get started!
Import EfficientNet Dependencies
As you are working through this tutorial, I recommend opening this Colab Notebook Training EfficientNet in tandem. This notebook is based on the original tutorial by DLogogy and has been updated to fix software versioning and the dataset import and creation now easily flows through with Roboflow.
First, make a copy of the notebook so you can edit, and make sure you select your
Runtime Type -->
GPU to utilize free GPU resources.
The first step we take in the notebook is to select the correct tensorflow environment, the codebase is still running on
tensorflow 1.x. We also check our keras version, in this pass we are using
keras 2.3.1. Then we import some packages and clone the EfficientNet keras repository.
Import EfficientNet and Choose EfficientNet Model
The biggest contribution of EfficientNet was to study how ConvNets can be efficiently scaled up. In this notebook, you can take advantage of that fact!
In the line
from efficientnet import EfficientNetB0 as Net you can choose between the size of model you would like to use. Choose from among
EfficientNetB0, EfficientNetB1, EfficientNetB2, EfficientNetB3. The larger the better performance, but watch out training time will slow down with larger models and you may run out of GPU memory with the free Colab GPUs.
Next, before loading the model, we choose the input resolution. We start with
150 x 150 here for GPU memory and to get a feel for the classification script, but it may be useful to scale this up on your task later.
Creating a Custom EfficientNet Training Dataset
Now, you can import your own data to use transfer learning to teach EfficientNet to classify images into your custom classes.
If you are just following along with the tutorial, we recommend using this public rock, paper, scissors dataset.
If bringing your own data, we recommend uploading to Roboflow for preprocessing and augmentations. First, you can sign up for a free account. Then
Create Dataset and drag and drop to upload. Make sure your images are in the following folder structure.
Then you can choose dataset version settings
Download your dataset for import into the Colab notebook with a
Once we have the download
curl link, we can replace the one line in the notebook where it says "Your Link Here".
For the purposes of our tutorial, we are only going to train on a small subset of 25 images, so we filter accordingly. Of course, you will want to scale up training on your own dataset. Then we pass our data through a
training generator to prepare the data for train time.
Importantly, during the creation of our
training generator we set the
batch_size that our model will be using during training. The higher batch size the more images you pass through at a time and training will run faster, but you may max out the memory that the Colab GPU can handle.
✅ Dataset ready to train
Creating a Custom EfficientNet Training Job
Next we set up the infrastructure to run a training job on our dataset. We choose the number of epochs to train for. The more epochs, the better your model is likely to fit your data but training will run for longer.
Next, we set up the network to build the correct number of layers for the number of classes we have in our dataset.
Here is a summary of our model's architecture:
Run Custom EfficientNet Training
Now that we have imported our dataset and set up the training job, we are ready to train our custom classification model!
We reference the train_generator and the number of epochs and kick off training.
If you see your training hanging for more that 10 minutes on
Epoch 1/50, then you may have run out of GPU memory, though the error is not exposed. You may need to reduce the size of the training set, reduce the batch size, etc.
As training is running, you want to watch your train loss moving downwards and the validation accuracy increasing. Validation accuracy is measured on a portion of your dataset that the model has never seen.
Examine EfficientNet Training Results
After training your custom EfficientNet classification model, you will be able to view the graph of your training job
Fine Tuning Custom EfficientDet model
Lastly, we can fine tune the last few layers of our network, hopefully to squeeze out some additional performance. This means that most of the network doesn't change but the last few parameters that are contributing the most to the class prediction.
Use Custom EfficientNet Model for Inference
And now the best part!
We can run a test image through our custom EfficientNet model for test inference. When we pass the image through the classifier it will return a series of probabilities associated with each of the possible classifications. We can assume the maximum probability is the models prediction for that class.
Our test image yields through inference:
Saving Custom EfficientNet Model Weights for Posterity
Lastly, we save our classification model's weights and export for future use in an application.
We also provide an example in the notebook of how to load the model back in for future inference.
Congratulations! Now you now how to train EfficientNet, a state of the art convolutional neural network, on your own dataset for image classification. You are now able to leverage state of the art deep learning technology to adapt artificial intelligence to new domains.
We hope you enjoyed this tutorial! Happy classifying 🧐 And try an object detection tutorial next!