Transfer learning is a machine learning (ML) technique where knowledge gained during training a set of problems can be used to solve other related problems. It’s very similar to the concept of reusing code in computer programming, except instead of sharing code between different applications or software projects, we’re sharing it between different ML systems. This can enable us to improve the speed at which we train our models and reduce the amount of raw data required for training them.
Transfer learning is a subset of deep learning and artificial intelligence (AI). In fact, it’s often called “data-driven” machine learning because you're leveraging your existing data - from previous projects or datasets - and applying it in new ways.
Teaching Friends to Skateboard 🛹
Imagine you have two friends that you're trying to teach how to skateboard. Both have never skateboarded previously. Friend A, call them Anna, has snowboarded in the past and even tried a little surfing. Friend B, call them Brian, has never tried any kind of board sport. Which friend do you expect would pick up skateboarding more quickly?
Knowing nothing else about Brian and Anna, we would likely pick Anna. While Anna has never tried skateboarding, specifically, she has participated in other board sports – related domains like snowboarding. We may expect that the balance she learned in snowboarding would enable her to pick up skateboarding more quickly. What's more – perhaps Anna doesn't only learn faster, but perhaps she is destined to be a more talented skateboarded than Brian altogether.
In other words, we bet on Anna because we think she will transfer her learnings from snowboarding into becoming a successful skateboarder.
Transfer Learning in Machine Learning 🐶
In machine learning, transfer learning isn't so different. At its core, transfer learning is using what a given model has learned about one domain and applying those learnings to attempt to learn a related problem.
Imagine we're attempting to teach a model to recognize specific dog species: labradors, pugs, corgies, etc. If we already have a machine learning model that knows how to recognize dogs in a photo (but not specific species) and a machine learning model that knows nothing. We would expect that the model which already knows what a dog looks like to more quickly learn what a specific species looks like – and we'd likely be correct.
Remember, in machine learning, we're fine-tuning a model's weights. Those weights can either start with completely randomly initiated values or they could start from some prior set of values. When we train a model from scratch – that is, with no prior images in mind – the weights are randomly initiated.
When we use transfer learning, those weights have values that have been learned from the prior domain problem. For this reason, the model may require less adjustment in its weights to learn a new domain. It may even have some embedded knowledge that the model that started from scratch would never learn.
How to Use Transfer Learning 🧪
Note that the key to transfer learning is ensuring that two problems we're working on are similar enough for what a model learned in one setting to apply to the second setting. Now, what defines "similar enough" is an imperfect science. In general, it depends on how fine-tuned the original model was to the first domain problem.
If the new problem we're trying to learn is a subdomain of the first problem, transfer learning is a good candidate. Like the dog example above: if we're aiming to learn one specific species and we already have a model that knows what dog looks like, there's a good chance transfer learning will help significantly.
If two problems have images that are in similar contexts, transfer learning is likely helpful. For example, if we're aiming to learn what a given object looks like from real world photographs and we already have a model trained on the COCO dataset, we could use the weights from the COCO dataset to learn the new domain problem.
If the new problem is an extension of the second problem, transfer learning will help. For example, if you trained a model on a context with 5,000 images and collected an additional 3,000 images, we could use the weights from the first model (5,000 images) on the second model (3,000 images). We would expect the model would continue to make marginal improvements.
Conclusion
Transfer learning is a powerful tool that allows you to reduce the training time of your deep learning models. It can be used in a variety of ways, but its main purpose is to help you train a model faster by leveraging existing knowledge from similar tasks.
Transfer learning is often used when there’s too much data and not enough time to manually label it all before training a model – think labeling millions of images. In this case, transfer learning lets you use less data by using algorithms that are already trained on other datasets with similar properties. For example, if we have an image dataset with cats and dogs in it and want our new CNN (classifier) to learn how to distinguish between them, we could leverage an existing CNN trained on ImageNet which has millions of images belonging exclusively to these two classes.
Roboflow makes transfer learning in computer vision a breeze. Users can train a model on one dataset and then use those weights on a second dataset. Moreover, Roboflow enables users to tap into model weights from publicly trained datasets so you can start training from the COCO dataset or any dataset available on Roboflow Public.
As always, happy training!
Cite this Post
Use the following entry to cite this post in your research:
Joseph Nelson. (Jan 13, 2021). A Primer on Transfer Learning. Roboflow Blog: https://blog.roboflow.com/a-primer-on-transfer-learning/
Discuss this Post
If you have any questions about this blog post, start a discussion on the Roboflow Forum.