Machine learning – the software discipline of mapping inputs to outputs without explicitly programmed relationships – requires substantial computational resources. Traditionally, this limits where machine learning models can run to very powerful supercomputers. But this is changing.

Computation is required at two core moments in the machine learning development lifecycle: model training and model inference.

Model training is a factor greater resource hog than model inference. Training a model necessitates uncovering complex relationships between inputs and outputs through intensive trial and error. Model inference, on the other hand, only requires making use of previously discovered relationships. Thus, model inference can occur on significantly less computational resources.

At the same time, increasingly powerful models are also shrinking in size. (For example, how big is YOLOv4-tiny? Weights for YOLOv4-tiny are only 23.1MB.) Moreover, hardware is becoming cheaper and faster. (For example, the NVIDIA Jetson now comes in a 'Nano' unit, which includes a 128-core NVIDIA Maxwell GPU and costs less than $100.)

It is the confluence of better and smaller models plus cheaper yet more powerful hardware that gives rise to embedded machine learning.

What is embedded machine learning?

Embedded machine learning is deploying machine learning algorithms to run on microcontrollers (really small computers). This includes running a neural network on a Raspberry Pi, NVIDIA Jetson, Intel Movidius, or Luxonis OAK. Embedded machine learning is a type of edge computing: running algorithms on end-user computational resources rather than a central data center (the cloud).

Some computers built for embedded machine learning can be used for computer vision where you infer information about visual data (video and images).

Advantages of embedded machine learning

Embedded machine learning can offer a few key advantages compared to cloud-based processing:

  • Speed: Without a round-trip to a server for predictions, model inputs and outputs can be provided much more quickly.
  • Connectivity: An internet connection is not required for embedded machine learning. This means you can deploy your model somewhere without having to have an internet network set up (i.e. in a vast field to analyse crops).
  • Privacy: All data processing happens on a device directly where a user is present, meaning the input data received stays locally.

Embedded machine learning constraints

With that said, machine learning also introduces constraints. Namely, models must be smaller, often resulting in lower accuracy. Moreover, incorporating active learning – which accelerates model improvement – can be more challenging as receiving inputs for model retraining may be delayed or even unavailable altogether.

How to use embedded machine learning in computer visions

If you've determined embedded machine learning is the best option for implementing your use case, the next key steps are:

  1. Collect a dataset
  2. Develop a model
  3. Select hardware appropriate for your task
  4. Deploy to your hardware
  5. Implement a system for continued model improvement

We'll focus the remainder of this post on building and deploying computer vision models, specifically.

Deploying Computer Vision Models to the Edge

In order to deploy a computer vision model to an edge device, that edge device must be set up with the requisite dependencies a given model expects.

For example, if you're using TensorFlow to run on a NVIDIA Jetson, that NVIDIA Jetson must be configured with the correct CUDA drivers to support the version of TensorFlow you're running. The same is true for any other framework: PyTorch, Caffe, Darknet, etc.

Managing dependencies on a given edge device is often a great place to use Docker. (Note: we're written about how to use GPUs with Docker previously.)

Once environments are setup with the correct drivers, you may need to build your model framework (in the specific version you require) on the edge device. Building a framework like Darknet or TensorFlow can take 14 hours (or longer) in our experience.

Upon having environments setup with the correct dependencies and a model framework built, a model can be deployed to your edge device of interest.

At Roboflow, we've also released Docker containers for running computer vision models on your NVIDIA Jetson.

Introducing Roboflow Support for NVIDIA Jetson
Deploy Models to NVIDIA Embedded Devices Deploying models to the edge offers unique benefits: inference speeds can beincreased, the model can run offline, and data can be processed locally. Butgetting models onto embedded hardware can be a tedious task – compiling a modelin the right format, wr…

This takes a lot of the guesswork out of getting configurations correct so that your models run consistently and with high performance.

As always, good luck building!

Frequently Asked Questions

What embedded devices can be used for machine learning?

Machine learning and computer vision inferences can be made on many embedded devices, from general-purpose computing boards like the Raspberry Pi to custom-built devices like the NVIDIA Jetson. The Raspberry Pi, Jetson, and Luxonis OAK are commonly used for embedded computer vision projects.

In what locations might an embedded computer vision device be deployed?

In computer vision use cases, an embedded device could be deployed anywhere you need to make inferences. For example, you could deploy an embedded device in a factory to manage the quality of a product, or in a restricted area to ensure all workers are wearing the appropriate safety equipment.