Open Source Computer Vision Deployment with Roboflow Inference

Published Aug 16, 2023 • 5 min read

Today, we are open sourcing the Roboflow Inference Server: our solution for using and deploying computer vision models in production, used to power millions of production model inferences. We are also announcing Roboflow Inference, an opinionated framework for creating standardized APIs around computer vision models.

Roboflow Deploy powers millions of daily inferences across thousands of models for hundreds of customers (including some of the world’s largest companies), and now we’re making the core technology available to the community under a permissive, Apache 2.0 license.

We hope this release accelerates the graduation of cutting-edge computer vision models from the realm of research and academia into the world of real applications powering real businesses.

pip install inference

Roboflow Inference lets you easily get predictions from computer vision models through a simple, standardized interface. It supports a variety of model architectures for tasks like object detection, instance segmentation, single-label classification, and multi-label classification and works seamlessly with custom models you’ve trained and/or deployed with Roboflow, along with the tens of thousands of fine-tuned models shared by our community.

To install the package on a CPU device, run:

pip install inference

To install the package on a GPU device, run:

pip install inference-gpu

Supported Fine-Tuned Models

Currently, Roboflow Inference has plugins implemented to serve the following architectures:

Object Detection

Ultralytics YOLOv8
Ultralytics YOLOv5

Instance Segmentation

Ultralytics YOLOv8
Ultralytics YOLOv5
YOLOv7
YOLACT

Single-Label Classification

Ultralytics YOLOv8
ViT

Multi-Label Classification

The next models to be supported will be the Autodistill base models. We’ll be adding additional new models based on customer and community demand. If there’s a model you’d like to see added, please open an issue (or submit a PR)!

Implementing New Models

Roboflow Inference is designed with extensibility in mind. Adding your own proprietary model is as simple as implementing a infer function that accepts an image and returns a prediction.

We will be publishing documentation on how to add new architectures to inference soon!

Foundation Models

Support for generic models like CLIP and SAM is already implemented. These models often complement fine-tuned models (for example, see how Autodistill uses foundation models to train supervised models):

We plan to add other generic models soon for tasks like OCR, pose estimation, captioning, and visual question answering.

The Inference Server

The Roboflow Inference Server is an HTTP microservice interface for inference. It supports many different deployment targets via Docker and is optimized to route and serve requests from edge devices or via the cloud in a standardized format. (If you’ve ever used Roboflow’s Hosted API, you’ve already used our Inference Server!)

Additionally, when you want to go beyond the basic functionality, the inference server has plug-ins that seamlessly integrate with Roboflow’s platform for model management, automated active learning, advanced monitoring, and device administration.

Getting predictions from your model is as simple as sending an HTTP POST request:

import requests

BASE_URL = "http://localhost:9001"

res = requests.post(
    f"{BASE_URL}/{model_id}?"
    + "&".join(
        [
            f"api_key={api_key}",
            f"confidence={confidence}",
            f"overlap={overlap}",
            f"image={image_url}",
            f"max_detections={max_detections}",
        ]
    )
)

print(res.json())

Where:

model_id: The ID of your model on Roboflow. You can find your model ID with reference to the Roboflow documentation.
api_key: Your Roboflow API key. Learn how to retrieve your Roboflow API key.
confidence: The minimum confidence level that must be met for a prediction to be returned.
overlap: The minimum IoU threshold that must be met for a prediction to be returned.
image_url: The URL of the image on which you want to run inference. This can also be a base64 string or a NumPy array.
max_detections: The maximum number of detections to return.

For more information on getting started, check out the Inference Quickstart.

Roboflow Managed Inference

While some users choose to self-host the Inference Server for network, privacy, and compliance purposes, Roboflow also offers our Hosted API as a fully turn-key serverless inference solution. It already serves millions of inferences per day, powering rapid prototyping and supporting mission-critical systems operating in manufacturing to healthcare.

At scale, we also manage dedicated Kubernetes clusters of auto-scaling GPU machines so that our customers don’t need to allocate valuable MLOps resources to scaling their computer vision model deployment. We have tuned our deployments to maximize GPU utilization, so our managed solution is often much cheaper than building on your own and if you need to do a VPC deployment inside of your own cloud, that’s available as well. Contact sales for more information about enterprise deployment.

Model Licensing

While Roboflow Inference (and the Roboflow Inference Server) are licensed under a liberal, Apache 2.0, open source license, some of the supported models use different licenses (including copyleft licenses such as GPL and AGPL in some cases). For models you train on your own, you should check to ensure that these models’ licenses support your business use-case.

For any model you train using Roboflow Train (and some other models), Roboflow’s paid plans include a commercial license for deployment via inference and the Inference Server so long as you follow your plan’s usage limits.

Start Using Roboflow Inference Today

Roboflow Inference is at the heart of what we do at Roboflow: providing powerful technologies with which you can build and deploy computer vision models that solve your business needs. We actively use Roboflow Inference internally, and are committed to improving the server to provide more functionality.

Over the next few weeks and months, we will be working on allowing you to bring your own models to Roboflow Inference that are not hosted on Roboflow, device management solutions so you can monitor if your servers are running, and more.

Is there a feature you would like to see in Roboflow Inference that we do not currently support? Leave an Issue on the project GitHub and we will evaluate your request.

Because the project is open source, you can extend the inference server to meet your needs. Want to see support for a model we don't currently support? You can build it into the server and use the same HTTP-based API the server configured for inference.

If you would like to help us add new models to the Inference Server, leave an Issue on the project GitHub repository. We will advise if there is already work going on to add a model. If no work has started, you can add a new model from scratch; if a contributor is already adding a model, we can point you to where you can help. Check out the project contribution guidelines for more information.

Cite this Post

Use the following entry to cite this post in your research:

Brad Dwyer, James Gallagher. (Aug 16, 2023). Open Source Computer Vision Deployment with Roboflow Inference. Roboflow Blog: https://blog.roboflow.com/open-source-inference-server/

Stay Connected

Get the Latest in Computer Vision First

Written by

Brad Dwyer

Roboflow cofounder and CTO. Building the computer vision infrastructure for developers. Previously founded Hatchlings and created Product Hunt's AR App of the Year.

View more posts

James Gallagher

James is a technical writer at Roboflow, with experience writing documentation on how to train and use state-of-the-art computer vision models.

View more posts

Open Source Computer Vision Deployment with Roboflow Inference

pip install inference

Supported Fine-Tuned Models

Implementing New Models

Foundation Models

The Inference Server

Roboflow Managed Inference

Model Licensing

Start Using Roboflow Inference Today

Cite this Post

Written by

Topics

More About Product Updates

Launch: Process data directly from Wasabi, Backblaze B2, Cloudflare R2, and more

Launch: Workflows Builder 2.0

New RF-DETR Segmentation Checkpoints from Nano to 2XLarge

Launch: Train and Deploy YOLO26 with Roboflow

#Shipmas: Edge Day

Launch: Advanced Enterprise Security Features