When you deploy a computer vision model, you may want to have a dedicated server, or several servers, to which you can route requests to your vision model. This is ideal for workflows where you are processing images from a client (i.e. a web application), recorded videos, and more.
We are excited to announce a new tool, Dedicated Deployments, to help you allocate dedicated GPUs and CPUs for use in running models supported in the Roboflow ecosystem.
In this guide, we are going to introduce Dedicated Deployments and how you can provision your own server for use in your vision projects.
Without further ado, let’s get started!
Introducing Roboflow Dedicated Deployments
Dedicated Deployments are servers provisioned exclusively for your use and are configured with Roboflow Inference out of the box. Dedicated Deployments is intended for use while building and testing the logic for your application. In the future, we plan to support production deployments.
Dedicated Deployments are available to all paying users. This includes Starter Plan customers with a paying subscription and Enterprise customers. Starter Plan Trial and Free tier users must upgrade to a paid plan to access the feature.
We are launching Dedicated Deployments with three server types:
- CPU only
- T4 GPU
- L4 GPU
If you choose a server with a GPU type, your Deployment will be automatically configured to use the GPU. This means you do not have to worry about drivers or any of the other frictions associated with setting up a server with GPUs for deep learning workflows.
You can have one Dedicated Deployment per Workspace. This Deployment will run for six hours, after which you need to request a new Deployment. Usage of Dedicated Deployments is currently free and, in the coming weeks, will be priced according to the instance type you choose (CPU, T4, L4).
Provision and Manage Dedicated Deployments (Web Application)
You can provision, manage, and delete Dedicated Deployments in the Roboflow Workflows web application.
Roboflow Workflows is a low-code, web-based application builder for creating computer vision applications.
To create a Dedicated Deployment, first create a Roboflow Workflow. To do so, click on Workflows on the left tab in the Roboflow dashboard, then click "Create Workflow":
Then, click on the "Running on Hosted API" link in the top left corner:
Click Dedicated Deployments to create and see your Dedicated Deployments:
Set a name for your Deployment, then choose whether you need a CPU or GPU.
Then, click "Create Dedicated Deployment".
Your Deployment will be provisioned. It may take anywhere from a few seconds to a few minutes to provision your deployment.
When your Deployment is ready, the status will be updated to Ready. You can then click "Connect" to use your Deployment with your Workflow in the Workflows editor:
See the "Use a Dedicated Deployment" section later in this guide to learn how to use the server URL for deployment outside the Workflows editor.
Provision and Manage Dedicated Deployments (CLI)
You can also create, manage, and delete Dedicated Deployment instances with the Roboflow Command Line Interface (CLI).
To install the Roboflow CLI, run:
pip install roboflow
Then, authenticate with Roboflow using the roboflow login
command:
roboflow login
The login command will walk you through an interactive process to authenticate.
Once you have signed in, you can start provisioning and managing your Dedicated Deployments.
Create a Dedicated Deployment
To provision a Dedicated Deployment, run:
roboflow deployment add -m MACHINE_TYPE -n DEPLOYMENT_NAME -t DURATION
Above, replace:
- MACHINE_TYPE: Machine type, run roboflow deployment machine_type to get available options.
- DEPLOYMENT_NAME: Deployment name, 3~10 lower case alphanumeric characters (a-z, 0-9), the first character must be a letter (a-z).
- DURATION: How long you want this deployment to be active (unit: hour), must be within 0.1~6, the default value is 3.
A server will be provisioned according to your requirements. This process may take a few minutes.
View Dedicated Deployment Status
To check the status of a Dedicated Deployment, run:
roboflow deployment get -d DEPLOYMENT_ID
Above, set:
- DEPLOYMENT_ID: deployment id (NOT the deployment name).
This will return several pieces of information about a Deployment, including a status value that indicates the status of the deployment. A ready status means the server is ready.
This command will also return a deployment-url where your server is hosted. You can use this URL with Roboflow Inference to run models on images.
Delete a Dedicated Deployment
You can delete a Dedicated Deployment at any time. You cannot recover deleted deployments.
To delete a Deployment, run:
roboflow deployment delete -d DEPLOYMENT_ID
Use a Dedicated Deployment
You can use Dedicated Deployments with:
- Models trained on Roboflow
- Models uploaded to Roboflow
- Foundation models supported by Roboflow Inference
- Roboflow Workflows
You can run multiple models concurrently, provided they fit in the RAM available for your GPU type. You can run several Roboflow models concurrently on all instance types. Foundation models take up more RAM. You may only be able to load a few foundation models depending on your instance type.
Let’s test a Dedicated Deployment with a model trained on Roboflow.
First, install the Roboflow Python package and the Inference SDK:
pip install roboflow inference-sdk
Then, create a new Python file and add the following code:
# import the inference-sdk
from inference_sdk import InferenceHTTPClient
# initialize the client
CLIENT = InferenceHTTPClient(
api_url="DEPLOYMENT_URL",
api_key="API_KEY"
)
# infer on a local image
result = CLIENT.infer("YOUR_IMAGE.jpg", model_id="MODEL_ID")
print(result)
Above, replace:
- DEPLOYMENT_URL with your Dedicated Deployment URL.
- API_KEY with your Roboflow API Key.
- MODEL_ID with your Roboflow model ID and version (i.e. `basketball-players-fy4c2/20`).
This code will print the results from running inference on our model. Our model will run on our Dedicated Deployment server, then the result will be printed to the console.
Use a Dedicated Deployment with Workflows
You can use Dedicated Deployments with computer vision applications built in Roboflow Workflows, a low-code, web-based computer vision application builder.
To use your Dedicated Deployment, click the “Deploy Workflow” button within the Workflow editor and replace the API_URL in the code snippets with your Dedicated Deployment:
from inference_sdk import InferenceHTTPClient
client = InferenceHTTPClient(
api_url="DEPLOYMENT_URL",
api_key="API_KEY"
)
result = client.run_workflow(
workspace_name="roboflow-universe-projects",
workflow_id="detect-common-objects",
images={
"image": "YOUR_IMAGE.jpg"
}
)
Learn more about how to deploy Workflows.
Conclusion
Roboflow Dedicated Deployments are servers configured with Roboflow Inference installed that you can use to run inference on your vision models, foundation models, and your Roboflow Workflows.
You can provision one Dedicated Deployment with a CPU or a GPU per Workspace. The deployment will be operational for six hours (or fewer, if you selected a lower expiry date for the Deployment), after which point you can create a new Deployment.
Today, Dedicated Deployments is ideal for testing models, allowing you to craft your application logic without having to configure a server.