Deploy RF-DETR on Edge Devices

Published Apr 10, 2026 • 8 min read

RF-DETR is built to run next to the camera, on a Jetson or small GPU in a factory, rail yard, or robot where the cloud is not an option and decisions happen in milliseconds. It is the first real-time detector to pass 60 mAP on COCO, holds the best accuracy-latency tradeoff measured on both COCO and the harder RF100-VL benchmark, and now covers instance segmentation through RF-DETR-Seg at the same speed profile. This post explains why that combination fits edge constraints and walks through deploying a model to a device with Roboflow Inference, including a box-counting application on a Jetson.

The hardest place to run a vision model is also where most of the work happens: on a device sitting next to a camera, on a factory floor, in a rail yard, on a robot, with limited compute and no room for latency. The cloud is not always an option.

Bandwidth costs add up, network links drop, and a lot of real operations cannot send video offsite for privacy or reliability reasons. When the decision has to be made in milliseconds, the model has to run where the camera is.

That is the problem Roboflow's RF-DETR was designed to solve. RF-DETR is a state of the art real-time object detection model created by Roboflow, and it was built with edge deployment in mind from the start.

It is the first real-time detector to cross 60 mAP on Microsoft COCO, and it holds the best accuracy-latency tradeoff of any real-time detection model we have measured, on COCO and on the harder real-world RF100-VL benchmark.

The same architecture now covers instance segmentation through RF-DETR-Seg, so pixel-level masks run on the edge with the same speed profile. Here is why that combination matters on the edge, and how to get a model running on a device.

You can try an RF-DETR model below:

How RF-DETR Meets The Edge Constraint

“With AI-assisted labeling tools, hosted training, and most importantly highly accurate models like RF-DETR, Roboflow has simplified the process of developing state-of-the-art computer vision applications for our team." said Shawn Patel Co-Founder and CTO at Almond.

Almond develops vision-powered robots for manufacturers, model accuracy is critical. Their RF-DETR model significantly outperformed YOLO, achieving an 81% accuracy score compared to YOLO's 67%.

Edge deployment forces a tradeoff that the cloud lets you avoid. You have a fixed amount of compute, often an NVIDIA Jetson or a small GPU, and you need the model to be both fast enough to keep up with the camera and accurate enough to trust. Most models make you give up one to get the other. A small, fast model misses detections. An accurate model is too heavy to hit frame rate on the device.

RF-DETR closes that gap. The model family runs across sizes built for exactly this constraint, with latency measured on an NVIDIA T4 using TensorRT at FP16 and batch size 1, the kind of setup an edge deployment actually uses:

Model	COCO mAP50:95	RF100-VL mAP50:95	Latency (ms)
RF-DETR Nano	48.4	57.1	2.32
RF-DETR Small	53.0	59.6	3.52
RF-DETR Medium	54.7	60.6	4.52

RF-DETR Nano runs at roughly 100 frames per second on a T4 while staying more accurate than detection models several times its size. That headroom is what edge teams need. It means you can run real-time detection on modest hardware, process multiple camera streams on one device, or leave compute budget for the tracking and counting logic that sits on top of the model.

The accuracy itself comes from the architecture. RF-DETR is a detection transformer, which removes the non-maximum suppression step that convolutional detectors run after every prediction. That step adds latency that scales with the number of objects in the frame, and it is often left out of the speed numbers other models report. RF-DETR's latency is the total time to a result, with nothing hidden, so the number you benchmark is the number you get in production.

How RF-DETR Compares to Other Real-Time Detectors

Here is how RF-DETR lines up against other real-time detectors, at matched model sizes. The numbers below are on Microsoft COCO and the real-world RF100-VL benchmark, with latency measured on an NVIDIA T4 using TensorRT at FP16 and batch size 1, reported as total latency end to end.

Model	COCO mAP@50	COCO mAP@50:95	RF100-VL mAP@50:95	Latency (ms)
RF-DETR Nano	67.6	48.4	57.1	2.32
D-FINE Nano	60.2	42.7	57.7	2.12
LW-DETR Tiny	60.7	42.9	n/a	1.91
YOLO11-N	52.0	37.4	55.3	2.49
RF-DETR Small	72.1	53.0	59.6	3.52
D-FINE Small	67.6	50.7	59.9	3.55
LW-DETR Small	66.8	48.0	58.0	2.62
YOLO11-S	59.7	44.4	56.2	3.16
RF-DETR Medium	73.6	54.7	60.6	4.52
D-FINE Medium	72.6	55.1	60.2	5.68
LW-DETR Medium	72.0	52.6	59.4	4.49
YOLO11-M	64.1	48.6	56.5	5.13

RF-DETR posts the highest COCO mAP@50 at every size tier, and it beats the widely deployed YOLO11 on every metric at every comparable size. The advantage compounds across tiers: RF-DETR Small reaches 53.0 mAP@50:95 at 3.52 ms, higher accuracy than the largest YOLO11 model while running more than 3x faster. Against the strongest transformer baselines it holds the best accuracy-latency frontier, which is the property that actually decides whether a model fits on an edge device.

Real-World Domains, Not Just COCO

A benchmark score on COCO tells you how a model does on everyday objects in clean photos. Edge deployments rarely look like that. They look like aerial imagery, industrial parts on a line, instruments in a lab, crops in a field, packages on a conveyor. The model has to transfer to a domain it was not pretrained on, usually from a custom dataset you labeled yourself, and often a small one.

This is what RF100-VL measures: how well a model adapts across 100 real-world datasets drawn from the open projects on Roboflow Universe. RF-DETR posts the best results on RF100-VL of any real-time model, which is the score that predicts how it will behave on your data rather than on a research benchmark. For an edge project, that adaptability is the difference between a model that works in the demo and one that holds up on the floor.

It also means you can train RF-DETR on a custom dataset and expect strong accuracy without an enormous labeling effort. The model was built to transfer well to new domains and to datasets big and small, which is the normal situation for a team deploying to a specific site with its own cameras, lighting, and parts.

Segmentation at the Same Speed with RF-DETR

A lot of edge problems need more than a bounding box. Measuring the area of a defect, separating two parts that overlap, following the exact outline of an irregular object, masking a region for a downstream step: these need a pixel-level mask, not a rectangle. That used to mean accepting slower models, which is a hard tradeoff on a device that is already compute-constrained.

RF-DETR-Seg removes that tradeoff. It is real-time instance segmentation built as an extension of RF-DETR detection, adding a segmentation head on top of the existing pipeline while keeping the same real-time inference characteristics. You get masks instead of boxes without giving up the speed that made the detector usable on the edge. RF-DETR-Seg reaches state-of-the-art accuracy for real-time instance segmentation, outperforming other real-time segmentation models across sizes on the COCO segmentation benchmark, with latency measured the same way: NVIDIA T4, TensorRT, FP16, batch size 1.

It also ships across the full range of sizes, from Nano to 2XLarge, and the scaling is built for exactly this decision. The smaller checkpoints run at reduced resolution to target low-latency edge deployment, while the larger ones trade resolution and parameters for higher mask quality. On a constrained device you pick a small checkpoint and keep your frame rate; when mask precision matters more than speed, you size up. ONNX and TFLite exports are available, which widens the range of edge hardware you can target beyond Jetson, and every checkpoint is Apache 2.0.

For an edge team, the most practical part is that segmentation is not a separate stack. It is the same model family, the same training package, the same deployment path through Roboflow Inference. A detection deployment that needs to become a segmentation deployment does not get rebuilt; it swaps the checkpoint. You can fine-tune RF-DETR-Seg on a custom dataset in COCO or instance segmentation format, starting from a pretrained checkpoint to shorten training, the same way you would with detection.

RF-DETR is Built to Run on the Hardware You Already Have

RF-DETR is open source under an Apache 2.0 license. That matters for edge work in two practical ways. First, it is commercial-safe: you can ship it in a product or run it across a fleet of devices without a restrictive license forcing your hand. Second, you own the model and can run it anywhere, on your own hardware, in your own facility, with no dependency on a vendor's API staying online.

Deployment runs through Roboflow Inference, an open source inference server that runs vision models on the edge, including offline. You install it on the device, point it at a camera or RTSP stream, and run your model locally. RF-DETR is optimized for the NVIDIA Jetson devices most commonly used for on-device vision, and it compiles to TensorRT to get the latency numbers above.

On top of the model, Roboflow Workflows lets you build the full application, chaining detection together with tracking, line counting, zones, and logic, then deploy the whole pipeline to the edge device as one unit. A common pattern is detection plus object tracking plus a line counter to count items crossing a point on a conveyor, all running on a single Jetson with no cloud round trip. This is the same approach behind production deployments like BNSF's automated rail yard inventory, where vision runs on site to track assets in real time.

Roboflow’s AI1 deployment device packages onboard AI compute, integrated lighting, and an industrial camera into a plug-and-play system for real-time computer vision in robotics applications.

How to Deploy RF-DETR to an Edge Device

The path from trained model to running device is short:

Train an RF-DETR model on your dataset, detection or segmentation, in the cloud on the Roboflow platform or on your own hardware with the open source RF-DETR package. Pick the size that fits your latency budget; Nano and Small are the usual starting points for constrained devices.
Build a Workflow that wraps the model with whatever tracking, counting, or zone logic your application needs.
Install Roboflow Inference on the edge device. On a Jetson, that is installing Docker, then pip install inference-cli, then inference server start.
Run the Workflow against your camera or video stream. Inference handles the model locally and returns results in real time.

The full walkthrough, including a working box-counting application on a Jetson, is in the RF-DETR Jetson deployment guide. For the model details and the SDK, see the RF-DETR documentation and the research paper on arXiv.

Use RF-DETR For Edge Deployment

Edge deployment is where Vision AI either earns its place in an operation or stalls. RF-DETR is built for that test. It gives you the accuracy of a transformer detector at the latency of a small model, it covers both detection and instance segmentation from one architecture, it adapts to the unusual domains real deployments live in, and it runs on the hardware you already have under an open license you can ship. You can train it on your data this week and have it running on a device, with no cloud dependency and no model lock-in.

Start with the RF-DETR model page, or go straight to training a model and deploying it to your edge device with Roboflow Inference.

Sources:

Cite this Post

Use the following entry to cite this post in your research:

Contributing Writer. (Apr 10, 2026). Why RF-DETR Is Built for the Edge. Roboflow Blog: https://blog.roboflow.com/rf-detr-for-the-edge/

Stay Connected

Get the Latest in Computer Vision First

Written by

Contributing Writer

View more posts

Topics

Computer Vision

Why RF-DETR Is Built for the Edge

How RF-DETR Meets The Edge Constraint

How RF-DETR Compares to Other Real-Time Detectors

Real-World Domains, Not Just COCO

Segmentation at the Same Speed with RF-DETR

RF-DETR is Built to Run on the Hardware You Already Have

How to Deploy RF-DETR to an Edge Device

Use RF-DETR For Edge Deployment

Cite this Post

Written by

Topics

More About Computer Vision

Advanced Techniques for Optimizing AI Inference Costs

Pipe and Tubes Quality Inspection with Roboflow

Retail Object Detection with RF-DETR

Teaching a Porch to Recognize Delivery Drivers and Accept Packages

Cosmetic Defect Detection with Computer Vision

Multi-Model Auto Labeling for Segmentation with Roboflow Workflows