Build an Automated Pallet Accounting System with Roboflow

How to Build Automated Pallet Accounting at End-of-Line with Roboflow

Published Mar 4, 2026 • 21 min read

Many production lines end the same way: packed pallets roll off the conveyor, land in a staging zone, and disappear into a "visibility gap." Between the moment a pallet is wrapped and the moment it’s picked up, your inventory exists only on a clipboard or in someone’s head.

The result? Inaccurate tallies, disputed shift totals, and zero insight into how long your throughput is actually sitting idle.

In this tutorial, we’re closing that gap. We’ll build an automated Pallet Accounting System that transforms raw video into a timestamped ledger. By the end of this guide, you’ll have a system that records:

Pallet Completed: When a packed pallet enters the end-of-line zone (The "In Time").
Pallet Collected: When a forklift picks it up and clears the zone (The "Out Time").

No manual counting. No shift-end disputes. Just a camera, an RF-DETR model, and a Roboflow Workflow.

0:00

/0:08

Output of System

How the Automated Pallet Accounting System Works

The logic follows exactly how pallets move on your floor:

A pallet is packed with packages at the packing station.
The packed pallet moves on a conveyor toward the end-of-line staging zone.
The pallet enters the zone, this is the completion event. The system logs its tracker ID and timestamp.
The pallet sits in the zone, waiting for pickup. The system tracks how long it has been there.
A forklift picks up the pallet and drives away. The pallet is no longer detected in the zone, this is the collection event. The system logs the exit timestamp and computes the wait duration.

The key insight is that you don't need to explicitly detect the forklift to measure pickup. You measure the pallet's presence in the staging zone. When a tracked pallet enters the zone, it is "completed." When that same tracked pallet disappears from the zone for longer than a brief timeout, it has been "collected." This is a standard industrial pattern zone presence as the proxy for both events.

An RF-DETR object detection model identifies pallets in every frame. ByteTrack assigns each pallet a persistent tracker ID so it is counted exactly once. A Time in Zone block monitors which pallets are inside the staging polygon and how long each has been there. The deployment script reads this data, maintains a per-pallet ledger, and computes warehouse metrics.

The Automated Pallet Accounting System Output

Before diving into the build steps, here is exactly what this system produces, so you know what you are working toward.

1. Annotated Video

The system outputs a video with bounding boxes, in zone time when the pallet enters the zone and the zone polygon drawn on every frame. This is your visual audit trail, you can scrub through and see exactly when each pallet entered and left the zone.

0:00

/0:05

Video output from Workflow

2. Pallet Event Ledger (`pallet_zone_events.json`)

Every pallet gets one entry in the ledger with its full lifecycle

[
  {
    "tracker_id": 1,
    "class": "pallet",
    "in_time": "2026-03-04 05:50:50",
    "out_time": "2026-03-04 05:50:52",
    "in_time_sec": 0.625,
    "out_time_sec": 2.917,
    "duration_sec": 2.292,
    "max_time_in_zone_sec": 2.292,
    "last_confidence": 0.8548049926757812,
    "note": "Removed from zone"
  },
  {
    "tracker_id": 2,
    "class": "pallet",
    "in_time": "2026-03-04 05:50:53",
    "out_time": "2026-03-04 05:50:55",
    "in_time_sec": 3.792,
    "out_time_sec": 5.417,
    "duration_sec": 1.625,
    "max_time_in_zone_sec": 1.625,
    "last_confidence": 0.9335129261016846,
    "note": "Removed from zone"
  }
]

Each entry records when the pallet entered the zone (completion), when it left (forklift pickup), how long it waited, and the model's confidence. This is the core accounting output, one row per pallet, machine-readable, ready to feed into your WMS or shift reports.

3. Warehouse Metrics (`warehouse_metrics.json`)

The system computes operational KPIs from the ledger

{
  "total_pallets_completed": 2,
  "avg_pickup_latency_sec": 1.958,
  "max_pickup_latency_sec": 2.292,
  "min_pickup_latency_sec": 1.625,
  "avg_forklift_cycle_time_sec": 2.5,
  "max_forklift_cycle_time_sec": 2.5,
  "min_forklift_cycle_time_sec": 2.5,
  "throughput_pallets_per_hour": 1502.5
}

These metrics tell you at a glance how the line performed, how fast pallets were completed, how quickly forklifts picked them up, and whether service levels were met.

How to Build Automated Pallet Accounting at End-of-Line

Prerequisites:

A free Roboflow account
Images or video footage of your end-of-line area
Python 3.10+
inference-sdk installed (pip install inference-sdk)

Step 1: Create a Roboflow Project and Collect Data

Start by creating an Object Detection project in Roboflow. This project will hold your pallet images, annotations, and trained model.

Capture Footage

Mount a camera overlooking the end-of-line staging zone. Record 15–30 minutes of normal operations that include

Packed pallets arriving on the conveyor and entering the staging zone
Pallets sitting in the zone, waiting
Forklifts approaching, lifting, and driving away with pallets
The zone empty (no pallets present)
Different shift conditions, lighting changes, different pallet loads

Upload the video to your Roboflow project. Use Roboflow's frame extraction feature to sample frames.

Bootstrap with a Public Dataset (Optional)

If you want to prototype before capturing site-specific footage, Roboflow Universe has ready-to-use pallet datasets. You can use my dataset used in this project as starting point. This pallet-detection project contains labeled images of pallets in warehouse settings. Fork it into your workspace, then replace with your own data for production.

Annotate

For this system, you need one class

pallet, any packed pallet visible in the frame

Open your project in Roboflow Annotate and draw tight bounding boxes around every pallet. Guidelines specific to this use case:

Label pallets even when packages are stacked on top, the model should learn that a loaded pallet is still a pallet.
Label pallets at all stages on the conveyor, stationary in the zone, being approached by a forklift.
Do not label pallets that are already lifted off the ground on forklift forks. Once the forklift has it, it is no longer "in the zone."
Include frames where no pallets are present. These negative examples help avoid false detections.

Use Roboflow's AI-assisted labeling tools to speed up annotation. Aim for at least 300-500 annotated images before your first training run.

Generate a Dataset Version

Once annotation is complete, generate a dataset version. For our project, the dataset version has the following configuration:

Total Images 470 (379 train / 61 valid / 30 test)
Preprocessing Auto-Orient applied, Resize stretch to 512×512
Augmentations 3 outputs per training example, Horizontal Flip

Click "Generate" to lock in a reproducible snapshot of your data.

*Dataset version created for this project*

Step 2: Train an RF-DETR Model

For this project, I trained an RF-DETR Small object detection model. RF-DETR is fastest object detection model ideal for real-time applications and handles partially visible and overlapping objects well, which matters when pallets are stacked or crowded in the staging zone.

I used the Roboflow autotraining option. This is the fastest path - no GPU setup, no code required. Roboflow Train uses a custom-optimized RF-DETR checkpoint that consistently produces higher mAP scores than the public COCO weights.

Navigate to the dataset version you generated.
Click "Train Model", then select "Custom Training".
Under Select Architecture, choose Roboflow RF-DETR (marked as Recommended). Set Model Size to Small.
Click "Continue", then "Start Training".

For my generated 470-image dataset, training completed and produced the following results

mAP@50 99.4%
Precision 96.2%
Recall 100%

The trained model ID is pallet-detection-nlwmv/1, a Roboflow RF-DETR Object Detection (Small) model. These are strong numbers that indicate the model reliably detects pallets across the dataset.

💡

If you want full control, train RF-DETR locally using the rfdetr Python package.

Step 3: Build the Pallet Accounting Workflow

With a trained model, the next step is to build a Roboflow Workflow that connects detection to tracking, zone monitoring, and visualization, so every pallet entering and leaving the zone is accounted for.

Create a New Workflow

In your Roboflow dashboard, click "Workflows" in the left sidebar.
Click "Create Workflow".
Select "Build My Own" to start from a blank canvas.

Here is the complete Workflow I built. You can use this workflow and explore settings for each blocks.

0:00

/0:18

pallet-accounting workflow

Here's a block-by-block explanation:

Block 1: Input

The Workflow starts with an Input block that accepts an image. When running on video, each frame from the video feed is passed in as the input image.

Input name: image

Block 2: Object Detection Model

Add an Object Detection Model block.

Image: Connect to the input image.
Model: Select your trained RF-DETR pallet detection model (e.g., pallet-detection-nlwmv/1).
Confidence Threshold: Start at 0.5 (tune later based on results).

This block runs the RF-DETR model on every frame and outputs bounding boxes, class labels, and confidence scores for every detected pallet.

Block 3: Byte Tracker

Add a Byte Tracker block after the detection model.

Detections: Connect to the Object Detection Model output.

ByteTrack assigns each detected pallet a unique, persistent tracker_id. Pallet #7 stays pallet #7 from the first frame it appears until it leaves the camera's field of view, even if it is briefly occluded by a forklift or another pallet. This persistent tracking is what makes accurate accounting possible one pallet = one entry in the ledger, no double-counting.

Block 4: Time in Zone

This is the core accounting block. Add a Time in Zone block.

Detections: Connect to the Byte Tracker output.
Zone: Define a polygon that covers your end-of-line staging zone, the area where pallets sit after coming off the conveyor and before being picked up by a forklift.

The Time in Zone block tracks

Which tracker IDs are currently inside the polygon
How long each tracked pallet has been in the zone (time_in_zone)

This is what we use downstream to determine both events a pallet appearing in the zone = completed; that pallet disappearing = collected.

Block 5: Bounding Box Visualization

Add a Bounding Box Visualization block.

Image: Connect to the input image.
Predictions: Configure it to the Time in Zone blocks timed_detections output.

This draws bounding boxes around every detected pallet in the frame.

Block 6: Polygon Zone Visualization

Add a Polygon Zone Visualization block.

Image: Connect to the output of the Bounding Box Visualization block (so the zone overlay is drawn on top of the bounding boxes).

This renders the staging zone polygon on the frame, making it easy to verify the zone is drawn correctly and pallets are being detected inside it.

Block 7: Label Visualization

Add a Label Visualization block.

Image: Connect to the output of the Polygon Zone Visualization block.
Detections: Connect to the Time in Zone output.

This shows each pallet's in zone time (e.g., In zone: 0.45s).

Configure Outputs

The Workflow needs to return two things

zone_time, the annotated visualization image from the Label Visualization block (label_visualization.image). This is used to produce the output video.
zone_output, the structured data from the Time in Zone block (time_in_zone.all_properties). This contains the per-pallet tracker IDs, time-in-zone durations, and confidence scores, the data your accounting script reads.

In the Workflow Outputs section, add

Output Name	Value
`zone_time`	`label_visualization.image`
`zone_output`	`time_in_zone.all_properties`

Step 4: Deploy and Run the System

With the Workflow built, we deploy it using the Roboflow inference-sdk Python package. The script streams video through the Workflow via WebRTC, reads the structured zone data from each frame, maintains a per-pallet accounting ledger, computes warehouse metrics live, and displays the annotated output as a real-time video feed. The architecture of the complete pipeline

Camera / Video File
       |
Roboflow Workflow (RF-DETR + ByteTrack + Time in Zone)
       |
Zone Event Engine (ENTER / EXIT detection)
       |
Pallet Accounting Ledger (pallet_zone_events.json)
       |
Warehouse Analytics (warehouse_metrics.json + pallet_metrics.json)
       |
Live Operator View (cv2.imshow with real-time playback)

Install Dependencies

pip install -U inference-sdk

The Deployment Script

Here is the complete deployment script. It connects to your Roboflow Workflow, processes video frame by frame, records pallet enter/exit events, updates warehouse metrics on every exit, and displays the annotated video as a live feed.

import cv2
import base64
import numpy as np
import json
import time
from datetime import datetime, timezone, timedelta
from statistics import mean
from inference_sdk import InferenceHTTPClient
from inference_sdk.webrtc import VideoFileSource, StreamConfig, VideoMetadata

Connect to Roboflow

client = InferenceHTTPClient.init(
    api_url="https://serverless.roboflow.com",
    api_key="YOUR_API_KEY"
)

source = VideoFileSource("pallet.mp4", realtime_processing=False)

VIDEO_OUTPUT = "zone_time"

config = StreamConfig(
    stream_output=[],
    data_output=["zone_time", "zone_output"],
    requested_plan="webrtc-gpu-medium",
    requested_region="us",
)

session = client.webrtc.stream(
    source=source,
    workflow="pallet-ac",
    workspace="your-workspace",
    image_input="image",
    config=config
)

InferenceHTTPClient connects to the Roboflow serverless API. VideoFileSource wraps a local video file as the input (for a live camera, use an RTSP URL instead). StreamConfig tells the Workflow which outputs to return per frame: zone_time for the visualization image and zone_output for the structured tracking data. The session streams the video through your pallet-ac Workflow via WebRTC.

Global State

The script maintains in-memory state that tracks every pallet currently in the zone, a finalized event ledger, and configuration for live display and reporting

active_tracks = {}   # tracker_id -> state dict
events = []          # finalized pallet events (the ledger)
frames = []          # for stitching output video

MISSING_TIMEOUT_SEC = 1.0

run_start_dt = datetime.now(timezone.utc)

# Live display
DISPLAY_LIVE = True
DISPLAY_WINDOW = "Pallet Workflow Live"
THROTTLE_TO_REALTIME = True

_last_video_t = None
_last_wall_t = None

# Output files
EVENT_FILE = "pallet_zone_events.json"
WAREHOUSE_FILE = "warehouse_metrics.json"
PALLET_FILE = "pallet_metrics.json"

MISSING_TIMEOUT_SEC prevents brief detection dropouts (a forklift momentarily blocking the camera) from being misinterpreted as a pallet leaving the zone. Only when a pallet has been undetected for longer than this timeout is it finalized as "collected."

DISPLAY_LIVE enables a real-time OpenCV window that shows the Workflow output as the video plays, like watching a live camera feed. THROTTLE_TO_REALTIME paces the display to match the original video speed rather than processing as fast as possible. Press q at any time to stop the stream.

Helper Functions

def video_time_seconds(metadata: VideoMetadata):
    return float(metadata.pts) * float(metadata.time_base)

def to_datetime_str(sec):
    dt = run_start_dt + timedelta(seconds=sec)
    return dt.strftime("%Y-%m-%d %H:%M:%S")

def parse_zone_predictions(data):
    if "zone_output" not in data:
        return []
    return data["zone_output"].get(
        "timed_detections", {}
    ).get("predictions", [])

parse_zone_predictions reads the structured output from the Time in Zone block. Each prediction represents a pallet currently detected inside the zone polygon, with its tracker_id, confidence, class, and time_in_zone.

Live Video Display

def show_live_frame(frame, t_sec):

    global _last_video_t, _last_wall_t

    if not DISPLAY_LIVE:
        return True

    if THROTTLE_TO_REALTIME:

        now_wall = time.time()

        if _last_video_t is None:
            _last_video_t = t_sec
            _last_wall_t = now_wall
        else:

            dv = t_sec - _last_video_t
            dw = now_wall - _last_wall_t

            sleep_time = max(0, dv - dw)

            if sleep_time > 0:
                time.sleep(sleep_time)

    cv2.imshow(DISPLAY_WINDOW, frame)

    key = cv2.waitKey(1) & 0xFF
    if key == ord("q"):
        return False

    return True

This function paces the frame display so the annotated video plays at real-time speed. The operator sees exactly what the camera sees, with bounding boxes, tracker IDs, and the zone polygon overlaid in real time.

Finalize a Track (Log the Exit Event)

When a pallet leaves the zone, this function closes its record, moves it to the event ledger, and immediately updates all report files

def finalize_track(tracker_id, out_time):

    st = active_tracks.get(tracker_id)
    if not st:
        return

    duration = round(out_time - st["in_time_sec"], 3)

    event = {
        "tracker_id": tracker_id,
        "class": st.get("class", "pallet"),
        "in_time": st["in_time"],
        "out_time": to_datetime_str(out_time),
        "in_time_sec": round(st["in_time_sec"], 3),
        "out_time_sec": round(out_time, 3),
        "duration_sec": duration,
        "max_time_in_zone_sec": round(
            st.get("max_time_in_zone_sec", 0), 3
        ),
        "last_confidence": st.get("last_confidence"),
        "note": "Removed from zone"
    }

    events.append(event)
    del active_tracks[tracker_id]

    print(f"[EXIT] pallet {tracker_id}  duration={duration}s")

    update_reports()

The key design decision here is that update_reports() is called inside finalize_track(). Every time a pallet exits the zone, all three JSON files (pallet_zone_events.json, warehouse_metrics.json, pallet_metrics.json) are rewritten with the latest data. In a live deployment, this means an external dashboard or WMS integration can read these files at any time and always see current numbers.

The Per-Frame Callback

This is the core of the accounting logic. It runs on every frame

@session.on_data()
def on_data(data, metadata):

    t_sec = video_time_seconds(metadata)
    now_str = to_datetime_str(t_sec)

    preds = parse_zone_predictions(data)

    present_ids = set()

    for p in preds:

        tid = p.get("tracker_id")

        if tid is None:
            continue

        present_ids.add(tid)

        conf = p.get("confidence")
        cls = p.get("class", "pallet")

        time_in_zone = p.get("time_in_zone")

        if tid not in active_tracks:

            active_tracks[tid] = {
                "tracker_id": tid,
                "class": cls,
                "in_time_sec": t_sec,
                "in_time": now_str,
                "last_seen_sec": t_sec,
                "last_confidence": conf,
                "max_time_in_zone_sec": float(time_in_zone or 0)
            }

            print(f"[ENTER] pallet {tid}")

        else:

            st = active_tracks[tid]

            st["last_seen_sec"] = t_sec
            st["last_confidence"] = conf

            if time_in_zone:
                st["max_time_in_zone_sec"] = max(
                    st["max_time_in_zone_sec"],
                    float(time_in_zone)
                )

    # detect exits

    to_finalize = []

    for tid, st in active_tracks.items():

        missing = t_sec - st["last_seen_sec"]

        if missing >= MISSING_TIMEOUT_SEC and tid not in present_ids:

            to_finalize.append((tid, st["last_seen_sec"]))

    for tid, out_t in to_finalize:
        finalize_track(tid, out_t)

    # live display

    if VIDEO_OUTPUT in data:

        img = cv2.imdecode(
            np.frombuffer(
                base64.b64decode(data[VIDEO_OUTPUT]["value"]),
                np.uint8
            ),
            cv2.IMREAD_COLOR
        )

        keep_running = show_live_frame(img, t_sec)

        if not keep_running:
            session.close()
            return

        frames.append((t_sec, metadata.frame_id, img))

    print(
        f"frame {metadata.frame_id} "
        f"active={len(active_tracks)} "
        f"events={len(events)}"
    )

Here is what happens on every frame

ENTER events. The script reads every pallet prediction from the Time in Zone output. If a tracker_id appears for the first time, it is logged as a new pallet entering the zone (the completion event). If the tracker_id is already known, the script updates the "last seen" time and maximum zone duration.

EXIT events. The script checks for any active pallet that was not present in this frame. If a pallet has been missing for longer than MISSING_TIMEOUT_SEC (default 1 second), it is finalized as a collection event. The out_time is set to the last frame where the pallet was actually detected, not the frame where the timeout triggered, which gives a more accurate pickup timestamp. On every exit, update_reports() immediately rewrites all JSON files with updated metrics.

Live display. The zone_time output is decoded and shown in an OpenCV window at real-time speed. The operator sees bounding boxes, tracker labels, and the zone polygon overlaid on the live feed.

Run the Stream and Save Results

session.run()

if DISPLAY_LIVE:
    cv2.destroyAllWindows()

for tid in list(active_tracks.keys()):
    finalize_track(tid, active_tracks[tid]["last_seen_sec"])

update_reports()

After the video ends (or the operator presses q), any pallets still in the zone are finalized and the reports are written one last time.

Stitch the Output Video (Optional)

The collected visualization frames are also saved as an MP4 file for offline review

if frames:

    frames.sort(key=lambda x: x[1])

    fps = len(frames) / (frames[-1][0] - frames[0][0])

    h, w = frames[0][2].shape[:2]

    out = cv2.VideoWriter(
        "output.mp4",
        cv2.VideoWriter_fourcc(*"mp4v"),
        fps,
        (w, h)
    )

    for _, _, f in frames:
        out.write(f)

    out.release()

    print("output video saved")

Step 5: Compute Warehouse KPIs

The deployment script already computes and saves metrics on every pallet exit via update_reports(). Here is what each metric function produces and what the numbers mean operationally.

Warehouse Metrics (warehouse_metrics.json)

def compute_warehouse_metrics(events):

    if not events:
        return {}

    events_sorted = sorted(events, key=lambda e: e["in_time_sec"])

    latencies = [e["duration_sec"] for e in events_sorted]

    pickup_times = [e["out_time_sec"] for e in events_sorted]

    cycle_times = [
        pickup_times[i] - pickup_times[i-1]
        for i in range(1, len(pickup_times))
    ]

    start = events_sorted[0]["in_time_sec"]
    end = max(e["out_time_sec"] for e in events_sorted)

    duration = max(0.001, end - start)

    throughput = len(events_sorted) / (duration / 3600)

    return {
        "total_pallets_completed": len(events_sorted),

        "avg_pickup_latency_sec": round(mean(latencies), 3),
        "max_pickup_latency_sec": round(max(latencies), 3),
        "min_pickup_latency_sec": round(min(latencies), 3),

        "avg_forklift_cycle_time_sec": round(mean(cycle_times), 3)
            if cycle_times else None,
        "max_forklift_cycle_time_sec": max(cycle_times)
            if cycle_times else None,
        "min_forklift_cycle_time_sec": min(cycle_times)
            if cycle_times else None,

        "throughput_pallets_per_hour": round(throughput, 2)
    }

This produces metrics like

Metric	Description
`total_pallets_completed`	Total pallets that entered and left the zone
`avg_pickup_latency_sec`	Average time pallets waited before forklift pickup
`max_pickup_latency_sec`	Longest any pallet waited
`min_pickup_latency_sec`	Shortest wait time
`throughput_pallets_per_hour`	Line throughput over the observed period
`avg_forklift_cycle_time_sec`	Average time between consecutive forklift pickups

Reading the Metrics: What to Do with These Numbers

These are not just numbers to log. Each one points to a specific operational decision

avg_pickup_latency_sec Consistently above 5 minutes? You likely need a second forklift assigned to the end-of-line area, or your current forklift driver is being pulled to other tasks too often.
max_pickup_latency_sec Spiking on certain shifts? Compare across shifts to identify staffing or process gaps. One shift may be understaffed or have a less experienced driver.
throughput_pallets_per_hour Dropping over time? The production line may be slowing down, or pallets are backing up because forklifts cannot keep pace.
avg_forklift_cycle_time_sec Increasing? The forklift is taking longer between pickups. This could mean longer drive distances (rack is filling up far from the line), more congestion in aisles, or driver fatigue toward end of shift.

Pallet Metrics (pallet_metrics.json)

def compute_pallet_metrics(events):

    if not events:
        return {}

    ev = sorted(events, key=lambda x: x["in_time_sec"])

    durations = [e["duration_sec"] for e in ev]

    start = ev[0]["in_time_sec"]
    end = max(e["out_time_sec"] for e in ev)

    window = max(0.001, end - start)

    throughput = len(ev) / (window / 3600)

    return {
        "total_pallet_events": len(ev),
        "unique_pallets": len(set(e["tracker_id"] for e in ev)),
        "avg_dwell_sec": round(mean(durations), 3),
        "max_dwell_sec": round(max(durations), 3),
        "min_dwell_sec": round(min(durations), 3),
        "throughput_pallets_per_hour": round(throughput, 2)
    }

This function focuses on pallet flow rather than forklift performance. avg_dwell_sec is how long pallets typically sit in the zone before pickup. unique_pallets confirms that tracker IDs are not being reassigned (if unique_pallets equals total_pallet_events, each pallet was tracked as a distinct object). throughput_pallets_per_hour gives the overall line speed from the pallet perspective.

Live Report Updates

Both metric functions are called inside update_reports(), which runs every time a pallet exits the zone

def update_reports():

    with open(EVENT_FILE, "w") as f:
        json.dump(events, f, indent=2)

    wm = compute_warehouse_metrics(events)

    with open(WAREHOUSE_FILE, "w") as f:
        json.dump(wm, f, indent=2)

    pm = compute_pallet_metrics(events)

    with open(PALLET_FILE, "w") as f:
        json.dump(pm, f, indent=2)

    print("\n--- Warehouse Metrics ---")
    print(json.dumps(wm, indent=2))

Because this runs on every exit event, the three JSON files are always current. An external process (a warehouse dashboard, a WMS connector, a shift report generator) can read these files at any time and get up-to-date numbers without waiting for the video to finish.

Step 6: Adapting for Live Camera Feeds

The deployment script above processes a video file. For a live production deployment, swap VideoFileSource for your camera's RTSP stream URL. The accounting logic remains identical, the script processes each frame as it arrives and maintains the pallet ledger in real time.

For edge deployment on devices like NVIDIA Jetson

Install Roboflow Inference on the device.
Deploy your Workflow to run locally, no cloud dependency needed.
Connect your camera directly to the device.

RF-DETR runs at 25+ FPS on NVIDIA Jetson devices and 100+ FPS on NVIDIA T4 GPUs, which is more than sufficient for real-time pallet tracking.

See the Roboflow Inference deployment documentation for device-specific setup instructions.

💡

Download full code for this project.

Step 7: Tips for a Reliable Production System

Drawing the Zone Polygon

The zone polygon should tightly cover the area where pallets sit between completion and forklift pickup. Get the coordinates right:

Make sure the whole pallet should be covered in the zone with its bounding box.
Do not extend the zone onto the conveyor, otherwise pallets get marked "complete" before they have actually arrived in the staging area.
Do not extend into the forklift driving lane, forklifts passing by (not picking up) could cause tracking noise.
Use a frame from your actual camera feed in the Polygon Zone tool at your exact camera resolution.

Tuning MISSING_TIMEOUT_SEC

This parameter controls how long a pallet can be undetected before it is declared "collected." The right value depends on your environment

Too low (e.g., 0.3s) Brief occlusions (forklift blocks camera view momentarily) will be misread as exits, creating false collection events.
Too high (e.g., 5s) The system is slow to recognize that a pallet has been picked up, and the out_time will lag behind reality.
Default of 1.0s works well for most setups at 30 FPS. Increase it if your forklifts routinely block the camera for longer than a second.

Handling Tracker ID Reassignment

In rare cases, ByteTrack may lose a pallet's tracker ID and assign a new one, making one physical pallet appear as two events. This can happen when a pallet is occluded for longer than the tracker's buffer. To mitigate this

Increase the track_buffer parameter in the Byte Tracker block settings (default is 30 frames).
Ensure consistent lighting and camera angle to minimize detection dropouts.
Post-process the ledger to merge events that overlap in time and position.

Confidence Tuning

Start at 0.5 confidence threshold. If the model produces false detections (counting shadows or boxes as pallets), increase it. If it misses real pallets, decrease it. Well-trained pallet models typically perform best between 0.35 and 0.55.

Multiple Pallets in the Zone

The system handles multiple simultaneous pallets naturally. ByteTrack gives each pallet a unique tracker ID. If three pallets are in the zone and a forklift takes one, only that one tracker ID disappears, the other two remain tracked with their own independent ledger entries.

Known Edge Cases and Hardening for Production

The core system, zone presence as a proxy for completion and collection, is a standard industrial pattern and works well in most environments. That said, there are edge cases worth understanding as you move toward a hardened production deployment.

Edge Case 1: Temporary Occlusion Creates a False Exit

A forklift approaching the zone can briefly block the camera's view of the pallet. If the occlusion lasts longer than MISSING_TIMEOUT_SEC, the system will falsely declare the pallet "collected", and then re-enter it as a new pallet when the forklift moves and the pallet becomes visible again.

Mitigation:

Tune MISSING_TIMEOUT_SEC based on real occlusion durations at your site. Record a few hours of footage, observe how long forklifts typically block the view, and set the timeout above that duration. A value of 2–3 seconds handles most real-world forklift approaches.
Increase the track_buffer parameter in the Byte Tracker block (default 30 frames). A higher buffer lets ByteTrack hold onto a tracker ID longer during detection dropouts.

Edge Case 2: Tracker ID Reassignment Creates Duplicate Events

If a pallet is occluded for longer than the tracker buffer, ByteTrack may drop the ID entirely and assign a new tracker_id when the pallet reappears. This creates two ledger entries for the same physical pallet, one short "false" event and one real one.

Mitigation:

Increase track_buffer in the Byte Tracker block to cover the longest expected occlusion at your site.
Add post-processing logic to the ledger merge any two events where the "out time" of one and the "in time" of the next are within a few seconds of each other and the bounding box positions overlap. This catches most reassignment cases.
Ensure consistent lighting and minimize camera obstructions to reduce detection dropouts in the first place.

Next-Level Hardening

For facilities that need even higher reliability, consider these additions

Add a second zone or line crossing for the forklift lane. Instead of relying solely on pallet disappearance, detect when a forklift enters the staging area as a confirmation signal. If a pallet disappears and a forklift was present in the zone at the same time, the collection event has stronger evidence.
Detect the forklift as a separate class. Train your RF-DETR model on both pallet and forklift classes. Use a Detections Filter block in the Workflow to separate them. This lets you log forklift activity independently and cross-reference it with pallet exits.
Add a "reappearance merge" step in post-processing. If a tracker ID exits and a new tracker ID enters within N seconds at a similar position, treat them as the same pallet.

These are iterative improvements. The core zone-based system handles the vast majority of cases correctly, and these hardening steps address the long tail.

Going Further: Rack Monitoring with a Second Camera

The end-of-line system tracks pallets from completion to forklift pickup. But where do those pallets go next? After pickup, the typical paths a pallet can take are

Direct shipment (cross-docking) where the pallet goes straight to an outbound truck without ever hitting a rack
Temporary storage in racks where the pallet is placed in a rack slot until it is needed for an order
Consolidation with other pallets where multiple pallets are grouped together before shipping
Loading dock staging where pallets are queued at the dock waiting for a truck

Each of these paths is a potential monitoring point. In this section, we focus on rack storage since it is the most common destination and the easiest to extend with the same Workflow pattern you already built.

A natural extension of this system is to add a second camera that monitors the rack, tracking when pallets are placed into rack slots and when they are removed.

The Concept

Mount a camera facing the rack, a front-facing view of the shelving bays.

Each rack slot (the individual bay where a pallet sits) can be defined as its own polygon zone. The same Workflow pattern applies

Pallet appears in a rack slot zone → logged as "stored" with a timestamp and rack position (e.g., Row B, Bay 3, Level 2).
Pallet disappears from a rack slot zone → logged as "retrieved", a forklift has pulled it for shipping or replenishment.

This gives you full lifecycle visibility completed → collected → stored → retrieved.

How to Build It

The architecture is the same Workflow you already built, Object Detection Model → Byte Tracker → Time in Zone → Visualizations, with a few adjustments

Multiple zones instead of one. Define a separate polygon for each rack slot. The Time in Zone block can monitor multiple zones simultaneously. Each zone corresponds to a physical rack position, so you know not just that a pallet was stored, but where.

Zone naming. Name each zone with the rack position (e.g., rack-A-bay-2-level-1). When a pallet enters that zone, the ledger entry includes the position. This makes it straightforward to answer questions like "is bay 3 occupied?" or "how long has that pallet been sitting in A-2-1?"

Camera placement. A front-facing camera covering the full rack face works well for racks up to 3–4 levels high. For deep racks (double-deep or drive-in), you may need angled cameras or one camera per aisle. The key requirement is that the camera can see individual bays clearly enough for the model to detect pallets in each slot.

Same model, different context. Your RF-DETR pallet detection model should transfer well to rack images, pallets look similar whether they are on the floor or on a shelf. You may need to add some rack-specific training images (pallets viewed from the front at different shelf heights, partial views where racking beams partially occlude the pallet). A few hundred additional labeled images should be enough to fine-tune.

What This Unlocks

With both cameras running, one at end-of-line, one at the rack, you get a complete pallet journey

Pallet #14
  Completed (end-of-line zone):     2:14:03 PM
  Collected (forklift pickup):      2:31:47 PM   | waited 17m 44s
  Stored (rack A, bay 2, level 1):  2:34:12 PM   | transit 2m 25s
  Retrieved (rack A, bay 2, level 1): next day 9:15 AM | stored 18h 41m

This data answers questions that no clipboard can

How long do pallets sit at end-of-line before pickup? (Forklift efficiency)
How long is the transit from line to rack? (Forklift routing)
Which rack positions turn over fastest? (Storage optimization)
Are there pallets that have been sitting in a rack slot for too long? (Dead stock detection)

The rack monitoring extension uses the exact same Roboflow Workflow pattern, the same RF-DETR model (with additional training data), and the same deployment script structure. The only difference is the number of zones and the physical meaning of enter/exit events.

Pallet Tracking System Conclusion

In this tutorial, we built a complete pallet accounting system that records every pallet from the moment it arrives at the end-of-line zone to the moment a forklift picks it up. To get started, create a free Roboflow account and follow the steps above.

If you want to discuss your specific facility setup, book a call with the Roboflow team.

Cite this Post

Use the following entry to cite this post in your research:

Timothy M. (Mar 4, 2026). How to Build Automated Pallet Accounting at End-of-Line with Roboflow. Roboflow Blog: https://blog.roboflow.com/automated-pallet-accounting/

Stay Connected

Get the Latest in Computer Vision First

Written by

Timothy M

View more posts

How to Build Automated Pallet Accounting at End-of-Line with Roboflow

How the Automated Pallet Accounting System Works

The Automated Pallet Accounting System Output

1. Annotated Video

2. Pallet Event Ledger (`pallet_zone_events.json`)

3. Warehouse Metrics (`warehouse_metrics.json`)

How to Build Automated Pallet Accounting at End-of-Line

Step 1: Create a Roboflow Project and Collect Data

Annotate

Generate a Dataset Version

Step 2: Train an RF-DETR Model

Step 3: Build the Pallet Accounting Workflow

Step 4: Deploy and Run the System

Step 5: Compute Warehouse KPIs

Step 6: Adapting for Live Camera Feeds

Step 7: Tips for a Reliable Production System

Known Edge Cases and Hardening for Production

Going Further: Rack Monitoring with a Second Camera

The Concept

How to Build It

What This Unlocks

Pallet Tracking System Conclusion

Cite this Post

Written by

Topics

More About Computer Vision

How to Use Roboflow for Video-Heavy Pipelines

How to Download and Run SAM 3 Weights

Revolutionizing Conidia Counting with Roboflow

DeepSeek Vision Models

How to Deploy Computer Vision

Automate Camera Quality Monitoring

How to Build Automated Pallet Accounting at End-of-Line with Roboflow

How the Automated Pallet Accounting System Works

The Automated Pallet Accounting System Output

1. Annotated Video

2. Pallet Event Ledger (pallet_zone_events.json)

3. Warehouse Metrics (warehouse_metrics.json)

How to Build Automated Pallet Accounting at End-of-Line

Step 1: Create a Roboflow Project and Collect Data

Annotate

Generate a Dataset Version

Step 2: Train an RF-DETR Model

Step 3: Build the Pallet Accounting Workflow

Step 4: Deploy and Run the System

Step 5: Compute Warehouse KPIs

Step 6: Adapting for Live Camera Feeds

Step 7: Tips for a Reliable Production System

Known Edge Cases and Hardening for Production

Going Further: Rack Monitoring with a Second Camera

The Concept

How to Build It

What This Unlocks

Pallet Tracking System Conclusion

Cite this Post

Written by

Topics

More About Computer Vision

How to Use Roboflow for Video-Heavy Pipelines

How to Download and Run SAM 3 Weights

Revolutionizing Conidia Counting with Roboflow

DeepSeek Vision Models

How to Deploy Computer Vision

Automate Camera Quality Monitoring

2. Pallet Event Ledger (`pallet_zone_events.json`)

3. Warehouse Metrics (`warehouse_metrics.json`)