Many production lines end the same way: packed pallets roll off the conveyor, land in a staging zone, and disappear into a "visibility gap." Between the moment a pallet is wrapped and the moment it’s picked up, your inventory exists only on a clipboard or in someone’s head.
The result? Inaccurate tallies, disputed shift totals, and zero insight into how long your throughput is actually sitting idle.
In this tutorial, we’re closing that gap. We’ll build an automated Pallet Accounting System that transforms raw video into a timestamped ledger. By the end of this guide, you’ll have a system that records:
- Pallet Completed: When a packed pallet enters the end-of-line zone (The "In Time").
- Pallet Collected: When a forklift picks it up and clears the zone (The "Out Time").
No manual counting. No shift-end disputes. Just a camera, an RF-DETR model, and a Roboflow Workflow.
Output of System
How the Automated Pallet Accounting System Works
The logic follows exactly how pallets move on your floor:
- A pallet is packed with packages at the packing station.
- The packed pallet moves on a conveyor toward the end-of-line staging zone.
- The pallet enters the zone, this is the completion event. The system logs its tracker ID and timestamp.
- The pallet sits in the zone, waiting for pickup. The system tracks how long it has been there.
- A forklift picks up the pallet and drives away. The pallet is no longer detected in the zone, this is the collection event. The system logs the exit timestamp and computes the wait duration.
The key insight is that you don't need to explicitly detect the forklift to measure pickup. You measure the pallet's presence in the staging zone. When a tracked pallet enters the zone, it is "completed." When that same tracked pallet disappears from the zone for longer than a brief timeout, it has been "collected." This is a standard industrial pattern zone presence as the proxy for both events.
An RF-DETR object detection model identifies pallets in every frame. ByteTrack assigns each pallet a persistent tracker ID so it is counted exactly once. A Time in Zone block monitors which pallets are inside the staging polygon and how long each has been there. The deployment script reads this data, maintains a per-pallet ledger, and computes warehouse metrics.
The Automated Pallet Accounting System Output
Before diving into the build steps, here is exactly what this system produces, so you know what you are working toward.
1. Annotated Video
The system outputs a video with bounding boxes, in zone time when the pallet enters the zone and the zone polygon drawn on every frame. This is your visual audit trail, you can scrub through and see exactly when each pallet entered and left the zone.
Video output from Workflow
2. Pallet Event Ledger (pallet_zone_events.json)
Every pallet gets one entry in the ledger with its full lifecycle
[
{
"tracker_id": 1,
"class": "pallet",
"in_time": "2026-03-04 05:50:50",
"out_time": "2026-03-04 05:50:52",
"in_time_sec": 0.625,
"out_time_sec": 2.917,
"duration_sec": 2.292,
"max_time_in_zone_sec": 2.292,
"last_confidence": 0.8548049926757812,
"note": "Removed from zone"
},
{
"tracker_id": 2,
"class": "pallet",
"in_time": "2026-03-04 05:50:53",
"out_time": "2026-03-04 05:50:55",
"in_time_sec": 3.792,
"out_time_sec": 5.417,
"duration_sec": 1.625,
"max_time_in_zone_sec": 1.625,
"last_confidence": 0.9335129261016846,
"note": "Removed from zone"
}
]Each entry records when the pallet entered the zone (completion), when it left (forklift pickup), how long it waited, and the model's confidence. This is the core accounting output, one row per pallet, machine-readable, ready to feed into your WMS or shift reports.
3. Warehouse Metrics (warehouse_metrics.json)
The system computes operational KPIs from the ledger
{
"total_pallets_completed": 2,
"avg_pickup_latency_sec": 1.958,
"max_pickup_latency_sec": 2.292,
"min_pickup_latency_sec": 1.625,
"avg_forklift_cycle_time_sec": 2.5,
"max_forklift_cycle_time_sec": 2.5,
"min_forklift_cycle_time_sec": 2.5,
"throughput_pallets_per_hour": 1502.5
}These metrics tell you at a glance how the line performed, how fast pallets were completed, how quickly forklifts picked them up, and whether service levels were met.
How to Build Automated Pallet Accounting at End-of-Line
Prerequisites:
- A free Roboflow account
- Images or video footage of your end-of-line area
- Python 3.10+
inference-sdkinstalled (pip install inference-sdk)
Step 1: Create a Roboflow Project and Collect Data
Start by creating an Object Detection project in Roboflow. This project will hold your pallet images, annotations, and trained model.
Capture Footage
Mount a camera overlooking the end-of-line staging zone. Record 15–30 minutes of normal operations that include
- Packed pallets arriving on the conveyor and entering the staging zone
- Pallets sitting in the zone, waiting
- Forklifts approaching, lifting, and driving away with pallets
- The zone empty (no pallets present)
- Different shift conditions, lighting changes, different pallet loads
Upload the video to your Roboflow project. Use Roboflow's frame extraction feature to sample frames.

Bootstrap with a Public Dataset (Optional)
If you want to prototype before capturing site-specific footage, Roboflow Universe has ready-to-use pallet datasets. You can use my dataset used in this project as starting point. This pallet-detection project contains labeled images of pallets in warehouse settings. Fork it into your workspace, then replace with your own data for production.
Annotate
For this system, you need one class
- pallet, any packed pallet visible in the frame
Open your project in Roboflow Annotate and draw tight bounding boxes around every pallet. Guidelines specific to this use case:
- Label pallets even when packages are stacked on top, the model should learn that a loaded pallet is still a pallet.
- Label pallets at all stages on the conveyor, stationary in the zone, being approached by a forklift.
- Do not label pallets that are already lifted off the ground on forklift forks. Once the forklift has it, it is no longer "in the zone."
- Include frames where no pallets are present. These negative examples help avoid false detections.
Use Roboflow's AI-assisted labeling tools to speed up annotation. Aim for at least 300-500 annotated images before your first training run.

Generate a Dataset Version
Once annotation is complete, generate a dataset version. For our project, the dataset version has the following configuration:
- Total Images 470 (379 train / 61 valid / 30 test)
- Preprocessing Auto-Orient applied, Resize stretch to 512×512
- Augmentations 3 outputs per training example, Horizontal Flip
Click "Generate" to lock in a reproducible snapshot of your data.

Step 2: Train an RF-DETR Model
For this project, I trained an RF-DETR Small object detection model. RF-DETR is fastest object detection model ideal for real-time applications and handles partially visible and overlapping objects well, which matters when pallets are stacked or crowded in the staging zone.
I used the Roboflow autotraining option. This is the fastest path - no GPU setup, no code required. Roboflow Train uses a custom-optimized RF-DETR checkpoint that consistently produces higher mAP scores than the public COCO weights.
- Navigate to the dataset version you generated.
- Click "Train Model", then select "Custom Training".
- Under Select Architecture, choose Roboflow RF-DETR (marked as Recommended). Set Model Size to Small.
- Click "Continue", then "Start Training".

For my generated 470-image dataset, training completed and produced the following results
- mAP@50 99.4%
- Precision 96.2%
- Recall 100%
The trained model ID is pallet-detection-nlwmv/1, a Roboflow RF-DETR Object Detection (Small) model. These are strong numbers that indicate the model reliably detects pallets across the dataset.
rfdetr Python package.Step 3: Build the Pallet Accounting Workflow
With a trained model, the next step is to build a Roboflow Workflow that connects detection to tracking, zone monitoring, and visualization, so every pallet entering and leaving the zone is accounted for.
Create a New Workflow
- In your Roboflow dashboard, click "Workflows" in the left sidebar.
- Click "Create Workflow".
- Select "Build My Own" to start from a blank canvas.
Here is the complete Workflow I built. You can use this workflow and explore settings for each blocks.
Here's a block-by-block explanation:
Block 1: Input
The Workflow starts with an Input block that accepts an image. When running on video, each frame from the video feed is passed in as the input image.
- Input name:
image
Block 2: Object Detection Model
Add an Object Detection Model block.
- Image: Connect to the input image.
- Model: Select your trained RF-DETR pallet detection model (e.g.,
pallet-detection-nlwmv/1). - Confidence Threshold: Start at
0.5(tune later based on results).
This block runs the RF-DETR model on every frame and outputs bounding boxes, class labels, and confidence scores for every detected pallet.
Block 3: Byte Tracker
Add a Byte Tracker block after the detection model.
- Detections: Connect to the Object Detection Model output.
ByteTrack assigns each detected pallet a unique, persistent tracker_id. Pallet #7 stays pallet #7 from the first frame it appears until it leaves the camera's field of view, even if it is briefly occluded by a forklift or another pallet. This persistent tracking is what makes accurate accounting possible one pallet = one entry in the ledger, no double-counting.
Block 4: Time in Zone
This is the core accounting block. Add a Time in Zone block.
- Detections: Connect to the Byte Tracker output.
- Zone: Define a polygon that covers your end-of-line staging zone, the area where pallets sit after coming off the conveyor and before being picked up by a forklift.
The Time in Zone block tracks
- Which tracker IDs are currently inside the polygon
- How long each tracked pallet has been in the zone (
time_in_zone)
This is what we use downstream to determine both events a pallet appearing in the zone = completed; that pallet disappearing = collected.
Block 5: Bounding Box Visualization
Add a Bounding Box Visualization block.
- Image: Connect to the input image.
- Predictions: Configure it to the Time in Zone blocks
timed_detectionsoutput.
This draws bounding boxes around every detected pallet in the frame.
Block 6: Polygon Zone Visualization
Add a Polygon Zone Visualization block.
- Image: Connect to the output of the Bounding Box Visualization block (so the zone overlay is drawn on top of the bounding boxes).
This renders the staging zone polygon on the frame, making it easy to verify the zone is drawn correctly and pallets are being detected inside it.
Block 7: Label Visualization
Add a Label Visualization block.
- Image: Connect to the output of the Polygon Zone Visualization block.
- Detections: Connect to the Time in Zone output.
This shows each pallet's in zone time (e.g., In zone: 0.45s).
Configure Outputs
The Workflow needs to return two things
zone_time, the annotated visualization image from the Label Visualization block (label_visualization.image). This is used to produce the output video.zone_output, the structured data from the Time in Zone block (time_in_zone.all_properties). This contains the per-pallet tracker IDs, time-in-zone durations, and confidence scores, the data your accounting script reads.
In the Workflow Outputs section, add
| Output Name | Value |
|---|---|
zone_time | label_visualization.image |
zone_output | time_in_zone.all_properties |
Step 4: Deploy and Run the System
With the Workflow built, we deploy it using the Roboflow inference-sdk Python package. The script streams video through the Workflow via WebRTC, reads the structured zone data from each frame, maintains a per-pallet accounting ledger, computes warehouse metrics live, and displays the annotated output as a real-time video feed. The architecture of the complete pipeline
Camera / Video File
|
Roboflow Workflow (RF-DETR + ByteTrack + Time in Zone)
|
Zone Event Engine (ENTER / EXIT detection)
|
Pallet Accounting Ledger (pallet_zone_events.json)
|
Warehouse Analytics (warehouse_metrics.json + pallet_metrics.json)
|
Live Operator View (cv2.imshow with real-time playback)Install Dependencies
pip install -U inference-sdkThe Deployment Script
Here is the complete deployment script. It connects to your Roboflow Workflow, processes video frame by frame, records pallet enter/exit events, updates warehouse metrics on every exit, and displays the annotated video as a live feed.
import cv2
import base64
import numpy as np
import json
import time
from datetime import datetime, timezone, timedelta
from statistics import mean
from inference_sdk import InferenceHTTPClient
from inference_sdk.webrtc import VideoFileSource, StreamConfig, VideoMetadataConnect to Roboflow
client = InferenceHTTPClient.init(
api_url="https://serverless.roboflow.com",
api_key="YOUR_API_KEY"
)
source = VideoFileSource("pallet.mp4", realtime_processing=False)
VIDEO_OUTPUT = "zone_time"
config = StreamConfig(
stream_output=[],
data_output=["zone_time", "zone_output"],
requested_plan="webrtc-gpu-medium",
requested_region="us",
)
session = client.webrtc.stream(
source=source,
workflow="pallet-ac",
workspace="your-workspace",
image_input="image",
config=config
)InferenceHTTPClient connects to the Roboflow serverless API. VideoFileSource wraps a local video file as the input (for a live camera, use an RTSP URL instead). StreamConfig tells the Workflow which outputs to return per frame: zone_time for the visualization image and zone_output for the structured tracking data. The session streams the video through your pallet-ac Workflow via WebRTC.
Global State
The script maintains in-memory state that tracks every pallet currently in the zone, a finalized event ledger, and configuration for live display and reporting
active_tracks = {} # tracker_id -> state dict
events = [] # finalized pallet events (the ledger)
frames = [] # for stitching output video
MISSING_TIMEOUT_SEC = 1.0
run_start_dt = datetime.now(timezone.utc)
# Live display
DISPLAY_LIVE = True
DISPLAY_WINDOW = "Pallet Workflow Live"
THROTTLE_TO_REALTIME = True
_last_video_t = None
_last_wall_t = None
# Output files
EVENT_FILE = "pallet_zone_events.json"
WAREHOUSE_FILE = "warehouse_metrics.json"
PALLET_FILE = "pallet_metrics.json"MISSING_TIMEOUT_SEC prevents brief detection dropouts (a forklift momentarily blocking the camera) from being misinterpreted as a pallet leaving the zone. Only when a pallet has been undetected for longer than this timeout is it finalized as "collected."
DISPLAY_LIVE enables a real-time OpenCV window that shows the Workflow output as the video plays, like watching a live camera feed. THROTTLE_TO_REALTIME paces the display to match the original video speed rather than processing as fast as possible. Press q at any time to stop the stream.
Helper Functions
def video_time_seconds(metadata: VideoMetadata):
return float(metadata.pts) * float(metadata.time_base)
def to_datetime_str(sec):
dt = run_start_dt + timedelta(seconds=sec)
return dt.strftime("%Y-%m-%d %H:%M:%S")
def parse_zone_predictions(data):
if "zone_output" not in data:
return []
return data["zone_output"].get(
"timed_detections", {}
).get("predictions", [])parse_zone_predictions reads the structured output from the Time in Zone block. Each prediction represents a pallet currently detected inside the zone polygon, with its tracker_id, confidence, class, and time_in_zone.
Live Video Display
def show_live_frame(frame, t_sec):
global _last_video_t, _last_wall_t
if not DISPLAY_LIVE:
return True
if THROTTLE_TO_REALTIME:
now_wall = time.time()
if _last_video_t is None:
_last_video_t = t_sec
_last_wall_t = now_wall
else:
dv = t_sec - _last_video_t
dw = now_wall - _last_wall_t
sleep_time = max(0, dv - dw)
if sleep_time > 0:
time.sleep(sleep_time)
cv2.imshow(DISPLAY_WINDOW, frame)
key = cv2.waitKey(1) & 0xFF
if key == ord("q"):
return False
return TrueThis function paces the frame display so the annotated video plays at real-time speed. The operator sees exactly what the camera sees, with bounding boxes, tracker IDs, and the zone polygon overlaid in real time.
Finalize a Track (Log the Exit Event)
When a pallet leaves the zone, this function closes its record, moves it to the event ledger, and immediately updates all report files
def finalize_track(tracker_id, out_time):
st = active_tracks.get(tracker_id)
if not st:
return
duration = round(out_time - st["in_time_sec"], 3)
event = {
"tracker_id": tracker_id,
"class": st.get("class", "pallet"),
"in_time": st["in_time"],
"out_time": to_datetime_str(out_time),
"in_time_sec": round(st["in_time_sec"], 3),
"out_time_sec": round(out_time, 3),
"duration_sec": duration,
"max_time_in_zone_sec": round(
st.get("max_time_in_zone_sec", 0), 3
),
"last_confidence": st.get("last_confidence"),
"note": "Removed from zone"
}
events.append(event)
del active_tracks[tracker_id]
print(f"[EXIT] pallet {tracker_id} duration={duration}s")
update_reports()The key design decision here is that update_reports() is called inside finalize_track(). Every time a pallet exits the zone, all three JSON files (pallet_zone_events.json, warehouse_metrics.json, pallet_metrics.json) are rewritten with the latest data. In a live deployment, this means an external dashboard or WMS integration can read these files at any time and always see current numbers.
The Per-Frame Callback
This is the core of the accounting logic. It runs on every frame
@session.on_data()
def on_data(data, metadata):
t_sec = video_time_seconds(metadata)
now_str = to_datetime_str(t_sec)
preds = parse_zone_predictions(data)
present_ids = set()
for p in preds:
tid = p.get("tracker_id")
if tid is None:
continue
present_ids.add(tid)
conf = p.get("confidence")
cls = p.get("class", "pallet")
time_in_zone = p.get("time_in_zone")
if tid not in active_tracks:
active_tracks[tid] = {
"tracker_id": tid,
"class": cls,
"in_time_sec": t_sec,
"in_time": now_str,
"last_seen_sec": t_sec,
"last_confidence": conf,
"max_time_in_zone_sec": float(time_in_zone or 0)
}
print(f"[ENTER] pallet {tid}")
else:
st = active_tracks[tid]
st["last_seen_sec"] = t_sec
st["last_confidence"] = conf
if time_in_zone:
st["max_time_in_zone_sec"] = max(
st["max_time_in_zone_sec"],
float(time_in_zone)
)
# detect exits
to_finalize = []
for tid, st in active_tracks.items():
missing = t_sec - st["last_seen_sec"]
if missing >= MISSING_TIMEOUT_SEC and tid not in present_ids:
to_finalize.append((tid, st["last_seen_sec"]))
for tid, out_t in to_finalize:
finalize_track(tid, out_t)
# live display
if VIDEO_OUTPUT in data:
img = cv2.imdecode(
np.frombuffer(
base64.b64decode(data[VIDEO_OUTPUT]["value"]),
np.uint8
),
cv2.IMREAD_COLOR
)
keep_running = show_live_frame(img, t_sec)
if not keep_running:
session.close()
return
frames.append((t_sec, metadata.frame_id, img))
print(
f"frame {metadata.frame_id} "
f"active={len(active_tracks)} "
f"events={len(events)}"
)Here is what happens on every frame
ENTER events. The script reads every pallet prediction from the Time in Zone output. If a tracker_id appears for the first time, it is logged as a new pallet entering the zone (the completion event). If the tracker_id is already known, the script updates the "last seen" time and maximum zone duration.
EXIT events. The script checks for any active pallet that was not present in this frame. If a pallet has been missing for longer than MISSING_TIMEOUT_SEC (default 1 second), it is finalized as a collection event. The out_time is set to the last frame where the pallet was actually detected, not the frame where the timeout triggered, which gives a more accurate pickup timestamp. On every exit, update_reports() immediately rewrites all JSON files with updated metrics.
Live display. The zone_time output is decoded and shown in an OpenCV window at real-time speed. The operator sees bounding boxes, tracker labels, and the zone polygon overlaid on the live feed.
Run the Stream and Save Results
session.run()
if DISPLAY_LIVE:
cv2.destroyAllWindows()
for tid in list(active_tracks.keys()):
finalize_track(tid, active_tracks[tid]["last_seen_sec"])
update_reports()After the video ends (or the operator presses q), any pallets still in the zone are finalized and the reports are written one last time.

Stitch the Output Video (Optional)
The collected visualization frames are also saved as an MP4 file for offline review
if frames:
frames.sort(key=lambda x: x[1])
fps = len(frames) / (frames[-1][0] - frames[0][0])
h, w = frames[0][2].shape[:2]
out = cv2.VideoWriter(
"output.mp4",
cv2.VideoWriter_fourcc(*"mp4v"),
fps,
(w, h)
)
for _, _, f in frames:
out.write(f)
out.release()
print("output video saved")Step 5: Compute Warehouse KPIs
The deployment script already computes and saves metrics on every pallet exit via update_reports(). Here is what each metric function produces and what the numbers mean operationally.
Warehouse Metrics (warehouse_metrics.json)
def compute_warehouse_metrics(events):
if not events:
return {}
events_sorted = sorted(events, key=lambda e: e["in_time_sec"])
latencies = [e["duration_sec"] for e in events_sorted]
pickup_times = [e["out_time_sec"] for e in events_sorted]
cycle_times = [
pickup_times[i] - pickup_times[i-1]
for i in range(1, len(pickup_times))
]
start = events_sorted[0]["in_time_sec"]
end = max(e["out_time_sec"] for e in events_sorted)
duration = max(0.001, end - start)
throughput = len(events_sorted) / (duration / 3600)
return {
"total_pallets_completed": len(events_sorted),
"avg_pickup_latency_sec": round(mean(latencies), 3),
"max_pickup_latency_sec": round(max(latencies), 3),
"min_pickup_latency_sec": round(min(latencies), 3),
"avg_forklift_cycle_time_sec": round(mean(cycle_times), 3)
if cycle_times else None,
"max_forklift_cycle_time_sec": max(cycle_times)
if cycle_times else None,
"min_forklift_cycle_time_sec": min(cycle_times)
if cycle_times else None,
"throughput_pallets_per_hour": round(throughput, 2)
}This produces metrics like
| Metric | Description |
|---|---|
total_pallets_completed | Total pallets that entered and left the zone |
avg_pickup_latency_sec | Average time pallets waited before forklift pickup |
max_pickup_latency_sec | Longest any pallet waited |
min_pickup_latency_sec | Shortest wait time |
throughput_pallets_per_hour | Line throughput over the observed period |
avg_forklift_cycle_time_sec | Average time between consecutive forklift pickups |
Reading the Metrics: What to Do with These Numbers
These are not just numbers to log. Each one points to a specific operational decision
avg_pickup_latency_secConsistently above 5 minutes? You likely need a second forklift assigned to the end-of-line area, or your current forklift driver is being pulled to other tasks too often.max_pickup_latency_secSpiking on certain shifts? Compare across shifts to identify staffing or process gaps. One shift may be understaffed or have a less experienced driver.throughput_pallets_per_hourDropping over time? The production line may be slowing down, or pallets are backing up because forklifts cannot keep pace.avg_forklift_cycle_time_secIncreasing? The forklift is taking longer between pickups. This could mean longer drive distances (rack is filling up far from the line), more congestion in aisles, or driver fatigue toward end of shift.
Pallet Metrics (pallet_metrics.json)
def compute_pallet_metrics(events):
if not events:
return {}
ev = sorted(events, key=lambda x: x["in_time_sec"])
durations = [e["duration_sec"] for e in ev]
start = ev[0]["in_time_sec"]
end = max(e["out_time_sec"] for e in ev)
window = max(0.001, end - start)
throughput = len(ev) / (window / 3600)
return {
"total_pallet_events": len(ev),
"unique_pallets": len(set(e["tracker_id"] for e in ev)),
"avg_dwell_sec": round(mean(durations), 3),
"max_dwell_sec": round(max(durations), 3),
"min_dwell_sec": round(min(durations), 3),
"throughput_pallets_per_hour": round(throughput, 2)
}This function focuses on pallet flow rather than forklift performance. avg_dwell_sec is how long pallets typically sit in the zone before pickup. unique_pallets confirms that tracker IDs are not being reassigned (if unique_pallets equals total_pallet_events, each pallet was tracked as a distinct object). throughput_pallets_per_hour gives the overall line speed from the pallet perspective.
Live Report Updates
Both metric functions are called inside update_reports(), which runs every time a pallet exits the zone
def update_reports():
with open(EVENT_FILE, "w") as f:
json.dump(events, f, indent=2)
wm = compute_warehouse_metrics(events)
with open(WAREHOUSE_FILE, "w") as f:
json.dump(wm, f, indent=2)
pm = compute_pallet_metrics(events)
with open(PALLET_FILE, "w") as f:
json.dump(pm, f, indent=2)
print("\n--- Warehouse Metrics ---")
print(json.dumps(wm, indent=2))Because this runs on every exit event, the three JSON files are always current. An external process (a warehouse dashboard, a WMS connector, a shift report generator) can read these files at any time and get up-to-date numbers without waiting for the video to finish.
Step 6: Adapting for Live Camera Feeds
The deployment script above processes a video file. For a live production deployment, swap VideoFileSource for your camera's RTSP stream URL. The accounting logic remains identical, the script processes each frame as it arrives and maintains the pallet ledger in real time.
For edge deployment on devices like NVIDIA Jetson
- Install Roboflow Inference on the device.
- Deploy your Workflow to run locally, no cloud dependency needed.
- Connect your camera directly to the device.
RF-DETR runs at 25+ FPS on NVIDIA Jetson devices and 100+ FPS on NVIDIA T4 GPUs, which is more than sufficient for real-time pallet tracking.
See the Roboflow Inference deployment documentation for device-specific setup instructions.
Step 7: Tips for a Reliable Production System
Drawing the Zone Polygon
The zone polygon should tightly cover the area where pallets sit between completion and forklift pickup. Get the coordinates right:
- Make sure the whole pallet should be covered in the zone with its bounding box.
- Do not extend the zone onto the conveyor, otherwise pallets get marked "complete" before they have actually arrived in the staging area.
- Do not extend into the forklift driving lane, forklifts passing by (not picking up) could cause tracking noise.
- Use a frame from your actual camera feed in the Polygon Zone tool at your exact camera resolution.
Tuning MISSING_TIMEOUT_SEC
This parameter controls how long a pallet can be undetected before it is declared "collected." The right value depends on your environment
- Too low (e.g., 0.3s) Brief occlusions (forklift blocks camera view momentarily) will be misread as exits, creating false collection events.
- Too high (e.g., 5s) The system is slow to recognize that a pallet has been picked up, and the
out_timewill lag behind reality. - Default of 1.0s works well for most setups at 30 FPS. Increase it if your forklifts routinely block the camera for longer than a second.
Handling Tracker ID Reassignment
In rare cases, ByteTrack may lose a pallet's tracker ID and assign a new one, making one physical pallet appear as two events. This can happen when a pallet is occluded for longer than the tracker's buffer. To mitigate this
- Increase the
track_bufferparameter in the Byte Tracker block settings (default is 30 frames). - Ensure consistent lighting and camera angle to minimize detection dropouts.
- Post-process the ledger to merge events that overlap in time and position.
Confidence Tuning
Start at 0.5 confidence threshold. If the model produces false detections (counting shadows or boxes as pallets), increase it. If it misses real pallets, decrease it. Well-trained pallet models typically perform best between 0.35 and 0.55.
Multiple Pallets in the Zone
The system handles multiple simultaneous pallets naturally. ByteTrack gives each pallet a unique tracker ID. If three pallets are in the zone and a forklift takes one, only that one tracker ID disappears, the other two remain tracked with their own independent ledger entries.
Known Edge Cases and Hardening for Production
The core system, zone presence as a proxy for completion and collection, is a standard industrial pattern and works well in most environments. That said, there are edge cases worth understanding as you move toward a hardened production deployment.
Edge Case 1: Temporary Occlusion Creates a False Exit
A forklift approaching the zone can briefly block the camera's view of the pallet. If the occlusion lasts longer than MISSING_TIMEOUT_SEC, the system will falsely declare the pallet "collected", and then re-enter it as a new pallet when the forklift moves and the pallet becomes visible again.
Mitigation:
- Tune
MISSING_TIMEOUT_SECbased on real occlusion durations at your site. Record a few hours of footage, observe how long forklifts typically block the view, and set the timeout above that duration. A value of 2–3 seconds handles most real-world forklift approaches. - Increase the
track_bufferparameter in the Byte Tracker block (default 30 frames). A higher buffer lets ByteTrack hold onto a tracker ID longer during detection dropouts.
Edge Case 2: Tracker ID Reassignment Creates Duplicate Events
If a pallet is occluded for longer than the tracker buffer, ByteTrack may drop the ID entirely and assign a new tracker_id when the pallet reappears. This creates two ledger entries for the same physical pallet, one short "false" event and one real one.
Mitigation:
- Increase
track_bufferin the Byte Tracker block to cover the longest expected occlusion at your site. - Add post-processing logic to the ledger merge any two events where the "out time" of one and the "in time" of the next are within a few seconds of each other and the bounding box positions overlap. This catches most reassignment cases.
- Ensure consistent lighting and minimize camera obstructions to reduce detection dropouts in the first place.
Next-Level Hardening
For facilities that need even higher reliability, consider these additions
- Add a second zone or line crossing for the forklift lane. Instead of relying solely on pallet disappearance, detect when a forklift enters the staging area as a confirmation signal. If a pallet disappears and a forklift was present in the zone at the same time, the collection event has stronger evidence.
- Detect the forklift as a separate class. Train your RF-DETR model on both
palletandforkliftclasses. Use a Detections Filter block in the Workflow to separate them. This lets you log forklift activity independently and cross-reference it with pallet exits. - Add a "reappearance merge" step in post-processing. If a tracker ID exits and a new tracker ID enters within N seconds at a similar position, treat them as the same pallet.
These are iterative improvements. The core zone-based system handles the vast majority of cases correctly, and these hardening steps address the long tail.
Going Further: Rack Monitoring with a Second Camera
The end-of-line system tracks pallets from completion to forklift pickup. But where do those pallets go next? After pickup, the typical paths a pallet can take are
- Direct shipment (cross-docking) where the pallet goes straight to an outbound truck without ever hitting a rack
- Temporary storage in racks where the pallet is placed in a rack slot until it is needed for an order
- Consolidation with other pallets where multiple pallets are grouped together before shipping
- Loading dock staging where pallets are queued at the dock waiting for a truck
Each of these paths is a potential monitoring point. In this section, we focus on rack storage since it is the most common destination and the easiest to extend with the same Workflow pattern you already built.
A natural extension of this system is to add a second camera that monitors the rack, tracking when pallets are placed into rack slots and when they are removed.
The Concept
Mount a camera facing the rack, a front-facing view of the shelving bays.

Each rack slot (the individual bay where a pallet sits) can be defined as its own polygon zone. The same Workflow pattern applies
- Pallet appears in a rack slot zone → logged as "stored" with a timestamp and rack position (e.g., Row B, Bay 3, Level 2).
- Pallet disappears from a rack slot zone → logged as "retrieved", a forklift has pulled it for shipping or replenishment.
This gives you full lifecycle visibility completed → collected → stored → retrieved.
How to Build It
The architecture is the same Workflow you already built, Object Detection Model → Byte Tracker → Time in Zone → Visualizations, with a few adjustments
Multiple zones instead of one. Define a separate polygon for each rack slot. The Time in Zone block can monitor multiple zones simultaneously. Each zone corresponds to a physical rack position, so you know not just that a pallet was stored, but where.
Zone naming. Name each zone with the rack position (e.g., rack-A-bay-2-level-1). When a pallet enters that zone, the ledger entry includes the position. This makes it straightforward to answer questions like "is bay 3 occupied?" or "how long has that pallet been sitting in A-2-1?"
Camera placement. A front-facing camera covering the full rack face works well for racks up to 3–4 levels high. For deep racks (double-deep or drive-in), you may need angled cameras or one camera per aisle. The key requirement is that the camera can see individual bays clearly enough for the model to detect pallets in each slot.
Same model, different context. Your RF-DETR pallet detection model should transfer well to rack images, pallets look similar whether they are on the floor or on a shelf. You may need to add some rack-specific training images (pallets viewed from the front at different shelf heights, partial views where racking beams partially occlude the pallet). A few hundred additional labeled images should be enough to fine-tune.
What This Unlocks
With both cameras running, one at end-of-line, one at the rack, you get a complete pallet journey
Pallet #14
Completed (end-of-line zone): 2:14:03 PM
Collected (forklift pickup): 2:31:47 PM | waited 17m 44s
Stored (rack A, bay 2, level 1): 2:34:12 PM | transit 2m 25s
Retrieved (rack A, bay 2, level 1): next day 9:15 AM | stored 18h 41mThis data answers questions that no clipboard can
- How long do pallets sit at end-of-line before pickup? (Forklift efficiency)
- How long is the transit from line to rack? (Forklift routing)
- Which rack positions turn over fastest? (Storage optimization)
- Are there pallets that have been sitting in a rack slot for too long? (Dead stock detection)
The rack monitoring extension uses the exact same Roboflow Workflow pattern, the same RF-DETR model (with additional training data), and the same deployment script structure. The only difference is the number of zones and the physical meaning of enter/exit events.
Pallet Tracking System Conclusion
In this tutorial, we built a complete pallet accounting system that records every pallet from the moment it arrives at the end-of-line zone to the moment a forklift picks it up. To get started, create a free Roboflow account and follow the steps above.
If you want to discuss your specific facility setup, book a call with the Roboflow team.
Cite this Post
Use the following entry to cite this post in your research:
Timothy M. (Mar 4, 2026). How to Build Automated Pallet Accounting at End-of-Line with Roboflow. Roboflow Blog: https://blog.roboflow.com/automated-pallet-accounting/