For the complete documentation index, see llms.txt. This page is also available as Markdown.

Process a Video Stream

Run a Roboflow Workflow against a video file or live RTSP stream.

This recipe runs a Roboflow Workflow against a video - either a file or a live RTSP/HLS stream. The runtime side is documented in depth in the product docs; this page focuses on wiring the pieces together from a developer perspective.

There are two practical paths:

  • Hosted (Serverless v2) - submit a video file or URL and get back per-frame predictions. Best for batch / one-shot processing.

  • Self-hosted Inference - stream frames into a local Inference server and consume predictions in real time. Best for low-latency or edge use cases.

Path A - Hosted: process a video file

Use the Serverless Video API for hosted batch video processing.

import os
import time
import requests

API_KEY = os.environ["ROBOFLOW_API_KEY"]
WORKSPACE = "my-workspace"
WORKFLOW = "my-detector-workflow"

# 1. Get a signed upload URL.
signed = requests.post(
    "https://api.roboflow.com/video_upload_signed_url/",
    params={"api_key": API_KEY, "file_name": "input.mp4"},
).json()

# 2. Upload the file.
with open("input.mp4", "rb") as f:
    requests.put(signed["signedUrl"], data=f, headers={"Content-Type": "video/mp4"})

# 3. Submit the inference job pointing at the uploaded file.
job = requests.post(
    "https://api.roboflow.com/videoinfer/",
    json={
        "api_key": API_KEY,
        "workspace": WORKSPACE,
        "workflow": WORKFLOW,
        "video_url": signed["fileUrl"],
        "fps": 5,
    },
).json()
job_id = job["jobId"]

# 4. Poll until the job is done.
while True:
    status = requests.get(
        f"https://api.roboflow.com/videoinfer?api_key={API_KEY}&jobId={job_id}"
    ).json()
    if status["status"] in ("complete", "failed"):
        break
    time.sleep(5)

print(status)

The output structure (per-frame predictions, timing) is documented in the Serverless Video API reference.

Path B - Self-hosted: stream frames to local Inference

For real-time / low-latency, run Roboflow Inference locally and stream frames into it.

Then from Python:

InferencePipeline handles frame decoding, batching, and async predictions. Replace render_boxes with your own callback to do something useful with the predictions (push to a webhook, write to a DB, raise an alert).

When to use which

Hosted
Self-hosted

Latency

Seconds (per-frame after upload)

Tens of milliseconds

Setup

Just an API key

Docker host or edge device

Cost

Per-frame credits

Hardware + electricity

Best for

Batch analysis, one-off footage

Live streams, alerts, edge

Logging predictions to Vision Events

Either path can log structured predictions to Vision Events for dashboards and alerting:

That's the same code path the Vision Events SDK and REST docs go into in detail.

Last updated

Was this helpful?