For the complete documentation index, see llms.txt. This page is also available as Markdown.

Run a Model on an Image

Overview of Roboflow's hosted inference endpoints and where to find detailed reference documentation.

Roboflow exposes inference through several runtimes - the right choice depends on whether you're calling a single model or a Workflow, how much throughput you need, and where the workload runs.

This page is a brief overview. The detailed inference reference lives in the product documentation, which is part of the same docs site. Cross-links are provided where the deeper material lives.

Inference runtimes

Runtime
Use when
Reference

Serverless v2 (serverless.roboflow.com)

Default. Hosted, auto-scaling, supports models and Workflows.

Dedicated Deployments

You need predictable latency, high throughput, or pinned GPU type. Managed by Roboflow.

Roboflow Inference (self-hosted)

On-prem, edge devices, air-gapped environments, or workloads that can't leave your VPC. Open source.

Calling Serverless v2

Run a model:

curl -F "file=@photo.jpg" \
  "https://serverless.roboflow.com/infer/<workspace>/<project>/<version>?api_key=$ROBOFLOW_API_KEY&confidence=0.5"

Run a Workflow:

curl -X POST "https://serverless.roboflow.com/infer/workflows/<workspace>/<workflow>" \
  -H "Content-Type: application/json" \
  -d '{
    "api_key": "'$ROBOFLOW_API_KEY'",
    "inputs": { "image": { "type": "url", "value": "https://example.com/photo.jpg" } }
  }'

The full request/response reference, including streaming, batching, and per-task response shapes, is in the product docs deployment section.

Deprecated: Serverless v1

The legacy task-specific endpoints - detect.roboflow.com, classify.roboflow.com, outline.roboflow.com, segment.roboflow.com - are deprecated. They still respond for backwards compatibility but new code should use serverless.roboflow.com instead.

If you find a snippet pointing to a *.roboflow.com task host, treat it as legacy and translate it to the Serverless v2 form above.

SDK and CLI shortcuts

If you're not building a non-Python integration, the SDK and CLI wrap these calls with proper authentication and JSON parsing:

Both ultimately hit the same Serverless v2 endpoint described above.

Last updated

Was this helpful?