For the complete documentation index, see llms.txt. This page is also available as Markdown.

Dedicated Deployments

Provision and manage Dedicated Deployment GPU machines via REST.

Dedicated Deployments are managed GPU machines that run your Roboflow models with predictable latency and high throughput. They are managed by a dedicated service hosted at https://roboflow.cloud, separate from the main https://api.roboflow.com REST API.

This page documents the management endpoints (create, get, pause, resume, delete, logs, usage). For inference against a deployment once it's live, see Run a Model on an Image.

The "edge devices" documentation under Deployment Manager is a separate product. Dedicated Deployments are managed GPU machines in Roboflow's cloud; Deployment Manager devices are on-prem hardware running Roboflow Inference.

Base URL: https://roboflow.cloud

api_key is passed as a query parameter on every request.

List Machine Types

GET /machine_types

curl "https://roboflow.cloud/machine_types?api_key=$ROBOFLOW_API_KEY"

Response

{
  "machine_types": [
    { "name": "gpu-small",  "description": "1× T4, 4 vCPU, 16 GB RAM" },
    { "name": "gpu-medium", "description": "1× L4, 8 vCPU, 32 GB RAM" }
  ]
}

Create a Deployment

POST /add

Body (JSON)

Name
Type
Description
Required

api_key

string

Workspace API key.

creator_email

string

Email of a workspace member.

deployment_name

string

Unique name within the workspace.

machine_type

string

From /machine_types.

duration

integer

Hours before auto-cleanup.

delete_on_expiration

boolean

true to delete on expiration; false to pause.

inference_version

string

Inference server version. Omit for latest.

The deployment provisions asynchronously. Poll GET /get until status == "ready".

Get a Deployment

GET /get?api_key=...&deployment_name=...

Response (excerpt)

List Deployments

GET /list?api_key=...

Pause / Resume / Delete

POST /pause POST /resume POST /delete

The same body shape applies to /resume and /delete.

Logs

GET /get_log?api_key=...&deployment_name=...&from_timestamp=...&to_timestamp=...&max_entries=...

from_timestamp and to_timestamp are ISO-8601 strings. Omit them to fetch the most recent logs up to max_entries.

Usage

Workspace-wide:

GET /usage_workspace?api_key=...&from_timestamp=...&to_timestamp=...

Per-deployment:

GET /usage_deployment?api_key=...&deployment_name=...&from_timestamp=...&to_timestamp=...

SDK and CLI equivalents

Last updated

Was this helpful?