Dedicated Deployments

Run Your Vision Models on Dedicated Servers with Roboflow

What are Dedicated Deployments?

Dedicated Deployments are private cloud servers managed by Roboflow, specifically designed to run your computer vision models. These models can include:

Object detection
Image segmentation
Classification
Keypoint detection
Foundation models like CLIP (if trained on Roboflow)
Roboflow Workflows (low-code vision applications)
...and many others!

Benefits of Dedicated Deployments

Focus on your machine vision business problem, leave the infrastructure to us: Spin up inference serving infrastructure with a few clicks and without having to signup with cloud providers, installing and securing servers, managing TLS certificates or worrying about server management, patching, updates etc.
Dedicated Resources: Get cloud servers allocated specifically for your use, ensuring consistent performance for your models.
Secure Access: Dedicated Deployments are accessible with your workspace's unique API key and utilize HTTPS for secure communication.
Easy Integration: Each deployment receives a subdomain within roboflow.cloud, simplifying integration with your applications.
Pay-Per-Hour: You're only charged for the duration of the server's existence (billed in 1 minute intervals).
Auto Pause & Resume: Your Dedicated Deployments will automatically pause after a configurable period of inactivity. For dev-cpu or dev-gpu deployment types, this period is fixed at 1 hour. They can be quickly resumed by sending a request with your API key. This feature is designed to help you save on costs.

Current Limitations

All dedicated deployments are currently hosted in US-based data centers; users from other Geographies may see higher latencies. Please contact us for a customized solution if you are outside of US, we can help you to reduce the network latency.
Dedicated deployments are available to Basic, Growth, Growth-UBP, and Enterprise plan workspaces. See Roboflow plans.

Types of Dedicated Deployments

Roboflow offers 4 different types of Dedicated Deployments, i.e., dev-cpu, dev-gpu, prod-cpu, and prod-gpu. While dev-cpu and dev-gpu are designed for development and testing purposes, will be deleted automatically after a few hours, prod-cpu and prod-gpu are persistent, ideally for serving large-scale production traffic.

Type

Features

dev-cpu

Ephemeral: will be automatically deleted after 3 hours

CPU: model inference can be done on the CPU

Ideal for testing integrations and prototyping applications

dev-gpu

Ephemeral: will be automatically deleted after 3 hours

Ideal for testing integrations and prototyping applications

GPU: models need GPU acceleration (like Florence 2)

Ideal for testing integrations and prototyping applications

prod-cpu

Persistent: dedicated subdomain <some-name>.roboflow.cloud

CPU: model inference can be done on the CPU

Ideal for serving production traffic

prod-gpu

Persistent: dedicated subdomain <some-name>.roboflow.cloud

GPU: models need GPU acceleration (like Florence 2)

Ideal for serving production traffic

Bill Information

The rate for GPU deployments (dev-gpu, prod-gpu) is 1 credit/hour, while the rate for CPU deployments (dev-cpu, prod-cpu) is 0.25 credit/hour.

If you prefer to be billed based on number of requests sent to your dedicated deployment server, please click here to contact our sales.

All dedicated deployment servers will run Roboflow Inference, our open-source inference server. Review the Roboflow Inference documentation to learn more about all of the features available.

Useful Links

PreviousPre-Trained Model APIs NextCreate a Dedicated Deployment

Last updated 2 days ago

Was this helpful?