Dedicated Deployments
Run Your Vision Models on Dedicated Servers with Roboflow
What are Dedicated Deployments?
Dedicated Deployments are private cloud servers managed by Roboflow, specifically designed to run your computer vision models. These models can include:
Object detection
Image segmentation
Classification
Keypoint detection
Foundation models like CLIP (if trained on Roboflow)
Roboflow Workflows (low-code vision applications)
...and many others!
Benefits of Dedicated Deployments:
Focus on your machine vision business problem, leave the infrastructure to us: Spin up inference serving infrastructure with a few clicks and without having to signup with cloud providers, installing and securing servers, managing TLS certificates or worrying about server management, patching, updates etc.
Dedicated Resources: Get cloud servers allocated specifically for your use, ensuring consistent performance for your models.
Secure Access: Dedicated Deployments are accessible with your workspace's unique API key and utilize HTTPS for secure communication.
Easy Integration: Each deployment receives a subdomain within roboflow.cloud, simplifying integration with your applications.
Current Limitations:
All dedicated deployments are currently hosted in US-based data centers; users from other Geographies may see higher latencies. Please contact us for a customized solution if you are outside of US, we can help you to reduce the network latency.
Dedicated deployments are available to Basic, Growth, Growth-UBP, and Enterprise plan workspaces. See Roboflow plans.
Choosing the Right Dedicated Deployment
Roboflow offers Dedicated Deployments in two distinct environments: development and production. Both environments allow you to create servers with or without GPUs, tailored to your specific needs.
Development Environment
Ephemeral: Servers in the development environment are short-lived, lasting between 1 and 6 hours. This also means that the same URL of the dedicated deployment may not be available the next time you start a dedicated deployment in the development environment.
Ideal for Testing: Perfect for testing integrations and prototyping applications.
Pay-Per-Hour: You're only charged for the duration of the server's existence (billed in 1 minute intervals).
Automatic Deletion: Servers are automatically deleted after their designated duration has passed or when manually deleted.
Production Environment
Persistent: Servers remain active until manually deleted via the Roboflow UI or CLI.
Hybrid Billing: Billing is based on both the server's duration and the number of inferences or workflows processed. More details here.
Dedicated Subdomain: You will get a dedicated subdomain
<some-name>.roboflow.cloud
, and it will always be available for the whole life cycle of your dedicated deployment.
Summary of Dedicated Deployment Types
The following table summarizes the available Dedicated Deployment types:
dev-cpu
Development tasks that do not require GPU acceleration, such as workflows without GPU-intensive models like Florence 2.
dev-gpu
Development tasks that benefit from GPU acceleration, such as workflows using models like Florence 2.
prod-cpu
Long-running production workloads that do not require GPU acceleration.
prod-gpu
Long-running production workloads that require GPU acceleration, or applications prioritizing low latency inference and workflow execution.
Billing difference between Development and Production Environment
Rate Definitions
Hourly Rate: A fixed rate determined by the type of dedicated deployment.
Inference Rate: A variable rate depending on the number of inference requests, including model, workflow, and video stream inferences, processed by the dedicated server.
Bill Calculation
Development Environment: Charges are solely based on the Hourly Rate, i.e., 0.5 credit/hour for dev-cpu, 1.0 credit/hour for dev-gpu.
Production Environment: Every hour, the Inference Rate is calculated based on the requests sent to the dedicated deployment. This rate is compared with the Hourly Rate (0.5 credit/hour for prod-cpu, 1.0 credit/hour for prod-gpu), and only the maximum of the two is charged.
By understanding these distinctions, you can select the optimal Dedicated Deployment configuration to meet your specific use case.
All dedicated deployment servers will run Roboflow Inference, our open-source inference server. Review the Roboflow Inference documentation to learn more about all of the features available.
Provision and Manage Dedicated Deployments (Roboflow App)
A workspace user can provision a dedicated deployment under the Deployments --> Dedicated Deployments tab for a workspace.
Clicking on the New Deployment button brings up the Create a Dedicated Deployment dialog, as shown below:
Each of the properties in the dialog are described in the table below. Fill the dialog and click on the Create Dedicated Deployment button. Your deployment will be provisioned. It may take anywhere from a few seconds to a few minutes to provision your deployment.
Name
Choose a unique name (5-15 characters) to identify your Dedicated Deployment. This name will also become the subdomain for your deployment endpoint (e.g., [invalid URL removed]).
Easy to Remember: Pick a name that clearly reflects your deployment's purpose (e.g., "prod-inference", "dev-testing").
Unique within Workspace: If your chosen name is already taken, a short random code will be added to create a unique subdomain.
Tips:
Use lowercase letters, numbers, and hyphens (-) for your name.
Avoid special characters or spaces.
Machine Type
Whether a CPU-only or a GPU dedicated deployment is needed.
Deployment Type
This is the deployment environment - Development (dev) or Production (prod).
Duration
This is the time for which the dedicated deployment remains online.
For development environments this can range from 1-6 hours. Fractional hours are permitted, they will be rounded to the nearest minute for billing purposes.
For production there is no expiration time, the deployment will run until the user deletes it explicitly.
Creating a Dedicated Deployment from within Workflows
Dedicated deployments can be created from within Roboflow Workflows. Roboflow Workflows is a low-code, web-based application builder for creating computer vision applications.
To create a Dedicated Deployment, first create a Roboflow Workflow. To do so, click on Workflows on the left tab in the Roboflow dashboard, then click "Create Workflow":
Then, click on the Running on Hosted API link in the top left corner:
Click Dedicated Deployments to create and see your Dedicated Deployments, the dialog presented here is identical to the one described above:
When your Deployment is ready, the status will be updated to Ready. You can then click Connect to use your deployment with your Workflow in the Workflows editor:
Provision and Manage Dedicated Deployments (Roboflow CLI)
The roboflow deployment
command provides a set of subcommands to manage your Roboflow Dedicated Deployments. These deployments allow you to run inference on your computer vision models on dedicated servers.
Subcommands
machine_type
: List available machine types for your Dedicated Deployments. Please be noted that the output is a combination of Deployment Type and Machine Type mentioned above, i.e.,dev-cpu, dev-gpu, prod-cpu, prod-gpu
.add
: Create a new Dedicated Deployment.get
: Get detailed information about a specific Dedicated Deployment.list
: List all Dedicated Deployments in your workspace.usage_workspace
: Get usage statistics for all Dedicated Deployments in your workspace.usage_deployment
: Get usage statistics for a specific Dedicated Deployment.delete
: Delete a Dedicated Deployment.log
: View logs for a specific Dedicated Deployment.
Subcommand Examples
Create a new deployment
Get deployment information
List all deployments
Use code with caution.
Get workspace usage
Get deployment usage
Delete a deployment
View deployment logs
Additional Notes
For more detailed information and options for each subcommand, use the
--help
flag.Ensure you have the
roboflow
CLI installed and configured with your API key, as documented here.Refer to the Roboflow documentation for more specific information about Dedicated Deployments and their usage.
Last updated