Dedicated Deployments

Run Your Vision Models on Dedicated Servers with Roboflow

What are Dedicated Deployments?

Dedicated Deployments are private cloud servers managed by Roboflow, specifically designed to run your computer vision models. These models can include:

  • Object detection

  • Image segmentation

  • Classification

  • Keypoint detection

  • Foundation models like CLIP (if trained on Roboflow)

  • Roboflow Workflows (low-code vision applications)

  • ...and many others!

Benefits of Dedicated Deployments:

  • Focus on your machine vision business problem, leave the infrastructure to us: Spin up inference serving infrastructure with a few clicks and without having to signup with cloud providers, installing and securing servers, managing TLS certificates or worrying about server management, patching, updates etc.

  • Dedicated Resources: Get cloud servers allocated specifically for your use, ensuring consistent performance for your models.

  • Secure Access: Dedicated Deployments are accessible with your workspace's unique API key and utilize HTTPS for secure communication.

  • Easy Integration: Each deployment receives a subdomain within roboflow.cloud, simplifying integration with your applications.

Current Limitations:

  • All dedicated deployments are currently hosted in US-based data centers; users from other Geographies may see higher latencies. Please contact us for a customized solution if you are outside of US, we can help you to reduce the network latency.

  • Dedicated deployments are available to Basic, Growth, Growth-UBP, and Enterprise plan workspaces. See Roboflow plans.

Choosing the Right Dedicated Deployment

Roboflow offers Dedicated Deployments in two distinct environments: development and production. Both environments allow you to create servers with or without GPUs, tailored to your specific needs.

Development Environment

  • Ephemeral: Servers in the development environment are short-lived, lasting between 1 and 6 hours. This also means that the same URL of the dedicated deployment may not be available the next time you start a dedicated deployment in the development environment.

  • Ideal for Testing: Perfect for testing integrations and prototyping applications.

  • Pay-Per-Hour: You're only charged for the duration of the server's existence (billed in 1 minute intervals).

  • Automatic Deletion: Servers are automatically deleted after their designated duration has passed or when manually deleted.

Production Environment

  • Persistent: Servers remain active until manually deleted via the Roboflow UI or CLI.

  • Hybrid Billing: Billing is based on both the server's duration and the number of inferences or workflows processed. More details here.

  • Dedicated Subdomain: You will get a dedicated subdomain <some-name>.roboflow.cloud , and it will always be available for the whole life cycle of your dedicated deployment.

Summary of Dedicated Deployment Types

The following table summarizes the available Dedicated Deployment types:

Type
When to use this

dev-cpu

Development tasks that do not require GPU acceleration, such as workflows without GPU-intensive models like Florence 2.

dev-gpu

Development tasks that benefit from GPU acceleration, such as workflows using models like Florence 2.

prod-cpu

Long-running production workloads that do not require GPU acceleration.

prod-gpu

Long-running production workloads that require GPU acceleration, or applications prioritizing low latency inference and workflow execution.

Billing difference between Development and Production Environment

Rate Definitions

  • Hourly Rate: A fixed rate determined by the type of dedicated deployment.

  • Inference Rate: A variable rate depending on the number of inference requests, including model, workflow, and video stream inferences, processed by the dedicated server.

Bill Calculation

  • Development Environment: Charges are solely based on the Hourly Rate, i.e., 0.5 credit/hour for dev-cpu, 1.0 credit/hour for dev-gpu.

  • Production Environment: Every hour, the Inference Rate is calculated based on the requests sent to the dedicated deployment. This rate is compared with the Hourly Rate (0.5 credit/hour for prod-cpu, 1.0 credit/hour for prod-gpu), and only the maximum of the two is charged.

By understanding these distinctions, you can select the optimal Dedicated Deployment configuration to meet your specific use case.

All dedicated deployment servers will run Roboflow Inference, our open-source inference server. Review the Roboflow Inference documentation to learn more about all of the features available.

Provision and Manage Dedicated Deployments (Roboflow App)

A workspace user can provision a dedicated deployment under the Deployments --> Dedicated Deployments tab for a workspace.

Clicking on the New Deployment button brings up the Create a Dedicated Deployment dialog, as shown below:

Each of the properties in the dialog are described in the table below. Fill the dialog and click on the Create Dedicated Deployment button. Your deployment will be provisioned. It may take anywhere from a few seconds to a few minutes to provision your deployment.

Property
Description

Name

Choose a unique name (5-15 characters) to identify your Dedicated Deployment. This name will also become the subdomain for your deployment endpoint (e.g., [invalid URL removed]).

  • Easy to Remember: Pick a name that clearly reflects your deployment's purpose (e.g., "prod-inference", "dev-testing").

  • Unique within Workspace: If your chosen name is already taken, a short random code will be added to create a unique subdomain.

Tips:

  • Use lowercase letters, numbers, and hyphens (-) for your name.

  • Avoid special characters or spaces.

Machine Type

Whether a CPU-only or a GPU dedicated deployment is needed.

Deployment Type

This is the deployment environment - Development (dev) or Production (prod).

Duration

This is the time for which the dedicated deployment remains online.

  • For development environments this can range from 1-6 hours. Fractional hours are permitted, they will be rounded to the nearest minute for billing purposes.

  • For production there is no expiration time, the deployment will run until the user deletes it explicitly.

Creating a Dedicated Deployment from within Workflows

Dedicated deployments can be created from within Roboflow Workflows. Roboflow Workflows is a low-code, web-based application builder for creating computer vision applications.

To create a Dedicated Deployment, first create a Roboflow Workflow. To do so, click on Workflows on the left tab in the Roboflow dashboard, then click "Create Workflow":

Creating a Workflow

Then, click on the Running on Hosted API link in the top left corner:

Changing the backend where the workflow will execute.

Click Dedicated Deployments to create and see your Dedicated Deployments, the dialog presented here is identical to the one described above:

When your Deployment is ready, the status will be updated to Ready. You can then click Connect to use your deployment with your Workflow in the Workflows editor:

Provision and Manage Dedicated Deployments (Roboflow CLI)

The roboflow deployment command provides a set of subcommands to manage your Roboflow Dedicated Deployments. These deployments allow you to run inference on your computer vision models on dedicated servers.

Subcommands

  • machine_type: List available machine types for your Dedicated Deployments. Please be noted that the output is a combination of Deployment Type and Machine Type mentioned above, i.e., dev-cpu, dev-gpu, prod-cpu, prod-gpu.

  • add: Create a new Dedicated Deployment.

  • get: Get detailed information about a specific Dedicated Deployment.

  • list: List all Dedicated Deployments in your workspace.

  • usage_workspace: Get usage statistics for all Dedicated Deployments in your workspace.

  • usage_deployment: Get usage statistics for a specific Dedicated Deployment.

  • delete: Delete a Dedicated Deployment.

  • log: View logs for a specific Dedicated Deployment.

Subcommand Examples

  • Create a new deployment

    roboflow deployment add my-deployment -m prod-gpu
  • Get deployment information

    roboflow deployment get my-deployment
  • List all deployments

    roboflow deployment list

    Use code with caution.

  • Get workspace usage

    roboflow deployment usage_workspace
  • Get deployment usage

    roboflow deployment usage_deployment my-deployment
  • Delete a deployment

    roboflow deployment delete my-deployment
  • View deployment logs

    roboflow deployment log my-deployment -t 60 -n 20

Additional Notes

  • For more detailed information and options for each subcommand, use the --help flag.

  • Ensure you have the roboflow CLI installed and configured with your API key, as documented here.

  • Refer to the Roboflow documentation for more specific information about Dedicated Deployments and their usage.

Last updated