Managed Deployments

Roboflow provides several managed deployment options that leverage our cloud infrastructure to run your models. These options are easy to use and offer excellent scalability options, making them ideal for a wide range of applications.

Serverless API

The Serverless Hosted API allows you to run workflows and models directly on Roboflow's infrastructure through an infinitely-scalable API. This is the easiest way to deploy your models and get started with inference.

Benefits:

Scalability: The API automatically scales to handle your inference needs, so you don't have to worry about provisioning or managing servers.
Ease of Use: You can access your models through a simple REST API, making it easy to integrate inference into your applications.
No Infrastructure Management: Roboflow handles all the infrastructure, so you can focus on building your applications.
Workflow Support: All your workflows are available as API endpoints on the Serveless API, so you can easily run your workflows using a simple HTTP request

Limitations:

Warmup Requests: When you make requests that are going to require loading a model that hasn’t been loaded into any of the servers, initial requests may have increased latency of several seconds. After subsequent requests the request latency will drastically improve as your model gets cached in the currently running servers.
CPU based: The Serverless Hosted API uses CPU for model inference; you may experience higher latency compared to dedicated deployments or self-hosted deployments and can’t use models that require a GPU (Serverless GPU API coming soon).

Workflows

The Serverless Hosted API allows you to run Workflows in the Roboflow cloud. This enables you to build and run complex computer vision applications without managing your own infrastructure.

You can also run workflows on Dedicated Deployments or self hosted inference servers, which enables you to use more powerful GPU based models and use Custom Python Blocks.

You can learn more about how to create, test, and deploy Workflows here.

Model Inference

In addition to workflows you can also infer against a specific model using the Serverless Hosted API. You can infer against any model you have trained on Robofolow, any of the supported foundation models, or find projects with trained model on https://universe.roboflow.com

Overview for How to Use the Serverless Hosted API:

Obtain your API key from the Roboflow dashboard.
Send a POST request to the API endpoint with your image and model information.
Receive the inference results in JSON format.

See the Serverless Hosted API docs for details and API specifications

Batch Processing

Roboflow Batch Processing is a fully managed solution powered by Workflows that allows you to process large volumes of videos and images without writing code. It offers an easy-to-use UI for quick tasks and a comprehensive API for automating data processing—fitting both small and large workloads.

With configurable processing workflows, real-time monitoring, and event-based notifications, Roboflow Batch Processing helps you efficiently manage data processing, track progress, and integrate with other systems—making it easy to achieve your goals.

Benefits:

Scalability: The service automatically scales to your data volume, capable of processing millions of images and thousands of video files efficiently.
Ease of Use: You can use the service in multiple ways—from a simple UI click to executing CLI commands, all the way to building production-grade automations that seamlessly integrate with your system.
No Infrastructure Management: Roboflow handles all the infrastructure and data management, so you can focus on solving your business use-cases.

Limitations:

Asynchronous nature of processing: The Batch Processing service launches processing jobs to run in the background when compute resources are available. While it typically takes only a few minutes to provision the necessary servers, there is no guarantee of an exact start time for the job. As a result, the service is not suitable for real-time processing.
Custom Python Blocks not supported: Since the service runs on Roboflow's infrastructure, we currently do not support the execution of arbitrary code through Custom Python Blocks.

Dedicated Deployments

Dedicated Deployments provide dedicated GPUs and CPUs for running your models. This option offers consistent performance, resource isolation, and enhanced security, making it suitable for demanding applications and production workloads that require resource isolation or custom code execution.

Benefits:

Consistent Performance: Dedicated resources ensure consistent performance for your models.
Resource Isolation: Your models run on isolated resources, preventing interference from other users.
GPU support: You can run large models that require GPU support on Dedicated Deployments (like e.g. SAM2, CogVML)

Limitations:

Limited to US-Based Data Centers: Currently, Dedicated Deployments are only available in US-based data centers, which may result in higher latency for users in other regions.

Last updated 28 days ago

Was this helpful?