Roboflow Managed Deployments Overview
Roboflow provides several managed deployment options that leverage our cloud infrastructure to run your models. These options are easy to use and offer excellent scalability options, making them ideal for a wide range of applications.
Serverless API
The Serverless Hosted API allows you to run workflows and models directly on Roboflow's infrastructure through an infinitely-scalable API. This is the easiest way to deploy your models and get started with inference.
Benefits:
Scalability: The API automatically scales to handle your inference needs, so you don't have to worry about provisioning or managing servers.
Ease of Use: You can access your models through a simple REST API, making it easy to integrate inference into your applications.
No Infrastructure Management: Roboflow handles all the infrastructure, so you can focus on building your applications.
Workflow Support: All your workflows are available as API endpoints on the Serveless API, so you can easily run your workflows using a simple HTTP request
Limitations:
Warmup Requests: When you make requests that are going to require loading a model that hasn’t been loaded into any of the servers, initial requests may have increased latency of several seconds. After subsequent requests the request latency will drastically improve as your model gets cached in the currently running servers.
CPU based: The Serverless Hosted API uses CPU for model inference; you may experience higher latency compared to dedicated deployments or self-hosted deployments and can’t use models that require a GPU (Serverless GPU API coming soon).
Workflows
The Serverless Hosted API allows you to run Workflows in the Roboflow cloud. This enables you to build and run complex computer vision applications without managing your own infrastructure.
You can also run workflows on Dedicated Deployments or self hosted inference servers, which enables you to use more powerful GPU based models and use Custom Python Blocks.
You can learn more about how to create, test, and deploy Workflows here.
Model Inference
In addition to workflows you can also infer against a specific model using the Serverless Hosted API. You can infer against any model you have trained on Robofolow, any of the supported foundation models, or find projects with trained model on https://universe.roboflow.com
Overview for How to Use the Serverless Hosted API:
Obtain your API key from the Roboflow dashboard.
Send a POST request to the API endpoint with your image and model information.
Receive the inference results in JSON format.
See the Serverless Hosted API docs for details and API specifications
Dedicated Deployments
Dedicated Deployments provide dedicated GPUs and CPUs for running your models. This option offers consistent performance, resource isolation, and enhanced security, making it suitable for demanding applications and production workloads that require resource isolation or custom code execution.
Benefits:
Consistent Performance: Dedicated resources ensure consistent performance for your models.
Resource Isolation: Your models run on isolated resources, preventing interference from other users.
GPU support: You can run large models that require GPU support on Dedicated Deployments (like e.g. SAM2, CogVML)
Custom Python Blocks: You can use custom python blocks in your workflows when deploying them on Dedicated Deployments.
Limitations:
Limited to US-Based Data Centers: Currently, Dedicated Deployments are only available in US-based data centers, which may result in higher latency for users in other regions.
Last updated
Was this helpful?