Deployment Overview

Overview of Roboflow Deployment Services

Introduction to Deployment

This section provides a high-level overview of deploying computer vision models and workflows with Roboflow. We'll cover essential concepts and terminology, different deployment options, and how to choose the best approach for your needs.

What is Inference?

In computer vision, inference refers to the process of using a trained model to analyze new images or videos and make predictions. For example, an object detection model might be used to identify and locate objects in a video stream, or a classification model might be used to categorize images based on their content.

Roboflow Inference is an open-source project that provides a powerful and flexible framework for deploying computer vision models and workflows. It is s the engine that powers most of Roboflows managed deployment services. You can also self host it or use it to deploy your vision workflows to edge devices. Roboflow Inference offers a range of features and capabilities, including:

  • Support for various model architectures and tasks, including object detection, classification, instance segmentation, and more.

  • Workflows, which lets you build computer vision applications by combining different models, pre-built logic, and external applications by choosing from hundreds of building Blocks.

  • Hardware acceleration for optimized performance on different devices, including CPUs, GPUs, and edge devices like NVIDIA Jetson.

  • Multiprocessing for efficient use of resources.

  • Video decoding for seamless processing of video streams.

  • HTTP interface, APIs and docker images to simplify deployment

  • Integration with Roboflow's hosted deployment options and the Roboflow platform.

What are Workflows?

Workflows enable you to build complex computer vision applications by combining different models, pre-built logic, and external applications. They provide a visual, low-code environment for designing and deploying sophisticated computer vision pipelines.

With Workflows, you can:

  • Chain multiple models together to perform complex tasks.

  • Add custom logic and decision-making to your applications.

  • Integrate with external systems and APIs.

  • Track, count, time, measure, and visualize objects in images and videos.

Deployment Options

Roboflow offers a variety of deployment options to suit different needs and use cases. These options can be broadly categorized into:

  • Roboflow Managed Deployments: These options leverage Roboflow's cloud infrastructure to run your models, eliminating the need for you to manage your own hardware or software.

  • Self-Hosted Deployments: These options allow you to deploy models on your own hardware, providing greater control over your environment and resources.

The following table summarizes the key features, benefits, and limitations of each deployment option:

Deployment Option

Description

Benefits

Limitations

Serverless API

Run workflows and models directly on Roboflow's infrastructure through an infinitely-scalable API.

Scalable, easy to use, no infrastructure management.

Limited control over resources, potential for higher latency for demanding applications.

Dedicated Deployments

Dedicated GPUs and CPUs for running workflows and models.

Support for GPU models, Video Streaming, Custom Python Blocks.

Limited to US-based data centers. Not autoscaling like Serverless API

Self-Hosted Deployments

Run Inference on your own hardware.

Full control over resources and environment, potential for lower latency.

Requires infrastructure management and expertise.

Choosing the Right Deployment Option

There is great guide on how to choose the best deployment method for your use case in the inference getting started guide at: https://inference.roboflow.com/start/getting-started/

The best deployment option for you depends on your specific needs and requirements. Consider the following factors when making your decision:

  • Scalability: If your application needs to handle varying levels of traffic, the serverless API offers excellent scalability.

  • Latency: If you need low latency or video processing, dedicated deployments or self-hosted deployments with powerful hardware might be the best choice.

  • GPUs: If you need to run models that require a GPU (e.g. SAM2, CogVML, etc) you need to use a Dedicated Deployment with GPU machine type or self hosted on hardware that has GPUs available. (Serverless GPU API coming soon)

  • Control: Self-hosted deployments provide the most control over your environment and resources.

  • Expertise: Self-hosted deployments require more technical expertise to set up and manage.

Last updated

Was this helpful?