Hosted Inference

Serve your model via an infinitely scaling, battle-tested, cloud API.


Each model trained with Roboflow Train is deployed as a custom API you can use to make predictions from any device that has an internet connection. Inference is done on the server so you don't need to worry about the edge device's hardware capabilities.
We automatically scale this API up and down and do load balancing for you so that you can rest assured that your application will be able to handle sudden spikes in traffic without having to pay for GPU time you're not using. Our hosted prediction API has been battle-hardened to handle even the most demanding production applications (including concurrently surviving through the famous Hacker News and Reddit "hugs of death" without so much as batting an eye).

Hosted Inference Endpoint

Every model on Roboflow has its own inference endpoint for using the Roboflow Inference API. The Roboflow Inference API is available for object detection, instance segmentation, and classification:

Code Snippets

You can find auto generated deployment code snippets on your project's Deploy tab.