Hosted API (Remote Server)

Leverage your custom trained model for cloud-hosted inference.


Each model trained with Roboflow Train is deployed as a custom API you can use to make predictions from any device that has an internet connection. Inference is done on the server so you don't need to worry about the edge device's hardware capabilities.
We automatically scale this API up and down and do load balancing for you so that you can rest assured that your application will be able to handle sudden spikes in traffic without having to pay for GPU time you're not using. Our hosted prediction API has been battle-hardened to handle even the most demanding production applications (including concurrently surviving through the famous Hacker News and Reddit "hugs of death" without so much as batting an eye).

Response Object Format

The hosted API inference route returns a JSON object containing an array of predictions. Each prediction has the following properties:
  • top = highest confidence predicted class
  • class = the label of the classification
  • confidence = the model's confidence that the image contains objects of the detected classification
  • predictions = collection of all predicted classes and their associated confidence values for the prediction
// an example JSON object
"top": "real-image",
"confidence": 0.9868,
"predictions': [
"class": "real-image",
"confidence": 0.9868
"class": "illustration",
"confidence": 0.0132

The Example Web App

The easiest way to familiarize yourself with the inference endpoint is to visit the Example Web App. To use the Web App, simply input your model , version and api_key. These will be pre-filled for you after training completes if you click through via the web UI under your versions "Training Results" section.
Then select an image via Choose File. After you have chosen the settings you want, click Run Inference.
On the left side of the screen, you will see example JavaScript code for posting a base64-encoded image to the inference endpoint. Within the form portion of the Web App, you can experiment with changing different API parameters when posting to the API.
Using the Inference API

Code Snippets

For your convenience, we've provided code snippets for calling this endpoint in various programming languages. If you need help integrating the inference API into your project don't hesitate to reach out.
All examples upload to an example dataset with a model-endpoint of your-dataset-slug/your-version. You can easily find your dataset's identifier by looking at the curl command shown in the Roboflow web interface after your model has finished training.
Note: These docs are auto-generated with your API key and version in your Deploy tab within the Roboflow application.

Infer on Local and Hosted Images

To install dependencies, pip install roboflow
from roboflow import Roboflow
rf = Roboflow(api_key="API_KEY")
project = rf.workspace().project("MODEL_ENDPOINT")
model = project.version(VERSION).model
# infer on a local image
# visualize your prediction
# model.predict("your_image.jpg").save("prediction.jpg")
# infer on an image hosted elsewhere
# print(model.predict("URL_OF_YOUR_IMAGE", hosted=True).json())

Inference with the Hosted API

import cv2
import base64
import numpy as np
import requests
import time
import json
# Construct the Roboflow Infer URL
# (if running locally replace with eg
upload_url = "".join([
img = cv2.imread("YOUR_IMAGE.jpg")
# Resize (while maintaining the aspect ratio) to improve speed and save bandwidth
height, width, channels = img.shape
scale = ROBOFLOW_SIZE / max(height, width)
img = cv2.resize(img, (round(scale * width), round(scale * height)))
# Encode image to base64 string
retval, buffer = cv2.imencode('.jpg', img)
img_str = base64.b64encode(buffer)
# Get prediction from Roboflow Infer API
resp =, data=img_str, headers={
"Content-Type": "application/x-www-form-urlencoded"
}, stream=True)
preds = resp.json()


We're using axios to perform the POST request in this example so first run npm install axios to install the dependency.

Inferring on a Local Image

const axios = require("axios");
const fs = require("fs");
const image = fs.readFileSync("YOUR_IMAGE.jpg", {
encoding: "base64"
method: "POST",
url: "",
params: {
api_key: "YOUR_KEY"
data: image,
headers: {
"Content-Type": "application/x-www-form-urlencoded"
.then(function(response) {
.catch(function(error) {

Uploading a Local Image Using base64

import UIKit
// Load Image and Convert to Base64
let image = UIImage(named: "your-image-path") // path to image to upload ex: image.jpg
let imageData = image?.jpegData(compressionQuality: 1)
let fileContent = imageData?.base64EncodedString()
let postData = fileContent!.data(using: .utf8)
// Initialize Inference Server Request with API_KEY, Model, and Model Version
var request = URLRequest(url: URL(string: "")!,timeoutInterval: Double.infinity)
request.addValue("application/x-www-form-urlencoded", forHTTPHeaderField: "Content-Type")
request.httpMethod = "POST"
request.httpBody = postData
// Execute Post Request
URLSession.shared.dataTask(with: request, completionHandler: { data, response, error in
// Parse Response to String
guard let data = data else {
print(String(describing: error))
// Convert Response String to Dictionary
do {
let dict = try JSONSerialization.jsonObject(with: data, options: []) as? [String: Any]
} catch {
// Print String Response
print(String(data: data, encoding: .utf8)!)

Accessing Prediction Response Values

All predictions:
for prediction in preds['predictions']:
print(prediction['class'], prediction['confidence'])
Highest confidence prediction:
print(preds['top']) # Example output (type: str) : real-image
print(preds['confidence']) #Example output (type: float) : 0.9868