Enterprise Deployment

With Roboflow Enterprise you get added features and flexibility to meet even the most stringent security and deployment requirements.

VPC and On-Premise Deployment

If you have a Roboflow Enterprise account with the on-prem add-on, you can deploy our Docker container in your private cloud (or on your own metal) using our Inference Server and (optionally) our License Server Docker containers.

In a common configuration, the Inference Server receives requests from client applications inside the private network (with no Internet connection) and fetches weights from your trained Roboflow models via the License Server (which resides in a DMZ and is granted access through the firewall to the Roboflow API and cloud storage buckets). In this way, you can ensure that sensitive images never leave your private network.

The inference server may also make use of Offline Mode (described below) to store the weights locally for up to 30 days for added privacy and resilience.

Installing and Using the Inference Server

The inference API is available as a Docker container for 64-bit Intel and AMD machines. To install, simply pull the container:

sudo docker pull roboflow/inference-server:cpu

Then run it:

sudo docker run --net=host roboflow/inference-server:cpu

You can now use the Inference Server as a drop-in replacement for our Hosted Inference API (see those docs for example code snippets in several programming languages). Use the sample code from the Hosted API but replace https://detect.roboflow.com with http://{INFERENCE-SERVER-IP}:9001 in the API call. For example,

base64 YOUR_IMAGE.jpg | curl -d @- \

Note: The first call to a model will take a few seconds to download your weights and initialize them; subsequent predictions will be much quicker.

Using the License Server

If you wish to firewall the Roboflow Inference Server from the Internet, you will need to use the Roboflow License Server which acts as a proxy for the Roboflow API and your models' weights.

On a machine with access to https://api.roboflow.com and https://storage.googleapis.com (and port 80 open to the Inference Server running in your private network), pull the License Server Docker container:

sudo docker pull roboflow/license-server

And run it:

sudo docker run --net=host roboflow/license-server

Configure your Inference Server to use this License Server by passing its IP in the LICENSE_SERVER environment variable:

sudo docker run --net=host --env LICENSE_SERVER= roboflow/inference-server:cpu

Offline Mode

With the optional Offline Mode add-on to Roboflow Enterprise, you can configure the Roboflow Inference Server to cache weights for up to 30 days. This allows it to run completely air-gapped or in locations where an Internet connection is not readily available.

To enable Offline Mode, you'll need to create and attach a Docker volume to /cache on the Inference Server:

sudo docker volume create roboflow
sudo docker run --net=host --env LICENSE_SERVER= --mount source=roboflow,target=/cache roboflow/inference-server:cpu

The weights will be loaded from the your Roboflow account over the Internet (via the License Server if you have configured it) and stored safely in the Docker volume for up to 30 days.

Your inference results will contain a new expiration key you can use to determine how long the Inference Server can continue to provide predictions before renewing its lease on the weights via an Internet or License Server connection. Once the weight expiration date drops below 7 days, the Inference Server will begin trying to renew the weights' lease once per hour until a connection to the Roboflow API is successfully made.

"predictions": [
"x": 340.9,
"y": 263.6,
"width": 284,
"height": 360,
"class": "example",
"confidence": 0.867
"expiration": {
"value": 29.91251408564815,
"unit": "days"