Deploy your model to on CPU on your own infrastructure.

The inference API is available as a Docker container for 64-bit Intel and AMD machines, it is not compatible with macOS based devices.

x86 CPU Installation

To install the Roboflow Inference Server on an x86 CPU, run the following command:

docker run -it --rm -p 9001:9001 roboflow/roboflow-inference-server-cpu

If required, run the command with sudo privileges.


Accelerate your inference on ARMv8 CPU architectures. This method is also confirmed to work with edge devices such as the Raspberry Pi.

docker run -it --rm -p 9001:9001 roboflow/roboflow-inference-server-arm-cpu:0.4.4

If required, run the command with sudo privileges.

Use the Inference Server

You can use the Inference Server as a drop-in replacement for our Hosted Inference API (see those docs for example code snippets in several programming languages). Use the sample code from the Hosted API but replace https://detect.roboflow.com with http://{INFERENCE-SERVER-IP}:9001 in the API call. For example,

base64 YOUR_IMAGE.jpg | curl -d @- \

Note: The first call to a model will take a few seconds to download your weights and initialize them; subsequent predictions will run faster.

Last updated