> For the complete documentation index, see [llms.txt](https://docs.roboflow.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.roboflow.com/deploy/supported-models/moondream2.md).

# Moondream2

Moondream2 is a compact vision-language model. In Roboflow Inference, it is exposed as an open-vocabulary object detector: pass a class name as the prompt and receive bounding boxes for matching regions.

{% hint style="info" %}
Moondream2 is not available on the Serverless Hosted API. Run it on a [Dedicated Deployment](/deploy/dedicated-deployments.md) or [self-hosted Inference](https://inference.roboflow.com/).
{% endhint %}

## Code sample

{% stepper %}
{% step %}

### Get your API Key

Create a Roboflow account, find your key on the [Roboflow API settings page](https://app.roboflow.com/settings/api) and make it available to your shell:

```bash
export ROBOFLOW_API_KEY="your-key-here"
```

{% endstep %}

{% step %}

### Install the dependencies

Install the [Inference SDK](https://inference.roboflow.com/) and [supervision](https://supervision.roboflow.com/):

```bash
pip install inference-sdk supervision opencv-python
```

{% endstep %}

{% step %}

### Run the model

Set `api_url` to your Dedicated Deployment URL or a local Inference server.

```python
import os
import cv2
import numpy as np
import requests
import supervision as sv
from inference_sdk import InferenceHTTPClient

content = requests.get("https://media.roboflow.com/notebooks/examples/dog.jpeg").content
image = cv2.imdecode(np.frombuffer(content, np.uint8), cv2.IMREAD_COLOR)
client = InferenceHTTPClient(
    api_url="https://your-deployment.roboflow.cloud",
    api_key=os.environ["ROBOFLOW_API_KEY"],
)
result = client.infer_lmm(
    image,
    model_id="moondream2",
    prompt="dog",
)

preds = result["predictions"]
xyxys = [
    [p["x"] - p["width"] / 2, p["y"] - p["height"] / 2,
     p["x"] + p["width"] / 2, p["y"] + p["height"] / 2]
    for p in preds
]
detections = sv.Detections(
    xyxy=np.array(xyxys, dtype=float),
    class_id=np.array([p.get("class_id", 0) for p in preds]),
    confidence=np.array([p.get("confidence", 1.0) for p in preds], dtype=float),
    data={"class_name": np.array([p["class"] for p in preds])},
)
labels = [f"{p['class']} {p.get('confidence', 1.0):.2f}" for p in preds]
annotated = sv.BoxAnnotator().annotate(image.copy(), detections)
annotated = sv.LabelAnnotator().annotate(annotated, detections, labels=labels)
cv2.imwrite("dog_annotated.png", annotated)
```

<figure><img src="/files/ohji87Lcp1h6SHAUbaxv" alt=""><figcaption></figcaption></figure>
{% endstep %}
{% endstepper %}

## Inference speed

Latency measured with [Roboflow Inference](https://inference.roboflow.com/) on 1x NVIDIA L4, batch size 1, captioning one image. Moondream2 cannot fix its output length, so latency varies with the response.

<table data-search="false"><thead><tr><th>Alias</th><th>Latency (ms)</th></tr></thead><tbody><tr><td><code>moondream2</code></td><td>1669</td></tr></tbody></table>

{% hint style="info" %}
Set `api_url` to match your deployment target:

* `http://localhost:9001` for a local [Inference](https://inference.roboflow.com/) server.
* Your [Dedicated Deployment](/deploy/dedicated-deployments.md) URL for a private endpoint.
  {% endhint %}