OwlV2

Use OwlV2 for one-shot object detection on a Dedicated Deployment or self-hosted Inference

OwlV2 is Google's open-vocabulary object detector. You provide one or more example bounding boxes on a reference image, and OwlV2 detects similar objects in target images without any training.

OwlV2 is not available on the Serverless Hosted API. Run it on a Dedicated Deployment or self-hosted Inference.

Code sample

Install dependencies:

pip install requests supervision opencv-python

The sample below uses a single example box on the input image as the prompt and detects matching objects in the same image. In practice you typically pass a separate reference image. Set URL to your Dedicated Deployment URL or a local Inference server. Pass your Roboflow API Key via the API_KEY environment variable.

import base64
import os
import urllib.request

import cv2
import numpy as np
import requests
import supervision as sv

URL = "https://your-deployment.roboflow.cloud"
IMAGE_URL = "https://media.roboflow.com/notebooks/examples/dog.jpeg"
IMAGE_PATH = "dog.jpeg"
OUTPUT_PATH = "dog_annotated.png"

urllib.request.urlretrieve(IMAGE_URL, IMAGE_PATH)
image = cv2.imread(IMAGE_PATH)
_, buffer = cv2.imencode(".jpg", image)
image_base64 = base64.b64encode(buffer).decode("utf-8")

response = requests.post(
    f"{URL}/owlv2/infer",
    json={
        "api_key": os.getenv("API_KEY"),
        "image": {"type": "base64", "value": image_base64},
        "training_data": [{
            "image": {"type": "base64", "value": image_base64},
            "boxes": [{"x": 360, "y": 800, "w": 500, "h": 500, "cls": "dog"}],
        }],
        "confidence": 0.99,
    },
)
preds = response.json()["predictions"]

xyxys = [
    [p["x"] - p["width"] / 2, p["y"] - p["height"] / 2,
     p["x"] + p["width"] / 2, p["y"] + p["height"] / 2]
    for p in preds
]
detections = sv.Detections(
    xyxy=np.array(xyxys, dtype=float),
    class_id=np.array([p.get("class_id", 0) for p in preds]),
    confidence=np.array([p["confidence"] for p in preds], dtype=float),
    data={"class_name": np.array([p["class"] for p in preds])},
)
labels = [f"{p['class']} {p['confidence']:.2f}" for p in preds]
annotated = sv.BoxAnnotator().annotate(scene=image.copy(), detections=detections)
annotated = sv.LabelAnnotator().annotate(scene=annotated, detections=detections, labels=labels)
cv2.imwrite(OUTPUT_PATH, annotated)

Set URL to match your deployment target:

OwlV2 confidences are typically very high (above 0.99). Tune the confidence parameter accordingly.

Last updated

Was this helpful?