# Two-Stage and CLIP Inference

`Workspace` exposes three convenience methods that combine multiple models or modalities in one call:

* `two_stage()` — run an object-detection model, then run a second model on each crop.
* `two_stage_ocr()` — run an object-detection model, then OCR each crop.
* `clip_compare()` — embed every image in a directory with [CLIP](https://openai.com/research/clip) and rank them against a target image.

These helpers are useful for prototypes (license plates, badge readers, similar-image search) before you build the equivalent in a [Workflow](/developer/python-sdk/manage-workflows.md).

## Two-stage detection + classification

The first stage detects regions of interest. The second stage classifies (or runs another detector on) each cropped region.

```python
import roboflow

rf = roboflow.Roboflow(api_key="YOUR_API_KEY")
ws = rf.workspace()

results = ws.two_stage(
    image="photo.jpg",
    first_stage_model_name="cars-or-trucks",
    first_stage_model_version=2,
    second_stage_model_name="vehicle-make",
    second_stage_model_version=4,
)

for r in results:
    print(r)
```

Each entry in `results` carries the parent detection (bbox, class, confidence) and the second-stage classification.

## Two-stage detection + OCR

```python
results = ws.two_stage_ocr(
    image="photo.jpg",
    first_stage_model_name="license-plates",
    first_stage_model_version=1,
)

for r in results:
    print(r["bbox"], r["text"])
```

The second stage runs Tesseract-style OCR on each crop and returns the recognized text alongside the bounding box.

## CLIP image comparison

`clip_compare()` embeds every image in a directory and the supplied target, returns the cosine-similarity score for each pair, sorted by similarity.

```python
results = ws.clip_compare(
    dir="./gallery",
    image_ext=".jpg",
    target_image="./query.jpg",
)

for r in results:
    print(r["image"], r["similarity"])
```

### Parameters

* `dir` (str) — directory to scan for candidate images.
* `image_ext` (str, default `".png"`) — file extension to match. Note this includes the leading dot.
* `target_image` (str) — path to the image you're searching for similarity against.

## When to graduate to Workflows

These helpers run on the hosted inference endpoint and serially make one HTTP call per stage / per image. Once you need batching, conditional branching, custom blocks, or a long-running pipeline, build the equivalent as a [Workflow](/developer/python-sdk/manage-workflows.md) — the runtime is purpose-built for chained inference and supports streaming inputs.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.roboflow.com/developer/python-sdk/two-stage-and-clip-inference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
