For the complete documentation index, see llms.txt. This page is also available as Markdown.

Per-Image Predictions

Returns per-image prediction records - TP/FP/FN counts, per-image precision/recall/F1, the image's cluster id and 2D embedding, and the raw confusion entries. Paginated.

This is the data the per-image predictions panel in the app reads.

https://api.roboflow.com/:workspace/model-evals/:evalId/image-predictions
curl "https://api.roboflow.com/my-workspace/model-evals/$EVAL_ID/image-predictions?api_key=$ROBOFLOW_API_KEY&split=test&limit=50"

Query parameters

Parameter
Type
Description

split

enum

One of train, valid, test, or all. Default all.

confidence

integer

Confidence-threshold percentage in [0, 100] (selects which per-confidence report variant to read).

limit

integer

Page size; default 200, max 1000.

offset

integer

Skip this many records before returning. Default 0.

Response

{
    "split": "test",
    "confidenceThreshold": 0.2,
    "totalImages": 192,
    "offset": 0,
    "limit": 50,
    "images": [
        {
            "imageId": "1QKLCUsfAzFiCIb6YCJj",
            "imageName": "abc.jpg",
            "split": "test",
            "augmentations": 2,
            "cluster": {
                "id": 4,
                "embedding2D": [7.494518280029297, -5.143994331359863]
            },
            "stats": {
                "truePositives": 2,
                "falsePositives": 7,
                "falseNegatives": 0,
                "precision": 0.222,
                "recall": 1.0,
                "f1": 0.364
            },
            "confusion": [
                [0, 0, 2],
                [2, 0, 7]
            ]
        }
    ]
}

Notes

  • imageId is the Roboflow source image id - useful for cross-referencing with other Roboflow APIs.

  • confusion entries are [actualClassIdx, predictedClassIdx, count] triples; class indices reference the same array as Confusion Matrix's classes.

  • embedding2D is the UMAP-projected 2D coordinate used in the Vector Analysis plot.

  • Different confidence values return different stats - predictions change with the threshold. Note that probing arbitrary confidence values will only succeed for thresholds the eval pipeline materialized; unmaterialized variants return 404 report_not_found.

  • Pagination cost: each page re-reads the full model_eval_results.json file from storage and slices it server-side. For evals with very large image_results arrays, prefer larger limit values (up to 1000) over many small pages to minimize the per-page fixed cost.

Last updated

Was this helpful?