For the complete documentation index, see llms.txt. This page is also available as Markdown.

Confusion Matrix

Returns the aggregated confusion matrix derived from per-image predictions. Each cell matrix[actual][predicted] is the count of instances where the ground-truth class was actual and the model predicted predicted. For semantic segmentation evaluations, values represent pixel counts rather than instance counts.

This is the data the confusion matrix panel in the app reads.

https://api.roboflow.com/:workspace/model-evals/:evalId/confusion-matrix
curl "https://api.roboflow.com/my-workspace/model-evals/$EVAL_ID/confusion-matrix?api_key=$ROBOFLOW_API_KEY&split=test"

Query parameters

Parameter
Type
Description

split

enum

One of train, valid, test, or all. Default test.

confidence

integer

Confidence-threshold percentage in [0, 100]. Defaults to the canonical file (typically 20).

Response

{
    "split": "test",
    "confidenceThreshold": 0.2,
    "classes": ["Car-rims", "music-note", "background"],
    "matrix": [
        [20,  0, 0],
        [ 0,  0, 0],
        [80,  0, 0]
    ]
}

In the example above, at confidence threshold 0.2:

  • All 20 instances of Car-rims were correctly classified (matrix[0][0] = 20)

  • The model produced 80 false positives - predicting Car-rims when the actual class was background (matrix[2][0] = 80)

  • The test split has no music-note instances

Notes

  • confidence selects which underlying per-confidence variant of the report to aggregate. Different thresholds yield different matrices.

  • split=all aggregates raw counts across train, valid, and test.

Last updated

Was this helpful?