Dataset Health Check

Assessing and improving the quality of your dataset.

Follow the Guided Tutorial

The best way to understand Roboflow's Dataset Health Check is by following the Dataset Health Check Guided Tutorial, which walks through a health check on a public dataset of hard hat construction workers.

Breaking Down the Health Check (Object Detection)

Understanding your dataset health helps you make informed decisions about preprocessing and augmentation for your dataset.

The Basics

Images counts the number of images in your dataset, including those images contain missing or null annotations.

  • Missing annotations are images that do not have an accompanying annotation file.

  • Null annotations are images that deliberately do not contain any objects.

See more on the difference between null and missing annotations.

Annotations describes the total number of objects annotated (i.e. the number of bounding boxes).

Average Image Size is the size of images in megapixels.

Class Balance

Class Balance shows how many of each object there are and easily visualizes class balance/imbalance. Imbalanced data can yield unfavorable results, especially when measuring models with accuracy.

Advanced Health Check

Roboflow Pro users have access to additional health check features. Get a walkthrough of these additional features here.‚Äč