Image Preprocessing
Image preprocessing steps to prepare data for models.
Preprocessing should be applied to your training, validation, and testing set to assure learning and inference occurs on the same image properties. (Note: in computer vision, inference means to generate predictions.)
For example, if your model learns on
640x640
images, it should do inference (generate predictions) on images of the same size.Auto-orient strips your images of their EXIF data so that you see images displayed the same way they are stored on disk.
EXIF data determines the orientation of a given image. Applications (like Preview on Mac) use this data to display an image in a specific orientation, even if the orientation of how it is stored on disk differs. See this front page Hacker News discussion on how this may silently ruin your object detection models.
- Roboflow recommends defaulting to leaving this on and checking how your images in inference are being fed to your model.
- If you want to learn more about whether or not you should auto-orient your images, check out our blog.
Resize changes your images size and, optionally, scale to a desired set of dimensions. Annotations are adjusted proportionally (except in the case of “fill” below).
Currently, we only support downsizing. We provide some guidance for what resize option may be best for your use case.
- Stretch to: Stretch your images to a preferred pixel-by-pixel dimension. Annotations are scaled proportionally. Images are square, distorted, but no source image data is lost.
- Fill (with center crop) in: The generated image is a centered crop of your desired output dimensions. For example, if the source image is 2600x2080 and the resize option is set to 640x640, the outputted resize is the central 640x640 of the source image. The aspect ratio is maintained, but source image data is lost.
- Fit within: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio. For example, if a source image is 2600x2080 and the resize option is set to 640x640, the longer dimensions (2600) is scaled to 640 and the secondary dimension (2080) is scaled to ~512 pixels. Image aspect ratios and original data are maintained, but they are not square.
- Fit (reflect edges) in: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio, and any newly created padding is a reflection of the source image. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. The remaining pixel area (416-335.48, or 80.52 pixels) are reflected pixels of the source image. Notably, Roboflow also reflects annotations by default. Images are square, padded, and aspect ratios plus original data are maintained.
- Fit (black edges) in: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio, and any newly created padding is black area. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. The remaining pixel area (416-335.48, or 80.52 pixels) are black pixels. Images are square, black padded, and aspect ratios plus original data are maintained.
- Fit (white edges) in: The dimensions of the source dimension are scaled to be the dimensions of the output image while maintaining the source image aspect ratio, and any newly created padding is white area. For example, if a source image is 2600x2080 and the resize option is set to 416x416, the longer dimensions (2600) is scaled to 416 and the secondary dimension (2080) is scaled to ~335.48 pixels. The remaining pixel area (416-335.48, or 80.52 pixels) are white pixels. Images are square, white padded, and aspect ratios plus original data are maintained.
Converts an image with RGB channels into an image with a single grayscale channel, which can save you memory. The value of each grayscale pixel is calculated as the weighted sum of the corresponding red, green and blue pixels: Y = 0.2125 R + 0.7154 G + 0.0721 B.
These weights are used by CRT phosphors as they better represent human perception of red, green and blue than equal weights. (Via Scikit-Image.)
Enhances an image with low contrast. We've explored whether you want to use contrast as a preprocessing step.
- Contrast Stretching: the image is rescaled to include all intensities that fall within the 2nd and 98th percentiles. See more.
- Histogram Equalization: “spreads out the most frequent intensity values” in an image. The equalized image has a roughly uniform distribution, where all colors of pixels are approximately equally represented. See more.
- Adaptive Equalization: Contrast Limited Adaptive Histogram Equalization (CLAHE). An algorithm for local contrast enhancement, that uses histograms computed over different tile regions of the image. Local details can therefore be enhanced even in regions that are darker or lighter than most of the image. (Via Scikit-Image.)
These features are available for Public Workspaces, and licensed Roboflow Growth and Enterprise accounts. Users on Sandboxes Workspaces (for Business use, creating initial Proof of Concepts) with a need for the advanced features should contact sales for access.
A preprocessing tool used to omit specific classes or remap (rename) classes when generating a new version of your dataset. Video

Omitting the "Apple leaf" class.

Remapping (relabeling) 3 classes to "Apple Leaf."
Images marked as null annotation, or "unannotated" after applying the Modify Classes tool, are the only ones affected when using Filter Null. Video

NOTE: Be sure that you have properly annotated ALL images within your dataset, designated appropriate images as null annotation, and/or omitted any unnecessary classes prior to using this tool.
- "Missing Annotations" occur when images are not annotated (leaving images unannotated will cause issues with the performance of your trained dataset, and can result in a failed training). Null annotations should only be applied when there is nothing present within that image that you wish for your model to detect.
Tiling can help when detecting small objects (especially in situations like aerial imagery and microscopy. The default setting is 2x2 tiling, however, you can adjust this as you see fit.

The tiling tool and a preview (depicted in "grid") of the output .

The static crop feature, and an example output.
Last modified 1mo ago