Training Resolutions by Model Type

Training resolution affects model accuracy, inference speed, and training time. Each model architecture has a default resolution that balances these factors. By default, Roboflow suggests the default training resolution for the selected model architecture.

The table below shows the default training resolution for each model architecture and size. You can override these defaults by configuring the resize preprocessing step when creating a new Dataset Version.

Object Detection

Model Type
Family & Size
Default Training Resolution

Object Detection

RF-DETR Nano

384×384

Object Detection

RF-DETR Small

512×512

Object Detection

RF-DETR Medium

576×576

Object Detection

RF-DETR Large

704×704

Object Detection

RF-DETR X Large

700x700

Object Detection

RF-DETR 2X Large

880x880

Object Detection

Roboflow 3.0 - Fast

640×640

Object Detection

Roboflow 3.0 - Accurate

640×640

Object Detection

Roboflow 3.0 - Medium

640×640

Object Detection

Roboflow 3.0 - Large

640×640

Object Detection

Roboflow 3.0 - Extra Large

640×640

Object Detection

YOLOv26(n/s/m/l/x)

640×640

Object Detection

YOLOv12 (n/s/m/l/x)

640×640

Object Detection

YOLOv11 (n/s/m/l/x)

640×640

Object Detection

YOLOv10 (n/s/m/b/l/x)

640×640

Object Detection

YOLOv9 (s/m/c/e)

640×640

Object Detection

YOLOv8 (n/s/m/l/x)

640×640

Object Detection

YOLOv5 (n/s/m/l/x)

640×640

Object Detection

YOLOv7 (legacy)

640×640

Object Detection

YOLO‑NAS Small

640×640

Object Detection

YOLO‑NAS Medium

640×640

Object Detection

Roboflow Instant

1008x1008

Instance Segmentation

Model Type
Family & Size
Default Training Resolution

Instance Segmentation

RF-DETR Nano

384×384

Instance Segmentation

RF-DETR Small

512×512

Instance Segmentation

RF-DETR Medium

576×576

Instance Segmentation

RF-DETR Large

704×704

Instance Segmentation

RF-DETR X Large

700x700

Instance Segmentation

RF-DETR 2X Large

880x880

Instance Segmentation

Roboflow 3.0 - Fast (Seg)

640×640

Instance Segmentation

Roboflow 3.0 - Accurate (Seg)

640×640

Instance Segmentation

Roboflow 3.0 - Medium (Seg)

640×640

Instance Segmentation

Roboflow 3.0 - Large (Seg)

640×640

Instance Segmentation

Roboflow 3.0 - Extra Large (Seg)

640×640

Instance Segmentation

YOLO-seg (v8/10/11/12)

640×640

Instance Segmentation

SAM3 (Segment Anything 3)

1008x1008

Instance Segmentation

Semantic segmentation (DeepLabV3+)

≥ 512×512

Classification & Pose

Model Type
Family & Size
Default Training Resolution

Classification & Pose

Resnet-18/34/50

224x224

Classification & Pose

YOLO-cls (v8/11)

224x224

Classification & Pose

Vision Transformer (ViT)

224x224

Classification & Pose

YOLO-pose (keypoints)

640x640

Multimodal/VLM

Model Type
Family & Size
Default Training Resolution

Multimodal/VLM

PaliGemma 2 - 3 B

448x448

Multimodal/VLM

PaliGemma 2 - 10 B/28 B

448x448

Multimodal/VLM

Florence-2

448x448

Multimodal/VLM

QWEN 2.5 VL

448x448

Multimodal/VLM

SmolVLM2

384x384

Last updated

Was this helpful?