Training Resolutions by Model Type
Training resolution affects model accuracy, inference speed, and training time. Each model architecture has a default resolution that balances these factors. By default, Roboflow suggests the default training resolution for the selected model architecture.
The table below shows the default training resolution for each model architecture and size. You can override these defaults by configuring the resize preprocessing step when creating a new Dataset Version.
Object Detection
Object Detection
RF-DETR Nano
384×384
Object Detection
RF-DETR Small
512×512
Object Detection
RF-DETR Medium
576×576
Object Detection
RF-DETR Large
704×704
Object Detection
RF-DETR X Large
700x700
Object Detection
RF-DETR 2X Large
880x880
Object Detection
Roboflow 3.0 - Fast
640×640
Object Detection
Roboflow 3.0 - Accurate
640×640
Object Detection
Roboflow 3.0 - Medium
640×640
Object Detection
Roboflow 3.0 - Large
640×640
Object Detection
Roboflow 3.0 - Extra Large
640×640
Object Detection
YOLOv26(n/s/m/l/x)
640×640
Object Detection
YOLOv12 (n/s/m/l/x)
640×640
Object Detection
YOLOv11 (n/s/m/l/x)
640×640
Object Detection
YOLOv10 (n/s/m/b/l/x)
640×640
Object Detection
YOLOv9 (s/m/c/e)
640×640
Object Detection
YOLOv8 (n/s/m/l/x)
640×640
Object Detection
YOLOv5 (n/s/m/l/x)
640×640
Object Detection
YOLOv7 (legacy)
640×640
Object Detection
YOLO‑NAS Small
640×640
Object Detection
YOLO‑NAS Medium
640×640
Object Detection
Roboflow Instant
1008x1008
Instance Segmentation
Instance Segmentation
RF-DETR Nano
384×384
Instance Segmentation
RF-DETR Small
512×512
Instance Segmentation
RF-DETR Medium
576×576
Instance Segmentation
RF-DETR Large
704×704
Instance Segmentation
RF-DETR X Large
700x700
Instance Segmentation
RF-DETR 2X Large
880x880
Instance Segmentation
Roboflow 3.0 - Fast (Seg)
640×640
Instance Segmentation
Roboflow 3.0 - Accurate (Seg)
640×640
Instance Segmentation
Roboflow 3.0 - Medium (Seg)
640×640
Instance Segmentation
Roboflow 3.0 - Large (Seg)
640×640
Instance Segmentation
Roboflow 3.0 - Extra Large (Seg)
640×640
Instance Segmentation
YOLO-seg (v8/10/11/12)
640×640
Instance Segmentation
SAM3 (Segment Anything 3)
1008x1008
Instance Segmentation
Semantic segmentation (DeepLabV3+)
≥ 512×512
Classification & Pose
Classification & Pose
Resnet-18/34/50
224x224
Classification & Pose
YOLO-cls (v8/11)
224x224
Classification & Pose
Vision Transformer (ViT)
224x224
Classification & Pose
YOLO-pose (keypoints)
640x640
Multimodal/VLM
Multimodal/VLM
PaliGemma 2 - 3 B
448x448
Multimodal/VLM
PaliGemma 2 - 10 B/28 B
448x448
Multimodal/VLM
Florence-2
448x448
Multimodal/VLM
QWEN 2.5 VL
448x448
Multimodal/VLM
SmolVLM2
384x384
Last updated
Was this helpful?