Object Detection

Adding object detection datasets.

Uploading Object Detection Datasets

If you are trying to identify objects in images with bounding boxes, you will need an object detection dataset. Object detection datasets require images (or videos) and annotations.

Select your folder(s) of images/videos and annotations.

If you are uploading images or videos, the best way to do this is to click-and-drag the whole folder of images/videos (and annotations, if you have them) directly into Roboflow. Roboflow supports many annotation formats. In many cases, dragging the files and dropping them into the box (as shown in the .gif below) should work.

If your annotation format is not included in the list below or if there are errors, then contact us! We will help add your annotation format into the list of supported formats.

You will know that your upload is successful when you see the progress bar move all the way to the right and you see images enclosed in bounding boxes. If you did not initially include annotations, you can add them in Roboflow now or later.

Popular Annotation Formats

Roboflow supports a wide array of annotation formats. You can see the full list of supported formats here. Some population formats include:

PASCAL VOC XML

PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning) is a Network of Excellence funded by the European Union. From 2005 - 2012, PASCAL ran the Visual Object Challenge (VOC). PASCAL annually released object detection datasets and reported benchmarks. (An aggregated PASCAL VOC dataset is available here.)

<annotation>
<folder>train</folder>
<filename>01.jpg</filename>
<path>/roboflow/data/train/01.png</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>224</width>
<height>224</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>21</name>
<pose>Frontal</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<occluded>0</occluded>
<bndbox>
<xmin>82</xmin>
<xmax>172</xmax>
<ymin>88</ymin>
<ymax>146</ymax>
</bndbox>
</object>
</annotation>

COCO JSON

The Common Objects in Context (COCO) dataset originated in a 2014 paper Microsoft published. The dataset "contains photos of 91 objects types that would be easily recognizable by a 4 year old." There are a total of 2.5 million labeled instances across 328,000 images. Given the sheer quantity and quality of data open sourced, COCO has become a standard dataset for testing and proving state of the art performance in new models. (The dataset is available here.)

See an example in our post on how to convert annotations to COCO JSON.

TensorFlow Object Detection .csv

TensorFlow object detection .csv files contain one bounding box per line.

filename

width

height

class

xmin

ymin

xmax

ymax

image_1.jpg

480

270

queen

173

24

260

137

image_1.jpg

480

270

queen

165

135

253

251

image_2.jpg

960

540

jack

255

96

337

208

image_2.jpg

960

540

jack

261

124

543

370

YOLO DarkNet .txt

Files that contain a .txt file for each image and a label_map.txt (or labels.txt) mapping the numeric classID to a class name.

Supported Image Formats

Roboflow supports uploading images in several formats. The most common are JPG, PNG, and BMP.

If you don't see support for your image format, write us in our support chat or send us a request.

Supported Video Formats

Roboflow Pro supports ingesting video in H.264 format. It will prompt you to select a frame rate for extracting image captures from your video.