Object Detection
Adding object detection datasets.

Uploading Object Detection Datasets

If you are trying to identify objects in images with bounding boxes, you will need an object detection dataset. Object detection datasets require images (or videos) and annotations.
Select your folder(s) of images/videos and annotations.
If you are uploading images or videos, the best way to do this is to click-and-drag the whole folder of images/videos (and annotations, if you have them) directly into Roboflow. Roboflow supports many annotation formats. In many cases, dragging the files and dropping them into the box (as shown in the .gif below) should work.
If your annotation format is not included in the list below or if there are errors, then contact us! We will help add your annotation format into the list of supported formats.
You will know that your upload is successful when you see the progress bar move all the way to the right and you see images enclosed in bounding boxes. If you did not initially include annotations, you can add them in Roboflow now or later.

Popular Annotation Formats

Roboflow supports a wide array of annotation formats. You can see the full list of supported formats here. Some population formats include:

PASCAL VOC XML

PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning) is a Network of Excellence funded by the European Union. From 2005 - 2012, PASCAL ran the Visual Object Challenge (VOC). PASCAL annually released object detection datasets and reported benchmarks. (An aggregated PASCAL VOC dataset is available here.)
1
<annotation>
2
<folder>train</folder>
3
<filename>01.jpg</filename>
4
<path>/roboflow/data/train/01.png</path>
5
<source>
6
<database>Unknown</database>
7
</source>
8
<size>
9
<width>224</width>
10
<height>224</height>
11
<depth>3</depth>
12
</size>
13
<segmented>0</segmented>
14
<object>
15
<name>21</name>
16
<pose>Frontal</pose>
17
<truncated>0</truncated>
18
<difficult>0</difficult>
19
<occluded>0</occluded>
20
<bndbox>
21
<xmin>82</xmin>
22
<xmax>172</xmax>
23
<ymin>88</ymin>
24
<ymax>146</ymax>
25
</bndbox>
26
</object>
27
</annotation>
Copied!

COCO JSON

The Common Objects in Context (COCO) dataset originated in a 2014 paper Microsoft published. The dataset "contains photos of 91 objects types that would be easily recognizable by a 4 year old." There are a total of 2.5 million labeled instances across 328,000 images. Given the sheer quantity and quality of data open sourced, COCO has become a standard dataset for testing and proving state of the art performance in new models. (The dataset is available here.)
See an example in our post on how to convert annotations to COCO JSON.

TensorFlow Object Detection .csv

TensorFlow object detection .csv files contain one bounding box per line.
filename
width
height
class
xmin
ymin
xmax
ymax
image_1.jpg
480
270
queen
173
24
260
137
image_1.jpg
480
270
queen
165
135
253
251
image_2.jpg
960
540
jack
255
96
337
208
image_2.jpg
960
540
jack
261
124
543
370

YOLO DarkNet .txt

Files that contain a .txt file for each image and a label_map.txt (or labels.txt) mapping the numeric classID to a class name.

Supported Image Formats

Roboflow supports uploading images in several formats. The most common are JPG, PNG, and BMP.
If you don't see support for your image format, write us in our support chat or send us a request.

Supported Video Formats

Roboflow Pro supports ingesting video in H.264 format. It will prompt you to select a frame rate for extracting image captures from your video.