copy Copy chevron-down
Deploy chevron-right Batch Processing Integration and Automation Automate Batch Processing with APIs, webhooks, and cloud storage integration.
Batch Processing is well-suited for task automation, especially when processes need to run on a recurring basis. This page covers how to integrate Roboflow Batch Processing with external systems using APIs and webhooks.
A typical Batch Processing pipeline consists of:
Data Ingestion — Upload data to Data Staging (ephemeral storage for input and output data).
Processing — Run a Workflow against ingested data, producing CSV/JSONL results. This is usually followed by an export stage that creates compressed archives for convenient extraction.
Data Export — Download results from the output batch via download links.
All CLI commands have equivalent REST API endpoints. Below are the key API interactions.
Video Ingestion
Copy curl -X POST " https://api.roboflow.com/data-staging/v1/external/{workspace}/batches/{batch_id}/upload/video " \
-G \
--data-urlencode " api_key=YOUR_API_KEY " \
--data-urlencode " fileName=your_video.mp4 " The response contains "signedURLDetails" with:
"uploadURL" — the URL to PUT the video
"extensionHeaders" — additional headers to include
Upload the video:
Include all headers from the "extensionHeaders" response field.
Image Ingestion
Single Image Upload
Best for batches up to 5,000 images. Cannot be combined with bulk upload for the same batch.
Recommended for batches exceeding 5,000 images. Bundle up to 500 images per *.tar archive.
Pack images into a *.tar archive according to the size and file-count limits returned by the API.
Upload the archive using the signed URL and extension headers from the response.
circle-exclamation
Bulk-upload batches cannot be mixed with single-image uploads for the same batch.
Check Batch Status
Before starting a job, verify that all data has been ingested:
To check shard upload details (paginated):
Monitor Job Status
General job status:
List job stages:
List tasks for a stage (paginated):
List parts of an output batch:
List download URLs for a part (paginated):
Download a file:
Data Staging Batch Types
Simple batches (type: simple-batch) — Created when ingesting data one item at a time. Best for up to 5,000–10,000 items.
Sharded batches (type: sharded-batch) — Created via bulk ingestion (images only). Designed for millions of data points with automatic sharding.
Multipart batches (type: multipart-batch) — Created internally by the system. A logical grouping of sub-batches managed as one entity.
Webhook Automation
Instead of polling for status, you can use webhooks to get notified when ingestion or processing completes.
Data Ingestion Webhooks
The CLI commands create-batch-of-images and create-batch-of-videos support:
--notifications-url <webhook_url> — Webhook endpoint for notifications.
--notification-category <value> — Filter notifications:
ingest-status (default) — Overall ingestion process status.
files-status — Individual file processing status.
Notifications are delivered via HTTP POST with an Authorization header containing your Roboflow Publishable Key.
Ingest Status Notification
File Status Notification
Job Completion Webhooks
Add --notifications-url when starting a job:
Job Completion Notification
Signed URL Ingestion
For advanced automation, you can ingest data via signed URLs instead of local files:
--data-source references-file — Process files referenced via signed URLs.
--references <path_or_url> — Path to a JSONL file containing file URLs, or a signed URL pointing to such a file.
Cloud Storage Authentication
AWS S3 and S3-Compatible Storage
Credentials are detected automatically from:
AWS credential files (~/.aws/credentials, ~/.aws/config)
IAM roles (EC2, ECS, Lambda)
Named profiles:
S3-compatible services (Cloudflare R2, MinIO, etc.):
Google Cloud Storage
Credentials are detected from:
Service account key file (recommended for automation):
User credentials from gcloud CLI (gcloud auth login)
GCP metadata service (when running on Google Cloud Platform)
Azure Blob Storage
SAS Token (recommended):
Account Key:
Generate a SAS token via Azure CLI:
For advanced use cases, reference scripts for generating signed URL files:
Last updated 11 hours ago