# CLI Usage

By installing `inference-cli` you gain access to the `inference rf-cloud` command, which allows you to interact with Batch Processing and Data Staging — the core components of Roboflow Batch Processing.

## Setup

```bash
pip install inference-cli
export ROBOFLOW_API_KEY="YOUR-API-KEY-GOES-HERE"
```

For cloud storage support:

```bash
pip install 'inference-cli[cloud-storage]'
```

If you need help finding your API key, see our [authentication guide](https://docs.roboflow.com/api-reference/authentication).

## Ingest Data

### Images

```bash
inference rf-cloud data-staging create-batch-of-images \
  --images-dir <your-images-dir-path> \
  --batch-id <your-batch-id>
```

### Videos

```bash
inference rf-cloud data-staging create-batch-of-videos \
  --videos-dir <your-videos-dir-path> \
  --batch-id <your-batch-id>
```

{% hint style="info" %}
**Batch ID format:** Must be lowercase, at most 64 characters, with only letters, digits, hyphens (`-`), and underscores (`_`).
{% endhint %}

### Cloud Storage

If your data is already in cloud storage (S3, Google Cloud Storage, or Azure), you can process it directly without downloading files locally.

**For images:**

```bash
inference rf-cloud data-staging create-batch-of-images \
  --data-source cloud-storage \
  --bucket-path <cloud-path> \
  --batch-id <your-batch-id>
```

**For videos:**

```bash
inference rf-cloud data-staging create-batch-of-videos \
  --data-source cloud-storage \
  --bucket-path <cloud-path> \
  --batch-id <your-batch-id>
```

The `--bucket-path` parameter supports:

* **S3**: `s3://bucket-name/path/`
* **Google Cloud Storage**: `gs://bucket-name/path/`
* **Azure Blob Storage**: `az://container-name/path/`

You can include glob patterns to filter files:

* `s3://my-bucket/training-data/**/*.jpg` — All JPG files recursively
* `gs://my-bucket/videos/2024-*/*.mp4` — MP4 files in 2024-\* folders
* `az://container/images/*.png` — PNG files in images folder

{% hint style="info" %}
Your cloud storage credentials are used **only locally** by the CLI to generate presigned URLs. They are **never uploaded** to Roboflow servers.
{% endhint %}

{% hint style="warning" %}
Generated presigned URLs are valid for 24 hours. Ensure your batch processing job completes within this timeframe.
{% endhint %}

For large datasets, the system automatically splits images into chunks of 20,000 files each. Videos work best in batches under 1,000.

### Signed URL Ingestion

For advanced automation, you can ingest data via signed URLs instead of local files:

* `--data-source references-file` — Process files referenced via signed URLs.
* `--references <path_or_url>` — Path to a JSONL file containing file URLs, or a signed URL pointing to such a file.

**Reference File Format (JSONL):**

```
{"name": "<unique-file-name-1>", "url": "https://<signed-url>"}
{"name": "<unique-file-name-2>", "url": "https://<signed-url>"}
```

{% hint style="info" %}
Signed URL ingestion is available to Growth Plan and Enterprise customers.
{% endhint %}

## Inspect Staged Data

```bash
inference rf-cloud data-staging show-batch-details --batch-id <your-batch-id>
```

## Start a Job

### Process Images

```bash
inference rf-cloud batch-processing process-images-with-workflow \
  --workflow-id <workflow-id> \
  --batch-id <batch-id> \
  --machine-type gpu
```

### Process Videos

```bash
inference rf-cloud batch-processing process-videos-with-workflow \
  --workflow-id <workflow-id> \
  --batch-id <batch-id> \
  --machine-type gpu \
  --max-video-fps <your-desired-fps>
```

{% hint style="info" %}
**Finding your Workflow ID:** Open the Workflow Editor in the Roboflow App, click "Deploy", and find the identifier in the code snippet.
{% endhint %}

{% hint style="info" %}
By default, processing runs on CPU. Use `--machine-type gpu` for Workflows with multiple or large models.
{% endhint %}

## Monitor Job Progress

The start command outputs a **Job ID**. Use it to check status:

```bash
inference rf-cloud batch-processing show-job-details --job-id <your-job-id>
```

## Export Results

The job details will include the **output batch ID**. Use it to export results:

```bash
inference rf-cloud data-staging export-batch \
  --target-dir <dir-to-export-result> \
  --batch-id <output-batch-of-a-job>
```

## Webhook Automation

Instead of polling for status, you can use webhooks to get notified when ingestion or processing completes.

### Data Ingestion Webhooks

The CLI commands `create-batch-of-images` and `create-batch-of-videos` support:

* `--notifications-url <webhook_url>` — Webhook endpoint for notifications.
* `--notification-category <value>` — Filter notifications:
  * `ingest-status` (default) — Overall ingestion process status.
  * `files-status` — Individual file processing status.

Notifications are delivered via HTTP POST with an `Authorization` header containing your Roboflow Publishable Key.

#### Ingest Status Notification

```json
{
    "type": "roboflow-data-staging-notification-v1",
    "event_id": "8c20f970-fe10-41e1-9ef2-e057c63c07ff",
    "ingest_id": "8cd48813430f2be70b492db67e07cc86",
    "batch_id": "test-batch-117",
    "shard_id": null,
    "notification": {
        "type": "ingest-status-notification-v1",
        "success": false,
        "error_details": {
            "type": "unsafe-url-detected",
            "reason": "Untrusted domain found: https://example.com/image.png"
        }
    },
    "delivery_attempt": 1
}
```

#### File Status Notification

```json
{
    "type": "roboflow-data-staging-notification-v1",
    "event_id": "8f42708b-aeb7-4b73-9d83-cf18518b6d81",
    "ingest_id": "d5cb69aa-b2d1-4202-a1c1-0231f180bda9",
    "batch_id": "prod-batch-1",
    "shard_id": "0d40fa12-349e-439f-83f8-42b9b7987b33",
    "notification": {
        "type": "ingest-files-status-notification-v1",
        "success": true,
        "ingested_files": [
            "000000494869.jpg",
            "000000186042.jpg"
        ],
        "failed_files": [
            {
                "type": "file-size-limit-exceeded",
                "file_name": "big_image.png",
                "reason": "Max size of single image is 20971520B."
            }
        ],
        "content_truncated": false
    },
    "delivery_attempt": 1
}
```

### Job Completion Webhooks

Add `--notifications-url` when starting a job:

```bash
inference rf-cloud batch-processing process-images-with-workflow \
  --workflow-id <workflow-id> \
  --batch-id <batch-id> \
  --notifications-url <webhook_url>
```

#### Job Completion Notification

```json
{
  "type": "roboflow-batch-job-notification-v1",
  "event_id": "8f42708b-aeb7-4b73-9d83-cf18518b6d81",
  "job_id": "<your-batch-job-id>",
  "job_state": "success | fail",
  "delivery_attempt": 1
}
```

## Cloud Storage Authentication

### AWS S3 and S3-Compatible Storage

Credentials are detected automatically from:

1. **Environment variables:**

```bash
export AWS_ACCESS_KEY_ID=your-access-key-id
export AWS_SECRET_ACCESS_KEY=your-secret-access-key
export AWS_SESSION_TOKEN=your-session-token  # Optional
```

2. **AWS credential files** (`~/.aws/credentials`, `~/.aws/config`)
3. **IAM roles** (EC2, ECS, Lambda)

**Named profiles:**

```bash
export AWS_PROFILE=production
```

**S3-compatible services (Cloudflare R2, MinIO, etc.):**

```bash
export AWS_ENDPOINT_URL=https://account-id.r2.cloudflarestorage.com
export AWS_REGION=auto  # R2 requires region='auto'
export AWS_ACCESS_KEY_ID=your-r2-access-key
export AWS_SECRET_ACCESS_KEY=your-r2-secret-key
```

### Google Cloud Storage

Credentials are detected from:

1. **Service account key file** (recommended for automation):

```bash
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
```

2. **User credentials** from gcloud CLI (`gcloud auth login`)
3. **GCP metadata service** (when running on Google Cloud Platform)

### Azure Blob Storage

**SAS Token (recommended):**

```bash
export AZURE_STORAGE_ACCOUNT_NAME=mystorageaccount
export AZURE_STORAGE_SAS_TOKEN="sv=2021-06-08&ss=b&srt=sco&sp=rl&se=2024-12-31"
```

**Account Key:**

```bash
export AZURE_STORAGE_ACCOUNT_NAME=mystorageaccount
export AZURE_STORAGE_ACCOUNT_KEY=your-account-key
```

Generate a SAS token via Azure CLI:

```bash
az storage container generate-sas \
  --account-name mystorageaccount \
  --name my-container \
  --permissions rl \
  --expiry 2024-12-31T23:59:59Z
```

### Custom Scripts

For advanced use cases, reference scripts for generating signed URL files:

* **AWS S3:** [generateS3SignedUrls.sh](https://raw.githubusercontent.com/roboflow/roboflow-python/main/scripts/generateS3SignedUrls.sh)
* **Google Cloud Storage:** [generateGCSSignedUrls.sh](https://github.com/roboflow/roboflow-python/blob/main/scripts/generateGCSSignedUrls.sh)
* **Azure Blob Storage:** [generateAzureSasUrls.sh](https://raw.githubusercontent.com/roboflow/roboflow-python/main/scripts/generateAzureSasUrls.sh)

## Discover All Options

```bash
inference rf-cloud --help
inference rf-cloud data-staging --help
inference rf-cloud batch-processing --help
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.roboflow.com/deploy/batch-processing/cli-usage.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
