# CLI Usage

By installing `inference-cli` you gain access to the `inference rf-cloud` command, which allows you to interact with Batch Processing and Data Staging — the core components of Roboflow Batch Processing.

## Setup

```bash
pip install inference-cli
export ROBOFLOW_API_KEY="YOUR-API-KEY-GOES-HERE"
```

For cloud storage support:

```bash
pip install 'inference-cli[cloud-storage]'
```

If you need help finding your API key, see our [authentication guide](https://docs.roboflow.com/api-reference/authentication).

## Ingest Data

### Images

```bash
inference rf-cloud data-staging create-batch-of-images \
  --images-dir <your-images-dir-path> \
  --batch-id <your-batch-id>
```

### Videos

```bash
inference rf-cloud data-staging create-batch-of-videos \
  --videos-dir <your-videos-dir-path> \
  --batch-id <your-batch-id>
```

{% hint style="info" %}
**Batch ID format:** Must be lowercase, at most 64 characters, with only letters, digits, hyphens (`-`), and underscores (`_`).
{% endhint %}

### Cloud Storage

If your data is already in cloud storage (S3, Google Cloud Storage, or Azure), you can process it directly without downloading files locally.

**For images:**

```bash
inference rf-cloud data-staging create-batch-of-images \
  --data-source cloud-storage \
  --bucket-path <cloud-path> \
  --batch-id <your-batch-id>
```

**For videos:**

```bash
inference rf-cloud data-staging create-batch-of-videos \
  --data-source cloud-storage \
  --bucket-path <cloud-path> \
  --batch-id <your-batch-id>
```

The `--bucket-path` parameter supports:

* **S3**: `s3://bucket-name/path/`
* **Google Cloud Storage**: `gs://bucket-name/path/`
* **Azure Blob Storage**: `az://container-name/path/`

You can include glob patterns to filter files:

* `s3://my-bucket/training-data/**/*.jpg` — All JPG files recursively
* `gs://my-bucket/videos/2024-*/*.mp4` — MP4 files in 2024-\* folders
* `az://container/images/*.png` — PNG files in images folder

{% hint style="info" %}
Your cloud storage credentials are used **only locally** by the CLI to generate presigned URLs. They are **never uploaded** to Roboflow servers.
{% endhint %}

{% hint style="warning" %}
Generated presigned URLs are valid for 24 hours. Ensure your batch processing job completes within this timeframe.
{% endhint %}

For large datasets, the system automatically splits images into chunks of 20,000 files each. Videos work best in batches under 1,000.

### Signed URL Ingestion

For advanced automation, you can ingest data via signed URLs instead of local files:

* `--data-source references-file` — Process files referenced via signed URLs.
* `--references <path_or_url>` — Path to a JSONL file containing file URLs, or a signed URL pointing to such a file.

**Reference File Format (JSONL):**

```
{"name": "<unique-file-name-1>", "url": "https://<signed-url>"}
{"name": "<unique-file-name-2>", "url": "https://<signed-url>"}
```

{% hint style="info" %}
Signed URL ingestion is available to Growth Plan and Enterprise customers.
{% endhint %}

## Inspect Staged Data

```bash
inference rf-cloud data-staging show-batch-details --batch-id <your-batch-id>
```

## Start a Job

### Process Images

```bash
inference rf-cloud batch-processing process-images-with-workflow \
  --workflow-id <workflow-id> \
  --batch-id <batch-id> \
  --machine-type gpu
```

### Process Videos

```bash
inference rf-cloud batch-processing process-videos-with-workflow \
  --workflow-id <workflow-id> \
  --batch-id <batch-id> \
  --machine-type gpu \
  --max-video-fps <your-desired-fps>
```

{% hint style="info" %}
**Finding your Workflow ID:** Open the Workflow Editor in the Roboflow App, click "Deploy", and find the identifier in the code snippet.
{% endhint %}

{% hint style="info" %}
By default, processing runs on CPU. Use `--machine-type gpu` for Workflows with multiple or large models.
{% endhint %}

## Monitor Job Progress

The start command outputs a **Job ID**. Use it to check status:

```bash
inference rf-cloud batch-processing show-job-details --job-id <your-job-id>
```

## Export Results

The job details will include the **output batch ID**. Use it to export results:

```bash
inference rf-cloud data-staging export-batch \
  --target-dir <dir-to-export-result> \
  --batch-id <output-batch-of-a-job>
```

## Webhook Automation

Instead of polling for status, you can use webhooks to get notified when ingestion or processing completes.

### Data Ingestion Webhooks

The CLI commands `create-batch-of-images` and `create-batch-of-videos` support:

* `--notifications-url <webhook_url>` — Webhook endpoint for notifications.
* `--notification-category <value>` — Filter notifications:
  * `ingest-status` (default) — Overall ingestion process status.
  * `files-status` — Individual file processing status.

Notifications are delivered via HTTP POST with an `Authorization` header containing your Roboflow Publishable Key.

#### Ingest Status Notification

```json
{
    "type": "roboflow-data-staging-notification-v1",
    "event_id": "8c20f970-fe10-41e1-9ef2-e057c63c07ff",
    "ingest_id": "8cd48813430f2be70b492db67e07cc86",
    "batch_id": "test-batch-117",
    "shard_id": null,
    "notification": {
        "type": "ingest-status-notification-v1",
        "success": false,
        "error_details": {
            "type": "unsafe-url-detected",
            "reason": "Untrusted domain found: https://example.com/image.png"
        }
    },
    "delivery_attempt": 1
}
```

#### File Status Notification

```json
{
    "type": "roboflow-data-staging-notification-v1",
    "event_id": "8f42708b-aeb7-4b73-9d83-cf18518b6d81",
    "ingest_id": "d5cb69aa-b2d1-4202-a1c1-0231f180bda9",
    "batch_id": "prod-batch-1",
    "shard_id": "0d40fa12-349e-439f-83f8-42b9b7987b33",
    "notification": {
        "type": "ingest-files-status-notification-v1",
        "success": true,
        "ingested_files": [
            "000000494869.jpg",
            "000000186042.jpg"
        ],
        "failed_files": [
            {
                "type": "file-size-limit-exceeded",
                "file_name": "big_image.png",
                "reason": "Max size of single image is 20971520B."
            }
        ],
        "content_truncated": false
    },
    "delivery_attempt": 1
}
```

### Job Completion Webhooks

Add `--notifications-url` when starting a job:

```bash
inference rf-cloud batch-processing process-images-with-workflow \
  --workflow-id <workflow-id> \
  --batch-id <batch-id> \
  --notifications-url <webhook_url>
```

#### Job Completion Notification

```json
{
  "type": "roboflow-batch-job-notification-v1",
  "event_id": "8f42708b-aeb7-4b73-9d83-cf18518b6d81",
  "job_id": "<your-batch-job-id>",
  "job_state": "success | fail",
  "delivery_attempt": 1
}
```

## Cloud Storage Authentication

### AWS S3 and S3-Compatible Storage

Credentials are detected automatically from:

1. **Environment variables:**

```bash
export AWS_ACCESS_KEY_ID=your-access-key-id
export AWS_SECRET_ACCESS_KEY=your-secret-access-key
export AWS_SESSION_TOKEN=your-session-token  # Optional
```

2. **AWS credential files** (`~/.aws/credentials`, `~/.aws/config`)
3. **IAM roles** (EC2, ECS, Lambda)

**Named profiles:**

```bash
export AWS_PROFILE=production
```

**S3-compatible services (Cloudflare R2, MinIO, etc.):**

```bash
export AWS_ENDPOINT_URL=https://account-id.r2.cloudflarestorage.com
export AWS_REGION=auto  # R2 requires region='auto'
export AWS_ACCESS_KEY_ID=your-r2-access-key
export AWS_SECRET_ACCESS_KEY=your-r2-secret-key
```

### Google Cloud Storage

Credentials are detected from:

1. **Service account key file** (recommended for automation):

```bash
export GOOGLE_APPLICATION_CREDENTIALS=/path/to/service-account-key.json
```

2. **User credentials** from gcloud CLI (`gcloud auth login`)
3. **GCP metadata service** (when running on Google Cloud Platform)

### Azure Blob Storage

**SAS Token (recommended):**

```bash
export AZURE_STORAGE_ACCOUNT_NAME=mystorageaccount
export AZURE_STORAGE_SAS_TOKEN="sv=2021-06-08&ss=b&srt=sco&sp=rl&se=2024-12-31"
```

**Account Key:**

```bash
export AZURE_STORAGE_ACCOUNT_NAME=mystorageaccount
export AZURE_STORAGE_ACCOUNT_KEY=your-account-key
```

Generate a SAS token via Azure CLI:

```bash
az storage container generate-sas \
  --account-name mystorageaccount \
  --name my-container \
  --permissions rl \
  --expiry 2024-12-31T23:59:59Z
```

### Custom Scripts

For advanced use cases, reference scripts for generating signed URL files:

* **AWS S3:** [generateS3SignedUrls.sh](https://raw.githubusercontent.com/roboflow/roboflow-python/main/scripts/generateS3SignedUrls.sh)
* **Google Cloud Storage:** [generateGCSSignedUrls.sh](https://github.com/roboflow/roboflow-python/blob/main/scripts/generateGCSSignedUrls.sh)
* **Azure Blob Storage:** [generateAzureSasUrls.sh](https://raw.githubusercontent.com/roboflow/roboflow-python/main/scripts/generateAzureSasUrls.sh)

## Discover All Options

```bash
inference rf-cloud --help
inference rf-cloud data-staging --help
inference rf-cloud batch-processing --help
```
