# Watch a Folder for New Images

A common ingestion pattern is to drop new images into a directory (a synced cloud folder, a camera's NAS export, an SFTP target) and have them automatically appear in a Roboflow project.

This recipe shows two implementations: a Python script using `watchdog` (best when you're already running a Python service), and a shell loop using the CLI (best for cron-style automation).

## Python: `watchdog` + SDK

```bash
pip install roboflow watchdog
```

```python
import os
import time
from pathlib import Path

import roboflow
from watchdog.events import FileSystemEventHandler
from watchdog.observers import Observer

WATCH_DIR = Path("./incoming")
PROJECT_ID = "my-detector"
BATCH_NAME = "auto-ingest"

rf = roboflow.Roboflow()
project = rf.workspace().project(PROJECT_ID)


class IngestHandler(FileSystemEventHandler):
    def on_created(self, event):
        if event.is_directory:
            return
        path = Path(event.src_path)
        if path.suffix.lower() not in {".jpg", ".jpeg", ".png"}:
            return
        # Wait briefly so the file is fully written before we read it.
        time.sleep(1)
        try:
            project.upload_image(
                image_path=str(path),
                split="train",
                batch_name=BATCH_NAME,
                num_retry_uploads=3,
            )
            print(f"uploaded {path.name}")
        except Exception as e:
            print(f"failed {path.name}: {e}")


observer = Observer()
observer.schedule(IngestHandler(), str(WATCH_DIR), recursive=True)
observer.start()
try:
    while True:
        time.sleep(60)
except KeyboardInterrupt:
    observer.stop()
observer.join()
```

What it does:

* Watches `WATCH_DIR` for new `.jpg`/`.jpeg`/`.png` files.
* Uploads each new file to `PROJECT_ID` under the `auto-ingest` batch.
* Retries transient upload failures.

To productionize: run it under `systemd` or in a Docker container, point logs at your observability stack, and add a deduplication step (Roboflow already deduplicates server-side by SHA-256 since `roboflow` 1.3.6, so a re-uploaded image is harmless).

## Shell: `inotifywait` + CLI

```bash
# Linux
sudo apt-get install inotify-tools
```

```bash
#!/usr/bin/env bash
set -euo pipefail

WATCH_DIR="./incoming"
PROJECT="my-detector"
BATCH="auto-ingest"

inotifywait -m -e close_write,moved_to --format '%w%f' "$WATCH_DIR" | while read -r FILE; do
  case "$FILE" in
    *.jpg|*.jpeg|*.png|*.JPG|*.JPEG|*.PNG)
      roboflow image upload "$FILE" -p "$PROJECT" --batch "$BATCH" --json \
        || echo "upload failed: $FILE" >&2
      ;;
  esac
done
```

For macOS, swap `inotifywait` for `fswatch`:

```bash
fswatch -0 ./incoming | xargs -0 -n1 -I {} \
  roboflow image upload "{}" -p my-detector --batch auto-ingest --json
```

## Periodic poll (no daemon)

If you can't run a long-lived process, poll on a cron schedule and upload anything that wasn't uploaded last time. The CLI's `--json` output and stable exit codes make this easy:

```bash
#!/usr/bin/env bash
set -euo pipefail

STATE=~/.roboflow-ingest-state
touch "$STATE"

find ./incoming -type f \( -name '*.jpg' -o -name '*.png' \) | while read -r FILE; do
  if ! grep -qxF "$FILE" "$STATE"; then
    if roboflow image upload "$FILE" -p my-detector --batch auto-ingest --json --quiet; then
      echo "$FILE" >> "$STATE"
    fi
  fi
done
```

Then add to `crontab`:

```
*/5 * * * * /usr/local/bin/ingest.sh >> /var/log/roboflow-ingest.log 2>&1
```

## Going further

* Combine with [Active Learning](/developer/python-sdk/active-learning.md) to filter out unhelpful frames before they're uploaded.
* Use [Annotation Jobs](/developer/python-sdk/manage-annotation-workflow.md) to auto-assign new uploads to a labeler.
* Tag uploads (`tag_names=...`) so you can search them later — see [Search Images](/developer/command-line-interface/search-images.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.roboflow.com/developer/recipes/watch-a-folder.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
