Video Inference
Run computer vision models across video frames.
Last updated
Run computer vision models across video frames.
Last updated
Our Hosted Video Inference method requires internet and runs on stored video files. For edge inference on real-time video streams please refer to our Inference documentation.
The Video Inference API is optimized for async video processing. It supports running any model Roboflow Inference implements (including foundation models like CLIP, custom fine-tuned models you train with Roboflow, and thousands of models shared by others on Roboflow Universe) to get predictions on all or a subset of the frames in a recorded video.
Here are the steps you must follow to use the API and retrieve predictions:
Upload a video
Request inference on a model or list of models on the uploaded video
Poll until results are available
Due to the optimizations to efficiently batch and utilize the GPU and the higher latency tolerance, the Video Inference API can be up to 100x cheaper for stored (vs realtime streaming) video processing than the image-based Roboflow Hosted Inference API.
View the specification for the API output format here.
You can use the Video Inference API on the following model types:
Task Type | Supported by Hosted API |
---|---|
Here are a few example use cases in which you can use the Video Inference API:
Video tagging
Video moderation (i.e. searching for violence, explicit scenes in media),
Finding and tagging brands or products
Extracting text from a video
Scene splitting and categorization
Object counting
Media search indexing
Identifying areas in which contextual ads can be placed in a video
And more.
Video inference currently supports the following video file-extensions: mp4, MP4, avi, AVI, mkv, MKV, webm, WEBM.
Roboflow caches uploaded video for a week to allow users to re-run video inference on the same uploaded video without having to upload the whole video repeatedly. After this 1 week period, the video is permanently deleted.
Videos uploaded to Roboflow can never be downloaded. The upload feature is solely to allow the backend to process the video for inference purposes.
✅
✅
✅
Classification
✅
Instance Segmentation
✅
Semantic Segmentation
✅