Llama Vision 3.2 Support in Workflows

Llama Vision 3.2, a multimodal LLM developed by Meta AI, can now be used in Roboflow Workflows.

You can use the model to ask questions about the contents of images and retrieve a text response.

For example, you could use the block to:

  1. Read the text in an image.

  2. Ask questions about the text in an image.

  3. Classify an image according to a specific prompt.

This response can then be returned by your Workflow, or processed further by other blocks (i.e. the Expression block).

Try it in Workflows today.

Note: The Llama Vision 3.2 block is configured to use OpenRouter for inference. You will need an OpenRouter API key to use the Llama Vision 3.2 block.

Last updated

Was this helpful?