> For the complete documentation index, see [llms.txt](https://docs.roboflow.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.roboflow.com/deploy/supported-models/qwen3-5.md). # Qwen3.5 Qwen3.5 is Alibaba's vision-language model family. It accepts an image and a text prompt and returns a text response. Two pretrained checkpoints are available:

Alias	Parameters
`qwen3_5-0.8b`	0.8B
`qwen3_5-2b`	2B

## Accuracy Headline vision-language benchmarks (non-thinking mode) from the official model cards ([0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B), [2B](https://huggingface.co/Qwen/Qwen3.5-2B)):


Benchmark	`qwen3_5-0.8b`	`qwen3_5-2b`
MMMU	47.4	64.2
MathVista (mini)	58.6	73.9
MMBench (EN v1.1)	68.0	81.3

## Inference speed Latency measured with [Roboflow Inference](https://inference.roboflow.com/) on 1x NVIDIA L4, batch size 1, generating exactly 128 tokens with greedy decoding from a fixed prompt. Latency scales with output length, so use tokens/sec to estimate other lengths.

Alias	Latency, 128 tokens (ms)	Tokens/sec
`qwen3_5-0.8b`	3307	39
`qwen3_5-2b`	3688	35

## Using Qwen 3.5 VL via Workflows Qwen 3.5 VL is available as a preconfigured [Workflow](/workflows/what-is-workflows.md) on the "Open-Source Models" tab of the Models page. Select "Qwen VL", choose a model variant and prompt, then click "Test API" to fork the Workflow into your Workspace and start running inference. The Workflow uses the unified `qwen_vlm@v1` block, which supports multiple Qwen VL generations:

Model	Parameters
Qwen 3.5 VL 0.8B	0.8B
Qwen 3.5 VL 2B	2B
Qwen 3 VL 2B	2B
Qwen 2.5 VL 7B	7B

## Using Qwen 3.5 VL via Inference SDK {% hint style="info" %} Direct Inference SDK calls to Qwen3.5 require a [Dedicated Deployment](/deploy/dedicated-deployments.md) or [self-hosted Inference](https://inference.roboflow.com/). For hosted access, use the Workflow path described above. {% endhint %} {% stepper %} {% step %} ### Get your API Key Create a Roboflow account, find your key on the [Roboflow API settings page](https://app.roboflow.com/settings/api) and make it available to your shell: ```bash export ROBOFLOW_API_KEY="your-key-here" ``` {% endstep %} {% step %} ### Install the dependencies Install the [Inference SDK](https://inference.roboflow.com/): ```bash pip install inference-sdk ``` {% endstep %} {% step %} ### Run the model Set `api_url` to your Dedicated Deployment URL or a local Inference server. ```python import os import cv2 import numpy as np import requests from inference_sdk import InferenceHTTPClient content = requests.get("https://media.roboflow.com/quickstart/dog.jpeg").content image = cv2.imdecode(np.frombuffer(content, np.uint8), cv2.IMREAD_COLOR) client = InferenceHTTPClient( api_url="https://your-deployment.roboflow.cloud", api_key=os.environ["ROBOFLOW_API_KEY"], ) result = client.infer_lmm( image, model_id="qwen3_5-2b", prompt="Describe this image briefly.", max_new_tokens=256, ) print(result["response"]) ``` {% endstep %} {% endstepper %} The code above prints the model response to the terminal: ``` A person wearing a white t-shirt and red shorts is carrying a black backpack on their shoulder, with a beagle dog perched on top of it. The scene takes place outdoors in a residential area, with modern apartment buildings in the background and greenery along the sidewalk. The person appears to be walking or standing near a building with large windows. ```

{% hint style="info" %} Set `api_url` to match your deployment target: * `http://localhost:9001` for a local [Inference](https://inference.roboflow.com/) server. * Your [Dedicated Deployment](/deploy/dedicated-deployments.md) URL for a private endpoint. {% endhint %} You can train your own Qwen3.5 checkpoint on Roboflow and call it by its per-model `{workspace}/{model-slug}` ID (see [Versions, Trainings, and Models](/train/versions-trainings-and-models.md)). {% embed url="" %}