# (Legacy) Serverless Hosted API

{% hint style="info" %}
We **recommend** using the V2 of our Serverless Hosted API. The V2 API is faster.\
\
[Refer to the Serverless Hosted API V2 documentation to get started with the new API.](https://docs.roboflow.com/deploy/serverless-hosted-api-v2)
{% endhint %}

## Model Support

The following models types are supported by the Serverless Hosted API (v1):

| Task Type                                                                                                        | Supported by Hosted API (v1) |
| ---------------------------------------------------------------------------------------------------------------- | ---------------------------- |
| [Object Detection](https://docs.roboflow.com/deploy/serverless/object-detection)                                 | ✅                            |
| [Classification](https://docs.roboflow.com/deploy/serverless/classification)                                     | ✅                            |
| [Instance Segmentation](https://docs.roboflow.com/deploy/serverless/instance-segmentation)                       | ✅                            |
| [Semantic Segmentation](https://docs.roboflow.com/deploy/serverless/instance-segmentation/semantic-segmentation) | ✅                            |
| [Keypoint Detection](https://docs.roboflow.com/deploy/serverless/keypoint-detection)                             | ✅                            |

## Latency comparison (v1 vs v2)

The end-to-end latency of requests sent to the Serverless Hosted API depends on several factors:

1. Model architecture, which has a bearing on the execution time
2. Size and resolution of the images that impact upload time and model inference time during execution
3. Network latency and bandwidth, which affects request upload time and response download time.
4. Service subscription and usage by other users at any specific time which could result in queueing latency

<figure><img src="https://662926385-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-M6S9nPJhEX9FYH6clfW%2Fuploads%2FqMum7HkzyWoxLnOpbVGx%2Fserverless-img.png?alt=media&#x26;token=e075c90c-32a9-4691-9afc-54ec9831251b" alt=""><figcaption></figcaption></figure>

We show some representative benchmarks of v1 vs v2 Serverless Hosted API in the table below. It shows both the end-to-end latency (E2E) as well as the execution time (Exec). These numbers are for information only, we encourage users to perform their own benchmarks using [our inference benchmark tools](https://inference.roboflow.com/inference_helpers/cli_commands/benchmark/) or their own custom benchmarks.

<table><thead><tr><th width="176.14410400390625">Model</th><th>V2 (E2E)</th><th>V2 (Exec)</th><th>V1 (E2E)</th><th>V1 (Exec)</th></tr></thead><tbody><tr><td>yolov8x-640</td><td>401 ms</td><td>29 ms</td><td>4084 ms</td><td>821 ms</td></tr><tr><td>yolov8m-640</td><td>757 ms</td><td>21 ms</td><td>572 ms</td><td>265 ms</td></tr><tr><td>yolov8n-640</td><td>384 ms</td><td>17 ms</td><td>312 ms</td><td>63 ms</td></tr><tr><td>yolov8x-1280</td><td>483 ms</td><td>97 ms</td><td>6431 ms</td><td>3032 ms</td></tr><tr><td>yolov8m-1280</td><td>416 ms</td><td>52 ms</td><td>1841 ms</td><td>1006 ms</td></tr><tr><td>yolov8n-1280</td><td>428 ms</td><td>35 ms</td><td>464 ms</td><td>157 ms</td></tr></tbody></table>

We encourage users to run their own benchmarks for their model inferences and workflows to get real metrics on their specific usecases.

## Limits

The Serverless Hosted API (v1), regardless of the specific task type, accepts files up to 5MB. This limit includes, but is not limited to, the image file size plus any request information attached.

{% hint style="info" %}
In the cases that requests are too large, we recommend downsizing any attached images. This usually will not result in poor performance as images are downsized regardless after they've been received on our servers to the input size that the model architecture accepts.\
\
Some of our SDKs, like the Python SDK, automatically downsize images to the model architecture's input size before they are sent to the API.
{% endhint %}
