Pricing

Serverless Hosted API Pricing पेज

The roboflow.com/credits पृष्ठ में उल्लेख है कि 1 credit 500 सेकंड के inference time के बराबर है। अधिक सटीक formula निम्नलिखित है:

यदि x-remote-processing-time header सेट है:
   credits = (100ms + x-remote-processing-time) / 500,000ms
अन्यथा:
   credits = max(x-processing-time, 100ms) / 500,000ms

जहाँ x-processing-time और x-remote-processing-time HTTP Response headers हैं, float format (seconds) में। देखें roboflow.com/pricing credit pricing के लिए।

_{Model Inference}

नीचे दिए गए उदाहरण में, हम coco/39 model (RF-DETR Small, 560x560) पर inference चलाते हैं। Response headers में हमें x-processing-time मिल सकता है, जो 81ms है। इस मामले में, हमारे पास होगा credits = max(81, 100) / 500,000 = 0.0002 credits या 1000 images पर 0.2 credits।

curl -X POST "https://serverless.roboflow.com/coco/39?api_key=API_KEY&image=https://media.roboflow.com/notebooks/examples/dog.jpeg" -I
HTTP/2 200 
content-type: application/json
content-length: 995
x-model-cold-start: false
x-model-id: coco/39
x-processing-time: 0.08100700378417969
x-workspace-id: my-workspace-id

Cold start

यदि आप वही request 10 मिनट बाद चलाते हैं, तो ऐसा हो सकता है कि model unload हो चुका हो और उसे GPU पर फिर से load करने की आवश्यकता हो - एक cold start। Model loading में कुछ सेकंड तक लग सकते हैं, और यह inferences के बीच की देरी से बहुत अधिक संबंधित होता है।

curl -X POST "https://serverless.roboflow.com/coco/39?api_key=API_KEY&image=https://media.roboflow.com/notebooks/examples/dog.jpeg" -I
HTTP/2 200 
content-type: application/json
content-length: 995
x-model-cold-start: true
x-model-id: coco/39
x-model-load-details: [{"m": "coco/39", "t": 0.7791134570725262}]
x-model-load-time: 0.5791134570725262
x-processing-time: 1.1060344696044922
x-workspace-id: my-workspace-id

Formula: credits = max(1106, 100)/500,000 = 0.0022 या 1000 (cold start) images के लिए 2.2 credits।

Workflow run

Workflows के लिए, हम model inference को general Workflow processing से अलग करते हैं। इसका मतलब है कि Workflow स्वयं (सस्ते) CPU-only machines पर execute होगा, और model inference के लिए केवल GPU machines का उपयोग करेगा, जिससे processing अधिक cost-effective होगी।

curl --location 'https://serverless.roboflow.com/my-workspace-id/workflows/lpr-workflow' -i \
--header 'Content-Type: application/json' \
--data '{
    "api_key": "API_KEY",
    "inputs": {
        "image": {"type": "url", "value": "https://storage.googleapis.com/com-roboflow-marketing/docs/cars-highway.png"}
    }
}'

HTTP/2 200 
content-type: application/json
content-length: 2277416
x-model-cold-start: false
x-processing-time: 6.334797143936157
x-remote-processing-time: 1.0542614459991455
x-remote-processing-times: [{"m": "vehicle-detection-bz0yu/4", "t": 1.0091230869293213}, {"m": "license-plate-w8chc/1", "t": 0.017786026000976562}, {"m": "license-plate-w8chc/1", "t": 0.01506495475769043}, {"m": "license-plate-w8chc/1", "t": 0.012287378311157227}]
x-workspace-id: my-workspace-id

Formula: credits = (100ms + 1054ms)/500,000 तो 0.0023 credits processing के लिए, और Gemini API call के लिए थोड़ी-सी छोटी राशि (token count पर निर्भर करती है, देखें roboflow.com/credits).

PreviousPython SDK के साथ उपयोग करें NextServerless Video Streaming API

Last updated 3 hours ago

Was this helpful?

hashtagModel Inference

hashtagCold start

hashtagWorkflow run

_{Model Inference}

Cold start

Workflow run