SAM3

Meta の SAM3 モデルを Serverless Hosted API 経由で使用します

Metaの Segment Anything Model 3arrow-up-right 推論をサポートします(via) Serverless Hosted API。2つの異なるSAM3エンドポイントを提供しています:

コードサンプル

PCSコードサンプル

以下はPCSエンドポイントを使用したSAM3推論のコードサンプルです。ユーザーは RoboflowのAPIキーarrow-up-rightAPI_KEY 環境変数で渡す必要があります。

import os
import requests
import base64
import cv2
import numpy as np

# From "https://media.roboflow.com/notebooks/examples/dog.jpeg"
image = cv2.imread("./dog.jpeg")

# Encode image as base64
_, buffer = cv2.imencode('.jpg', image)
image_base64 = base64.b64encode(buffer).decode('utf-8')

payload = {
    "image": { "type": "base64", "value": image_base64 },
    "prompts": [
        { "type": "text", "text": "person" },
        { "type": "text", "text": "dog" },
    ],
    "output_prob_thresh": 0.5,
    "format": "polygon",
}

url = "https://serverless.roboflow.com/sam3/concept_segment?api_key=" + os.getenv("API_KEY")
response = requests.post(url, json=payload)
data = response.json()

for key in dat
    print(key) # Should be prompt_results and time

PVSコードサンプル

参照: Github Gistarrow-up-right OpenCVを使ったインタラクティブデモ(このビデオで使用されたもの)

エンドポイント

SAM3 PCS (promptable concept segmentation)

post

Concept Segmentation (Text Prompts)

Allows you to segment objects using text prompts.

Image Input: The image field accepts either:

  • {"type": "url", "value": "<IMAGE_URL>"} - A publicly accessible image URL

  • {"type": "base64", "value": "<BASE64_DATA>"} - Base64 encoded image data

Prompts: Each prompt in the prompts array should have type: "text" and a text field with the object description.

Query parameters
api_keystringRequired

Your Roboflow API Key. Get one at https://app.roboflow.com/settings/api

Body
formatstringOptional

One of 'polygon', 'rle'

Default: polygon
image_idstringOptional

Optional ID for caching embeddings.

output_prob_threshnumberOptional

Score threshold for outputs.

Default: 0.5
model_idstringOptional

The model ID of SAM3. Use 'sam3/sam3_final' to target the generic base model.

Default: sam3/sam3_final
nms_iou_thresholdnumberOptional

IoU threshold for cross-prompt NMS. If not set, NMS is disabled. Must be in [0.0, 1.0] when set.

Responses
chevron-right
200

Successful Response

application/json
timenumberRequired

The time in seconds it took to produce the segmentation including preprocessing

post
/sam3/concept_segment

SAM3 PVS (promptable visual segmentation)

post

Interactive Segmentation (SAM 2 Style)

SAM 3 also supports interactive segmentation using points and boxes.

Image Input: The image field accepts either:

  • {"type": "url", "value": "<IMAGE_URL>"} - A publicly accessible image URL

  • {"type": "base64", "value": "<BASE64_DATA>"} - Base64 encoded image data

Note: NumPy arrays are NOT supported on the serverless API. Use URL or base64 encoding only.

Prompts: Support point-based prompts with positive/negative clicks for interactive segmentation.

Query parameters
api_keystringRequired

Your Roboflow API Key. Get one at https://app.roboflow.com/settings/api

Body

SAM2 visual segmentation request.

image_idstringOptional

The ID of the image to be segmented used to retrieve cached embeddings. If an embedding is cached, it will be used instead of generating a new embedding. If no embedding is cached, a new embedding will be generated and cached.

Example: image_id
formatstringOptional

The format of the response. Must be one of 'json', 'rle', or 'binary'. If binary, masks are returned as binary numpy arrays. If json, masks are converted to polygons. If rle, masks are converted to RLE format.

Default: jsonExample: json
sam2_version_idstringOptional

The version ID of SAM to be used for this request. Must be one of hiera_tiny, hiera_small, hiera_large, hiera_b_plus

Default: hiera_largeExample: hiera_large
multimask_outputbooleanOptional

If true, the model will return three masks. For ambiguous input prompts (such as a single click), this will often produce better masks than a single prediction.

Default: trueExample: true
save_logits_to_cachebooleanOptional

If True, saves the low-resolution logits to the cache for potential future use.

Default: false
load_logits_from_cachebooleanOptional

If True, attempts to load previously cached low-resolution logits for the given image and prompt set.

Default: false
Responses
chevron-right
200

Successful Response

application/json
timenumberRequired

The time in seconds it took to produce the segmentation including preprocessing

post
/sam3/visual_segment

Last updated

Was this helpful?