SAM3

Meta의 SAM3 모델을 Serverless Hosted API를 통해 사용하세요

우리는 Meta의 Segment Anything Model 3arrow-up-right 추론을 Serverless Hosted API를 통해 지원합니다. 두 가지 다른 SAM3 엔드포인트를 제공합니다:

코드 샘플

PCS 코드 샘플

아래는 SAM3 추론을 위해 PCS 엔드포인트를 사용하는 코드 샘플입니다. 사용자는 Roboflow의 API 키arrow-up-right 를 통해 전달해야 합니다 API_KEY 환경 변수.

import os
import requests
import base64
import cv2
import numpy as np

# "https://media.roboflow.com/notebooks/examples/dog.jpeg"에서
image = cv2.imread("./dog.jpeg")

# 이미지를 base64로 인코딩
_, buffer = cv2.imencode('.jpg', image)
image_base64 = base64.b64encode(buffer).decode('utf-8')

payload = {
    "image": { "type": "base64", "value": image_base64 },
    "prompts": [
        { "type": "text", "text": "person" },
        { "type": "text", "text": "dog" },
    ],
    "output_prob_thresh": 0.5,
    "format": "polygon",
}

url = "https://serverless.roboflow.com/sam3/concept_segment?api_key=" + os.getenv("API_KEY")
response = requests.post(url, json=payload)
data = response.json()

for key in dat
    print(key) # 프롬프트 결과와 시간이어야 함

PVS 코드 샘플

참조 Github Gistarrow-up-right OpenCV를 사용한 대화형 데모로, 이 비디오에서 사용되었습니다:

엔드포인트

SAM3 PCS (promptable concept segmentation)

post

Concept Segmentation (Text Prompts)

Allows you to segment objects using text prompts.

Image Input: The image field accepts either:

  • {"type": "url", "value": "<IMAGE_URL>"} - A publicly accessible image URL

  • {"type": "base64", "value": "<BASE64_DATA>"} - Base64 encoded image data

Prompts: Each prompt in the prompts array should have type: "text" and a text field with the object description.

Query parameters
api_keystringRequired

Your Roboflow API Key. Get one at https://app.roboflow.com/settings/api

Body
formatstringOptional

One of 'polygon', 'rle'

Default: polygon
image_idstringOptional

Optional ID for caching embeddings.

output_prob_threshnumberOptional

Score threshold for outputs.

Default: 0.5
model_idstringOptional

The model ID of SAM3. Use 'sam3/sam3_final' to target the generic base model.

Default: sam3/sam3_final
nms_iou_thresholdnumberOptional

IoU threshold for cross-prompt NMS. If not set, NMS is disabled. Must be in [0.0, 1.0] when set.

Responses
chevron-right
200

Successful Response

application/json
timenumberRequired

The time in seconds it took to produce the segmentation including preprocessing

post
/sam3/concept_segment

SAM3 PVS (promptable visual segmentation)

post

Interactive Segmentation (SAM 2 Style)

SAM 3 also supports interactive segmentation using points and boxes.

Image Input: The image field accepts either:

  • {"type": "url", "value": "<IMAGE_URL>"} - A publicly accessible image URL

  • {"type": "base64", "value": "<BASE64_DATA>"} - Base64 encoded image data

Note: NumPy arrays are NOT supported on the serverless API. Use URL or base64 encoding only.

Prompts: Support point-based prompts with positive/negative clicks for interactive segmentation.

Query parameters
api_keystringRequired

Your Roboflow API Key. Get one at https://app.roboflow.com/settings/api

Body

SAM2 visual segmentation request.

image_idstringOptional

The ID of the image to be segmented used to retrieve cached embeddings. If an embedding is cached, it will be used instead of generating a new embedding. If no embedding is cached, a new embedding will be generated and cached.

Example: image_id
formatstringOptional

The format of the response. Must be one of 'json', 'rle', or 'binary'. If binary, masks are returned as binary numpy arrays. If json, masks are converted to polygons. If rle, masks are converted to RLE format.

Default: jsonExample: json
sam2_version_idstringOptional

The version ID of SAM to be used for this request. Must be one of hiera_tiny, hiera_small, hiera_large, hiera_b_plus

Default: hiera_largeExample: hiera_large
multimask_outputbooleanOptional

If true, the model will return three masks. For ambiguous input prompts (such as a single click), this will often produce better masks than a single prediction.

Default: trueExample: true
save_logits_to_cachebooleanOptional

If True, saves the low-resolution logits to the cache for potential future use.

Default: false
load_logits_from_cachebooleanOptional

If True, attempts to load previously cached low-resolution logits for the given image and prompt set.

Default: false
Responses
chevron-right
200

Successful Response

application/json
timenumberRequired

The time in seconds it took to produce the segmentation including preprocessing

post
/sam3/visual_segment

Last updated

Was this helpful?