> For the complete documentation index, see [llms.txt](https://docs.roboflow.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.roboflow.com/roboflow/roboflow-ko/deploy/supported-models/sam3.md).

# SAM3

Meta의 [Segment Anything Model 3](https://github.com/facebookresearch/sam3) 를 통한 추론을 [Serverless Hosted API](/roboflow/roboflow-ko/deploy/serverless-hosted-api-v2.md). SAM3 엔드포인트는 두 가지를 제공합니다:

{% hint style="info" %}
Roboflow에서 SAM3 모델을 학습하는 기능은 유료 [요금제에서](/roboflow/roboflow-ko/billing/plans.md) 다음이 포함된 [사용량 기반 과금](/roboflow/roboflow-ko/billing/credits.md).
{% endhint %}

* [프롬프트 가능한 개념 세그멘테이션](#concept-segmentation-pcs) (**PCS**), 이는 이미지 내 개념의 모든 인스턴스를 세그멘트합니다. 개념은 텍스트 프롬프트, 예시 박스, 또는 둘 다로 설명됩니다.
* [프롬프트 가능한 시각 세그멘테이션](#visual-segmentation-pvs) (**PVS**), 이는 SAM2 방식으로 포인트 또는 박스에서 요청당 객체 하나를 대화식으로 세그멘트합니다.

엔드포인트를 선택하려면 이 표를 사용하세요:

| 다음이 있습니다                     | 원하는 것은            | 다음을 사용하세요:              |
| ---------------------------- | ----------------- | ----------------------- |
| 텍스트 설명(예: "person")          | 일치하는 모든 인스턴스의 마스크 | `/sam3/concept_segment` |
| 예시 객체 하나를 둘러싼 박스             | 유사한 모든 인스턴스의 마스크  | `/sam3/concept_segment` |
| 객체를 포함하거나 제외하기 위한 텍스트와 예시 박스 | 일치하는 모든 인스턴스의 마스크 | `/sam3/concept_segment` |
| 특정 객체 하나에 대한 클릭 또는 박스        | 해당 객체만의 마스크       | `/sam3/visual_segment`  |

다음 값을 전달하세요 [API key](https://app.roboflow.com/settings/api) 를 `api_key` 모든 요청의 쿼리 매개변수로.

## 개념 세그멘테이션(PCS)

`POST https://serverless.roboflow.com/sam3/concept_segment`

다음의 각 항목은 `prompts` 하나의 개념을 설명합니다. 응답에는 하나의 `prompt_results` 프롬프트당 항목 하나가 포함되며, 각 항목에는 발견된 모든 인스턴스가 들어 있습니다. 요청은 최대 16개의 프롬프트를 허용합니다.

### 텍스트 프롬프트

```python
import os
import requests

payload = {
    "image": {"type": "url", "value": "https://media.roboflow.com/inference/people-walking.jpg"},
    "prompts": [
        {"type": "text", "text": "person"},
        {"type": "text", "text": "backpack"},
    ],
    "output_prob_thresh": 0.5,
    "format": "polygon",  # or "rle"
}

response = requests.post(
    "https://serverless.roboflow.com/sam3/concept_segment",
    params={"api_key": os.getenv("API_KEY")},
    json=payload,
)
for prompt_result in response.json()["prompt_results"]:
    print(prompt_result["echo"], len(prompt_result["predictions"]), "instances")
```

이미지는 다음과 같이 인라인으로도 보낼 수 있습니다 `{"type": "base64", "value": "<BASE64_IMAGE>"}`.

### 예시 박스 프롬프트

텍스트 대신, 예시를 사용해 프롬프트를 지정할 수 있습니다: 하나의 예시 객체를 둘러싼 박스입니다. 모델은 박스에 들어 있는 객체만이 아니라 예시와 일치하는 모든 인스턴스를 찾습니다.

```python
payload = {
    "image": {"type": "url", "value": "https://media.roboflow.com/inference/people-walking.jpg"},
    "prompts": [
        {
            "type": "visual",
            "boxes": [{"x": 1409, "y": 705, "width": 112, "height": 183}],
            "box_labels": [1],
        }
    ],
    "output_prob_thresh": 0.5,
    "format": "polygon",
}
```

박스는 절대 픽셀 좌표를 사용합니다. 두 가지 형식이 허용됩니다:

* `{"x": ..., "y": ..., "width": ..., "height": ...}` 여기서 `x`, `y` 는 왼쪽 위 모서리입니다
* `{"x0": ..., "y0": ..., "x1": ..., "y1": ...}` 명시적인 모서리를 위한 형식

`box_labels` 가 필요합니다. 다음 경우: `boxes` 가 설정되어 있을 때이며, 박스마다 항목이 하나씩 있어야 합니다: `1` 양성 예시를 표시합니다(이와 같은 객체를 찾기). `0` 음성 예시를 표시합니다(이와 같은 객체를 제외하기).

### 텍스트와 예시 프롬프트 결합

하나의 프롬프트에 텍스트와 예시 박스를 모두 담을 수 있습니다. 이는 시각적 예시로 텍스트 개념을 좁히거나, 음성 예시로 비슷한 대상을 제외할 때 유용합니다:

```python
payload = {
    "image": {"type": "url", "value": "https://media.roboflow.com/inference/people-walking.jpg"},
    "prompts": [
        {
            "type": "visual",
            "text": "person",
            "boxes": [
                {"x": 1409, "y": 705, "width": 112, "height": 183},
                {"x": 1216, "y": 496, "width": 124, "height": 184},
            ],
            "box_labels": [1, 0],
        }
    ],
    "output_prob_thresh": 0.5,
    "format": "polygon",
}
```

여기서 모델은 첫 번째(양성) 예시와 일치하는 사람을 세그멘트하면서 두 번째(음성) 예시와 유사한 인스턴스는 억제합니다.

## 시각 세그멘테이션(PVS)

`POST https://serverless.roboflow.com/sam3/visual_segment`

PVS는 클릭 또는 박스로 지정된 특정 객체 하나를 세그멘트합니다. 대화형 human-in-the-loop 마스크 세분화에는 이를 사용하고, 개념의 모든 인스턴스를 원할 때는 PCS를 사용하세요.

```python
import os
import requests

payload = {
    "image": {"type": "url", "value": "https://media.roboflow.com/inference/people-walking.jpg"},
    "prompts": {
        "prompts": [
            {
                "points": [{"x": 1465, "y": 796, "positive": True}],
                "box": {"x": 1465, "y": 796, "width": 112, "height": 183},
            }
        ]
    },
    "multimask_output": False,
    "format": "json",
}

response = requests.post(
    "https://serverless.roboflow.com/sam3/visual_segment",
    params={"api_key": os.getenv("API_KEY")},
    json=payload,
)
prediction = response.json()["predictions"][0]
print(prediction["confidence"], len(prediction["masks"]), "polygons")
```

프롬프트에는 `포인트`를 `박스` 또는 둘 다 포함될 수 있습니다:

* `포인트` 는 절대 픽셀 좌표입니다. `"positive": true` 클릭한 영역을 포함합니다, `false` 그 영역은 제외합니다. 마스크를 세분화하려면 포인트를 더 추가하세요.
* `박스` 중심 기준 좌표를 사용합니다: `x`, `y` 는 박스 중심입니다. PCS 박스는 왼쪽 위 기준인 것과 다릅니다.

응답에는 해당 프롬프트에 대한 신뢰도가 가장 높은 단일 마스크가 포함됩니다. `multimask_output` 는 모델이 내부적으로 생성하는 마스크 후보의 수를 제어합니다(True일 때는 3개). 하지만 응답에는 항상 가장 좋은 후보가 선택됩니다.

{% hint style="warning" %}
요청당 프롬프트 하나를 보내세요. 하나의 PVS 요청에 여러 프롬프트를 보내면 현재는 예측이 하나만 반환됩니다.
{% endhint %}

OpenCV를 사용하는 대화형 데모는 다음을 참조하세요 [GitHub Gist](https://gist.github.com/Erol444/4cbc33c6ac52d83c63f6f9d86ca8a7a4)이 비디오에서 사용되었습니다:

{% embed url="<https://www.youtube.com/watch?v=01xrBzqHZ6c>" %}

## 엔드포인트

## SAM3 PCS (promptable concept segmentation)

> \*\*Concept Segmentation (Text Prompts)\*\*\
> \
> Allows you to segment objects using text prompts.\
> \
> \*\*Image Input\*\*: The \`image\` field accepts either:\
> \- \`{"type": "url", "value": "\<IMAGE\_URL>"}\` - A publicly accessible image URL\
> \- \`{"type": "base64", "value": "\<BASE64\_DATA>"}\` - Base64 encoded image data\
> \
> &#x20;\*\*Prompts\*\*: Each prompt in the \`prompts\` array should have \`type: "text"\` and a \`text\` field with the object description.

```json
{"openapi":"3.1.0","info":{"title":"Roboflow SAM3 API","version":"0.64.4"},"servers":[{"url":"https://serverless.roboflow.com"}],"paths":{"/sam3/concept_segment":{"post":{"summary":"SAM3 PCS (promptable concept segmentation)","description":"**Concept Segmentation (Text Prompts)**\n\nAllows you to segment objects using text prompts.\n\n**Image Input**: The `image` field accepts either:\n- `{\"type\": \"url\", \"value\": \"<IMAGE_URL>\"}` - A publicly accessible image URL\n- `{\"type\": \"base64\", \"value\": \"<BASE64_DATA>\"}` - Base64 encoded image data\n\n **Prompts**: Each prompt in the `prompts` array should have `type: \"text\"` and a `text` field with the object description.","operationId":"sam3_segment_image_sam3_concept_segment_post","parameters":[{"name":"api_key","in":"query","required":true,"schema":{"type":"string","title":"API Key"},"description":"Your Roboflow API Key. Get one at https://app.roboflow.com/settings/api"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Sam3SegmentationRequest"}}}},"responses":{"200":{"description":"Successful Response","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Sam3SegmentationResponse"}}}},"422":{"description":"Validation Error","content":{"application/json":{"schema":{"$ref":"#/components/schemas/HTTPValidationError"}}}}}}}},"components":{"schemas":{"Sam3SegmentationRequest":{"properties":{"image":{"$ref":"#/components/schemas/InferenceRequestImage","description":"The image to be segmented."},"prompts":{"items":{"$ref":"#/components/schemas/Sam3Prompt"},"type":"array","minItems":1,"title":"Prompts","description":"List of prompts (text and/or visual)"},"format":{"type":"string","title":"Format","description":"One of 'polygon', 'rle'","default":"polygon"},"image_id":{"type":"string","title":"Image Id","description":"Optional ID for caching embeddings."},"output_prob_thresh":{"type":"number","title":"Output Prob Thresh","description":"Score threshold for outputs.","default":0.5},"model_id":{"type":"string","title":"Model Id","description":"The model ID of SAM3. Use 'sam3/sam3_final' to target the generic base model.","default":"sam3/sam3_final"},"nms_iou_threshold":{"type":"number","title":"Nms Iou Threshold","description":"IoU threshold for cross-prompt NMS. If not set, NMS is disabled. Must be in [0.0, 1.0] when set."}},"type":"object","required":["image","prompts"],"title":"Sam3SegmentationRequest"},"InferenceRequestImage":{"properties":{"type":{"type":"string","title":"Type","description":"The type of image data provided, one of `url`, `base64`"},"value":{"type":"string","title":"Value","description":"Image data corresponding to the image type, if type = 'url' then value is a string containing the url of an image, else if type = 'base64' then value is a string containing base64 encoded image data."}},"type":"object","required":["type"],"title":"InferenceRequestImage","description":"Image data for inference request.\n\nAttributes:\n    type (str): The type of image data provided, one of 'url', 'base64', or 'numpy'.\n    value (Optional[Any]): Image data corresponding to the image type."},"Sam3Prompt":{"properties":{"type":{"type":"string","title":"Type","description":"Hint: `text` or `visual`"},"text":{"type":"string","title":"Text","description":"Text prompt describing the object to segment"},"output_prob_thresh":{"type":"number","title":"Output Prob Thresh","description":"Score threshold for this prompt's outputs. Overrides request-level threshold if set."},"boxes":{"items":{"anyOf":[{"$ref":"#/components/schemas/Box"},{"$ref":"#/components/schemas/BoxXYXY"}]},"type":"array","title":"Boxes","description":"Absolute pixel boxes as either XYWH or XYXY entries"},"box_labels":{"items":{"anyOf":[{"type":"integer"},{"type":"boolean"}]},"type":"array","title":"Box Labels","description":"List of 0/1 or booleans for boxes"}},"type":"object","required":["type"],"title":"Sam3Prompt","description":"Unified prompt that can contain text and/or geometry. Absolute pixel coordinates are used for boxes."},"Sam3SegmentationResponse":{"properties":{"prompt_results":{"items":{"$ref":"#/components/schemas/Sam3PromptResult"},"type":"array","title":"Prompt Results","description":"Results for each prompt in the request"},"time":{"type":"number","title":"Time","description":"The time in seconds it took to produce the segmentation including preprocessing"}},"type":"object","required":["prompt_results","time"],"title":"Sam3SegmentationResponse"},"Sam3PromptResult":{"properties":{"prompt_index":{"type":"integer","title":"Prompt Index","description":"Index of the prompt this result corresponds to"},"echo":{"$ref":"#/components/schemas/Sam3PromptEcho","description":"Echo of the original prompt for reference"},"predictions":{"items":{"$ref":"#/components/schemas/Sam3SegmentationPrediction"},"type":"array","title":"Predictions","description":"Segmentation predictions for this prompt"}},"type":"object","required":["prompt_index","predictions"],"title":"Sam3PromptResult"},"Sam3PromptEcho":{"properties":{"prompt_index":{"type":"integer","title":"Prompt Index"},"type":{"type":"string","title":"Type","description":"The prompt type (`text` or `visual`)"},"text":{"type":"string","title":"Text","description":"The text prompt if type is `text`"},"num_boxes":{"type":"integer","title":"Num Boxes","description":"Number of bounding boxes in the prompt"}},"type":"object","title":"Sam3PromptEcho"},"Sam3SegmentationPrediction":{"properties":{"format":{"type":"string","title":"Format","description":"The format of the mask data, either `polygon` or `rle`"},"confidence":{"type":"number","title":"Confidence","description":"Confidence score for this prediction"},"masks":{"items":{"items":{"items":{"type":"number"},"type":"array","minItems":2,"maxItems":2},"type":"array"},"type":"array","title":"Masks","description":"Array of polygons, each polygon is an array of [x, y] coordinate points"}},"type":"object","required":["format","confidence","masks"],"title":"Sam3SegmentationPrediction"},"HTTPValidationError":{"properties":{"detail":{"items":{"$ref":"#/components/schemas/ValidationError"},"type":"array","title":"Detail"}},"type":"object","title":"HTTPValidationError"},"ValidationError":{"properties":{"loc":{"items":{"anyOf":[{"type":"string"},{"type":"integer"}]},"type":"array","title":"Location"},"msg":{"type":"string","title":"Message"},"type":{"type":"string","title":"Error Type"}},"type":"object","required":["loc","msg","type"],"title":"ValidationError"}}}}
```

## SAM3 PVS (promptable visual segmentation)

> \*\*Interactive Segmentation (SAM 2 Style)\*\*\
> \
> SAM 3 also supports interactive segmentation using points and boxes.\
> \
> \*\*Image Input\*\*: The \`image\` field accepts either:\
> \- \`{"type": "url", "value": "\<IMAGE\_URL>"}\` - A publicly accessible image URL\
> \- \`{"type": "base64", "value": "\<BASE64\_DATA>"}\` - Base64 encoded image data\
> \
> \> \*\*Note\*\*: NumPy arrays are NOT supported on the serverless API. Use URL or base64 encoding only.\
> \
> \*\*Prompts\*\*: Support point-based prompts with positive/negative clicks for interactive segmentation.

```json
{"openapi":"3.1.0","info":{"title":"Roboflow SAM3 API","version":"0.64.4"},"servers":[{"url":"https://serverless.roboflow.com"}],"paths":{"/sam3/visual_segment":{"post":{"summary":"SAM3 PVS (promptable visual segmentation)","description":"**Interactive Segmentation (SAM 2 Style)**\n\nSAM 3 also supports interactive segmentation using points and boxes.\n\n**Image Input**: The `image` field accepts either:\n- `{\"type\": \"url\", \"value\": \"<IMAGE_URL>\"}` - A publicly accessible image URL\n- `{\"type\": \"base64\", \"value\": \"<BASE64_DATA>\"}` - Base64 encoded image data\n\n> **Note**: NumPy arrays are NOT supported on the serverless API. Use URL or base64 encoding only.\n\n**Prompts**: Support point-based prompts with positive/negative clicks for interactive segmentation.","operationId":"sam3_visual_segment_sam3_visual_segment_post","parameters":[{"name":"api_key","in":"query","required":true,"schema":{"type":"string","title":"API Key"},"description":"Your Roboflow API Key. Get one at https://app.roboflow.com/settings/api"}],"requestBody":{"required":true,"content":{"application/json":{"schema":{"$ref":"#/components/schemas/Sam2SegmentationRequest"}}}},"responses":{"200":{"description":"Successful Response","content":{"application/json":{"schema":{"$ref":"#/components/schemas/Sam2SegmentationResponse"}}}},"422":{"description":"Validation Error","content":{"application/json":{"schema":{"$ref":"#/components/schemas/HTTPValidationError"}}}}}}}},"components":{"schemas":{"Sam2SegmentationRequest":{"properties":{"image":{"$ref":"#/components/schemas/InferenceRequestImage","description":"The image to be segmented."},"image_id":{"type":"string","title":"Image Id","description":"The ID of the image to be segmented used to retrieve cached embeddings. If an embedding is cached, it will be used instead of generating a new embedding. If no embedding is cached, a new embedding will be generated and cached."},"prompts":{"$ref":"#/components/schemas/Sam2PromptSet","description":"A list of prompts for masks to predict. Each prompt can include a bounding box and / or a set of postive or negative points."},"format":{"type":"string","title":"Format","description":"The format of the response. Must be one of 'json', 'rle', or 'binary'. If binary, masks are returned as binary numpy arrays. If json, masks are converted to polygons. If rle, masks are converted to RLE format.","default":"json"},"sam2_version_id":{"type":"string","title":"Sam2 Version Id","description":"The version ID of SAM to be used for this request. Must be one of hiera_tiny, hiera_small, hiera_large, hiera_b_plus","default":"hiera_large"},"multimask_output":{"type":"boolean","title":"Multimask Output","description":"If true, the model will return three masks. For ambiguous input prompts (such as a single click), this will often produce better masks than a single prediction.","default":true},"save_logits_to_cache":{"type":"boolean","title":"Save Logits To Cache","description":"If True, saves the low-resolution logits to the cache for potential future use.","default":false},"load_logits_from_cache":{"type":"boolean","title":"Load Logits From Cache","description":"If True, attempts to load previously cached low-resolution logits for the given image and prompt set.","default":false}},"type":"object","required":["image"],"title":"Sam2SegmentationRequest","description":"SAM2 visual segmentation request."},"InferenceRequestImage":{"properties":{"type":{"type":"string","title":"Type","description":"The type of image data provided, one of `url`, `base64`"},"value":{"type":"string","title":"Value","description":"Image data corresponding to the image type, if type = 'url' then value is a string containing the url of an image, else if type = 'base64' then value is a string containing base64 encoded image data."}},"type":"object","required":["type"],"title":"InferenceRequestImage","description":"Image data for inference request.\n\nAttributes:\n    type (str): The type of image data provided, one of 'url', 'base64', or 'numpy'.\n    value (Optional[Any]): Image data corresponding to the image type."},"Sam2SegmentationResponse":{"properties":{"prompt_results":{"items":{"$ref":"#/components/schemas/Sam2PromptResult"},"type":"array","title":"Prompt Results","description":"Results for each prompt in the request"},"time":{"type":"number","title":"Time","description":"The time in seconds it took to produce the segmentation including preprocessing"}},"type":"object","required":["prompt_results","time"],"title":"Sam2SegmentationResponse"},"Sam2PromptResult":{"properties":{"prompt_index":{"type":"integer","title":"Prompt Index","description":"Index of the prompt this result corresponds to"},"predictions":{"items":{"$ref":"#/components/schemas/Sam2SegmentationPrediction"},"type":"array","title":"Predictions","description":"Segmentation predictions for this prompt"}},"type":"object","required":["prompt_index","predictions"],"title":"Sam2PromptResult"},"HTTPValidationError":{"properties":{"detail":{"items":{"$ref":"#/components/schemas/ValidationError"},"type":"array","title":"Detail"}},"type":"object","title":"HTTPValidationError"},"ValidationError":{"properties":{"loc":{"items":{"anyOf":[{"type":"string"},{"type":"integer"}]},"type":"array","title":"Location"},"msg":{"type":"string","title":"Message"},"type":{"type":"string","title":"Error Type"}},"type":"object","required":["loc","msg","type"],"title":"ValidationError"}}}}
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.roboflow.com/roboflow/roboflow-ko/deploy/supported-models/sam3.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
