File size: 3,752 Bytes

071f580

---
license: apache-2.0
pipeline_tag: image-segmentation
library_name: onnx
tags:
  - onnxruntime
  - onnx
  - segment-anything
  - segment-anything-2
  - image-segmentation
  - edge-ai
  - anylabeling
authors:
  - Viet-Anh Nguyen
---

# Segment Anything 2 (SAM 2) — ONNX Models

ONNX exports of Meta's [SAM 2](https://github.com/facebookresearch/sam2) image-segmentation backbones, packaged for direct use with [`onnxruntime`](https://onnxruntime.ai) and [AnyLabeling](https://github.com/vietanhdev/anylabeling).

## Why this repo exists

SAM 2 is materially better than SAM 1 on speed and quality, but the official release ships PyTorch checkpoints. ONNX gives you a portable, dependency-light runtime that works in Python, C++, JavaScript, and most embedded targets. These exports are the ones AnyLabeling consumes for its smart-labeling features.

## Variants

Each backbone is provided in two equivalent forms — pick whichever fits your loader:

- **Raw ONNX pair**: `<backbone>.encoder.onnx` + `<backbone>.decoder.onnx`
- **Bundled zip**: `<backbone>.zip` containing both files

| Backbone | Encoder size | Decoder size | Bundle |
|---|---|---|---|
| `sam2_hiera_tiny` | 128 MB | 19.7 MB | `sam2_hiera_tiny.zip` (148 MB) |
| `sam2_hiera_small` | 155 MB | 19.7 MB | `sam2_hiera_small.zip` (175 MB) |
| `sam2_hiera_base_plus` | 324 MB | 19.7 MB | `sam2_hiera_base_plus.zip` (344 MB) |
| `sam2_hiera_large` | 848 MB | 19.7 MB | `sam2_hiera_large.zip` (868 MB) |

`zip_models.py` is the helper script used to produce the bundled zips from the encoder/decoder pairs.

## Quick start

```bash
pip install huggingface_hub onnxruntime
```

```python
from huggingface_hub import hf_hub_download
import onnxruntime as ort

repo = "vietanhdev/segment-anything-2-onnx-models"
encoder = hf_hub_download(repo_id=repo, filename="sam2_hiera_tiny.encoder.onnx")
decoder = hf_hub_download(repo_id=repo, filename="sam2_hiera_tiny.decoder.onnx")

enc = ort.InferenceSession(encoder, providers=["CPUExecutionProvider"])
dec = ort.InferenceSession(decoder, providers=["CPUExecutionProvider"])

# Inspect expected inputs:
print("Encoder:", [(i.name, i.shape, i.type) for i in enc.get_inputs()])
print("Decoder:", [(i.name, i.shape, i.type) for i in dec.get_inputs()])
```

For the full image → mask pipeline (encoder + decoder + prompt handling), see how AnyLabeling wires it: <https://github.com/vietanhdev/anylabeling>

## Use with AnyLabeling

These models drop into AnyLabeling's auto-labeling backend without conversion. See the [AnyLabeling docs](https://github.com/vietanhdev/anylabeling) for the model-config wiring.

## Source weights

Original SAM 2 weights and license: <https://github.com/facebookresearch/sam2>

This repo redistributes the same weights in ONNX format. License unchanged from upstream (Apache 2.0).

## Citation

```bibtex
@misc{nguyen2026sam2_onnx,
  author = {Nguyen, Viet-Anh and {Neural Research Lab}},
  title  = {SAM 2 ONNX Models},
  year   = {2026},
  url    = {https://huggingface.co/vietanhdev/segment-anything-2-onnx-models}
}
```

For the underlying model, cite Meta's SAM 2 paper:

```bibtex
@article{ravi2024sam2,
  title   = {SAM 2: Segment Anything in Images and Videos},
  author  = {Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
  journal = {arXiv:2408.00714},
  year    = {2024}
}
```

## Acknowledgments

Thanks to Meta AI Research for releasing SAM 2. This repo packages their work for edge inference.