vietanhdev's picture
docs: rewrite model card with variants table, citation, AnyLabeling cross-link
071f580 verified
---
license: apache-2.0
pipeline_tag: image-segmentation
library_name: onnx
tags:
- onnxruntime
- onnx
- segment-anything
- segment-anything-2
- image-segmentation
- edge-ai
- anylabeling
authors:
- Viet-Anh Nguyen
---
# Segment Anything 2 (SAM 2) — ONNX Models
ONNX exports of Meta's [SAM 2](https://github.com/facebookresearch/sam2) image-segmentation backbones, packaged for direct use with [`onnxruntime`](https://onnxruntime.ai) and [AnyLabeling](https://github.com/vietanhdev/anylabeling).
## Why this repo exists
SAM 2 is materially better than SAM 1 on speed and quality, but the official release ships PyTorch checkpoints. ONNX gives you a portable, dependency-light runtime that works in Python, C++, JavaScript, and most embedded targets. These exports are the ones AnyLabeling consumes for its smart-labeling features.
## Variants
Each backbone is provided in two equivalent forms — pick whichever fits your loader:
- **Raw ONNX pair**: `<backbone>.encoder.onnx` + `<backbone>.decoder.onnx`
- **Bundled zip**: `<backbone>.zip` containing both files
| Backbone | Encoder size | Decoder size | Bundle |
|---|---|---|---|
| `sam2_hiera_tiny` | 128 MB | 19.7 MB | `sam2_hiera_tiny.zip` (148 MB) |
| `sam2_hiera_small` | 155 MB | 19.7 MB | `sam2_hiera_small.zip` (175 MB) |
| `sam2_hiera_base_plus` | 324 MB | 19.7 MB | `sam2_hiera_base_plus.zip` (344 MB) |
| `sam2_hiera_large` | 848 MB | 19.7 MB | `sam2_hiera_large.zip` (868 MB) |
`zip_models.py` is the helper script used to produce the bundled zips from the encoder/decoder pairs.
## Quick start
```bash
pip install huggingface_hub onnxruntime
```
```python
from huggingface_hub import hf_hub_download
import onnxruntime as ort
repo = "vietanhdev/segment-anything-2-onnx-models"
encoder = hf_hub_download(repo_id=repo, filename="sam2_hiera_tiny.encoder.onnx")
decoder = hf_hub_download(repo_id=repo, filename="sam2_hiera_tiny.decoder.onnx")
enc = ort.InferenceSession(encoder, providers=["CPUExecutionProvider"])
dec = ort.InferenceSession(decoder, providers=["CPUExecutionProvider"])
# Inspect expected inputs:
print("Encoder:", [(i.name, i.shape, i.type) for i in enc.get_inputs()])
print("Decoder:", [(i.name, i.shape, i.type) for i in dec.get_inputs()])
```
For the full image → mask pipeline (encoder + decoder + prompt handling), see how AnyLabeling wires it: <https://github.com/vietanhdev/anylabeling>
## Use with AnyLabeling
These models drop into AnyLabeling's auto-labeling backend without conversion. See the [AnyLabeling docs](https://github.com/vietanhdev/anylabeling) for the model-config wiring.
## Source weights
Original SAM 2 weights and license: <https://github.com/facebookresearch/sam2>
This repo redistributes the same weights in ONNX format. License unchanged from upstream (Apache 2.0).
## Citation
```bibtex
@misc{nguyen2026sam2_onnx,
author = {Nguyen, Viet-Anh and {Neural Research Lab}},
title = {SAM 2 ONNX Models},
year = {2026},
url = {https://huggingface.co/vietanhdev/segment-anything-2-onnx-models}
}
```
For the underlying model, cite Meta's SAM 2 paper:
```bibtex
@article{ravi2024sam2,
title = {SAM 2: Segment Anything in Images and Videos},
author = {Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal = {arXiv:2408.00714},
year = {2024}
}
```
## Acknowledgments
Thanks to Meta AI Research for releasing SAM 2. This repo packages their work for edge inference.