| --- |
| license: apache-2.0 |
| pipeline_tag: image-segmentation |
| library_name: onnx |
| tags: |
| - onnxruntime |
| - onnx |
| - segment-anything |
| - segment-anything-2 |
| - image-segmentation |
| - edge-ai |
| - anylabeling |
| authors: |
| - Viet-Anh Nguyen |
| --- |
| |
| # Segment Anything 2 (SAM 2) — ONNX Models |
|
|
| ONNX exports of Meta's [SAM 2](https://github.com/facebookresearch/sam2) image-segmentation backbones, packaged for direct use with [`onnxruntime`](https://onnxruntime.ai) and [AnyLabeling](https://github.com/vietanhdev/anylabeling). |
|
|
| ## Why this repo exists |
|
|
| SAM 2 is materially better than SAM 1 on speed and quality, but the official release ships PyTorch checkpoints. ONNX gives you a portable, dependency-light runtime that works in Python, C++, JavaScript, and most embedded targets. These exports are the ones AnyLabeling consumes for its smart-labeling features. |
|
|
| ## Variants |
|
|
| Each backbone is provided in two equivalent forms — pick whichever fits your loader: |
|
|
| - **Raw ONNX pair**: `<backbone>.encoder.onnx` + `<backbone>.decoder.onnx` |
| - **Bundled zip**: `<backbone>.zip` containing both files |
|
|
| | Backbone | Encoder size | Decoder size | Bundle | |
| |---|---|---|---| |
| | `sam2_hiera_tiny` | 128 MB | 19.7 MB | `sam2_hiera_tiny.zip` (148 MB) | |
| | `sam2_hiera_small` | 155 MB | 19.7 MB | `sam2_hiera_small.zip` (175 MB) | |
| | `sam2_hiera_base_plus` | 324 MB | 19.7 MB | `sam2_hiera_base_plus.zip` (344 MB) | |
| | `sam2_hiera_large` | 848 MB | 19.7 MB | `sam2_hiera_large.zip` (868 MB) | |
|
|
| `zip_models.py` is the helper script used to produce the bundled zips from the encoder/decoder pairs. |
|
|
| ## Quick start |
|
|
| ```bash |
| pip install huggingface_hub onnxruntime |
| ``` |
|
|
| ```python |
| from huggingface_hub import hf_hub_download |
| import onnxruntime as ort |
| |
| repo = "vietanhdev/segment-anything-2-onnx-models" |
| encoder = hf_hub_download(repo_id=repo, filename="sam2_hiera_tiny.encoder.onnx") |
| decoder = hf_hub_download(repo_id=repo, filename="sam2_hiera_tiny.decoder.onnx") |
| |
| enc = ort.InferenceSession(encoder, providers=["CPUExecutionProvider"]) |
| dec = ort.InferenceSession(decoder, providers=["CPUExecutionProvider"]) |
| |
| # Inspect expected inputs: |
| print("Encoder:", [(i.name, i.shape, i.type) for i in enc.get_inputs()]) |
| print("Decoder:", [(i.name, i.shape, i.type) for i in dec.get_inputs()]) |
| ``` |
|
|
| For the full image → mask pipeline (encoder + decoder + prompt handling), see how AnyLabeling wires it: <https://github.com/vietanhdev/anylabeling> |
|
|
| ## Use with AnyLabeling |
|
|
| These models drop into AnyLabeling's auto-labeling backend without conversion. See the [AnyLabeling docs](https://github.com/vietanhdev/anylabeling) for the model-config wiring. |
|
|
| ## Source weights |
|
|
| Original SAM 2 weights and license: <https://github.com/facebookresearch/sam2> |
|
|
| This repo redistributes the same weights in ONNX format. License unchanged from upstream (Apache 2.0). |
|
|
| ## Citation |
|
|
| ```bibtex |
| @misc{nguyen2026sam2_onnx, |
| author = {Nguyen, Viet-Anh and {Neural Research Lab}}, |
| title = {SAM 2 ONNX Models}, |
| year = {2026}, |
| url = {https://huggingface.co/vietanhdev/segment-anything-2-onnx-models} |
| } |
| ``` |
|
|
| For the underlying model, cite Meta's SAM 2 paper: |
|
|
| ```bibtex |
| @article{ravi2024sam2, |
| title = {SAM 2: Segment Anything in Images and Videos}, |
| author = {Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph}, |
| journal = {arXiv:2408.00714}, |
| year = {2024} |
| } |
| ``` |
|
|
| ## Acknowledgments |
|
|
| Thanks to Meta AI Research for releasing SAM 2. This repo packages their work for edge inference. |
|
|