Web-Petals SmolLM2-1.7B ONNX Layers
SmolLM2-1.7B-Instruct split into individual ONNX transformer layer files for distributed P2P inference.
Files
- embeds.onnx - Token embedding table (~384 MB)
- layer_0.onnx ... layer_23.onnx - 24 transformer blocks (~256 MB each)
- lm_head.onnx - Final LayerNorm + LM head (~384 MB)
Architecture
- Hidden size: 2048
- Layers: 24
- Attention heads: 32
- FFN size: 8192
- Vocab size: 49152
- Total: ~6.75 GB (FP32)
License
Apache 2.0 (same as SmolLM2)
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support