powerpudu
/

Web-Petals-SmolLM2-1.7B

distributed-inference

Model card Files Files and versions

Web-Petals SmolLM2-1.7B ONNX Layers

SmolLM2-1.7B-Instruct split into individual ONNX transformer layer files for distributed P2P inference.

Files

embeds.onnx - Token embedding table (~384 MB)
layer_0.onnx ... layer_23.onnx - 24 transformer blocks (~256 MB each)
lm_head.onnx - Final LayerNorm + LM head (~384 MB)

Architecture

Hidden size: 2048
Layers: 24
Attention heads: 32
FFN size: 8192
Vocab size: 49152
Total: ~6.75 GB (FP32)

License

Apache 2.0 (same as SmolLM2)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support