Web-Petals SmolLM2-1.7B ONNX Layers

SmolLM2-1.7B-Instruct split into individual ONNX transformer layer files for distributed P2P inference.

Files

  • embeds.onnx - Token embedding table (~384 MB)
  • layer_0.onnx ... layer_23.onnx - 24 transformer blocks (~256 MB each)
  • lm_head.onnx - Final LayerNorm + LM head (~384 MB)

Architecture

  • Hidden size: 2048
  • Layers: 24
  • Attention heads: 32
  • FFN size: 8192
  • Vocab size: 49152
  • Total: ~6.75 GB (FP32)

License

Apache 2.0 (same as SmolLM2)

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support