Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs Paper • 2505.04519 • Published May 7, 2025 • 5
Qwen3 Collection Qwen's new Qwen3 models. In Unsloth Dynamic 2.0, GGUF, 4-bit and 16-bit Safetensor formats. Includes 128K Context Length variants. • 70 items • Updated about 15 hours ago • 262