Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled Text Generation • 5B • Updated 6 days ago • 1.38k • 9
Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled Text Generation • 28B • Updated 5 days ago • 53.2k • 518
view article Article A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons Feb 4, 2025 • 31
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published Feb 11, 2025 • 40