DFlare: Scaling Up Draft Capacity for Block Diffusion Speculative Decoding Paper • 2606.02091 • Published 17 days ago • 1
A Comprehensive Survey on Long Context Language Modeling Paper • 2503.17407 • Published Mar 20, 2025 • 49
MPO: Boosting LLM Agents with Meta Plan Optimization Paper • 2503.02682 • Published Mar 4, 2025 • 29
Running 3.89k The Ultra-Scale Playbook 🌌 3.89k The ultimate guide to training LLM on large GPU Clusters