Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes Paper • 2603.25562 • Published 7 days ago • 8
Running 3.76k The Ultra-Scale Playbook 🌌 3.76k The ultimate guide to training LLM on large GPU Clusters
MemEvolve: Meta-Evolution of Agent Memory Systems Paper • 2512.18746 • Published Dec 21, 2025 • 31
Scaling Latent Reasoning via Looped Language Models Paper • 2510.25741 • Published Oct 29, 2025 • 229
Running on CPU Upgrade Featured 3.08k The Smol Training Playbook 📚 3.08k The secrets to building world-class LLMs
view article Article ⚡ nano-vLLM: Lightweight, Low-Latency LLM Inference from Scratch Jun 28, 2025 • 38
Inference-Time Scaling for Generalist Reward Modeling Paper • 2504.02495 • Published Apr 3, 2025 • 58