Running 28 Weight-Space Geometry of Offline Reasoning Training đź§ 28 Interactive weight-space geometry of six reasoning losses
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent Paper • 2603.13875 • Published Mar 14 • 36
Running on CPU Upgrade 263 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens 📝 263 Visualize synthetic‑data experiments as an interactive bookshelf
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published Feb 27 • 91
view article Article Architectural Choices in China's Open-Source AI Ecosystem: Building Beyond DeepSeek huggingface • Jan 27 • 45
view article Article From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels drbh, danieldk • Aug 18, 2025 • 104
unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF Text Generation • 31B • Updated Jan 30 • 245k • 766
Running 3.91k The Ultra-Scale Playbook 🌌 3.91k The ultimate guide to training LLM on large GPU Clusters