nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B-BF16 Text Generation β’ 561B β’ Updated 1 day ago β’ 49.8k β’ 154
view article Article Task-Seeded Synthetic Q&A Generation for Nemotron Pretraining nvidia β’ 3 days ago β’ 12
view article Article Harness, Scaffold, and the AI Agent Terms Worth Getting Right sergiopaniego, ariG23498 β’ 14 days ago β’ 101
Running on CPU Upgrade Featured 394 ML Intern π€ 394 Ask ML questions and receive instant helpful answers
The ATOM Report: Measuring the Open Language Model Ecosystem Paper β’ 2604.07190 β’ Published Apr 8 β’ 5
Running Featured 85 Distilling 100B+ Models 40x Faster with TRL π 85 TRL distillation for 100B+ teachers, 40x faster
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 aminediroHF, qgallouedec, kashif, lewtun, edbeeching, albertvillanova, nouamanetazi, lvwerra, sergiopaniego β’ Mar 10 β’ 160
view article Article Ulysses Sequence Parallelism: Training with Million-Token Contexts kashif, stas β’ Mar 9 β’ 30
Running on CPU Upgrade 245 The Synthetic Data Playbook: Generating Trillions of the Finest Tokens π 245 Explore synthetic data benchmarks via an interactive bookshelf