Zone of Proximal Policy Optimization: Teacher in Prompts, Not Gradients Paper • 2606.18216 • Published 7 days ago • 57
Agent Explorative Policy Optimization for Multimodal Agentic Reasoning Paper • 2605.28774 • Published 27 days ago • 93
view article Article Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models nvidia • about 1 month ago • 34
view article Article Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models nvidia • about 1 month ago • 34
Nemotron-Labs-Diffusion Collection A Tri-Mode Language Model Family Unifying Autoregressive, Diffusion, and Self-Speculation Decoding • 7 items • Updated 11 days ago • 49