ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents Paper • 2603.18815 • Published 5 days ago • 10
Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text Paper • 2601.22975 • Published Jan 30 • 110