TACO: Tool-Augmented Credit Optimization for Agentic Tool Use Paper • 2606.30251 • Published 4 days ago • 17
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper • 2508.14029 • Published Aug 19, 2025 • 119