Efficient RLVR Training via Weighted Mutual Information Data Selection Paper • 2603.01907 • Published 5 days ago • 14
Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning Paper • 2601.21037 • Published Jan 28 • 15
Agentic Policy Optimization via Instruction-Policy Co-Evolution Paper • 2512.01945 • Published Dec 1, 2025 • 4
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems Paper • 2508.07407 • Published Aug 10, 2025 • 98