FORT-Searcher: Synthesizing Shortcut-Resistant Search Tasks for Training Deep Search Agents Paper • 2606.12087 • Published 4 days ago • 71
ClawGym: A Scalable Framework for Building Effective Claw Agents Paper • 2604.26904 • Published Apr 29 • 51
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models Paper • 2503.21380 • Published Mar 27, 2025 • 38