CaveAgent: Transforming LLMs into Stateful Runtime Operators Paper • 2601.01569 • Published Jan 4 • 20
Online Causal Kalman Filtering for Stable and Effective Policy Optimization Paper • 2602.10609 • Published 24 days ago • 17
Dr. MAS: Stable Reinforcement Learning for Multi-Agent LLM Systems Paper • 2602.08847 • Published 26 days ago • 26
AgentOCR: Reimagining Agent History via Optical Self-Compression Paper • 2601.04786 • Published Jan 8 • 30
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2, 2025 • 233
TimeMaster: Training Time-Series Multimodal LLMs to Reason via Reinforcement Learning Paper • 2506.13705 • Published Jun 16, 2025 • 2
verl-agent Collection Open-source models trained via GiGPO and verl-agent • 3 items • Updated 5 days ago • 2
Group-in-Group Policy Optimization for LLM Agent Training Paper • 2505.10978 • Published May 16, 2025 • 20
Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models Paper • 2505.10554 • Published May 15, 2025 • 120