FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use Paper • 2603.08262 • Published 11 days ago • 34
MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild Paper • 2603.17187 • Published 3 days ago • 106
OpenSeeker: Democratizing Frontier Search Agents by Fully Open-Sourcing Training Data Paper • 2603.15594 • Published 4 days ago • 137
MEMO: Memory-Augmented Model Context Optimization for Robust Multi-Turn Multi-Agent LLM Games Paper • 2603.09022 • Published 11 days ago • 21
BenchPreS: A Benchmark for Context-Aware Personalized Preference Selectivity of Persistent-Memory LLMs Paper • 2603.16557 • Published 3 days ago • 17
Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty Paper • 2603.15500 • Published 4 days ago • 11
Code-A1: Adversarial Evolving of Code LLM and Test LLM via Reinforcement Learning Paper • 2603.15611 • Published 4 days ago • 10
One-Eval: An Agentic System for Automated and Traceable LLM Evaluation Paper • 2603.09821 • Published 10 days ago • 10
Supervised Fine-Tuning versus Reinforcement Learning: A Study of Post-Training Methods for Large Language Models Paper • 2603.13985 • Published 6 days ago • 9
Steve-Evolving: Open-World Embodied Self-Evolution via Fine-Grained Diagnosis and Dual-Track Knowledge Distillation Paper • 2603.13131 • Published 7 days ago • 6
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 8 days ago • 62
GradMem: Learning to Write Context into Memory with Test-Time Gradient Descent Paper • 2603.13875 • Published 6 days ago • 27