AgenticSTS: A Bounded-Memory Testbed for Long-Horizon LLM Agents Paper • 2607.02255 • Published 2 days ago • 39 • 2
ELDR: Expert-Locality-Aware Decode Routing for PD-Disaggregated MoE Serving Paper • 2607.00466 • Published 3 days ago • 23 • 3
Managing Procedural Memory in LLM Agents: Control, Adaptation, and Evaluation Paper • 2606.23127 • Published 12 days ago • 20 • 3
Agentic Abstention: Do Agents Know When to Stop Instead of Act? Paper • 2606.28733 • Published 7 days ago • 140 • 9
Thinking While Speaking: Inference-Time Knowledge Transfer for Responsive and Intelligent Conversational Voice Agents Paper • 2511.07397 • Published 11 days ago • 11 • 3
When Does Combining Language Models Help? A Co-Failure Ceiling on Routing, Voting, and Mixture-of-Agents Across 67 Frontier Models Paper • 2606.27288 • Published 9 days ago • 4 • 3
Why Multi-Step Tool-Use Reinforcement Learning Collapses and How Supervisory Signals Fix It Paper • 2606.26027 • Published 10 days ago • 18 • 3
The Verification Horizon: No Silver Bullet for Coding Agent Rewards Paper • 2606.26300 • Published 10 days ago • 47 • 4
Constraint Tax in Open-Weight LLMs: An Empirical Study of Tool Calling Suppression Under Structured Output Constraints Paper • 2606.25605 • Published 10 days ago • 3 • 3
Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning Paper • 2606.24428 • Published 11 days ago • 52 • 3
Deeper is Not Always Better: Mitigating the Alignment Tax via Confident Layer Decoding Paper • 2606.21906 • Published 14 days ago • 24 • 13
Sleeping Agents FEST-Style Few-Shot RL for Reasoning 🧠 Solve math problems with step‑by‑step reasoning
Sleeping Agents FEST-Style Few-Shot RL for Reasoning 🧠 Solve math problems with step‑by‑step reasoning