AgenticSTS: A Bounded-Memory Testbed for Long-Horizon LLM Agents Paper • 2607.02255 • Published 1 day ago • 38
Seed2.0 Model Card: Towards Intelligence Frontier for Real-World Complexity Paper • 2607.00248 • Published 4 days ago • 22
ACE-Ego-0: Unifying Egocentric Human and Robotic Data for VLA Pretraining Paper • 2606.17200 • Published 19 days ago • 54
LoopCoder-v2: Only Loop Once for Efficient Test-Time Computation Scaling Paper • 2606.18023 • Published 18 days ago • 209
Hy-Embodied-0.5-VLA: From Vision-Language-Action Models to a Real-World Robot Learning Stack Paper • 2606.14409 • Published 22 days ago • 15
WBench Collection WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation • 4 items • Updated Jun 1 • 4
YoCausal: How Far is Video Generation from World Model? A Causality Perspective Paper • 2605.30346 • Published May 28 • 55
VitaBench 2.0: Evaluating Personalized and Proactive Agents in Long-Term User Interactions Paper • 2605.27141 • Published May 26 • 20
EMO: Earth Mover Distance Optimization for Auto-Regressive Language Modeling Paper • 2310.04691 • Published Oct 7, 2023 • 3
LocateAnything: Fast and High-Quality Vision-Language Grounding with Parallel Box Decoding Paper • 2605.27365 • Published May 26 • 145
WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published May 25 • 103
MiniCPM-o 4.5: Towards Real-Time Full-Duplex Omni-Modal Interaction Paper • 2604.27393 • Published Apr 30 • 81
HumanNet: Scaling Human-centric Video Learning to One Million Hours Paper • 2605.06747 • Published May 7 • 55
A Benchmark for Interactive World Models with a Unified Action Generation Framework Paper • 2605.03941 • Published May 5 • 5