Foresight: Failure Detection for Long-Horizon Robotic Manipulation with Action-Conditioned World Model Latents Paper • 2606.23085 • Published 7 days ago • 14
Escaping the Self-Confirmation Trap: An Execute-Distill-Verify Paradigm for Agentic Experience Learning Paper • 2606.24428 • Published 6 days ago • 51
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 6 days ago • 136
AOHP: An Open-Source OS-Level Agent Harness for Personalized, Efficient and Secure Interaction Paper • 2606.23449 • Published 7 days ago • 30
Go-with-the-Track: Video Compositing and Motion Control with Point Tracking Paper • 2606.20891 • Published 11 days ago • 3
DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams Paper • 2606.21337 • Published 10 days ago • 73
Grouped Query Experts: Mixture-of-Experts on GQA Self-Attention Paper • 2606.20945 • Published 11 days ago • 75
PAIWorld: A 3D-Consistent World Foundation Model for Robotic Manipulation Paper • 2606.18375 • Published 13 days ago • 11
MaineCoon: Pursuing A Real-Time Audio-Visual Social World Model Paper • 2606.17800 • Published 13 days ago • 13
OmniDirector: General Multi-Shot Camera Cloning without Cross-Paired Data Paper • 2606.13432 • Published 18 days ago • 112
Avatar V: Scaling Video-Reference Avatar Video Generation Paper • 2606.13872 • Published 18 days ago • 9
Qwen-RobotWorld Technical Report: Unifying Embodied World Modeling through Language-Conditioned Video Generation Paper • 2606.17030 • Published 14 days ago • 30
PermaVid: Consistent Video Generation Across Edits via Disentangled Context Memory Paper • 2606.16449 • Published 14 days ago • 5