WBench: A Comprehensive Multi-turn Benchmark for Interactive Video World Model Evaluation Paper • 2605.25874 • Published 22 days ago • 102
SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-unify Architecture Paper • 2605.12500 • Published May 12 • 191
OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis Paper • 2604.15093 • Published Apr 16 • 30
OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis Paper • 2604.15093 • Published Apr 16 • 30
OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis Paper • 2604.15093 • Published Apr 16 • 30
Running Agents 1 Mobile Agent Trajectory Viewer 📱 1 Explore mobile app interaction trajectories with visual UI
On Data Engineering for Scaling LLM Terminal Capabilities Paper • 2602.21193 • Published Feb 24 • 103
OdysseyArena: Benchmarking Large Language Models For Long-Horizon, Active and Inductive Interactions Paper • 2602.05843 • Published Feb 5 • 61
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents Paper • 2602.02196 • Published Feb 2 • 35
OS-Symphony: A Holistic Framework for Robust and Generalist Computer-Using Agent Paper • 2601.07779 • Published Jan 12 • 28
SenseNova-MARS: Empowering Multimodal Agentic Reasoning and Search via Reinforcement Learning Paper • 2512.24330 • Published Dec 30, 2025 • 36
PaCo-RL: Advancing Reinforcement Learning for Consistent Image Generation with Pairwise Reward Modeling Paper • 2512.04784 • Published Dec 2, 2025 • 25
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 73
OS-Sentinel: Towards Safety-Enhanced Mobile GUI Agents via Hybrid Validation in Realistic Workflows Paper • 2510.24411 • Published Oct 28, 2025 • 73