X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 6 days ago • 34
SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning Paper • 2505.02486 • Published May 5, 2025
KG-RAG: Enhancing GUI Agent Decision-Making via Knowledge Graph-Driven Retrieval-Augmented Generation Paper • 2509.00366 • Published Aug 30, 2025
CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification Paper • 2603.01940 • Published Mar 2 • 24
PhoStream: Benchmarking Real-World Streaming for Omnimodal Assistants in Mobile Scenarios Paper • 2601.22575 • Published Jan 30 • 1
UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents Paper • 2605.29534 • Published 10 days ago • 15
OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants Paper • 2605.26485 • Published 12 days ago • 3
OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants Paper • 2605.26485 • Published 12 days ago • 3
UI-KOBE: Knowledge-Oriented Behavior Exploration for Lightweight Graph-Guided GUI Agents Paper • 2605.29534 • Published 10 days ago • 15
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published Apr 5 • 51
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published Apr 5 • 51
AURA: Always-On Understanding and Real-Time Assistance via Video Streams Paper • 2604.04184 • Published Apr 5 • 51