Why Far Looks Up: Probing Spatial Representation in Vision-Language Models Paper • 2605.30161 • Published 12 days ago • 60
Personalize-then-Store: Benchmarking and Learning Personalized Memory for Long-horizon Agents Paper • 2605.25535 • Published 15 days ago • 41
M3DocRAG: Multi-modal Retrieval is What You Need for Multi-page Multi-document Understanding Paper • 2411.04952 • Published Nov 7, 2024 • 29