M^3Eval: Multi-Modal Memory Evaluation through Cognitively-Grounded Video Tasks Paper • 2606.05008 • Published 5 days ago • 26
SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding Paper • 2401.09340 • Published Jan 17, 2024 • 21