Mindstorms in Natural Language-Based Societies of Mind Paper โข 2305.17066 โข Published May 26, 2023 โข 3
ColorMAE: Exploring data-independent masking strategies in Masked AutoEncoders Paper โข 2407.13036 โข Published Jul 17, 2024 โข 4
Mixture of States: Routing Token-Level Dynamics for Multimodal Generation Paper โข 2511.12207 โข Published Nov 15, 2025 โข 10
End-to-End Temporal Action Detection with 1B Parameters Across 1000 Frames Paper โข 2311.17241 โข Published Nov 28, 2023
Re^2TAL: Rewiring Pretrained Video Backbones for Reversible Temporal Action Localization Paper โข 2211.14053 โข Published Nov 25, 2022
CAPTAIN: Semantic Feature Injection for Memorization Mitigation in Text-to-Image Diffusion Models Paper โข 2512.10655 โข Published Dec 11, 2025 โข 10
BOLT: Boost Large Vision-Language Model Without Training for Long-form Video Understanding Paper โข 2503.21483 โข Published Mar 27, 2025 โข 1
OpenTAD: A Unified Framework and Comprehensive Study of Temporal Action Detection Paper โข 2502.20361 โข Published Feb 27, 2025 โข 1