Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published 4 days ago • 7
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published 4 days ago • 7
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization Paper • 2510.08540 • Published Oct 9, 2025 • 109
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22, 2025 • 27
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22, 2025 • 27
TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs Paper • 2509.18056 • Published Sep 22, 2025 • 27 • 3
Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation Paper • 2406.00670 • Published Jun 2, 2024
Unbiased Region-Language Alignment for Open-Vocabulary Dense Prediction Paper • 2412.06244 • Published Dec 9, 2024
A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models Paper • 2508.01548 • Published Aug 3, 2025 • 14
Revisiting Efficient Semantic Segmentation: Learning Offsets for Better Spatial and Class Feature Alignment Paper • 2508.08811 • Published Aug 12, 2025 • 2
A Glimpse to Compress: Dynamic Visual Token Pruning for Large Vision-Language Models Paper • 2508.01548 • Published Aug 3, 2025 • 14