Submitted by akhaliq 28 Video-LLaVA: Learning United Visual Representation by Alignment Before Projection · 6 authors 3.49k 4
Submitted by akhaliq 25 Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning · 10 authors 3
Submitted by akhaliq 19 Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 · 11 authors 5
Submitted by akhaliq 8 VideoCon: Robust Video-Language Alignment via Contrast Captions · 5 authors 58
Submitted by akhaliq 6 UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework · 9 authors 52