Submitted by haotiz 54 MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning · 23 authors 3
Submitted by Lemoncoke 28 Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models · 8 authors 41 2
Submitted by gmongaras 16 Cottention: Linear Transformers With Cosine Attention Southern Methodist University AI 20 5
Submitted by SiyuanH 14 UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models · 12 authors 3 4
Submitted by liruiw 13 Scaling Proprioceptive-Visual Learning with Heterogeneous Pre-trained Transformers · 4 authors 15 2
Submitted by hyungjoochae 10 Coffee-Gym: An Environment for Evaluating and Improving Natural Language Feedback on Erroneous Code · 10 authors 3
Submitted by ShuoChen99 8 Visual Question Decomposition on Multimodal Large Language Models · 8 authors 2
Submitted by zhangxulong 1 IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding · 4 authors 30 2