-
DA-DPO: Cost-efficient Difficulty-aware Preference Optimization for Reducing MLLM Hallucinations
Paper • 2601.00623 • Published -
TeaRAG: A Token-Efficient Agentic Retrieval-Augmented Generation Framework
Paper • 2511.05385 • Published -
Pre-DPO: Improving Data Utilization in Direct Preference Optimization Using a Guiding Reference Model
Paper • 2504.15843 • Published • 16 -
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Paper • 2505.19000 • Published • 42
Xiaoge Shen
huez
AI & ML interests
None yet
Recent Activity
updated
a collection
15 days ago
mlp
updated
a collection
15 days ago
mlp
updated
a collection
15 days ago
mlp
Organizations
None yet