CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models Paper • 2407.17467 • Published Jul 24, 2024
SynthDoc: Bilingual Documents Synthesis for Visual Document Understanding Paper • 2408.14764 • Published Aug 27, 2024
Balancing Speciality and Versatility: a Coarse to Fine Framework for Supervised Fine-tuning Large Language Model Paper • 2404.10306 • Published Apr 16, 2024 • 1
Cultivating Helpful, Personalized, and Creative AI Tutors: A Framework for Pedagogical Alignment using Reinforcement Learning Paper • 2507.20335 • Published Jul 27, 2025
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper • 2508.05592 • Published Aug 7, 2025 • 6
ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning Paper • 2508.19996 • Published Aug 27, 2025
RelayFormer: A Unified Local-Global Attention Framework for Scalable Image and Video Manipulation Localization Paper • 2508.09459 • Published Aug 13, 2025 • 2
MathSmith: Towards Extremely Hard Mathematical Reasoning by Forging Synthetic Problems with a Reinforced Policy Paper • 2508.05592 • Published Aug 7, 2025 • 6
LexSemBridge: Fine-Grained Dense Representation Enhancement through Token-Aware Embedding Augmentation Paper • 2508.17858 • Published Aug 25, 2025 • 10