Learn Hard Problems During RL with Reference Guided Fine-tuning Paper • 2603.01223 • Published 3 days ago • 12
view article Article Unlocking Agentic RL Training for GPT-OSS: A Practical Retrospective Jan 27 • 62
Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models Paper • 2410.02740 • Published Oct 3, 2024 • 53