OPUS: Towards Efficient and Principled Data Selection in Large Language Model Pre-training in Every Iteration Paper • 2602.05400 • Published 15 days ago • 318
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models Paper • 2602.04649 • Published 15 days ago • 12
Weak-Driven Learning: How Weak Agents make Strong Agents Stronger Paper • 2602.08222 • Published 11 days ago • 256
Saffron-1: Towards an Inference Scaling Paradigm for LLM Safety Assurance Paper • 2506.06444 • Published Jun 6, 2025 • 73