VeriEvol: Scaling Multimodal Mathematical Reasoning via Verifiable Evol-Instruct Paper โข 2606.23543 โข Published 5 days ago โข 6
STARE: Surprisal-Guided Token-Level Advantage Reweighting for Policy Entropy Stability Paper โข 2606.19236 โข Published 10 days ago โข 13
Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models Paper โข 2603.01571 โข Published Mar 2 โข 34
RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper โข 2603.01562 โข Published Mar 2 โข 64
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct Paper โข 2308.09583 โข Published Aug 18, 2023 โข 8
WizardLM: Empowering Large Language Models to Follow Complex Instructions Paper โข 2304.12244 โข Published Apr 24, 2023 โข 14
WizardCoder: Empowering Code Large Language Models with Evol-Instruct Paper โข 2306.08568 โข Published Jun 14, 2023 โข 34