DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated 20 days ago • 146k • 39 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 1.24k allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 202 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated 21 days ago • 96
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 5.36k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 28.3k • 64 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 680 • 34 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 9
DPO RLLab/allenai-Dolci-Instruct-DPO-Length-Filtered Viewer • Updated 20 days ago • 146k • 39 RLLab/olmo-3-7b-it-sft Text Generation • 7B • Updated Dec 18, 2025 • 1.24k allenai/Dolci-Instruct-SFT-No-Tools Viewer • Updated Jan 5 • 1.92M • 202 • 4 RLLab/gemma-3-4b-text-sft Text Generation • 4B • Updated 21 days ago • 96
RL-Dataset open-r1/DAPO-Math-17k-Processed Viewer • Updated Nov 10, 2025 • 34.8k • 5.36k • 62 DigitalLearningGmbH/MATH-lighteval Viewer • Updated Jan 15, 2025 • 25k • 28.3k • 64 POLARIS-Project/Polaris-Dataset-53K Viewer • Updated Jun 18, 2025 • 53.3k • 680 • 34 RLLab/math-rl Viewer • Updated Nov 25, 2025 • 57.5k • 9