ADRA-RL/s1_deepseek-r1_lexical_unique_trio_penalty_1.25_seed42 Viewer • Updated 15 days ago • 128 • 13
ADRA-RL/qwen2.5-7b-instrct_s1_deepseek-r1_distillation_original Text Generation • 1.0B • Updated 15 days ago • 22
ADRA-RL/qwen2.5-7b-instrct_s1_gemini-r1_distillation_original Text Generation • 2B • Updated 15 days ago • 15
ADRA-RL/qwen2.5-7b-instrct_lora_adra_s1_deepseek-r1_original_lexical_unique_trio_s140 Updated 15 days ago