agurung/flawed-fictions-qwen3-4b-lengthpenalty-litereason Reinforcement Learning • 4B • Updated Mar 10 • 9
agurung/flawed-fictions-qwen25-7b-lengthpenalty-litereason Reinforcement Learning • 8B • Updated Feb 22 • 19
agurung/v4_savebestearly_sft_qwen7B_25percent_lr_1e3_bptt_offset Text Generation • 8B • Updated Feb 5 • 2