Replace reward plots with combined training_dynamics screenshot d38ba17 verified Supreeth commited on 20 days ago