Self-Fulfilling (Mis)alignment: Post-Trained Models Collection Here is a selection of models that have undergone DPO. We also share the earlier instruction checkpoints. We recommend using the DPO models. • 22 items • Updated Jan 16 • 2