Cephalo: Harnessing Heterogeneous GPU Clusters for Training Transformer Models Paper • 2411.01075 • Published Nov 1, 2024