Grokking: Generalization Beyond Overfitting on Small Algorithmic Datasets Paper โข 2201.02177 โข Published Jan 6, 2022 โข 6