Dataset and pre-trained models for "Exploiting Vocabulary Frequency Imbalance in Language Model Pre-training (Neurips 2025)"
Woojin Chung PRO
gartland
AI & ML interests
None yet
Organizations
models 61
gartland/finewebedu-196K-30B
0.4B • Updated
gartland/finewebedu-98K-30B
0.2B • Updated
gartland/finewebedu-49K-30B
0.2B • Updated
gartland/finewebedu-24K-30B
0.1B • Updated
gartland/finewebedu-196K-450M-seed42
Text Generation • 1.0B • Updated
gartland/finewebedu-98K-450M-seed42
Text Generation • 0.7B • Updated
gartland/finewebedu-49K-450M-seed42
Text Generation • 0.6B • Updated • 7
gartland/finewebedu-24K-450M-seed42
Text Generation • 0.5B • Updated
gartland/finewebedu-49K-lr1.2e-3-seed42
Text Generation • 0.2B • Updated
gartland/finewebedu-49K-lr2.4e-3-seed42
Text Generation • 0.2B • Updated
datasets 33
gartland/finewebedu-49K-tokenized-30B
Viewer • Updated • 14.9M • 106
gartland/finewebedu-196K-tokenized-30B
Viewer • Updated • 14.4M • 131
gartland/finewebedu-98K-tokenized-30B
Viewer • Updated • 14.5M • 77
gartland/finewebedu-24K-tokenized-30B
Viewer • Updated • 15.4M • 89
gartland/finewebedu-superbpe-t160K
Viewer • Updated • 2.75M • 68
gartland/finewebedu-superbpe-t80K
Viewer • Updated • 2.64M • 45
gartland/finewebedu-superbpe-t180K
Viewer • Updated • 2.85M • 167
gartland/finewebedu-superbpe
Viewer • Updated • 3.63M • 25
gartland/finewebedu-30B
Viewer • Updated • 38.9M • 110
gartland/openwebtext-cc-24K
Viewer • Updated • 9.15k • 5