AI & ML interests
None defined yet.
Recent Activity
textcleanlm/essentialweb-1.0-10B-clean-content
Viewer
• Updated • 9.32M • 25
textcleanlm/essentialweb-1.0-10B-raw-content
Viewer
• Updated • 9.32M • 42
textcleanlm/essentialweb-1.0-sample-10B
Viewer
• Updated • 9.32M • 68
Viewer
• Updated • 2.98M • 21
textcleanlm/med-domain-5b
Viewer
• Updated • 4.07M • 23
textcleanlm/med-domain-data-sample1
Viewer
• Updated • 814k • 10
textcleanlm/med-domain-data-sample
Viewer
• Updated • 8.1k • 10
textcleanlm/fineweb-sample-10BT
Viewer
• Updated • 14.9M • 39
textcleanlm/training-data-2
Viewer
• Updated • 66.3k • 37
textcleanlm/textclean-10B
Viewer
• Updated • 9.77M • 148
textcleanlm/textclean-2B-raw-cleaned
Viewer
• Updated • 1.95M • 23
textcleanlm/textclean-2B-raw-sample
Viewer
• Updated • 100 • 6
textcleanlm/textclean-2B-raw
Viewer
• Updated • 1.97M • 8
textcleanlm/textclean-sft
Viewer
• Updated • 894k • 6
Viewer
• Updated • 91.7k • 4
textcleanlm/textclean-200M
Viewer
• Updated • 581k • 5
textcleanlm/100M-raw-webtext-to-denoised-text
Viewer
• Updated • 179k • 24
textcleanlm/annotation_example
Viewer
• Updated • 1.82k • 15
Viewer
• Updated • 1.82k • 15
textcleanlm/textclean-20M
Viewer
• Updated • 18.3k • 22
textcleanlm/textclean-corpus-10M-deepseek-ablation
Viewer
• Updated • 18.1k • 4
textcleanlm/textclean-corpus-1M-variant-ablation-research
Viewer
• Updated • 1.82k • 14
textcleanlm/textclean-corpus-1M-old
Viewer
• Updated • 1.82k • 15
• 1
textcleanlm/textclean-corpus-1M-o4-mini
Viewer
• Updated • 1.82k • 15