Traditional Chinese corpus collection for LLM training (pre-training, instruction-tuning, and RLHF/alignment).
Oscar, Li
liswei
AI & ML interests
Multimodal Deep Learning, Natural Language Processing, Efficient Fine-Tuning
Organizations
models 8
liswei/emojilm-0.6b-GGUF
0.6B • Updated • 19
liswei/emojilm-0.6b
0.6B • Updated • 2
liswei/Taiwan-ELM
Updated
liswei/Taiwan-ELM-1_1B-Instruct
Text Generation • 1B • Updated • 5 • 1
liswei/Taiwan-ELM-270M-Instruct
Text Generation • 0.3B • Updated • 12 • 1
liswei/Taiwan-ELM-1_1B
Text Generation • 1B • Updated • 3 • 1
liswei/Taiwan-ELM-270M
Text Generation • 0.3B • Updated • 23 • 2
liswei/EmojiLMSeq2SeqLoRA
0.6B • Updated • 3
datasets 10
liswei/Taiwan-Text-Excellence-2B
Viewer • Updated • 1.78M • 21 • 20
liswei/PromptPair-TW
Viewer • Updated • 119k • 12 • 2
liswei/news-collection-zhtw
Viewer • Updated • 592k • 118 • 3
liswei/wikinews-zhtw-dedup
Viewer • Updated • 8.37k • 14
liswei/wikipedia-zhtw-dedup
Viewer • Updated • 1.18M • 31 • 3
liswei/common-crawl-zhtw
Viewer • Updated • 2.71M • 65 • 6
liswei/coct-en-zhtw-dedup
Viewer • Updated • 217k • 6 • 2
liswei/c4-zhtw
Viewer • Updated • 4.86M • 35 • 3
liswei/rm-static-zhTW
Viewer • Updated • 81.4k • 18 • 30
liswei/NTU-Tree
Viewer • Updated • 478 • 55 • 4