AI & ML interests
None defined yet.
Recent Activity
View all activity
Articles
cesear64ย
updated a
dataset 4 days ago
cesear64ย
updated a
model 12 days ago
Article
Scaling Zero-Resource Vocabulary: A Data Pipeline for Sango
MEYNG
โข cesear64ย
updated a
model 18 days ago
cesear64ย
published a
model 19 days ago
cesear64ย
published a
model 20 days ago
Post
4127
Just published: how we built production Sango (Central African Republic) translation without fine-tuning, parallel corpus, or training compute.
The method โ vocabulary-augmented prompting with a 581-entry native-speaker-verified lexicon โ generalizes to any of the ~2,000 African languages at the same data-poverty level. Recipe, dataset, and code template all included.
๐ Blog: https://huggingface.co/blog/MEYNG/sangoai
๐ฆ Dataset: MEYNG/sango-vocabulary
Would especially value feedback from anyone working on other low-resource African languages โ Ewondo, Lingala, Wolof next on our roadmap.
The method โ vocabulary-augmented prompting with a 581-entry native-speaker-verified lexicon โ generalizes to any of the ~2,000 African languages at the same data-poverty level. Recipe, dataset, and code template all included.
๐ Blog: https://huggingface.co/blog/MEYNG/sangoai
๐ฆ Dataset: MEYNG/sango-vocabulary
Would especially value feedback from anyone working on other low-resource African languages โ Ewondo, Lingala, Wolof next on our roadmap.
Article
Vocabulary-Augmented Prompting for Sango โ Production African Language AI Without a Parallel Corpus
MEYNG
โข โข 2 Article
Vocabulary-Augmented Prompting for Sango โ Production African Language AI Without a Parallel Corpus
MEYNG
โข cesear64ย
updated a
Space 2 months ago
cesear64ย
published a
Space 2 months ago
cesear64ย
published a
dataset 2 months ago