CDLI
Collection
This is a collection of models used for the CDLI ASR challenge for atypical speech in Uganda on Ugandan English and Luganda. • 21 items • Updated
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("KasuleTrevor/cdli-parakeet-11b-en-finetune")
transcriptions = asr_model.transcribe(["file.wav"])lr=5e-5)
This repository contains a NeMo ASR model fine-tuned from
nvidia/parakeet-tdt-1.1b on the gated
cdli/ugandan_english_nonstandard_speech_v1.0 dataset.
This card documents the stronger 1.1B recovery run using a lower learning
rate (5e-5) after the earlier 1e-4 run plateaued early.
nvidia/parakeet-tdt-1.1bcdli/ugandan_english_nonstandard_speech_v1.0cc-by-sa-4.051766381017The evaluation artifacts in this run contain 1016 scored rows.
/jupyter_kernel/parakeet_cdli_en_5e5nvidia/parakeet-tdt-1.1b40.0 s30.0 s0.2 s488325e-51e-3100CosineAnnealing2000010Evaluation was run on the held-out test split using both raw and normalized
transcript comparison.
31.57%15.09%21.20%12.56%20.70%12.58%EN-PARAKEET-TDT-F1tdt-1-1b.nemo: exported NeMo checkpointcheckpoints/: intermediate training checkpointstest_predictions.csvtest_predictions.jsonltest_predictions_scored.csvtest_predictions_scored.jsonltest_predictions_grouped_analysis.csv5e-5 run improved substantially over the earlier 1.1B 1e-4 run.
# Gated model: Login with a HF token with gated access permission hf auth login