DigitalUmuganda/Afrivoice_Kinyarwanda
Updated • 125
How to use DigitalUmuganda/Mbaza-ASR-Afrivoice-660h with NeMo:
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("DigitalUmuganda/Mbaza-ASR-Afrivoice-660h")
transcriptions = asr_model.transcribe(["file.wav"])To train, fine-tune or play with the model you will need to install NVIDIA NeMo.
For inference just run:
pip install nemo_toolkit['all']
The model is available for use in the NeMo toolkit, and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("DigitalUmuganda/Mbaza-ASR-Afrivoice-660h")
asr_model.transcribe(['<audio_sample>'])
python [NEMO_GIT_FOLDER]/examples/asr/transcribe_speech.py pretrained_name="DigitalUmuganda/nemo_kin_pretrained_800h_retrained_tokenizer" audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
This model accepts 16000 KHz Mono-channel Audio (wav files) as input.