Automatic Speech Recognition
NeMo
Safetensors
PyTorch
English
speech
audio
Transformer
FastConformer
Conformer
NeMo
Qwen
hf-asr-leaderboard
Eval Results (legacy)
Eval Results
Instructions to use nvidia/canary-qwen-2.5b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- NeMo
How to use nvidia/canary-qwen-2.5b with NeMo:
import nemo.collections.asr as nemo_asr asr_model = nemo_asr.models.ASRModel.from_pretrained("nvidia/canary-qwen-2.5b") transcriptions = asr_model.transcribe(["file.wav"]) - Notebooks
- Google Colab
- Kaggle
base model encoder choice
#8
by sugintama - opened
For base model's encoder choice, is an encoder with a different FastConformer-like structure compatible with this SALM, such as nvidia/parakeet-tdt-0.6b-v2, or must it strictly be an encoder which is combined with a transformer decoder in base model?
For Canary-Qwen-2.5B specifically, it has to be the encoder architecture and parameters in this checkpoint, otherwise it won’t work.
But if you wanted to train your own SALM with NeMo, you can or course use any pretrained model (just change „pretrained_asr” to a different name or .nemo checkpoint)