Instructions to use fredchu/breeze-asr-26-mlx-fp16 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use fredchu/breeze-asr-26-mlx-fp16 with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir breeze-asr-26-mlx-fp16 fredchu/breeze-asr-26-mlx-fp16
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- LM Studio
Breeze-ASR-26 โ MLX (fp16)
Apple MLX port of MediaTek-Research/Breeze-ASR-26, the Taiwanese Hokkien (Taigi) ASR model from MediaTek Research's MR Breeze 3 series.
Runs on Apple Silicon Macs (M1/M2/M3/M4) via the mlx-whisper package โ same API as mlx-community/whisper-* checkpoints.
Files
| file | size |
|---|---|
weights.safetensors |
~3.1 GB |
config.json |
<1 KB |
Usage
import mlx_whisper
result = mlx_whisper.transcribe(
"audio.wav",
path_or_hf_repo="fredchu/breeze-asr-26-mlx-fp16",
language="zh",
)
print(result["text"])
CLI:
mlx_whisper audio.wav --model fredchu/breeze-asr-26-mlx-fp16 --language zh
Performance (M1 Max, mlx-whisper)
| sample | duration | RTF |
|---|---|---|
| Mandarin financial speech (60s) | 60.0s | 8.72ร real-time |
| Taiwanese Hokkien sample (25s) | 25.0s | 6.28ร real-time |
When to use this vs. the 4-bit variant
A companion fredchu/breeze-asr-26-mlx-4bit (877 MB, palette quantized) is also available.
In our 2026-04-30 evaluation on a real Taigi sample (a Mandarin-speaking creator using the Taigi word "ๆผๆณ" pio-pรดa), the 4-bit variant transcribed "ๆผๆณ" correctly while the fp16 variant produced "็็" instead โ a counterintuitive result, possibly because palette quantization re-calibrates outlier weights in a way that helps generalize to underrepresented Taigi tokens. Worth verifying if you have a Taigi-heavy use case.
For pure Mandarin or read benchmarks, fp16 should remain the safer choice.
Limitations (inherited from base model)
- Outputs Mandarin Chinese characters, not Taigi orthography (ๅฐ่ชๆญฃๅญ / ๅฐ็พ )
- Trained on ~10,000 hours of synthetic Taigi speech โ distribution gap with real spontaneous speech
- English brand/proper nouns are aggressively transliterated: in our Mandarin test,
Hellobecameๅๅ,AustinbecameAlsted,NetflixbecameNathalie ็ๆไบ. ASR-25 (MediaTek-Research/Breeze-ASR-25) handles these correctly. Do not use this model for content with frequent English code-switching. - All segments come back as one ~30-second block regardless of audio content (model training behaviour, not framework setting). Post-process if you need finer subtitle granularity.
Conversion
Built with a custom wrapper around mlx-examples/whisper/convert.py that adds sharded-safetensors loader support (the source repo ships weights as 5 GB + 1 GB shards, which the upstream converter doesn't handle).
The conversion script is open source on GitHub โ search for convert_breeze_asr_26.py in For_Claude/scripts/asr-eval/.
License
Apache 2.0 โ inherits from the base model.
Acknowledgments
- MediaTek Research for the original Breeze-ASR-26 weights
- Apple MLX team for the framework and conversion tooling
- Downloads last month
- 34
Quantized
Model tree for fredchu/breeze-asr-26-mlx-fp16
Base model
openai/whisper-large-v2