whisper-large-v2-cantonese-mlx-4bit

這是 doggy8088/whisper-large-v2-cantonese-mlx4-bit 量化版, 由原始的 MLX fp16 checkpoint 再量化而成,適合在 Apple Silicon 上進一步降低記憶體占用。

量化設定

  • bits: 4
  • group size: 64
  • quantization mode: MLX affine weight-only quantization

使用方式

pip install -U mlx-whisper

CLI:

mlx_whisper audio.wav --model doggy8088/whisper-large-v2-cantonese-mlx-4bit --language zh

Python:

import mlx_whisper

result = mlx_whisper.transcribe(
    "audio.wav",
    path_or_hf_repo="doggy8088/whisper-large-v2-cantonese-mlx-4bit",
    language="zh",
)
print(result["text"])

注意事項

  • 這是量化模型,速度與精度可能和 fp16 版本略有差異。
  • 建議使用 language="zh",或省略讓模型自動偵測;不要指定 yue
  • 若你想要最高精度,請改用 fp16 原版:doggy8088/whisper-large-v2-cantonese-mlx
Downloads last month
30
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for doggy8088/whisper-large-v2-cantonese-mlx-4bit