Qwen3-ForcedAligner-0.6B — MLX 8-bit

8-bit quantized version of Qwen/Qwen3-ForcedAligner-0.6B for Apple Silicon inference via MLX.

Predicts word-level timestamps for audio+text pairs in a single non-autoregressive forward pass.

Model Details

Detail Value
Audio encoder 24 layers, 1024 dim, 16 heads, float16
Text decoder 28 layers, 1024 hidden, 16Q/8KV heads, 8-bit quantized (group_size=64)
Classify head Linear(1024, 5000), float16
Timestamp resolution 80ms per class (5000 classes = 400s max)
Total size ~1.4 GB

Usage

let aligner = try await Qwen3ForcedAligner.fromPretrained(
    modelId: "aufklarer/Qwen3-ForcedAligner-0.6B-8bit"
)
let aligned = aligner.align(
    audio: samples, text: "Hello world", sampleRate: 24000
)

Variants

Variant Size Model ID
4-bit ~979 MB aufklarer/Qwen3-ForcedAligner-0.6B-4bit
8-bit ~1.4 GB aufklarer/Qwen3-ForcedAligner-0.6B-8bit
bf16 ~1.8 GB aufklarer/Qwen3-ForcedAligner-0.6B-bf16
CoreML INT4 ~630 MB aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT4
CoreML INT8 ~1.0 GB aufklarer/Qwen3-ForcedAligner-0.6B-CoreML-INT8

Links



Downloads last month
148
Safetensors
Model size
0.5B params
Tensor type
U32
·
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aufklarer/Qwen3-ForcedAligner-0.6B-8bit

Finetuned
(7)
this model

Collection including aufklarer/Qwen3-ForcedAligner-0.6B-8bit