Qwen3-Coder-30B-A3B-Instruct_MXFP4
This checkpoint is a variant of Qwen3-Coder-30B-A3B-Instruct, where expert weights have been quantized to MXFP4 format similarly to gpt-oss-20b and gpt-oss-120b.
For quantizing weights we used the function downcast_to_mxfp from triton-kernels.
The checkpoint might come with a small drop in accuracy, but has ~68% size reduction compared to the original BF16 checkpoint.
Accuracy Comparison
| Model | GSM8K (strict-match) | GSM8K (flexible-extract) |
|---|---|---|
| Qwen3-Coder-30B-A3B-Instruct (BF16) | 90.67% ± 0.80% | 89.92% ± 0.83% |
| Qwen3-Coder-30B-A3B-Instruct_MXFP4 | 89.76% ± 0.83% | 88.70% ± 0.87% |
Checkpoint Size
| Model | Size | Reduction |
|---|---|---|
| Qwen3-Coder-30B-A3B-Instruct (BF16) | 57 GB | - |
| Qwen3-Coder-30B-A3B-Instruct_MXFP4 | 18 GB | 68% smaller |
- Downloads last month
- 19