Neooooo/qf-integration-test
QuantForge Metadata
- Base model:
Qwen/Qwen3-30B-A3B - Quantization scheme:
nvfp4 - Calibration dataset:
HuggingFaceH4/ultrachat_200k - Calibration samples:
32 - Max sequence length:
512 - Ignored layers:
lm_head, re:.*\.mlp\.gate$, re:.*\.mlp\.router$
Accuracy (BF16 vs NVFP4)
| Task | Metric | BF16 | NVFP4 | Recovery |
|---|---|---|---|---|
| arc_challenge | acc,none | 0.4000 | 0.3000 | 0.750 |
| hellaswag | acc,none | 0.4000 | 0.4000 | 1.000 |
Aggregate macro recovery: 0.875
Note: Scores estimated from subset.
Performance
Performance benchmark unavailable: evaluate.skip_perf=true
Usage (vLLM)
vllm serve Neooooo/qf-integration-test
- Downloads last month
- 20