caiovicentino1
/

Qwen3.5-9B-EOQ-Dynamic-BitPacked

8-bit precision

Model card Files Files and versions

Qwen3.5-9B-EOQ-Dynamic-BitPacked

4.95 GB

Ctrl+K

Ctrl+K

1 contributor

History: 7 commits

caiovicentino1's picture

fix: add tokenizer_class Qwen2TokenizerFast

067e0a1 verified about 2 months ago

.gitattributes

1.57 kB
Upload folder using huggingface_hub about 2 months ago
README.md

6.33 kB
Add torchao inference results: 43 tok/s, 6.3 GB VRAM, 95% FP16 speed about 2 months ago
chat_template.jinja

7.76 kB
Upload folder using huggingface_hub about 2 months ago
config.json

2.78 kB
Upload folder using huggingface_hub about 2 months ago
eoq_loader.py

5.35 kB
Upgrade loader: GPU dequant (4s vs 437s, 100x faster) about 2 months ago
eoq_metadata.json

89.1 kB
Upload folder using huggingface_hub about 2 months ago
model-00001.safetensors

4.93 GB
xet

Upload folder using huggingface_hub about 2 months ago
model.safetensors.index.json

51.2 kB
Upload folder using huggingface_hub about 2 months ago
tokenizer.json

20 MB
xet

Upload folder using huggingface_hub about 2 months ago
tokenizer_config.json

1.1 kB
fix: add tokenizer_class Qwen2TokenizerFast about 2 months ago