Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
caiovicentino1
/
Qwen3.5-9B-EOQ-Dynamic-BitPacked
like
1
Safetensors
qwen3_5
eoq
quantized
entropy-coding
compressed
qwen3.5
dynamic
bitpacked
awq
torchao
8-bit precision
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Copy to bucket
new
main
Qwen3.5-9B-EOQ-Dynamic-BitPacked
4.95 GB
Ctrl+K
Ctrl+K
1 contributor
History:
7 commits
caiovicentino1
fix: add tokenizer_class Qwen2TokenizerFast
067e0a1
verified
about 2 months ago
.gitattributes
Safe
1.57 kB
Upload folder using huggingface_hub
about 2 months ago
README.md
Safe
6.33 kB
Add torchao inference results: 43 tok/s, 6.3 GB VRAM, 95% FP16 speed
about 2 months ago
chat_template.jinja
Safe
7.76 kB
Upload folder using huggingface_hub
about 2 months ago
config.json
Safe
2.78 kB
Upload folder using huggingface_hub
about 2 months ago
eoq_loader.py
Safe
5.35 kB
Upgrade loader: GPU dequant (4s vs 437s, 100x faster)
about 2 months ago
eoq_metadata.json
Safe
89.1 kB
Upload folder using huggingface_hub
about 2 months ago
model-00001.safetensors
Safe
4.93 GB
xet
Upload folder using huggingface_hub
about 2 months ago
model.safetensors.index.json
Safe
51.2 kB
Upload folder using huggingface_hub
about 2 months ago
tokenizer.json
Safe
20 MB
xet
Upload folder using huggingface_hub
about 2 months ago
tokenizer_config.json
Safe
1.1 kB
fix: add tokenizer_class Qwen2TokenizerFast
about 2 months ago