Block Diffusion for Flash Speculative Decoding
AI & ML interests
Efficient AI
Recent Activity
View all activity
Papers
DFlash: Block Diffusion for Flash Speculative Decoding
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 1.15k • 8 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 39.9k • 40 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 9.3k • 13
Block Diffusion for Flash Speculative Decoding
Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
-
ParoQuant: Pairwise Rotation Quantization for Efficient Reasoning LLM Inference
Paper • 2511.10645 • Published • 8 -
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 1.15k • 8 -
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 39.9k • 40 -
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 9.3k • 13
models 29
z-lab/Qwen3.5-27B-DFlash
Text Generation • 4B • Updated • 231 • 5
z-lab/Qwen3.5-0.8B-PARO
Image-Text-to-Text • 0.4B • Updated • 760 • 1
z-lab/Llama-2-7b-hf-PARO
Text Generation • 1B • Updated • 283 • 1
z-lab/DeepSeek-R1-Distill-Llama-8B-PARO
Text Generation • 1B • Updated • 363 • 1
z-lab/Qwen3.5-27B-PARO
Image-Text-to-Text • 6B • Updated • 1.15k • 8
z-lab/Qwen3.5-9B-PARO
Image-Text-to-Text • 3B • Updated • 39.9k • 40
z-lab/Qwen3.5-4B-PARO
Image-Text-to-Text • 1B • Updated • 9.3k • 13
z-lab/Qwen3.5-2B-PARO
Image-Text-to-Text • 1B • Updated • 343 • 2
z-lab/Qwen3-14B-PARO
Text Generation • 2B • Updated • 515 • 2
z-lab/Qwen3-8B-PARO
Text Generation • 1B • Updated • 1.06k • 1
datasets 0
None public yet