Bonsai-Image 4B (binary)

First converted from diffusers to comfyui/sd.cpp using f2_from_diffusers.py and then selectively quantized (hopefully) all binary tensors using sd.cpp.

sd-cli -M convert -m bonsai-image-unpacked.safetensors -o bonsai_image_4b-q1_0.gguf --tensor-type-rules "^single_blocks.[0-9]+.linear1.weight=q1_0,^single_blocks.[0-9]+.linear2.weight=q1_0,^double_blocks.[0-9]+.img_attn.qkv.weight=q1_0,^double_blocks.[0-9]+.img_attn.proj.weight=q1_0,^double_blocks.[0-9]+.txt_attn.qkv.weight=q1_0,^double_blocks.[0-9]+.txt_attn.proj.weight=q1_0,^double_blocks.[0-9]+.img_mlp.[0-9].weight=q1_0,^double_blocks.[0-9]+.txt_mlp.[0-9].weight=q1_0"

Use like flux2 klein 4B (distilled).


I uploaded an even smaller file where the next 2 largest tensors in bf16 where quantized to q8_0 with negligible quality loss.

sd-cli -M convert -m models/bonsai_image_4b-q1_0.gguf -o models/bonsai_image_4b-mod_q8_0-q1_0.gguf --tensor-type-rules "^double_stream_modulation_img.lin.weight=q8_0,^double_stream_modulation_txt.lin.weight=q8_0"
base (q1_0+bf16) mod_q8_0 (q1_0+q8_0+bf16)
base mod

example sd.cpp command that maximizes compute to get the most out of the model:

sd-cli --diffusion-model models/bonsai_image_4b-q1_0.gguf --llm models/Qwen3-4B-UD-IQ3_XXS.gguf --steps 6 --vae models/flux2/full_encoder_small_decoder.safetensors --offload-to-cpu --cfg-scale 1.0 --fa -p "a lovely cat" -H 1024 -W 1024 --vae-tiling --vae-tile-size 64 --sampling-method dpm++2s_a -o bonsai-q1_0_base.png
Downloads last month
1,586
GGUF
Model size
4B params
Architecture
Hardware compatibility
Log In to add your hardware

1-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Green-Sky/bonsai-image-binary-4B-GGUF