Anyone found any obvious differences compared to using the stock model?

by mingyi456 - opened Dec 7, 2025

Dec 7, 2025

I know a few people have downloaded a few versions of the text encoders to carry out some basic testing, but I have not seen any extensive test results. Usually I see some slight changes to the output images, but no conclusive results.

Is there anyone else who has found any interesting results to share?

johnpork112233

Dec 9, 2025

not really, just tried it with a gguf clip loader node in comfyui and did not get much of a difference in my results.

Lockout

Owner Dec 9, 2025

•

edited Dec 9, 2025

less likely to cover up nudity or stray away from things. if you're doing normal gens the whole point is to keep it similar to the original.

longer explanation:

So the way this works is that you can use ANY qwen 4b as the TE. No actual text is generated, the model makes embeddings. I have tried several de-censored qwens, RP models whatever I could find. They can make some interesting stuff but the DiT isn't trained on them. You might get worse prompt following, aberrations like second limbs or straight up artifacts, etc. The z-image authors state in an issue that they trained the qwen along with the image portion, hopefully they're not bullshitting; I haven't done a hash on them. So using random qwens has the downside of missing that.

What we don't want is the model to generate a "refusal" embedding because of the prompt. That's how your women end up clothed, putting their arms over their naughty bits etc. Abliteration, while not perfect, deletes this. So my suggestion is to try a few different qwens and see which one you like best, they're pretty small.

Mescalamba

Dec 15, 2025

•

edited Dec 15, 2025

Full model would be nice for some (to quant to different types, doing scaled and so on). Thanks for doing this. Your model should align better than most with Z-image, given they trained it with it. I mean, if they did. 🙃

Lockout

Owner Dec 16, 2025

Full model is: https://huggingface.co/Lockout/qwen3-4b-heretic-zimage/tree/main/qwen-4b-zimage-heretic

I should also upload the re-run, it has way lower KLD.

Andrei32

Jan 3

This comment has been hidden (marked as Resolved)

CamiloMM

Jan 19

how do you get a single .safetensors file? I do:

cat model-00001-of-00002.safetensors model-00002-of-00002.safetensors > qwen_3_4b-hereticV2-zimage.safetensors

but it throws an error (both GGUF and the original model works fine)

mingyi456

Jan 20

@CamiloMM Personally, I use the python safetensors library to do the merging. The following code snippet should work:

from safetensors.torch import load_file, save_file
state_dict1 = load_file("model-00001-of-00002.safetensors")
state_dict2 = load_file("model-00002-of-00002.safetensors")
merged_state_dict = state_dict1 | state_dict2
save_file(merged_state_dict, "qwen_3_4b-hereticV2-zimage.safetensors")

CamiloMM

Jan 20

@CamiloMM Personally, I use the python safetensors library to do the merging.

Thanks, that worked!

I thought those files were just one big file cut at regular intervals, because Gemini told me that lol. Looking into it, it seems they're complete files, just with some layers each. I have no idea why HuggingFace chose that as opposed to just cutting them up normally.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment