Anyone found any obvious differences compared to using the stock model?
I know a few people have downloaded a few versions of the text encoders to carry out some basic testing, but I have not seen any extensive test results. Usually I see some slight changes to the output images, but no conclusive results.
Is there anyone else who has found any interesting results to share?
not really, just tried it with a gguf clip loader node in comfyui and did not get much of a difference in my results.
less likely to cover up nudity or stray away from things. if you're doing normal gens the whole point is to keep it similar to the original.
longer explanation:
So the way this works is that you can use ANY qwen 4b as the TE. No actual text is generated, the model makes embeddings. I have tried several de-censored qwens, RP models whatever I could find. They can make some interesting stuff but the DiT isn't trained on them. You might get worse prompt following, aberrations like second limbs or straight up artifacts, etc. The z-image authors state in an issue that they trained the qwen along with the image portion, hopefully they're not bullshitting; I haven't done a hash on them. So using random qwens has the downside of missing that.
What we don't want is the model to generate a "refusal" embedding because of the prompt. That's how your women end up clothed, putting their arms over their naughty bits etc. Abliteration, while not perfect, deletes this. So my suggestion is to try a few different qwens and see which one you like best, they're pretty small.
Full model would be nice for some (to quant to different types, doing scaled and so on). Thanks for doing this. Your model should align better than most with Z-image, given they trained it with it. I mean, if they did. π
Full model is: https://huggingface.co/Lockout/qwen3-4b-heretic-zimage/tree/main/qwen-4b-zimage-heretic
I should also upload the re-run, it has way lower KLD.
how do you get a single .safetensors file? I do:
cat model-00001-of-00002.safetensors model-00002-of-00002.safetensors > qwen_3_4b-hereticV2-zimage.safetensors
but it throws an error (both GGUF and the original model works fine)
@CamiloMM Personally, I use the python safetensors library to do the merging. The following code snippet should work:
from safetensors.torch import load_file, save_file
state_dict1 = load_file("model-00001-of-00002.safetensors")
state_dict2 = load_file("model-00002-of-00002.safetensors")
merged_state_dict = state_dict1 | state_dict2
save_file(merged_state_dict, "qwen_3_4b-hereticV2-zimage.safetensors")
@CamiloMM Personally, I use the python
safetensorslibrary to do the merging.
Thanks, that worked!
I thought those files were just one big file cut at regular intervals, because Gemini told me that lol. Looking into it, it seems they're complete files, just with some layers each. I have no idea why HuggingFace chose that as opposed to just cutting them up normally.