comin
/

IterComp

StableDiffusionXLPipeline

Model card Files Files and versions

IterComp / README.md

comin's picture

Update README.md

f584ce5 verified about 1 year ago

|

history blame contribute delete

2.76 kB

	---
	license: apache-2.0
	---

	# IterComp(ICLR 2025)

	Official Repository of the paper: [IterComp](https://arxiv.org/abs/2410.07171).
	<p align="left">
	<a href='https://arxiv.org/abs/2410.07171'>
	<img src='https://img.shields.io/badge/Arxiv-2410.07171-A42C25?style=flat&logo=arXiv&logoColor=A42C25'></a>
	<a href='https://github.com/YangLing0818/IterComp'>
	<img src='https://img.shields.io/badge/GitHub-Code-black?style=flat&logo=github&logoColor=white'></a>
	</p>

	<img src="./itercomp.png" style="zoom:50%;" />

	## News🔥🔥🔥

	[2025.02] We open-source three composition-aware reward models in [HuggingFace Repo](https://huggingface.co/comin/IterComp/tree/main/reward_models), which can be used for preference learning and as new image generation evaluators.

	[2025.02] We enhance IterComp-RPG with LLMs that possess the strongest reasoning capabilities, including [DeepSeek-R1](https://github.com/deepseek-ai/DeepSeek-R1), [OpenAI o3-mini](https://openai.com/index/openai-o3-mini/), and [OpenAI o1](https://openai.com/index/learning-to-reason-with-llms/) to achieve outstanding compositional image generation under complex prompts.

	[2025.01] IterComp is accepted by ICLR 2025!!!

	[2024.10] Checkpoints of base diffusion model are publicly available on [HuggingFace Repo](https://huggingface.co/comin/IterComp).

	[2024.10] Our main code of IterComp is released.

	## Introduction

	IterComp is one of the new State-of-the-Art compositional generation methods. In this repository, we release the model training from [SDXL Base 1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0) .

	## Text-to-Image Usage

	```python
	from diffusers import DiffusionPipeline
	import torch

	pipe = DiffusionPipeline.from_pretrained("comin/IterComp", torch_dtype=torch.float16, use_safetensors=True)
	pipe.to("cuda")
	# if using torch < 2.0
	# pipe.enable_xformers_memory_efficient_attention()

	prompt = "An astronaut riding a green horse"
	image = pipe(prompt=prompt).images[0]
	image.save("output.png")
	```

	IterComp can serve as a powerful backbone for various compositional generation methods, such as [RPG](https://github.com/YangLing0818/RPG-DiffusionMaster) and [Omost](https://github.com/lllyasviel/Omost). We recommend integrating IterComp into these approaches to achieve more advanced compositional generation results.

	## Citation

	```
	@article{zhang2024itercomp,
	title={IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation},
	author={Zhang, Xinchen and Yang, Ling and Li, Guohao and Cai, Yaqi and Xie, Jiake and Tang, Yong and Yang, Yujiu and Wang, Mengdi and Cui, Bin},
	journal={arXiv preprint arXiv:2410.07171},
	year={2024}
	}
	```

	##