Text-to-Image
Diffusers
pixart_sigma
pixart_sigma-diffusers
image-to-image
simpletuner
Not-For-All-Audiences
lora
controlnet
template:sd-lora
standard
Instructions to use ControlNetLoRA/pixart with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use ControlNetLoRA/pixart with Diffusers:
pip install -U diffusers transformers accelerate
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline controlnet = ControlNetModel.from_pretrained("ControlNetLoRA/pixart") pipe = StableDiffusionControlNetPipeline.from_pretrained( "terminusresearch/pixart-900m-1024-ft-v0.6", controlnet=controlnet ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
| license: openrail++ | |
| base_model: "terminusresearch/pixart-900m-1024-ft-v0.6" | |
| tags: | |
| - pixart_sigma | |
| - pixart_sigma-diffusers | |
| - text-to-image | |
| - image-to-image | |
| - diffusers | |
| - simpletuner | |
| - not-for-all-audiences | |
| - lora | |
| - controlnet | |
| - template:sd-lora | |
| - standard | |
| pipeline_tag: text-to-image | |
| inference: true | |
| widget: | |
| - text: 'A photo-realistic image of a cat' | |
| parameters: | |
| negative_prompt: 'ugly, cropped, blurry, low-quality, mediocre average' | |
| output: | |
| url: ./assets/image_0_0.png | |
| # pixart-controlnet-lora-test | |
| This is a ControlNet PEFT LoRA derived from [terminusresearch/pixart-900m-1024-ft-v0.6](https://huggingface.co/terminusresearch/pixart-900m-1024-ft-v0.6). | |
| The main validation prompt used during training was: | |
| ``` | |
| A photo-realistic image of a cat | |
| ``` | |
| ## Validation settings | |
| - CFG: `4.0` | |
| - CFG Rescale: `0.0` | |
| - Steps: `16` | |
| - Sampler: `ddim` | |
| - Seed: `42` | |
| - Resolution: `1024x1024` | |
| Note: The validation settings are not necessarily the same as the [training settings](#training-settings). | |
| You can find some example images in the following gallery: | |
| <Gallery /> | |
| The text encoder **was not** trained. | |
| You may reuse the base model text encoder for inference. | |
| ## Training settings | |
| - Training epochs: 24 | |
| - Training steps: 150 | |
| - Learning rate: 0.0001 | |
| - Learning rate schedule: constant | |
| - Warmup steps: 500 | |
| - Max grad value: 0.01 | |
| - Effective batch size: 1 | |
| - Micro-batch size: 1 | |
| - Gradient accumulation steps: 1 | |
| - Number of GPUs: 1 | |
| - Gradient checkpointing: False | |
| - Prediction type: epsilon (extra parameters=['training_scheduler_timestep_spacing=trailing', 'inference_scheduler_timestep_spacing=trailing', 'controlnet_enabled']) | |
| - Optimizer: adamw_bf16 | |
| - Trainable parameter precision: Pure BF16 | |
| - Base model precision: `no_change` | |
| - Caption dropout probability: 0.0% | |
| - LoRA Rank: 64 | |
| - LoRA Alpha: 64.0 | |
| - LoRA Dropout: 0.1 | |
| - LoRA initialisation style: default | |
| ## Datasets | |
| ### antelope-data-1024 | |
| - Repeats: 0 | |
| - Total number of images: 6 | |
| - Total number of aspect buckets: 1 | |
| - Resolution: 1.048576 megapixels | |
| - Cropped: True | |
| - Crop style: center | |
| - Crop aspect: square | |
| - Used for regularisation data: No | |
| ## Inference | |
| ```python | |
| import torch | |
| from diffusers import PixArtSigmaPipeline, PixArtSigmaControlNetPipeline | |
| # if you're not in the SimpleTuner environment, this import will fail. | |
| from helpers.models.pixart.controlnet import PixArtSigmaControlNetAdapterModel | |
| # Load base model | |
| base_model_id = "terminusresearch/pixart-900m-1024-ft-v0.6" | |
| controlnet_id = "bghira/pixart-controlnet-lora-test" | |
| # Load ControlNet adapter | |
| controlnet = PixArtSigmaControlNetAdapterModel.from_pretrained( | |
| f"{controlnet_id}/controlnet" | |
| ) | |
| # Create pipeline | |
| pipeline = PixArtSigmaControlNetPipeline.from_pretrained( | |
| base_model_id, | |
| controlnet=controlnet, | |
| torch_dtype=torch.bfloat16 | |
| ) | |
| pipeline.to('cuda' if torch.cuda.is_available() else 'cpu') | |
| # Load your control image | |
| from PIL import Image | |
| control_image = Image.open("path/to/control/image.png") | |
| # Generate | |
| prompt = "A photo-realistic image of a cat" | |
| image = pipeline( | |
| prompt=prompt, | |
| image=control_image, | |
| num_inference_steps=16, | |
| guidance_scale=4.0, | |
| generator=torch.Generator(device='cuda').manual_seed(42), | |
| controlnet_conditioning_scale=1.0, | |
| ).images[0] | |
| image.save("output.png") | |