--- license: cc-by-4.0 tags: - text-to-image - image-to-image - stable-diffusion - lora - diffusers - medical-imaging - retina - chest-xray - synthetic-data - privacy-preserving - controlled-generation base_model: CompVis/stable-diffusion-v1-4 widget: - text: "retina fundoscopy right eye dilated age=45 gender=male bp systolic=120" - text: "retina fundoscopy left eye dilated age=70 gender=female bp systolic=165" - text: "retina fundoscopy right eye dilated age=30 gender=female bp systolic=110" - text: "chest xray view=pa age=55 gender=female demonstrating no finding" - text: "chest xray view=pa age=40 gender=male demonstrating cardiomegaly" - text: "chest xray view=pa age=65 gender=female demonstrating pleural effusion" --- # SynthMed LoRAs — Privacy-Preserving Medical Image Generation Two LoRA adapters for [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4) that generate synthetic medical images conditioned on patient metadata. These models support two generation modes: - **Text-to-image (T2I)** — generate a novel synthetic image from a prompt alone. - **Image-to-image (I2I)** — transform a real image into a synthetic counterpart while preserving clinical signals. The `strength` parameter trades off re-identification risk against biomarker fidelity, enabling tunable de-identification. | Adapter | Domain | Conditioned on | Weight file | |---------|--------|---------------|-------------| | `hpp-retina` | Retinal fundus photography | Age, sex, systolic BP | `hpp-retina/lora_weights.safetensors` | | `cxr` | Chest X-ray (PA view) | Age, sex, pathology findings | `cxr/lora_weights.safetensors` | --- ## About Patient privacy constraints limit the sharing of clinical imaging datasets. These LoRAs were developed as part of a study evaluating image-conditioned diffusion as a practical de-identification approach: given a real image, I2I diffusion produces a synthetic counterpart that preserves downstream-relevant clinical signals while reducing re-identification risk. Key findings from the paper: - **I2I outperforms T2I** on pixel/perceptual fidelity and biomarker agreement at all conditioning strengths. - **Privacy–utility tradeoff** (retinal): biomarker agreement (hemoglobin, Pearson *r*) drops from 0.83 at strength 0.1 to 0.32 at strength 1.0, while top-1 re-identification rate falls from 100% to ~2%. - **Cross-cohort transfer**: pretraining on I2I synthetic retinal images performs comparably to real-image pretraining when transferring to UK Biobank, and surpasses it in the smallest fine-tuning regimes. - **CXR caveat**: chest X-rays remain substantially re-identifiable even at high I2I strengths. --- ## Quick Start ### Text-to-image (T2I) ```python import torch from diffusers import DDIMScheduler, StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, safety_checker=None, ) pipe.scheduler = DDIMScheduler.from_pretrained( "CompVis/stable-diffusion-v1-4", subfolder="scheduler" ) pipe.load_lora_weights("doronys/synthmed-loras", weight_name="hpp-retina/lora_weights.safetensors") pipe = pipe.to("cuda") image = pipe( "retina fundoscopy right eye dilated age=50 gender=male bp systolic=130", num_inference_steps=50, ).images[0] image.save("retina_t2i.png") ``` ### Image-to-image (I2I) — tunable de-identification ```python from PIL import Image import torch from diffusers import DDIMScheduler, StableDiffusionImg2ImgPipeline pipe = StableDiffusionImg2ImgPipeline.from_pretrained( "CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16, safety_checker=None, ) pipe.scheduler = DDIMScheduler.from_pretrained( "CompVis/stable-diffusion-v1-4", subfolder="scheduler" ) pipe.load_lora_weights("doronys/synthmed-loras", weight_name="hpp-retina/lora_weights.safetensors") pipe = pipe.to("cuda") source = Image.open("patient_retina.png").resize((512, 512)) # strength: 0.0 → near-identity, 1.0 → unconstrained T2I image = pipe( prompt="retina fundoscopy right eye dilated age=50 gender=male bp systolic=130", image=source, strength=0.5, num_inference_steps=50, ).images[0] image.save("retina_i2i.png") ``` Switch to the CXR adapter by changing `weight_name="cxr/lora_weights.safetensors"` and using a CXR prompt. --- ## Prompt Format Labels are appended to a domain-specific prefix as `key=value` tokens. **Retinal fundus (`hpp-retina`)** ``` retina fundoscopy {laterality} eye dilated age={age} gender={male|female} bp systolic={sbp} ``` Examples: - `retina fundoscopy right eye dilated age=45 gender=male bp systolic=120` - `retina fundoscopy left eye dilated age=72 gender=female bp systolic=155` **Chest X-ray (`cxr`)** ``` chest xray view=pa age={age} gender={male|female} demonstrating {finding} ``` Common findings: `no finding`, `cardiomegaly`, `pleural effusion`, `infiltration`, `atelectasis`, `pneumonia`, `edema`, `pneumothorax`, `consolidation`, `effusion` Examples: - `chest xray view=pa age=55 gender=female demonstrating no finding` - `chest xray view=pa age=63 gender=male demonstrating cardiomegaly` --- ## Image Examples ### Retinal Fundus — T2I vs I2I (strength=0.5) Each pair is generated from the same real source image and prompt. T2I generates freely from text; I2I conditions on the real image, preserving anatomical structure. **`retina fundoscopy right eye dilated age=42 gender=female bp systolic=104`** | T2I | I2I | |-----|-----| | ![](images/retina_t2i_f42_bp104.png) | ![](images/retina_i2i_f42_bp104.png) | **`retina fundoscopy right eye dilated age=67 gender=female bp systolic=123`** | T2I | I2I | |-----|-----| | ![](images/retina_t2i_f67_bp123.png) | ![](images/retina_i2i_f67_bp123.png) | **`retina fundoscopy right eye dilated age=48 gender=male bp systolic=127`** | T2I | I2I | |-----|-----| | ![](images/retina_t2i_m48_bp127.png) | ![](images/retina_i2i_m48_bp127.png) | ### Chest X-ray — T2I | | | | |-|-|-| | ![](images/cxr_t2i_f58_no_finding.png) | ![](images/cxr_t2i_f73_cardiomegaly.png) | ![](images/cxr_t2i_m74_effusion.png) | | `age=58 female, no finding` | `age=73 female, cardiomegaly` | `age=74 male, effusion` | | ![](images/cxr_t2i_m47_fibrosis.png) | ![](images/cxr_t2i_m40_atelectasis.png) | ![](images/cxr_t2i_m48_nodule.png) | | `age=47 male, fibrosis, infiltration` | `age=40 male, atelectasis, effusion` | `age=48 male, atelectasis, nodule` | --- ## Training Details Both adapters were fine-tuned from `CompVis/stable-diffusion-v1-4` using LoRA on attention layers (`to_k`, `to_q`, `to_v`, `to_out.0`). | Parameter | Value | |-----------|-------| | LoRA rank | 64 | | LoRA alpha | 32 | | Resolution | 512 × 512 | | Training steps | 20,000 | | Learning rate | 1e-4 (constant) | | Batch size | 12 | | Mixed precision | fp16 | **hpp-retina** was trained on retinal fundus images from the [Human Phenotype Project (10K)](https://www.nature.com/articles/s41591-025-03790-9). **cxr** was trained on the [NIH ChestX-ray8](https://nihcc.app.box.com/v/ChestXray-NIHCC) dataset. --- ## Citation If you use these models in your research, please cite: ```bibtex @article{synthmed2026, title = {Privacy-Preserving Synthetic Medical Images via Image Conditioned Diffusion Models}, author = {Yaya-Stupp, Doron and Lutsker, Guy and Spiegel, Or and Segal, Eran}, year = {2026}, } ``` --- ## License [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) — free to use, share, and adapt with attribution.