---
license: cc-by-4.0
tags:
  - text-to-image
  - image-to-image
  - stable-diffusion
  - lora
  - diffusers
  - medical-imaging
  - retina
  - chest-xray
  - synthetic-data
  - privacy-preserving
  - controlled-generation
base_model: CompVis/stable-diffusion-v1-4
widget:
  - text: "retina fundoscopy right eye dilated age=45 gender=male bp systolic=120"
  - text: "retina fundoscopy left eye dilated age=70 gender=female bp systolic=165"
  - text: "retina fundoscopy right eye dilated age=30 gender=female bp systolic=110"
  - text: "chest xray view=pa age=55 gender=female demonstrating no finding"
  - text: "chest xray view=pa age=40 gender=male demonstrating cardiomegaly"
  - text: "chest xray view=pa age=65 gender=female demonstrating pleural effusion"
---

# SynthMed LoRAs — Privacy-Preserving Medical Image Generation

Two LoRA adapters for [CompVis/stable-diffusion-v1-4](https://huggingface.co/CompVis/stable-diffusion-v1-4) that generate synthetic medical images conditioned on patient metadata.

These models support two generation modes:

- **Text-to-image (T2I)** — generate a novel synthetic image from a prompt alone.
- **Image-to-image (I2I)** — transform a real image into a synthetic counterpart while preserving clinical signals. The `strength` parameter trades off re-identification risk against biomarker fidelity, enabling tunable de-identification.

| Adapter | Domain | Conditioned on | Weight file |
|---------|--------|---------------|-------------|
| `hpp-retina` | Retinal fundus photography | Age, sex, systolic BP | `hpp-retina/lora_weights.safetensors` |
| `cxr` | Chest X-ray (PA view) | Age, sex, pathology findings | `cxr/lora_weights.safetensors` |

---

## About

Patient privacy constraints limit the sharing of clinical imaging datasets. These LoRAs were developed as part of a study evaluating image-conditioned diffusion as a practical de-identification approach: given a real image, I2I diffusion produces a synthetic counterpart that preserves downstream-relevant clinical signals while reducing re-identification risk.

Key findings from the paper:

- **I2I outperforms T2I** on pixel/perceptual fidelity and biomarker agreement at all conditioning strengths.
- **Privacy–utility tradeoff** (retinal): biomarker agreement (hemoglobin, Pearson *r*) drops from 0.83 at strength 0.1 to 0.32 at strength 1.0, while top-1 re-identification rate falls from 100% to ~2%.
- **Cross-cohort transfer**: pretraining on I2I synthetic retinal images performs comparably to real-image pretraining when transferring to UK Biobank, and surpasses it in the smallest fine-tuning regimes.
- **CXR caveat**: chest X-rays remain substantially re-identifiable even at high I2I strengths.

---

## Quick Start

### Text-to-image (T2I)

```python
import torch
from diffusers import DDIMScheduler, StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
    safety_checker=None,
)
pipe.scheduler = DDIMScheduler.from_pretrained(
    "CompVis/stable-diffusion-v1-4", subfolder="scheduler"
)
pipe.load_lora_weights("doronys/synthmed-loras", weight_name="hpp-retina/lora_weights.safetensors")
pipe = pipe.to("cuda")

image = pipe(
    "retina fundoscopy right eye dilated age=50 gender=male bp systolic=130",
    num_inference_steps=50,
).images[0]
image.save("retina_t2i.png")
```

### Image-to-image (I2I) — tunable de-identification

```python
from PIL import Image
import torch
from diffusers import DDIMScheduler, StableDiffusionImg2ImgPipeline

pipe = StableDiffusionImg2ImgPipeline.from_pretrained(
    "CompVis/stable-diffusion-v1-4",
    torch_dtype=torch.float16,
    safety_checker=None,
)
pipe.scheduler = DDIMScheduler.from_pretrained(
    "CompVis/stable-diffusion-v1-4", subfolder="scheduler"
)
pipe.load_lora_weights("doronys/synthmed-loras", weight_name="hpp-retina/lora_weights.safetensors")
pipe = pipe.to("cuda")

source = Image.open("patient_retina.png").resize((512, 512))

# strength: 0.0 → near-identity,  1.0 → unconstrained T2I
image = pipe(
    prompt="retina fundoscopy right eye dilated age=50 gender=male bp systolic=130",
    image=source,
    strength=0.5,
    num_inference_steps=50,
).images[0]
image.save("retina_i2i.png")
```

Switch to the CXR adapter by changing `weight_name="cxr/lora_weights.safetensors"` and using a CXR prompt.

---

## Prompt Format

Labels are appended to a domain-specific prefix as `key=value` tokens.

**Retinal fundus (`hpp-retina`)**
```
retina fundoscopy {laterality} eye dilated age={age} gender={male|female} bp systolic={sbp}
```

Examples:
- `retina fundoscopy right eye dilated age=45 gender=male bp systolic=120`
- `retina fundoscopy left eye dilated age=72 gender=female bp systolic=155`

**Chest X-ray (`cxr`)**
```
chest xray view=pa age={age} gender={male|female} demonstrating {finding}
```

Common findings: `no finding`, `cardiomegaly`, `pleural effusion`, `infiltration`, `atelectasis`, `pneumonia`, `edema`, `pneumothorax`, `consolidation`, `effusion`

Examples:
- `chest xray view=pa age=55 gender=female demonstrating no finding`
- `chest xray view=pa age=63 gender=male demonstrating cardiomegaly`

---

## Image Examples

### Retinal Fundus — T2I vs I2I (strength=0.5)

Each pair is generated from the same real source image and prompt. T2I generates freely from text; I2I conditions on the real image, preserving anatomical structure.

**`retina fundoscopy right eye dilated age=42 gender=female bp systolic=104`**

| T2I | I2I |
|-----|-----|
| ![](images/retina_t2i_f42_bp104.png) | ![](images/retina_i2i_f42_bp104.png) |

**`retina fundoscopy right eye dilated age=67 gender=female bp systolic=123`**

| T2I | I2I |
|-----|-----|
| ![](images/retina_t2i_f67_bp123.png) | ![](images/retina_i2i_f67_bp123.png) |

**`retina fundoscopy right eye dilated age=48 gender=male bp systolic=127`**

| T2I | I2I |
|-----|-----|
| ![](images/retina_t2i_m48_bp127.png) | ![](images/retina_i2i_m48_bp127.png) |

### Chest X-ray — T2I

| | | |
|-|-|-|
| ![](images/cxr_t2i_f58_no_finding.png) | ![](images/cxr_t2i_f73_cardiomegaly.png) | ![](images/cxr_t2i_m74_effusion.png) |
| `age=58 female, no finding` | `age=73 female, cardiomegaly` | `age=74 male, effusion` |
| ![](images/cxr_t2i_m47_fibrosis.png) | ![](images/cxr_t2i_m40_atelectasis.png) | ![](images/cxr_t2i_m48_nodule.png) |
| `age=47 male, fibrosis, infiltration` | `age=40 male, atelectasis, effusion` | `age=48 male, atelectasis, nodule` |

---

## Training Details

Both adapters were fine-tuned from `CompVis/stable-diffusion-v1-4` using LoRA on attention layers (`to_k`, `to_q`, `to_v`, `to_out.0`).

| Parameter | Value |
|-----------|-------|
| LoRA rank | 64 |
| LoRA alpha | 32 |
| Resolution | 512 × 512 |
| Training steps | 20,000 |
| Learning rate | 1e-4 (constant) |
| Batch size | 12 |
| Mixed precision | fp16 |

**hpp-retina** was trained on retinal fundus images from the [Human Phenotype Project (10K)](https://www.nature.com/articles/s41591-025-03790-9).  
**cxr** was trained on the [NIH ChestX-ray8](https://nihcc.app.box.com/v/ChestXray-NIHCC) dataset.

---

## Citation

If you use these models in your research, please cite:

```bibtex
@article{synthmed2026,
  title   = {Privacy-Preserving Synthetic Medical Images via Image Conditioned Diffusion Models},
  author  = {Yaya-Stupp, Doron and Lutsker, Guy and Spiegel, Or and Segal, Eran},
  year    = {2026},
}
```

---

## License

[CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) — free to use, share, and adapt with attribution.