Diffusion-DDIM β DDIM Sampler on CIFAR-10
An extension of the DDPM-CIFAR10 model that adds a Denoising Diffusion Implicit Models (DDIM) sampler for fast, deterministic inference. The underlying U-Net is trained identically to the DDPM variant; only the sampling procedure differs.
Model Description
DDIM (Song et al., 2020) reformulates the reverse diffusion process as a non-Markovian chain, allowing the model to sample in far fewer steps than standard DDPM while maintaining high image quality. The same noise-predicting U-Net trained with DDPM can be used directly with the DDIM sampler without retraining.
DDIM vs DDPM
| Property | DDPM | DDIM |
|---|---|---|
| Sampling process | Markovian (stochastic) | Non-Markovian (deterministic) |
| Required steps | ~1000 | 50β200 (configurable) |
| Speed | Slow | Fast |
| Determinism | Stochastic | Deterministic (given same seed) |
| Retraining needed | β | No (same model weights) |
Architecture
Identical U-Net to Diffusion-CIFAR10: ResBlocks with GroupNorm, self-attention at the bottleneck, sinusoidal time embeddings, strided Conv2d downsampling, nearest-neighbour upsampling.
Diffusion Process
- Trainer:
GaussianDiffusionTrainerβ standard DDPM noise prediction training - Sampler:
GaussianDiffusionSamplerwith DDIM update rule - Schedule: Linear beta schedule
- Loss: MSE between predicted and actual noise (epsilon parameterization)
Training Details
| Parameter | Value |
|---|---|
| Dataset | CIFAR-10 (50,000 images, 32x32 RGB) |
| Epochs | 200 (checkpoint: ckpt_199.pth) |
| Optimizer | AdamW |
| LR Schedule | Cosine annealing + linear warmup |
Sample Images
Generated samples at different DDIM step counts are included in SampledImgs/:
| File | DDIM steps |
|---|---|
| SampledNoGuidenceImgs_400.png | 400 steps |
| SampledNoGuidenceImgs_600.png | 600 steps |
Repository Contents
| File | Description |
|---|---|
| model.py | U-Net architecture |
| diffusion.py | GaussianDiffusionTrainer + DDIM GaussianDiffusionSampler |
| train.py | Training loop |
| scheduler.py | GradualWarmupScheduler |
| main.py | Entry point |
| Checkpoints/ckpt_199.pth | Model checkpoint (epoch 199) |
| SampledImgs/ | Generated samples at various step counts |
How to Use
import torch
from model import UNet
from diffusion import GaussianDiffusionSampler
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = UNet(T=1000, ch=128, ch_mult=[1,2,2,2], attn=[1], num_res_blocks=2, dropout=0.1)
model.load_state_dict(torch.load("Checkpoints/ckpt_199.pth", map_location=device))
model.eval().to(device)
# DDIM sampler β set T to desired number of inference steps
sampler = GaussianDiffusionSampler(beta_1=1e-4, beta_t=0.02, model=model, T=200).to(device)
with torch.no_grad():
x_T = torch.randn(16, 3, 32, 32, device=device)
samples = sampler(x_T)
References
- Ho et al. (2020). Denoising Diffusion Probabilistic Models
- Song et al. (2020). Denoising Diffusion Implicit Models
License
MIT