wan2.2-ti2v-5b-diffusers-8bit

This repository contains mixed q8/BF16 MLX-Gen saved weights for Wan-AI/Wan2.2-TI2V-5B-Diffusers. It is designed for local Apple Silicon inference with mlx-gen.

It uses the mflux/MLX saved-weight layout with MLX quantization tensors. It is not a Diffusers or Transformers from_pretrained() checkpoint.

Source Model

Original model: Wan-AI/Wan2.2-TI2V-5B-Diffusers.

This quantized derivative follows the Apache 2.0 license of the source model.

Quantization

This is a mixed q8/BF16 checkpoint:

q8 for quantizable Wan transformer attention and feed-forward linears.
BF16 for the Wan VAE.
BF16 for Wan transformer condition_embedder.* and proj_out.
BF16 for the UMT5 text encoder, scheduler metadata, tokenizer files, norms, convolutions, and other non-quantizable parameters.

The upstream TI2V-5B source snapshot is not uniformly 16-bit on disk: the transformer and VAE safetensors are FP32, while the UMT5 text encoder is BF16. MLX-Gen loads Wan transformer/VAE weights at BF16 runtime precision.

Measurements

Measured on 2026-06-04 with mlx-gen 0.18.10 on an Apple M5 Max with 128 GiB unified memory.

Validation profile: 1280x704, 17 frames, 20 denoising steps, guidance 5, 24 fps, seed 321, explicit empty negative prompt.

Layout	Storage	Logical Model	Full-Process Physical Peak	Max RSS	MLX Peak	Total Time	Output
Upstream source snapshot	31.9 GiB	10.6 GiB	102.7 GiB	13.7 GiB	58.5 GiB	216.2 s	base-source.mp4
Prepared BF16 package	21.2 GiB	10.6 GiB	102.6 GiB	14.5 GiB	58.5 GiB	261.6 s	prepared-bf16.mp4
This mixed q8/BF16 package	16.9 GiB	6.3 GiB	103.7 GiB	13.8 GiB	54.2 GiB	243.4 s	mixed-q8-bf16.mp4

This package reduces storage, logical model bytes, active MLX model bytes, and MLX allocator peak in the validation profile. It did not reduce full-process physical peak memory in this profile because transient video-generation allocations dominated the run.

The source and prepared BF16 package produced byte-identical decoded MP4 frames. This mixed q8/BF16 package stayed visually in the same family with mean frame MAE 1.66 versus source/BF16.

Storage is the Hugging Face repository total. Logical Model is the loaded Wan transformer plus VAE tensor footprint measured from MLX arrays. Full-Process Physical Peak is Darwin phys_footprint sampled from model initialization through MP4 save and health validation.

Validation assets:

Usage

python -m pip install -U mlx-gen

mlxgen download --model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit

mlxgen generate \
  --model AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit \
  --prompt "A short cinematic video of a glowing orange glass sphere floating above calm teal water, soft reflections, gentle camera movement" \
  --negative-prompt "" \
  --width 1280 \
  --height 704 \
  --frames 17 \
  --steps 20 \
  --guidance 5 \
  --fps 24 \
  --seed 321 \
  --output video.mp4

TI2V-5B also supports first-frame image-to-video in MLX-Gen when one input image is supplied.

Attribution

MLX-Gen is based on mflux by Filip Strand and the original mflux contributors.

Quantized and contributed by @lpalbou.

Downloads last month: -; Downloads are not tracked for this model. How to track

MLX

Hardware compatibility

8-bit

Model tree for AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit

Base model

Wan-AI/Wan2.2-TI2V-5B-Diffusers

Finetuned

(11)

this model

Collection including AbstractFramework/wan2.2-ti2v-5b-diffusers-8bit

mlx-gen

Collection

Models prepared and quantized for Apple MLX by mlx-gen based on mflux. https://github.com/lpalbou/mlx-gen • 28 items • Updated about 12 hours ago