LTX-2.3-22B — 24GB (MLX)

Mixed-precision quantized version of Lightricks/LTX-2.3 optimised by baa.ai using a proprietary Black Sheep AI method.

Per-tensor bit-width allocation via advanced sensitivity analysis. Runs natively on Apple Silicon via ltx-2-mlx.

Metrics

Metric	Value
Transformer size	19.8 GB
Total model size	~28 GB
Average bits (transformer)	7.9
Quantized layers	1,157 / 1,760

Bit distribution:

Bits	Layers
5	3
6	75
8	1,079

Requirements

Apple Silicon (M1 or later), macOS 13+
~26 GB unified memory
Python 3.10+

pip install mlx flask ltx-pipelines-mlx ltx-core-mlx

Usage — Web App

git clone https://huggingface.co/baa-ai/LTX-2.3-22B-RAM-24GB-MLX
cd LTX-2.3-22B-RAM-24GB-MLX
python webapp.py

# Side-by-side comparison with the 12GB version:
git clone https://huggingface.co/baa-ai/LTX-2.3-22B-RAM-12GB-MLX ../LTX-2.3-22B-RAM-12GB-MLX
python webapp.py --compare-dir ../LTX-2.3-22B-RAM-12GB-MLX

Open http://localhost:7860 — enter a prompt, set resolution and duration, and hit Generate. The live log streams generation progress; the video plays in the browser when done.

Usage — CLI

python generate.py \
    --model-dir /path/to/LTX-2.3-22B-RAM-24GB-MLX \
    --prompt "A serene mountain lake at sunrise, mist over calm water, pine trees reflected" \
    --height 480 --width 704 --num-frames 65 \
    --output output.mp4

Frame count must follow 32k + 1 (33, 65, 97, 129 …). Duration in seconds = frames ÷ fps.

Model Comparison

Model	Transformer	Avg bpw	Peak RAM
LTX-2.3-22B-RAM-12GB-MLX	9.3 GB	4.6	~14 GB
LTX-2.3-22B-RAM-24GB-MLX	19.8 GB	7.9	~26 GB
dgrauet/ltx-2.3-mlx-q8	19.2 GB	8.0 uniform	~26 GB

Quantized by baa.ai

Black Sheep AI Products

Shepherd — Private AI deployment platform that shrinks frontier models by 50-60% through RAM compression, enabling enterprises to run sophisticated AI on single GPU instances or Apple Silicon hardware. Deploy in your VPC with zero data leaving your infrastructure. Includes CI/CD pipeline integration, fleet deployment across Apple Silicon clusters, air-gapped and sovereign deployment support, and multi-format export (MLX, GGUF). Annual cloud costs from ~$2,700 — or run on a Mac Studio for electricity only.

Watchman — Capability audit and governance platform for compressed AI models. Know exactly what your quantized model can do before it goes live. Watchman predicts which capabilities survive compression in minutes — replacing weeks of benchmarking. Includes compliance-ready reporting for regulated industries, quality valley warnings for counterproductive memory allocations, instant regression diagnosis tracing issues to specific tensors, and 22 adversarial security probes scanning for injection, leakage, hallucination, and code vulnerabilities.

Learn more at baa.ai — Sovereign AI.