Mixed-precision MLX quantizations of LTX-2.3 for Apple Silicon using the RAM MCKP allocator. 12 GB and 24 GB variants.
AI & ML interests
Model Quantization
Recent Activity
View all activity
Mixed-precision GGUF quantizations of moonshotai/Kimi-K2.6 from the RAM pipeline (per-tensor bit allocation via sensitivity probing).
Mixed-precision MLX builds of deepseek-ai/DeepSeek-V3.2 for Apple Silicon.
RAM optimised Gemma 4 models by baa.ai
MINT & SWAN quantized versions of MiniMax-M2.5 (MLX & GGUF)
SWAN quantized versions of Llama 3.1 and 3.3 70B Instruct (MLX)
MINT & SWAN quantized versions of Qwen3 models (MLX)
MINT quantized versions of Qwen3.5-122B-A10B at multiple budget targets (MLX & GGUF)
Mixed-precision MLX builds of Qwen/Qwen3.6-27B at the predicted local and global operating points.
Mixed-precision MLX builds of Qwen/Qwen3.6-35B-A3B for Apple Silicon. Size points: 19 GB, 25 GB.
RAM quantized versions of MiniMaxAI/MiniMax-M2.7 for Apple Silicon. Size points: 91 GB, 100 GB, 111 GB, 116 GB, 120 GB.
-
baa-ai/MiniMax-M2.7-RAM-100GB-MLX
Text Generation • 229B • Updated • 611 • 5 -
baa-ai/MiniMax-M2.7-RAM-120GB-MLX
Text Generation • 229B • Updated • 304 • 3 -
baa-ai/MiniMax-M2.7-RAM-116GB-MLX
Text Generation • 229B • Updated • 129 • 2 -
baa-ai/MiniMax-M2.7-RAM-111GB-MLX
Text Generation • 229B • Updated • 80 • 1
MINT quantized Nemotron-3-Super-120B — hybrid Mamba-MoE-Attention (MLX & GGUF)
Baa.ai quantized versions of GLM models
MINT & SWAN quantized versions of Llama 4 Scout and Maverick (MLX & GGUF)
Mixed-precision MLX builds of Qwen/Qwen3.5-35B-A3B for Apple Silicon, quantized by baa.ai. Size points: 12.5-21, 25, 29, 31 GB.
MINT & SWAN quantized versions of Qwen3.5-397B-A17B (MLX & GGUF)
Mixed-precision MLX quantizations of LTX-2.3 for Apple Silicon using the RAM MCKP allocator. 12 GB and 24 GB variants.
Mixed-precision MLX builds of Qwen/Qwen3.6-27B at the predicted local and global operating points.
Mixed-precision GGUF quantizations of moonshotai/Kimi-K2.6 from the RAM pipeline (per-tensor bit allocation via sensitivity probing).
Mixed-precision MLX builds of Qwen/Qwen3.6-35B-A3B for Apple Silicon. Size points: 19 GB, 25 GB.
Mixed-precision MLX builds of deepseek-ai/DeepSeek-V3.2 for Apple Silicon.
RAM quantized versions of MiniMaxAI/MiniMax-M2.7 for Apple Silicon. Size points: 91 GB, 100 GB, 111 GB, 116 GB, 120 GB.
-
baa-ai/MiniMax-M2.7-RAM-100GB-MLX
Text Generation • 229B • Updated • 611 • 5 -
baa-ai/MiniMax-M2.7-RAM-120GB-MLX
Text Generation • 229B • Updated • 304 • 3 -
baa-ai/MiniMax-M2.7-RAM-116GB-MLX
Text Generation • 229B • Updated • 129 • 2 -
baa-ai/MiniMax-M2.7-RAM-111GB-MLX
Text Generation • 229B • Updated • 80 • 1
RAM optimised Gemma 4 models by baa.ai
MINT quantized Nemotron-3-Super-120B — hybrid Mamba-MoE-Attention (MLX & GGUF)
MINT & SWAN quantized versions of MiniMax-M2.5 (MLX & GGUF)
Baa.ai quantized versions of GLM models
SWAN quantized versions of Llama 3.1 and 3.3 70B Instruct (MLX)
MINT & SWAN quantized versions of Llama 4 Scout and Maverick (MLX & GGUF)
MINT & SWAN quantized versions of Qwen3 models (MLX)
Mixed-precision MLX builds of Qwen/Qwen3.5-35B-A3B for Apple Silicon, quantized by baa.ai. Size points: 12.5-21, 25, 29, 31 GB.
MINT quantized versions of Qwen3.5-122B-A10B at multiple budget targets (MLX & GGUF)
MINT & SWAN quantized versions of Qwen3.5-397B-A17B (MLX & GGUF)