MINT quantized Nemotron-3-Super-120B — hybrid Mamba-MoE-Attention (MLX & GGUF)
AI & ML interests
Model Quantization
Recent Activity
View all activity
Organization Card
🐑 baa.ai
Smaller. Smarter. Sovereign.
Making frontier models run anywhere
We publish high-quality mixed-precision quantized models for Apple Silicon and GGUF. Our models use a proprietary optimisation method that delivers superior quality at your target memory budget
🤗 Models
All models are published as MLX (Apple Silicon) and GGUF (cross-platform) formats.
| Model Family | Sizes Available | Format |
|---|---|---|
| Qwen3.5-397B | 220GB, 224GB | MLX, GGUF |
| Qwen3.5-122B | 52GB, 128GB, 154GB | MLX, GGUF |
| Qwen3.5-35B | 15–51GB (8 variants) | MLX, GGUF |
| Llama-4-Maverick (402B) | 407GB | MLX, GGUF |
| Llama-4-Scout (109B) | 117GB | MLX, GGUF |
| Llama-3.1/3.3-70B | 47GB | MLX |
| MiniMax-M2.5 (229B) | 179GB | MLX, GGUF |
| Nemotron-120B | — | MLX, GGUF |
| GLM-4.7-Flash | 16GB | MLX |
| Qwen3-30B/8B | 16GB, 6GB | MLX |
models 40
baa-ai/Qwen3.5-122B-A10B-RAM-48GB-MLX
122B • Updated • 19
baa-ai/Qwen3.5-122B-A10B-MINT-3bit-MLX
122B • Updated • 1.87k • 3
baa-ai/Qwen3.5-35B-A3B-RAM-31GB-MLX
35B • Updated • 76
baa-ai/Qwen3.5-35B-A3B-RAM-29GB-MLX
35B • Updated • 23
baa-ai/Qwen3.5-35B-A3B-RAM-25GB-MLX
35B • Updated • 17
baa-ai/Llama-4-Maverick-17B-128E-Instruct-MINT-407GB-MLX
Text Generation • 401B • Updated • 879
baa-ai/MiniMax-M2.5-MINT-179GB-MLX
Text Generation • 229B • Updated • 1.11k
baa-ai/Llama-4-Scout-17B-16E-Instruct-MINT-117GB-MLX
Text Generation • 108B • Updated • 1.04k
baa-ai/Qwen3.5-122B-A10B-MINT-128GB-MLX
Text Generation • 122B • Updated • 935
baa-ai/Qwen3.5-397B-A17B-MINT-220GB-MLX
Text Generation • 396B • Updated • 754
datasets 0
None public yet