Content

This model area links to models and tools around Mägic. The development milestones are called Skipper (T3) and Mate (M8).

The Mägic project is a Proto Open Source project (OpenSoars) that does NOT publish its code but applies the benefits preferrably to OSI models and some select Open Weights models depending on community feedback. The goal is to strengthen the True Open Source model family while giving everyone a choice to run efficient high quality inference on-device.

Converted GGUF models will be provided for:

  • all OSI compliant language models (Olmo, Apertus, Smol, ...)
  • select Open Weights language models (Mistral, Granite, ...)

Inference software based on llama.cpp will be provided as open source under Apache 2.0 license.

  • initial versions target 2bpw for fp16 quality at a 4x speedup: A single RTX 4090 will be able to serve a 70b model as fast as a H200 today.
  • T3 experiments showed that 1.4bpw and 10x speedup is possible. That is Mägic.

Mägic Technology

Mägic splits the work into two parts:

  • secret code for lossless model conversion
    • The secret transformation is based on prior work I did 2001-2005 on video codec quantization. Back then I experimented with higher dimensional transformations (JAVC) as well as adaptive resolution (HeiDi). Those transformations and ideas fit very well into the age of neural network quantization.
  • public code for efficient model inference Apache 2.0
    • binaries and source code
    • future extensions will debut as Mägic+ and trickle down into Mägic over time

Mägic+

Mägic+ will be the paid tier and offer improved compression, performance and model support

Demo Spaces

  • Regular compression
  • T3 OSI compression
    • TOM@zero Demo of next generation 2bpw compression (Skipper aka T3) with high quality open source models (OSI)
      • Olmo 3.1 32b
  • T3 Open Weights compression
    • Granite4extreme Granite 4 small hybrid 32b compressed to below 9GB in fp16 quality
  • Mägic compression

SmartQuant

SmartQuant is not Mägic. It provides a baseline with improved default compression.

Downloads last month
-
GGUF
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support