granite-4.1-8b-mxfp8-mlx

Brainwaves

         arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.486,0.666,0.875,0.636,0.450,0.766,0.631

Other models in this size

gemma-4-E4B-it

         arc   arc/e boolq hswag obkqa piqa  wino
bf16     0.490,0.674,0.793,0.612,0.416,0.756,0.669
mxfp8    0.480,0.656,0.797,0.608,0.400,0.755,0.665
mxfp4    0.455,0.607,0.851,0.585,0.402,0.744,0.651

Qwen3.5-9B

mxfp8    0.417,0.458,0.623,0.634,0.338,0.737,0.639
mxfp4    0.419,0.472,0.622,0.634,0.352,0.739,0.644
q8-hi    0.413,0.455,0.622,0.642,0.346,0.746,0.654
q8       0.418,0.455,0.622,0.643,0.342,0.748,0.659

This model granite-4.1-8b-mxfp8-mlx was converted to MLX format from ibm-granite/granite-4.1-8b using mlx-lm version 0.31.3.

Use with mlx

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("granite-4.1-8b-mxfp8-mlx")

prompt = "hello"

if tokenizer.chat_template is not None:
    messages = [{"role": "user", "content": prompt}]
    prompt = tokenizer.apply_chat_template(
        messages, add_generation_prompt=True, return_dict=False,
    )

response = generate(model, tokenizer, prompt=prompt, verbose=True)

Downloads last month: 1,153

Safetensors

Model size

9B params

Tensor type

U32

BF16

MLX

Hardware compatibility

8-bit

Model tree for nightmedia/granite-4.1-8b-mxfp8-mlx

Base model

ibm-granite/granite-4.1-8b

Quantized

(31)

this model

Collection including nightmedia/granite-4.1-8b-mxfp8-mlx

IBM Granite 4.1

Collection

now with high IQ • 21 items • Updated about 16 hours ago • 4