Mesh LLM

gemma-4-26B-A4B-it-UD-Q4_K_M

Distributed GGUF inference package for Mesh LLM

Website GitHub Discord

GGUF layer package for running gemma-4-26B-A4B-it-UD-Q4_K_M across a local Mesh LLM cluster.

This package is derived from unsloth/gemma-4-26B-A4B-it-GGUF and keeps the original GGUF distribution split into per-layer artifacts for distributed inference.

Highlights

Run locally Pool multiple machines OpenAI-compatible Package variant
Private inference on your hardware Split layers across peers Serve /v1/chat/completions locally UD-Q4_K_M layer package

Model Overview

Property Value
Source model unsloth/gemma-4-26B-A4B-it-GGUF
Model id unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_M
Family Gemma
Parameter scale 26B-A4B
Quantization UD-Q4_K_M
Layer count 30
Activation width 2816
Package size 16.3 GB
Source file gemma-4-26B-A4B-it-UD-Q4_K_M.gguf
Package repo meshllm/gemma-4-26B-A4B-it-UD-Q4_K_M-layers

Recommended Use

  • Local and private inference with Mesh LLM.
  • Multi-machine serving when the full GGUF is too large for one host.
  • OpenAI-compatible chat/completions workflows through Mesh LLM's local API.

For upstream architecture details, chat template guidance, sampling recommendations, license terms, and benchmark notes, see the source model card: unsloth/gemma-4-26B-A4B-it-GGUF.

Quickstart

# Run this on each machine that should contribute memory/compute.
mesh-llm serve --model "meshllm/gemma-4-26B-A4B-it-UD-Q4_K_M-layers" --split
# Check the mesh and discover the OpenAI-compatible model name.
curl -s http://localhost:3131/api/status
curl -s http://localhost:3131/v1/models
# Send an OpenAI-compatible chat request.
curl -s http://localhost:3131/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "unsloth/gemma-4-26B-A4B-it-GGUF:UD-Q4_K_M",
    "messages": [{"role": "user", "content": "Write a tiny hello-world function in Rust."}],
    "max_tokens": 128
  }'

Package Variant

Property Value
Format layer-package
Canonical source ref unsloth/gemma-4-26B-A4B-it-GGUF@main/gemma-4-26B-A4B-it-UD-Q4_K_M.gguf
Source revision main
Source SHA-256 34c746b1d50ab813e29cd46c4796e3f43c741901a582f93a67b55b9fc9687b35
Skippy ABI 0.1.22
Package manifest SHA-256 23c1e32d371c3f50aa393e046894d1f893f22d99f0ccee91552be17cf9c88b65

What Is Included

Artifact Path Contents SHA-256
Manifest model-package.json Package schema, source identity, checksums 23c1e32d371c3f50aa393e046894d1f893f22d99f0ccee91552be17cf9c88b65
Metadata shared/metadata.gguf 1 tensors, 15.1 MB 4ea990cfa36597971790aa1dcc4658cd8c946685b3d16dbaac4f685efb382002
Embeddings shared/embeddings.gguf 2 tensors, 763.1 MB 6eb39617e6999290dd4c338cdcadf98ea42f948812fb4aeb0eab50735c7e0dd3
Output head shared/output.gguf 2 tensors, 15.1 MB 88f7262cffae57f67ffb7b8db8fa9c04439d43a3af7d53fa3df81f0c6511e522
Transformer layers layers/layer-*.gguf 30 layer artifacts, 685 tensors, 15.5 GB see model-package.json

Validation

Generated by the Mesh LLM HF Jobs splitter from mesh-llm ref main and validated before upload:

skippy-model-package validate-package "/source/gemma-4-26B-A4B-it-UD-Q4_K_M.gguf" "$PACKAGE_DIR"

Links

Downloads last month
5,064
GGUF
Model size
0.8B params
Architecture
gemma4
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for meshllm/gemma-4-26B-A4B-it-UD-Q4_K_M-layers

Quantized
(1)
this model