Theprint MoE 10B 0126

theprint-10B-MoE-A3B (GGUF)

A Mixture of Experts model built on Llama 3.2 3B, combining four specialized fine-tunes with a general-purpose model.

Architecture

  • Base model: theprint/GeneralChat-Llama3.2-3B
  • Gate mode: Hidden
  • Dtype: bfloat16
  • Experts: 4

Experts

Expert Specialization
LLM-Data-Science-Llama3.2-3B Machine learning, neural networks, fine-tuning, pre-training
CreativeWriter-Llama3.2-3B Fiction writing, story structure, scene development, plot analysis
Llama-3.2-3B-VanRossum Python programming, debugging, algorithm implementation
CogBeTh-Llama3.2-3B Mental health support, anxiety, stress management, self-care

How It Works

The model uses a hidden gate mechanism to route inputs to the most relevant expert(s) based on the content of the prompt. Each expert was fine-tuned for its domain before being merged into this MoE architecture using mergekit.

Usage

Compatible with any Llama 3.2 inference setup. No special configuration required — the routing happens automatically.

Downloads last month
84
GGUF
Model size
10B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for theprint/theprint-10B-MoE-A3B-0126-GGUF

Quantized
(3)
this model

Collection including theprint/theprint-10B-MoE-A3B-0126-GGUF