How to use from the
Use from the
llama-cpp-python library
# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF",
	filename="",
)
llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": [
				{
					"type": "text",
					"text": "Describe this image in one sentence."
				},
				{
					"type": "image_url",
					"image_url": {
						"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
					}
				}
			]
		}
	]
)

OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF

I do this work independently and release it for free. Donations are welcome and go toward compute for more and larger abliterations.

Bitcoin: bc1qsvfduzj9fjs9fugpc52yver3f2g8fp7xjxecdv

Community discussion: https://discord.gg/rhUZY5GEZr

Overview

This repository contains APEX GGUF quantizations of OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored.

The source model is an abliterated and DPO-retrained version of Qwen/Qwen3.6-35B-A3B. After ablation and DPO, the original Qwen3.6 vision layers were readded to retain multimodal functionality. These GGUF files keep that vision support through the included multimodal projector.

Five APEX tiers are included:

  • APEX Quality: non-imatrix APEX Quality quantization.
  • APEX I-Quality: same tensor policy as Quality, quantized with a diverse imatrix calibration set.
  • APEX I-Balanced: two-tier Q6_K/Q5_K expert gradient with imatrix.
  • APEX I-Compact: Q4_K/Q3_K compact expert layout with imatrix.
  • APEX Mini: smallest included APEX tier, using IQ2_S middle experts with imatrix.

The filenames follow the upstream APEX naming style used by mudler/apex-quant.

Model Details

Attribute Value
Base model OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored
Original base Qwen/Qwen3.6-35B-A3B
Method Refusal ablation plus DPO retraining, then GGUF quantization
Quantization APEX Quality, APEX I-Quality, APEX I-Balanced, APEX I-Compact, APEX Mini
Format GGUF
Runtime llama.cpp
Architecture Qwen3.6 MoE vision-language model
Vision support Yes, through the included BF16 multimodal projector
Projector mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf

Files

File Description
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf APEX I-Quality GGUF quant with imatrix metadata
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Quality.gguf APEX Quality GGUF quant without imatrix
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Balanced.gguf APEX I-Balanced GGUF quant with imatrix metadata
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Compact.gguf APEX I-Compact GGUF quant with imatrix metadata
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Mini.gguf APEX Mini GGUF quant with imatrix metadata
mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf BF16 vision projector required for multimodal/image inputs
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.tensor-types.txt Tensor-type policy used for the I-Quality quant
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Quality.tensor-types.txt Tensor-type policy used for the Quality quant
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Balanced.tensor-types.txt Tensor-type policy used for the I-Balanced quant
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Compact.tensor-types.txt Tensor-type policy used for the I-Compact quant
OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-Mini.tensor-types.txt Tensor-type policy used for the Mini quant

The FP16 safetensors source model is published separately:

https://huggingface.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored

The earlier non-APEX GGUF release is published separately:

https://huggingface.co/OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-kuato-DPO-abliterated-uncensored-GGUF

llama.cpp

Download the I-Quality quant:

huggingface-cli download OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF \
  OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf \
  mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf \
  --local-dir ./model

Swap the GGUF filename for APEX-I-Balanced, APEX-I-Compact, APEX-Mini, or APEX-Quality if you want a different size/quality tradeoff. The same mmproj file is used for all tiers.

Run server mode:

llama-server \
  -m ./model/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf \
  --mmproj ./model/mmproj-OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-BF16.gguf \
  --host 0.0.0.0 \
  --port 8080 \
  -ngl all \
  --jinja

Run interactive text chat:

llama-cli \
  -m ./model/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-I-Quality.gguf \
  --conversation \
  -ngl all

For image input, load the mmproj file together with the GGUF model.

Quantization Notes

All GGUFs use APEX tensor layouts adapted conservatively for this MoE vision architecture:

  • APEX Quality and APEX I-Quality: Q6_K edge routed experts, Q5_K near-edge routed experts, IQ4_XS middle routed experts, Q8_0 shared experts, Q6_K attention/SSM projections.
  • APEX I-Balanced: Q6_K edge routed experts, Q5_K middle routed experts, Q8_0 shared experts, Q6_K attention/SSM projections.
  • APEX I-Compact: Q4_K edge routed experts, Q3_K middle routed experts, Q6_K shared experts, Q4_K attention/SSM projections.
  • APEX Mini: Q3_K edge routed experts, IQ2_S middle routed experts, Q5_K/Q4_K shared experts, Q4_K/Q3_K attention/SSM projections.
  • router, norms, embeddings, output, state tensors, conv tensors, and bias tensors: preserved as BF16/F32 where appropriate
  • vision projector: kept separate as BF16

The I-Quality, I-Balanced, I-Compact, and Mini files were quantized with an imatrix generated from a Hugging Face calibration corpus with chat, code, multilingual, terminal, and agentic examples.

Notes

Use is the responsibility of the user. Make sure your usage complies with applicable laws, platform rules, and deployment requirements.

Downloads last month
3,853
GGUF
Model size
35B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OpenYourMind/OpenYourMind-Qwen3.6-35B-A3B-abliterated-uncensored-APEX-GGUF