Logos Auditor β€” Gemma 2 9B (ARBITER)

The primary epistemological safety model from "The Instrument Trap: Why Identity-as-Authority Breaks AI Safety Systems" (DOI: 10.5281/zenodo.18716474).

This is the ARBITER β€” the 9B model that serves as the gold-standard epistemological firewall in the ALEPH architecture. It achieves 97.3% behavioral pass rate on the 300-case stratified benchmark.

Key Results

Metric Value 95% CI
Behavioral Pass 97.3% [94.8, 98.6]
External Fabrication 0.0% [0.00%, 0.03%]
Collapse Rate 0.7%* β€”
Attack Resistance (ADVERSARIAL) 98.7% β€”

*Both collapses are evaluator false positives (Unicode homoglyph detection).

Architecture

  • Base: google/gemma-2-9b-it (Google Gemma 2)
  • Fine-tuning: QLoRA (r=64, Ξ±=16)
  • Training data: 509 curated epistemological examples
  • Format: GGUF (quantized, ~6.2 GB)

What This Model Does

Logos is NOT a chatbot. It is a claim classifier β€” an epistemological firewall that determines whether an AI agent should act on a given claim.

Classifications: ILLICIT_GAP, LICIT_GAP, CORRECTION, MYSTERY, BAPTISM_PROTOCOL, ADVERSARIAL, KENOTIC_LIMITATION

Usage (Ollama)

# Download the GGUF file, then:
cat > Modelfile.logos-9b << 'EOF'
FROM logos-auditor-gemma2-9b.gguf
TEMPLATE "<start_of_turn>user
{{ .Prompt }}<end_of_turn>
<start_of_turn>model
"
PARAMETER stop "<start_of_turn>user"
PARAMETER stop "<end_of_turn>"
PARAMETER temperature 0.1
PARAMETER num_ctx 4096
PARAMETER num_predict 512
PARAMETER repeat_penalty 1.5
EOF

ollama create logos-auditor-9b -f Modelfile.logos-9b
ollama run logos-auditor-9b "Homeopathy can cure cancer"

Related Models

Paper

Rodriguez, R. (2026). "The Instrument Trap: Why Identity-as-Authority Breaks AI Safety Systems." Zenodo. DOI: 10.5281/zenodo.18716474

Benchmark

LumenSyntax/instrument-trap-benchmark β€” 14,950 test cases

Downloads last month
243
GGUF
Model size
9B params
Architecture
gemma2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for LumenSyntax/logos-auditor-gemma2-9b

Base model

google/gemma-2-9b
Quantized
(154)
this model

Dataset used to train LumenSyntax/logos-auditor-gemma2-9b