CAJAL-4B-P2PCLAW

🧠 The Research LLM That Fits in Your Pocket

CAJAL-4B is a 4-billion parameter language model fine-tuned specifically for scientific paper generation. Unlike generic chatbots, CAJAL understands academic structure, citation formats, LaTeX, and domain-specific terminology.

Named after Santiago Ramón y Cajal, the father of modern neuroscience, this model embodies rigorous, structured thinking applied to scientific writing.


🚀 Quick Start

Option 1: HuggingFace Transformers (Python)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW")
tokenizer = AutoTokenizer.from_pretrained("Agnuxo/CAJAL-4B-P2PCLAW")

prompt = """Write an abstract for a paper on decentralized AI peer review 
using formal verification and IPFS-backed persistence."""

inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Option 2: llama.cpp / LM Studio (Local, No Code)

Download the GGUF from Releases

Open LM Studio → Load Model → Select GGUF

System prompt:

You are CAJAL, a research assistant specialized in scientific writing.
Generate well-structured, cited academic content.
Use LaTeX formatting for equations when relevant.
Prefer precise, technical language over vague generalizations.

Option 3: Ollama

ollama pull agnuxo/cajal-4b-p2pclaw
ollama run agnuxo/cajal-4b-p2pclaw

Option 4: vLLM (Fast Inference Server)

python -m vllm.entrypoints.openai.api_server \
  --model Agnuxo/CAJAL-4B-P2PCLAW \
  --quantization awq

Option 5: MLX (Apple Silicon)

import mlx_lm

model, tokenizer = mlx_lm.load("Agnuxo/CAJAL-4B-P2PCLAW")
response = mlx_lm.generate(model, tokenizer, prompt="Write a paper abstract...")

📊 What Makes It Different

Feature CAJAL-4B Generic 4B Why It Matters
Paper structure ✅ Native understanding ⚠️ Generic chat Knows IMRAD format
Citations ✅ BibTeX, APA, MLA ❌ Hallucinates Real citation formats
LaTeX ✅ Equations, tables ❌ No Research-ready output
Domain terms ✅ Physics, CS, Bio ⚠️ Surface-level Technical depth
Methodology ✅ Detailed procedures ⚠️ Vague Reproducible methods
VRAM usage ✅ 3.5GB (Q4_K_M) Similar Runs on consumer GPUs
Local inference ✅ 100% offline ⚠️ Depends No API/cloud needed

🎯 Benchmarks

Task CAJAL-4B Qwen3.5-4B Gemma-4B Phi-4-mini
Abstract generation 92/100 71/100 68/100 79/100
Citation accuracy 88/100 52/100 48/100 61/100
LaTeX correctness 94/100 43/100 41/100 55/100
Methodology detail 89/100 64/100 59/100 72/100
Literature review 85/100 69/100 67/100 74/100

Evaluated by BenchClaw 17-judge tribunal on 50 paper generation tasks.


💻 Hardware Requirements

Quantization File Size VRAM Required Speed (RTX 3090) Speed (M3 Max)
Q4_K_M 2.3 GB 3.5 GB ~45 tok/s ~35 tok/s
Q5_K_M 2.7 GB 4.2 GB ~42 tok/s ~32 tok/s
Q8_0 4.1 GB 5.0 GB ~38 tok/s ~28 tok/s
F16 8.0 GB 9.0 GB ~35 tok/s ~25 tok/s

CPU-only: Works on any modern CPU. ~5 tok/s on Ryzen 7 5800X.


🏗️ Training Details

  • Base model: Qwen/Qwen3.5-4B
  • Fine-tuning: LoRA (r=64, alpha=128) on domain-specific corpus
  • Dataset: 50+ peer-reviewed papers + 200+ technical reports from P2PCLAW ecosystem
  • Context length: 32K tokens
  • Training hardware: RTX 3090 24GB
  • License: Apache 2.0

🔬 Example Outputs

Example 1: Abstract Generation

Prompt: Write an abstract for a paper on thermodynamic reservoir computing using SHA-256 ASICs.

Output:

We present a novel architecture for thermodynamic reservoir computing 
that repurposes Bitcoin SHA-256 ASICs as physical substrates. By mapping 
the spin-glass dynamics of the hashing pipeline onto a continuous-time 
reservoir, we achieve energy-efficient computation at the thermodynamic 
limit. Our experimental evaluation on repurposed Antminer S9 hardware 
demonstrates a 2.3x improvement in energy-delay product compared to 
conventional FPGA implementations, with a theoretical bound derived 
from Landauer's principle...

Example 2: Methodology Section

Prompt: Write a methodology section for evaluating decentralized AI peer review with 17 judges.

Output:

### 3.1 Evaluation Protocol

We employ a multi-dimensional tribunal consisting of 17 independent 
judge agents, each specialized in a distinct evaluation criterion:

- **Reasoning Depth** (15%): Logical structure, argument validity, 
  inference chain completeness.
- **Mathematical Rigor** (12%): Proof correctness, notation consistency, 
  theorem applicability.
- **Code Quality** (10%): Reproducibility, documentation, test coverage.
...

Each judge produces a scalar score $s_i \in [0, 100]$ and a confidence 
weight $w_i \in [0, 1]$. The composite score is computed as:

S=i=117wisii=117wiS = \frac{\sum_{i=1}^{17} w_i s_i}{\sum_{i=1}^{17} w_i}

A paper achieves **Tribunal Pass** if $S \geq 75$ and no individual 
$s_i < 50$ (no veto condition).

🧩 Integration with P2PCLAW Ecosystem

CAJAL is one component of the P2PCLAW distributed research network:

Component Role Link
OpenCLAW-P2P Core protocol, Lean 4 proofs GitHub
BenchClaw 17-judge evaluation Web
EnigmAgent Secure credential vault GitHub
AgentBoot Bare-metal automation Web
P2PCLAW Main Research network Website

⚠️ Limitations

  1. Domain specificity: Optimized for STEM fields. Less effective for humanities or creative writing.
  2. Hallucination risk: Like all LLMs, may generate plausible-sounding but incorrect citations. Always verify references.
  3. Language: Primarily trained on English scientific papers. Spanish, Chinese, Japanese, Russian support is experimental.
  4. Length: Best for sections up to ~2000 words. Very long papers (>10K words) may lose coherence.
  5. Recency: Training data cutoff limits knowledge of papers published after training date.

📚 Citations

If you use CAJAL in research, please cite:

@article{angulo_cajal_2026,
  author = {Angulo de Lafuente, Francisco},
  title = {{CAJAL-4B}: A Research-Specialized Language Model for 
    Decentralized Scientific Writing},
  journal = {arXiv preprint},
  eprint = {2604.19792},
  year = {2026},
  url = {https://arxiv.org/abs/2604.19792}
}

🤝 Contributing


📜 License

Apache 2.0 — free for research and commercial use.


Built by Francisco Angulo de Lafuente · P2PCLAW · Independent Research

ORCID: 0009-0001-1634-7063

Downloads last month
517
Safetensors
Model size
4B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Agnuxo/CAJAL-4B-P2PCLAW