CaiTI GGUF Bundle (Base + Task Adapters)

This repository contains the GGUF deployment artifacts for CaiTI using a single GGUF base model plus task-specific GGUF LoRA adapters.

Base model

  • Llama-3.2-3B-Instruct-Q4_K_M.gguf
  • Intended runtime: llama.cpp / llama-cpp-python

Task adapters

Use one adapter per task:

  • adapters/task1_response_analyzer.gguf -> Response Analyzer (37-dimension + score)
  • adapters/task2_general_response.gguf -> General Response (Yes/No/Maybe/Question/Stop)
  • adapters/task3_rv_reasoner.gguf -> RV reasoner (0/1)
  • adapters/task4_cbt_stage1.gguf -> CBT Stage 1
  • adapters/task4_cbt_stage2.gguf -> CBT Stage 2
  • adapters/task4_cbt_stage3.gguf -> CBT Stage 3

Runtime recommendation

  • Keep one base GGUF model loaded.
  • Switch adapters by task routing.
  • Use deterministic decoding for classification/reasoner tasks.

Example (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path='Llama-3.2-3B-Instruct-Q4_K_M.gguf',
    lora_path='adapters/task3_rv_reasoner.gguf',
    n_ctx=2048,
    n_gpu_layers=0,
)
out = llm('DECISION: 0/1 only\n...input...', max_tokens=8, temperature=0.0)
print(out['choices'][0]['text'])
Downloads last month
135
GGUF
Model size
3B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support