Configuration Parsing Warning:Config file tokenizer_config.json cannot be fetched (too big)

gemma-4-21b-reap-harness-ready

This is a fine-tuned version of 0xSero/gemma-4-21b-a4b-it-REAP trained on Claude conversations with tool use capabilities.

Attribution & Licenses

Base Model

This model is based on:

  • Gemma 4 by Google DeepMind
  • 0xSero/gemma-4-21b-a4b-it-REAP - A specialized fine-tune of Gemma 4

Gemma 4 is licensed under the Gemma License: https://ai.google.dev/gemma/terms

Training Data

  • Dataset: Private Claude conversations (agent-dataset-unsloth)
  • Source: Conversations generated using Anthropic's Claude (Claude Code)
  • License: Private dataset - not for redistribution

Training Framework

This model was fine-tuned using:

  • Transformers by Hugging Face (Apache 2.0)
  • PEFT (Parameter-Efficient Fine-Tuning) by Hugging Face (Apache 2.0)
  • bitsandbytes for 4-bit quantization (MIT)
  • Unsloth for optimized training (Apache 2.0)

Developer

Fine-tuned by: Austin Dixson
Training Date: April 2025
Status: Active development - iteration 1/10

Training Details

  • Base Model: 0xSero/gemma-4-21b-a4b-it-REAP
  • Training Steps: 325/1500 (22% complete)
  • Loss: ~2.708
  • Dataset: Private Claude conversations (agent-dataset-unsloth)
  • Training Method: LoRA (Low-Rank Adaptation)
    • Rank (r): 16
    • Alpha: 16
    • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Capabilities

This model has been fine-tuned for:

  • One-shot coding - Writing code from single examples
  • Tool-driven agent loops - Using tools autonomously
  • Function calling - OpenAI-style function calling
  • Autonomous research - Self-directed problem solving

Tools Integrated

  • divideandconquer
  • PinchBench
  • WildClawBench
  • hotAsianIntern

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model in 4-bit
base_model = AutoModelForCausalLM.from_pretrained(
    "0xSero/gemma-4-21b-a4b-it-REAP",
    device_map="auto",
    torch_dtype=torch.float16,
    load_in_4bit=True,
)

# Load LoRA adapters
model = PeftModel.from_pretrained(base_model, "austindixson/gemma-4-21b-reap-harness-ready")
tokenizer = AutoTokenizer.from_pretrained("austindixson/gemma-4-21b-reap-harness-ready")

# Use the model
prompt = "How do I create a REST API in Python?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Configuration

  • Max Sequence Length: 2048 tokens
  • Batch Size: 2 per device ร— 4 gradient accumulation = 8 effective batch
  • Learning Rate: 2e-4
  • Quantization: 4-bit (NF4 quantization)
  • Optimizer: AdamW 8-bit
  • Scheduler: cosine with 10 warmup steps

Hardware

Trained on H100 GPU (80GB HBM3) with 4-bit quantization for memory efficiency.

Iteration Plan

This model is part of a 10x iteration workflow:

  1. Train โ†’ Benchmark โ†’ Auto-research โ†’ Prune โ†’ Deploy
  2. Current status: First iteration checkpoint (step 325)

License

This model inherits the license from the base Gemma 4 model. See the Gemma License for usage terms.


Acknowledgments

  • Google DeepMind for creating the Gemma 4 model
  • 0xSero for the REAP fine-tune of Gemma 4
  • Anthropic for Claude (Claude Code) used to generate training data
  • Hugging Face for the Transformers, PEFT, and Bitsandbytes libraries
  • Unsloth for the optimized training framework
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for austindixson/gemma-4-21b-reap-harness-ready

Finetuned
(4)
this model