Ayrton-1

A Formula 1 domain-specialist language model. Fine-tune of google/gemma-3-12b-it for factual F1 question answering across the 1950–2025 seasons, built for local inference on Apple Silicon via MLX.

Named after Ayrton Senna.

What it does

Answers factual F1 questions: race results, championship standings, driver/constructor history (1950–2025), and session-level telemetry and strategy (2018–2025 only, due to data availability).

It is not a chatbot, a general assistant, or a live-data source. It is a frozen knowledge model.

Evaluation

Holdout Hybrid accuracy
2025 test (ab_sample256_gold.jsonl) 0.957
2024 valid (ab_valid2024_sample256_gold.jsonl) 0.945

Hybrid accuracy = value-level match with light paraphrase tolerance. Evaluation harness: scripts/eval_hybrid.py in the source repo.

Quick start

MLX (Apple Silicon, recommended)

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("machina-sports/ayrton-1")
messages = [{"role": "user", "content": "Who won the 1988 F1 World Championship?"}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=prompt, max_tokens=128))

Transformers (CUDA / CPU)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("machina-sports/ayrton-1", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("machina-sports/ayrton-1")
messages = [{"role": "user", "content": "Who won the 1988 F1 World Championship?"}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
print(tokenizer.decode(model.generate(inputs, max_new_tokens=128)[0], skip_special_tokens=True))

Intended use

  • Factual F1 QA (race results, standings, driver history, lap stats, session strategy).
  • English only.
  • Historical reasoning — the model has no live data.

Out of scope

  • Pre-2018 session-level telemetry. FastF1 coverage starts in 2018; the model is trained to explicitly refuse these with a scope boundary ("Detailed session-level telemetry/strategy coverage is 2018–2025.").
  • Seasons after 2025. The training data ends with the 2025 season.
  • General-purpose chat, code, math. Performance on non-F1 tasks is not evaluated and likely regressed vs the base Gemma.
  • Real-time information (driver transfers, current race weekends, standings during a live season).

Training

  • Base model: google/gemma-3-12b-it
  • Method: Iterative LoRA fine-tuning via mlx_lm.lora, merging each version's adapter into the base before training the next generation. This model is the final fused merge after v8 → v9 → v10a–h.
  • LoRA config: rank 16 layers, learning rate 2.5e-6, max seq length 2048, batch size 1.
  • Hardware: Apple Silicon (MLX).
  • Final stage: v10h-p2 — 180 iterations, resumed from v10h-p1 checkpoint 200.
  • Training data: machina-sports/ayrton-1-qa-v2 — temporally split, 2024 as validation, 2025 as test.
  • Distillation: Gemini 3 Flash as teacher for style/fluency transfer on train split, with strict factual filters (numeric match, hedge rejection, template-level overlap thresholds).

Full pipeline, data-mix recipes, and eval harness in the source repository.

Limitations & known biases

  • Refusal calibration on pre-2018 telemetry is imperfect — the model usually refuses, but occasional hallucinations on early FastF1-style questions still occur.
  • Coverage skew toward the modern era. 2018–2025 has richer training signal (session-level detail) than 1950–2017, so answer quality is more uniform in the modern period.
  • Jolpica data gap: constructorStandings for the 1954 season is empty at the source and is not patched.
  • Language: trained and evaluated in English only.
  • Numeric precision: lap-time and telemetry values are reproduced to the precision seen in FastF1; rounding differences vs other sources are expected.

Data sources & attribution

  • Jolpica-F1 — historical backbone (1950–2025): races, results, standings, driver/constructor metadata.
  • FastF1 — session-level telemetry and strategy (2018–2025).
  • OpenF1 — disabled in this release (access instability, marginal gain).

All three upstream sources retain their own licenses; this model is a derivative work built on the compiled QA dataset.

License

  • Model weights: Released under the Gemma Terms of Use. Commercial use permitted subject to those terms.
  • Training data: CC BY-NC 4.0 — see the dataset card.
  • Upstream attributions: Jolpica-F1, FastF1 retain their own licenses.

Source code

Full training, data-build, and eval pipeline: https://github.com/machinasports/ayrton-1

Citation

@misc{ayrton1,
  title  = {Ayrton-1: A Formula 1 Domain-Specialist Fine-Tune of Gemma 3 12B},
  author = {Machina Sports},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/machina-sports/ayrton-1}}
}
Downloads last month
136
Safetensors
Model size
12B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for machina-sports/ayrton-1

Finetuned
(323)
this model

Dataset used to train machina-sports/ayrton-1

Evaluation results

  • Hybrid accuracy (2025) on ayrton-1-qa-v2 (2025 test)
    self-reported
    0.957
  • Hybrid accuracy (2024) on ayrton-1-qa-v2 (2024 valid)
    self-reported
    0.945