Ayrton-1

A Formula 1 domain-specialist language model. Fine-tune of google/gemma-3-12b-it for factual F1 question answering across the 1950–2025 seasons, built for local inference on Apple Silicon via MLX.

Named after Ayrton Senna.

What it does

Answers factual F1 questions: race results, championship standings, driver/constructor history (1950–2025), and session-level telemetry and strategy (2018–2025 only, due to data availability).

It is not a chatbot, a general assistant, or a live-data source. It is a frozen knowledge model.

Evaluation

Holdout	Hybrid accuracy
2025 test (`ab_sample256_gold.jsonl`)	0.957
2024 valid (`ab_valid2024_sample256_gold.jsonl`)	0.945

Hybrid accuracy = value-level match with light paraphrase tolerance. Evaluation harness: scripts/eval_hybrid.py in the source repo.

Quick start

MLX (Apple Silicon, recommended)

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("machina-sports/ayrton-1")
messages = [{"role": "user", "content": "Who won the 1988 F1 World Championship?"}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=prompt, max_tokens=128))

Transformers (CUDA / CPU)

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("machina-sports/ayrton-1", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("machina-sports/ayrton-1")
messages = [{"role": "user", "content": "Who won the 1988 F1 World Championship?"}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
print(tokenizer.decode(model.generate(inputs, max_new_tokens=128)[0], skip_special_tokens=True))

Intended use

Factual F1 QA (race results, standings, driver history, lap stats, session strategy).
English only.
Historical reasoning — the model has no live data.

Out of scope

Pre-2018 session-level telemetry. FastF1 coverage starts in 2018; the model is trained to explicitly refuse these with a scope boundary ("Detailed session-level telemetry/strategy coverage is 2018–2025.").
Seasons after 2025. The training data ends with the 2025 season.
General-purpose chat, code, math. Performance on non-F1 tasks is not evaluated and likely regressed vs the base Gemma.
Real-time information (driver transfers, current race weekends, standings during a live season).

Training

Base model: google/gemma-3-12b-it
Method: Iterative LoRA fine-tuning via mlx_lm.lora, merging each version's adapter into the base before training the next generation. This model is the final fused merge after v8 → v9 → v10a–h.
LoRA config: rank 16 layers, learning rate 2.5e-6, max seq length 2048, batch size 1.
Hardware: Apple Silicon (MLX).
Final stage: v10h-p2 — 180 iterations, resumed from v10h-p1 checkpoint 200.
Training data: machina-sports/ayrton-1-qa-v2 — temporally split, 2024 as validation, 2025 as test.
Distillation: Gemini 3 Flash as teacher for style/fluency transfer on train split, with strict factual filters (numeric match, hedge rejection, template-level overlap thresholds).

Full pipeline, data-mix recipes, and eval harness in the source repository.

Limitations & known biases

Refusal calibration on pre-2018 telemetry is imperfect — the model usually refuses, but occasional hallucinations on early FastF1-style questions still occur.
Coverage skew toward the modern era. 2018–2025 has richer training signal (session-level detail) than 1950–2017, so answer quality is more uniform in the modern period.
Jolpica data gap: constructorStandings for the 1954 season is empty at the source and is not patched.
Language: trained and evaluated in English only.
Numeric precision: lap-time and telemetry values are reproduced to the precision seen in FastF1; rounding differences vs other sources are expected.

Data sources & attribution

Jolpica-F1 — historical backbone (1950–2025): races, results, standings, driver/constructor metadata.
FastF1 — session-level telemetry and strategy (2018–2025).
OpenF1 — disabled in this release (access instability, marginal gain).

All three upstream sources retain their own licenses; this model is a derivative work built on the compiled QA dataset.

License

Model weights: Released under the Gemma Terms of Use. Commercial use permitted subject to those terms.
Training data: CC BY-NC 4.0 — see the dataset card.
Upstream attributions: Jolpica-F1, FastF1 retain their own licenses.

Source code

Full training, data-build, and eval pipeline: https://github.com/machinasports/ayrton-1

Citation

@misc{ayrton1,
  title  = {Ayrton-1: A Formula 1 Domain-Specialist Fine-Tune of Gemma 3 12B},
  author = {Machina Sports},
  year   = {2026},
  howpublished = {\url{https://huggingface.co/machina-sports/ayrton-1}}
}

Downloads last month: 136

Safetensors

Model size

12B params

Tensor type

BF16

MLX

Hardware compatibility

Quantized

Model tree for machina-sports/ayrton-1

Base model

google/gemma-3-12b-pt

Finetuned

google/gemma-3-12b-it

Finetuned

(323)

this model

Dataset used to train machina-sports/ayrton-1

Evaluation results

Hybrid accuracy (2025) on ayrton-1-qa-v2 (2025 test)
self-reported

0.957
Hybrid accuracy (2024) on ayrton-1-qa-v2 (2024 valid)
self-reported

0.945