Ayrton-1
A Formula 1 domain-specialist language model. Fine-tune of google/gemma-3-12b-it for factual F1 question answering across the 1950–2025 seasons, built for local inference on Apple Silicon via MLX.
Named after Ayrton Senna.
What it does
Answers factual F1 questions: race results, championship standings, driver/constructor history (1950–2025), and session-level telemetry and strategy (2018–2025 only, due to data availability).
It is not a chatbot, a general assistant, or a live-data source. It is a frozen knowledge model.
Evaluation
| Holdout | Hybrid accuracy |
|---|---|
2025 test (ab_sample256_gold.jsonl) |
0.957 |
2024 valid (ab_valid2024_sample256_gold.jsonl) |
0.945 |
Hybrid accuracy = value-level match with light paraphrase tolerance. Evaluation harness: scripts/eval_hybrid.py in the source repo.
Quick start
MLX (Apple Silicon, recommended)
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("machina-sports/ayrton-1")
messages = [{"role": "user", "content": "Who won the 1988 F1 World Championship?"}]
prompt = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
print(generate(model, tokenizer, prompt=prompt, max_tokens=128))
Transformers (CUDA / CPU)
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("machina-sports/ayrton-1", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("machina-sports/ayrton-1")
messages = [{"role": "user", "content": "Who won the 1988 F1 World Championship?"}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
print(tokenizer.decode(model.generate(inputs, max_new_tokens=128)[0], skip_special_tokens=True))
Intended use
- Factual F1 QA (race results, standings, driver history, lap stats, session strategy).
- English only.
- Historical reasoning — the model has no live data.
Out of scope
- Pre-2018 session-level telemetry. FastF1 coverage starts in 2018; the model is trained to explicitly refuse these with a scope boundary ("Detailed session-level telemetry/strategy coverage is 2018–2025.").
- Seasons after 2025. The training data ends with the 2025 season.
- General-purpose chat, code, math. Performance on non-F1 tasks is not evaluated and likely regressed vs the base Gemma.
- Real-time information (driver transfers, current race weekends, standings during a live season).
Training
- Base model:
google/gemma-3-12b-it - Method: Iterative LoRA fine-tuning via
mlx_lm.lora, merging each version's adapter into the base before training the next generation. This model is the final fused merge after v8 → v9 → v10a–h. - LoRA config: rank 16 layers, learning rate 2.5e-6, max seq length 2048, batch size 1.
- Hardware: Apple Silicon (MLX).
- Final stage:
v10h-p2— 180 iterations, resumed fromv10h-p1checkpoint 200. - Training data:
machina-sports/ayrton-1-qa-v2— temporally split, 2024 as validation, 2025 as test. - Distillation: Gemini 3 Flash as teacher for style/fluency transfer on train split, with strict factual filters (numeric match, hedge rejection, template-level overlap thresholds).
Full pipeline, data-mix recipes, and eval harness in the source repository.
Limitations & known biases
- Refusal calibration on pre-2018 telemetry is imperfect — the model usually refuses, but occasional hallucinations on early FastF1-style questions still occur.
- Coverage skew toward the modern era. 2018–2025 has richer training signal (session-level detail) than 1950–2017, so answer quality is more uniform in the modern period.
- Jolpica data gap:
constructorStandingsfor the 1954 season is empty at the source and is not patched. - Language: trained and evaluated in English only.
- Numeric precision: lap-time and telemetry values are reproduced to the precision seen in FastF1; rounding differences vs other sources are expected.
Data sources & attribution
- Jolpica-F1 — historical backbone (1950–2025): races, results, standings, driver/constructor metadata.
- FastF1 — session-level telemetry and strategy (2018–2025).
- OpenF1 — disabled in this release (access instability, marginal gain).
All three upstream sources retain their own licenses; this model is a derivative work built on the compiled QA dataset.
License
- Model weights: Released under the Gemma Terms of Use. Commercial use permitted subject to those terms.
- Training data: CC BY-NC 4.0 — see the dataset card.
- Upstream attributions: Jolpica-F1, FastF1 retain their own licenses.
Source code
Full training, data-build, and eval pipeline: https://github.com/machinasports/ayrton-1
Citation
@misc{ayrton1,
title = {Ayrton-1: A Formula 1 Domain-Specialist Fine-Tune of Gemma 3 12B},
author = {Machina Sports},
year = {2026},
howpublished = {\url{https://huggingface.co/machina-sports/ayrton-1}}
}
- Downloads last month
- 136
Quantized
Model tree for machina-sports/ayrton-1
Dataset used to train machina-sports/ayrton-1
Evaluation results
- Hybrid accuracy (2025) on ayrton-1-qa-v2 (2025 test)self-reported0.957
- Hybrid accuracy (2024) on ayrton-1-qa-v2 (2024 valid)self-reported0.945