CHEETAH-350M-Merged-FP16

CHEETAH-350M-Merged-FP16 is a merged instruction-tuned model based on LiquidAI/LFM2-350M.

It was fine-tuned as a LoRA adapter on HuggingFaceTB/smol-smoltalk, then merged into the base model to create a standalone Transformers model.

🐆 Fast, small, cheap, and instruction-following focused.

Model Details

Field Value
Model family CHEETAH
Model name CHEETAH-350M-Merged-FP16
Base model LiquidAI/LFM2-350M
Training dataset HuggingFaceTB/smol-smoltalk
Fine-tuning type LoRA SFT
Final format Merged FP16 Transformers model
Training platform Modal
GPU NVIDIA L4
Selected checkpoint Step 750
License lfm1.0

Training Summary

The model was trained as a LoRA adapter and stopped at step 750 after the checkpoint was saved.

Metric Value
Selected step 750
Last evaluated step 700
Eval loss 1.3082
Eval perplexity 3.70
Tokens seen at checkpoint 9,711,906
Training time 32.8 minutes
Speed near end ~5,000 tok/s
GPU NVIDIA L4

Final Training Log

[2026-05-30 19:03:52] step=700/1000 loss=17.2614 lr=4.36e-05 tokens_seen=9,070,297 tok/s=5116.6 elapsed_min=30.6
[2026-05-30 19:03:53] eval_loss=1.3082 eval_ppl=3.70
[2026-05-30 19:04:18] step=710/1000 loss=16.9876 lr=4.10e-05 tokens_seen=9,195,110 tok/s=4728.6 elapsed_min=31.1
[2026-05-30 19:04:44] step=720/1000 loss=16.4713 lr=3.84e-05 tokens_seen=9,324,489 tok/s=5017.9 elapsed_min=31.5
[2026-05-30 19:05:10] step=730/1000 loss=16.7246 lr=3.59e-05 tokens_seen=9,457,294 tok/s=5178.9 elapsed_min=31.9
[2026-05-30 19:05:36] step=740/1000 loss=16.3293 lr=3.34e-05 tokens_seen=9,580,098 tok/s=4697.8 elapsed_min=32.4
[2026-05-30 19:06:02] step=750/1000 loss=16.4356 lr=3.10e-05 tokens_seen=9,711,906 tok/s=5018.9 elapsed_min=32.8
[2026-05-30 19:06:06] Saved checkpoint: /outputs/CHEETAH-350M-LoRA-L4/checkpoints/step-750

Note: the displayed training loss was affected by gradient accumulation logging. Evaluation loss and perplexity are the preferred metrics for judging the selected checkpoint.

Intended Use

This model is intended for:

  • Lightweight instruction following
  • Small assistant experiments
  • Fast local or cloud inference
  • Educational fine-tuning experiments
  • CHEETAH model family development

Not Intended For

This model is not intended for:

  • High-stakes medical, legal, or financial advice
  • Safety-critical automation
  • Private-data processing without review
  • Production deployment without evaluation

Usage

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Tralalabs/CHEETAH-350M-Merged-FP16"

tokenizer = AutoTokenizer.from_pretrained(
    model_id,
    trust_remote_code=True,
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)

messages = [
    {
        "role": "system",
        "content": "You are CHEETAH, a fast, clear, helpful assistant.",
    },
    {
        "role": "user",
        "content": "Explain why cheetahs are fast in 3 short bullets.",
    },
]

prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=160,
        do_sample=True,
        temperature=0.35,
        top_p=0.9,
        repetition_penalty=1.05,
        pad_token_id=tokenizer.eos_token_id,
    )

print(tokenizer.decode(output[0], skip_special_tokens=True))

Training Data

The model was fine-tuned on:

  • HuggingFaceTB/smol-smoltalk

This dataset is a subset of SmolTalk adapted for models smaller than 1B parameters.

Training Configuration

Setting Value
Base model LiquidAI/LFM2-350M
Dataset HuggingFaceTB/smol-smoltalk
Rows 16,000
Max sequence length 2048
LoRA rank 16
LoRA alpha 32
LoRA dropout 0.05
Learning rate 2e-4
Gradient accumulation 16
Selected checkpoint Step 750
Final tokens seen 9,711,906

Limitations

CHEETAH-350M-Merged-FP16 is a small 350M-class model. It may:

  • Hallucinate facts
  • Struggle with long reasoning chains
  • Give weak answers on niche knowledge
  • Misread complex instructions
  • Need careful prompting for best results

For factual or current information, verify outputs with trusted sources.

License

This model is released under lfm1.0, matching the license of the base model LiquidAI/LFM2-350M.

The training dataset HuggingFaceTB/smol-smoltalk is licensed under Apache-2.0.

Citation

Base model:

LiquidAI/LFM2-350M

Dataset:

HuggingFaceTB/smol-smoltalk

Model Family

This model belongs to the CHEETAH family:

CHEETAH-[SIZE]-LoRA
CHEETAH-[SIZE]-Merged

This release:

CHEETAH-350M-Merged-FP16

Example Output

Prompt:

system
You are CHEETAH, a fast, clear, helpful assistant.

user
Explain why cheetahs are fast in 3 short bullets.

Model output:

assistant
1. Cheetahs have a unique body structure that allows them to run at incredible speeds.
2. Their long legs and lightweight build enable them to accelerate quickly.
3. Cheetahs have a specialized tail that acts as a counterbalance during high-speed runs.
Downloads last month
125
Safetensors
Model size
0.4B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tralalabs/CHEETAH-350M-Merged-FP16

Finetuned
(60)
this model
Quantizations
4 models

Dataset used to train Tralalabs/CHEETAH-350M-Merged-FP16

Collection including Tralalabs/CHEETAH-350M-Merged-FP16