calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0681

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.0932 1.0 6 2.3284
2.0708 2.0 12 1.7597
1.6188 3.0 18 1.4260
1.2984 4.0 24 1.1225
1.0571 5.0 30 0.9371
0.9006 6.0 36 0.8191
0.8187 7.0 42 0.7260
0.7281 8.0 48 0.6762
0.6804 9.0 54 0.6348
0.6251 10.0 60 0.5609
0.5701 11.0 66 0.5090
0.5325 12.0 72 0.4674
0.4973 13.0 78 0.4407
0.4619 14.0 84 0.4090
0.4408 15.0 90 0.3996
0.4311 16.0 96 0.4260
0.4237 17.0 102 0.3490
0.3734 18.0 108 0.3225
0.3387 19.0 114 0.2895
0.3111 20.0 120 0.2506
0.2790 21.0 126 0.2317
0.2652 22.0 132 0.2102
0.2521 23.0 138 0.1889
0.2293 24.0 144 0.1697
0.2031 25.0 150 0.1413
0.1844 26.0 156 0.1269
0.1856 27.0 162 0.1358
0.1787 28.0 168 0.1104
0.1549 29.0 174 0.1175
0.1644 30.0 180 0.1034
0.1406 31.0 186 0.0931
0.1379 32.0 192 0.0888
0.1299 33.0 198 0.0916
0.1265 34.0 204 0.0809
0.1216 35.0 210 0.0747
0.1160 36.0 216 0.0725
0.1106 37.0 222 0.0707
0.1092 38.0 228 0.0686
0.1076 39.0 234 0.0689
0.1113 40.0 240 0.0681

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
43
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support