calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1069

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 70
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.1895 1.0 6 2.6427
2.5023 2.0 12 2.3194
2.2164 3.0 18 2.0818
1.9817 4.0 24 1.8459
1.7401 5.0 30 1.5845
1.4949 6.0 36 1.3665
1.3165 7.0 42 1.2362
1.1900 8.0 48 1.1268
1.0863 9.0 54 1.0420
1.0146 10.0 60 0.9612
0.9334 11.0 66 0.8996
0.8609 12.0 72 0.8165
0.7963 13.0 78 0.7590
0.7421 14.0 84 0.7111
0.7035 15.0 90 0.6801
0.6712 16.0 96 0.6426
0.6280 17.0 102 0.6165
0.6134 18.0 108 0.6010
0.5910 19.0 114 0.5739
0.5714 20.0 120 0.5586
0.5493 21.0 126 0.5467
0.5482 22.0 132 0.5332
0.5189 23.0 138 0.5065
0.5028 24.0 144 0.4866
0.4837 25.0 150 0.4671
0.4711 26.0 156 0.4461
0.4482 27.0 162 0.4298
0.4262 28.0 168 0.4064
0.4098 29.0 174 0.3933
0.3945 30.0 180 0.3733
0.3810 31.0 186 0.3547
0.3614 32.0 192 0.3353
0.3432 33.0 198 0.3179
0.3322 34.0 204 0.3003
0.3181 35.0 210 0.2997
0.3149 36.0 216 0.2767
0.2864 37.0 222 0.2600
0.2819 38.0 228 0.2418
0.2663 39.0 234 0.2296
0.2591 40.0 240 0.2176
0.2535 41.0 246 0.2106
0.2314 42.0 252 0.1970
0.2277 43.0 258 0.1849
0.2231 44.0 264 0.1754
0.2109 45.0 270 0.1750
0.1956 46.0 276 0.1640
0.1933 47.0 282 0.1597
0.1826 48.0 288 0.1556
0.1831 49.0 294 0.1550
0.1776 50.0 300 0.1436
0.1671 51.0 306 0.1417
0.1773 52.0 312 0.1349
0.1617 53.0 318 0.1311
0.1600 54.0 324 0.1333
0.1557 55.0 330 0.1274
0.1548 56.0 336 0.1231
0.1486 57.0 342 0.1217
0.1484 58.0 348 0.1201
0.1434 59.0 354 0.1169
0.1375 60.0 360 0.1179
0.1418 61.0 366 0.1142
0.1377 62.0 372 0.1119
0.1422 63.0 378 0.1115
0.1315 64.0 384 0.1104
0.1362 65.0 390 0.1106
0.1380 66.0 396 0.1084
0.1326 67.0 402 0.1072
0.1311 68.0 408 0.1078
0.1396 69.0 414 0.1073
0.1340 70.0 420 0.1069

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
174
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support