calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1067

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.9472 1.0 6 2.2628
2.0366 2.0 12 1.7642
1.5645 3.0 18 1.3854
1.2543 4.0 24 1.1356
1.0465 5.0 30 1.0000
0.9275 6.0 36 0.8717
0.8314 7.0 42 0.7647
0.7834 8.0 48 0.7636
0.7211 9.0 54 0.6759
0.6548 10.0 60 0.6257
0.6005 11.0 66 0.6371
0.6133 12.0 72 0.6054
0.5681 13.0 78 0.5610
0.5576 14.0 84 0.5441
0.5279 15.0 90 0.4975
0.4981 16.0 96 0.4773
0.4508 17.0 102 0.4559
0.4491 18.0 108 0.4150
0.4122 19.0 114 0.4092
0.3865 20.0 120 0.3901
0.3667 21.0 126 0.3332
0.3287 22.0 132 0.3279
0.3177 23.0 138 0.2912
0.2865 24.0 144 0.2619
0.2658 25.0 150 0.2492
0.2669 26.0 156 0.2297
0.2491 27.0 162 0.2012
0.2140 28.0 168 0.1816
0.2039 29.0 174 0.1664
0.1917 30.0 180 0.1593
0.1792 31.0 186 0.1406
0.1558 32.0 192 0.1353
0.1507 33.0 198 0.1392
0.1538 34.0 204 0.1245
0.1550 35.0 210 0.1156
0.1298 36.0 216 0.1170
0.1377 37.0 222 0.1142
0.1428 38.0 228 0.1121
0.1332 39.0 234 0.1076
0.1255 40.0 240 0.1067

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
23
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support