calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0010

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 80

Training results

Training Loss Epoch Step Validation Loss
2.4893 1.0 13 1.6643
1.2679 2.0 26 0.8866
0.8178 3.0 39 0.7130
0.6528 4.0 52 0.5806
0.5426 5.0 65 0.5071
0.5094 6.0 78 0.4677
0.4514 7.0 91 0.3869
0.3829 8.0 104 0.3313
0.3274 9.0 117 0.2774
0.2904 10.0 130 0.2338
0.2507 11.0 143 0.1997
0.2193 12.0 156 0.1843
0.2027 13.0 169 0.1613
0.1787 14.0 182 0.1374
0.1596 15.0 195 0.1373
0.1485 16.0 208 0.1148
0.1350 17.0 221 0.1013
0.1191 18.0 234 0.0988
0.1056 19.0 247 0.0762
0.0920 20.0 260 0.0643
0.0819 21.0 273 0.0607
0.0733 22.0 286 0.0524
0.0633 23.0 299 0.0388
0.0531 24.0 312 0.0313
0.0461 25.0 325 0.0283
0.0406 26.0 338 0.0226
0.0315 27.0 351 0.0156
0.0245 28.0 364 0.0138
0.0203 29.0 377 0.0112
0.0165 30.0 390 0.0095
0.0145 31.0 403 0.0081
0.0127 32.0 416 0.0070
0.0105 33.0 429 0.0062
0.0098 34.0 442 0.0056
0.0087 35.0 455 0.0047
0.0079 36.0 468 0.0044
0.0067 37.0 481 0.0041
0.0065 38.0 494 0.0039
0.0060 39.0 507 0.0034
0.0053 40.0 520 0.0031
0.0049 41.0 533 0.0029
0.0046 42.0 546 0.0027
0.0042 43.0 559 0.0026
0.0039 44.0 572 0.0022
0.0041 45.0 585 0.0025
0.0037 46.0 598 0.0021
0.0036 47.0 611 0.0022
0.0035 48.0 624 0.0019
0.0031 49.0 637 0.0019
0.0032 50.0 650 0.0019
0.0029 51.0 663 0.0017
0.0026 52.0 676 0.0016
0.0024 53.0 689 0.0015
0.0026 54.0 702 0.0015
0.0023 55.0 715 0.0016
0.0024 56.0 728 0.0014
0.0023 57.0 741 0.0014
0.0023 58.0 754 0.0013
0.0021 59.0 767 0.0013
0.0020 60.0 780 0.0011
0.0019 61.0 793 0.0012
0.0020 62.0 806 0.0012
0.0018 63.0 819 0.0011
0.0017 64.0 832 0.0011
0.0017 65.0 845 0.0011
0.0018 66.0 858 0.0010
0.0016 67.0 871 0.0010
0.0017 68.0 884 0.0010
0.0016 69.0 897 0.0010
0.0014 70.0 910 0.0010
0.0016 71.0 923 0.0010
0.0016 72.0 936 0.0011
0.0015 73.0 949 0.0010
0.0015 74.0 962 0.0011
0.0014 75.0 975 0.0010
0.0014 76.0 988 0.0010
0.0014 77.0 1001 0.0010
0.0013 78.0 1014 0.0010
0.0014 79.0 1027 0.0010
0.0014 80.0 1040 0.0010

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
334
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support