calculator_model_test

This model is a fine-tuned version of zzox531/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0566

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
0.1807 1.0 5 0.1524
0.1720 2.0 10 0.1506
0.1684 3.0 15 0.1488
0.1650 4.0 20 0.1413
0.1600 5.0 25 0.1404
0.1596 6.0 30 0.1395
0.1540 7.0 35 0.1335
0.1535 8.0 40 0.1309
0.1494 9.0 45 0.1302
0.1469 10.0 50 0.1270
0.1438 11.0 55 0.1246
0.1441 12.0 60 0.1232
0.1406 13.0 65 0.1215
0.1368 14.0 70 0.1161
0.1340 15.0 75 0.1152
0.1309 16.0 80 0.1125
0.1293 17.0 85 0.1114
0.1266 18.0 90 0.1087
0.1259 19.0 95 0.1065
0.1229 20.0 100 0.1084
0.1215 21.0 105 0.1039
0.1193 22.0 110 0.1020
0.1180 23.0 115 0.1019
0.1147 24.0 120 0.1017
0.1163 25.0 125 0.0958
0.1137 26.0 130 0.0959
0.1094 27.0 135 0.0936
0.1099 28.0 140 0.0919
0.1095 29.0 145 0.0913
0.1076 30.0 150 0.0901
0.1060 31.0 155 0.0875
0.1057 32.0 160 0.0878
0.1054 33.0 165 0.0876
0.1050 34.0 170 0.0886
0.1032 35.0 175 0.0841
0.1018 36.0 180 0.0843
0.1040 37.0 185 0.0816
0.0998 38.0 190 0.0847
0.0988 39.0 195 0.0805
0.1018 40.0 200 0.0806
0.0992 41.0 205 0.0800
0.0998 42.0 210 0.0780
0.0970 43.0 215 0.0766
0.0948 44.0 220 0.0770
0.0931 45.0 225 0.0745
0.0934 46.0 230 0.0768
0.0924 47.0 235 0.0754
0.0906 48.0 240 0.0734
0.0919 49.0 245 0.0754
0.0897 50.0 250 0.0712
0.0889 51.0 255 0.0721
0.0882 52.0 260 0.0697
0.0881 53.0 265 0.0718
0.0859 54.0 270 0.0690
0.0864 55.0 275 0.0700
0.0844 56.0 280 0.0681
0.0829 57.0 285 0.0686
0.0855 58.0 290 0.0668
0.0817 59.0 295 0.0671
0.0821 60.0 300 0.0668
0.0812 61.0 305 0.0653
0.0800 62.0 310 0.0648
0.0797 63.0 315 0.0650
0.0794 64.0 320 0.0655
0.0792 65.0 325 0.0643
0.0796 66.0 330 0.0641
0.0770 67.0 335 0.0631
0.0796 68.0 340 0.0635
0.0784 69.0 345 0.0626
0.0767 70.0 350 0.0616
0.0768 71.0 355 0.0615
0.0738 72.0 360 0.0609
0.0745 73.0 365 0.0617
0.0784 74.0 370 0.0614
0.0764 75.0 375 0.0613
0.0744 76.0 380 0.0618
0.0750 77.0 385 0.0602
0.0754 78.0 390 0.0599
0.0729 79.0 395 0.0602
0.0727 80.0 400 0.0589
0.0732 81.0 405 0.0597
0.0756 82.0 410 0.0588
0.0722 83.0 415 0.0580
0.0703 84.0 420 0.0592
0.0739 85.0 425 0.0582
0.0705 86.0 430 0.0579
0.0716 87.0 435 0.0584
0.0703 88.0 440 0.0582
0.0707 89.0 445 0.0576
0.0707 90.0 450 0.0574
0.0717 91.0 455 0.0574
0.0712 92.0 460 0.0572
0.0707 93.0 465 0.0572
0.0691 94.0 470 0.0570
0.0694 95.0 475 0.0567
0.0690 96.0 480 0.0567
0.0703 97.0 485 0.0567
0.0698 98.0 490 0.0566
0.0695 99.0 495 0.0566
0.0696 100.0 500 0.0566

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
59
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zzox531/calculator_model_test

Unable to build the model tree, the base model loops to the model itself. Learn more.