calculator_model_test

This model is a fine-tuned version of ReadHegel/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0006

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 250

Training results

Training Loss Epoch Step Validation Loss
0.6830 1.0 6 0.3474
0.2905 2.0 12 0.1705
0.1633 3.0 18 0.1282
0.1346 4.0 24 0.1065
0.1074 5.0 30 0.0834
0.1010 6.0 36 0.0615
0.1016 7.0 42 0.0634
0.0770 8.0 48 0.0620
0.0729 9.0 54 0.0903
0.0853 10.0 60 0.0888
0.1005 11.0 66 0.1555
0.1295 12.0 72 0.1269
0.1342 13.0 78 0.0783
0.0937 14.0 84 0.0535
0.0596 15.0 90 0.0485
0.0538 16.0 96 0.0271
0.0479 17.0 102 0.0285
0.0511 18.0 108 0.0340
0.0576 19.0 114 0.0345
0.0502 20.0 120 0.0395
0.0462 21.0 126 0.0349
0.0474 22.0 132 0.0225
0.0341 23.0 138 0.0262
0.0328 24.0 144 0.0279
0.0378 25.0 150 0.0220
0.0421 26.0 156 0.0329
0.0383 27.0 162 0.0240
0.0371 28.0 168 0.0276
0.0297 29.0 174 0.0191
0.0283 30.0 180 0.0153
0.0350 31.0 186 0.0285
0.0348 32.0 192 0.0288
0.0389 33.0 198 0.0555
0.0441 34.0 204 0.0325
0.0454 35.0 210 0.0309
0.0403 36.0 216 0.0220
0.0374 37.0 222 0.0230
0.0395 38.0 228 0.0336
0.0417 39.0 234 0.0222
0.0382 40.0 240 0.0365
0.0341 41.0 246 0.0172
0.0323 42.0 252 0.0187
0.0364 43.0 258 0.0184
0.0388 44.0 264 0.0177
0.0324 45.0 270 0.0125
0.0270 46.0 276 0.0138
0.0270 47.0 282 0.0169
0.0255 48.0 288 0.0147
0.0286 49.0 294 0.0207
0.0255 50.0 300 0.0156
0.0176 51.0 306 0.0191
0.0223 52.0 312 0.0166
0.0174 53.0 318 0.0087
0.0130 54.0 324 0.0078
0.0116 55.0 330 0.0070
0.0097 56.0 336 0.0054
0.0098 57.0 342 0.0048
0.0074 58.0 348 0.0063
0.0081 59.0 354 0.0044
0.0070 60.0 360 0.0042
0.0068 61.0 366 0.0048
0.0065 62.0 372 0.0060
0.0058 63.0 378 0.0037
0.0058 64.0 384 0.0036
0.0049 65.0 390 0.0031
0.0048 66.0 396 0.0034
0.0041 67.0 402 0.0032
0.0041 68.0 408 0.0027
0.0042 69.0 414 0.0029
0.0036 70.0 420 0.0022
0.0039 71.0 426 0.0018
0.0032 72.0 432 0.0031
0.0039 73.0 438 0.0023
0.0050 74.0 444 0.0029
0.0091 75.0 450 0.0039
0.0069 76.0 456 0.0032
0.0080 77.0 462 0.0049
0.0103 78.0 468 0.0066
0.0109 79.0 474 0.0098
0.0082 80.0 480 0.0057
0.0110 81.0 486 0.0051
0.0109 82.0 492 0.0088
0.0175 83.0 498 0.0069
0.0161 84.0 504 0.0120
0.0178 85.0 510 0.0073
0.0188 86.0 516 0.0124
0.0138 87.0 522 0.0041
0.0079 88.0 528 0.0047
0.0089 89.0 534 0.0052
0.0133 90.0 540 0.0319
0.0211 91.0 546 0.0221
0.0208 92.0 552 0.0059
0.0126 93.0 558 0.0101
0.0136 94.0 564 0.0087
0.0103 95.0 570 0.0056
0.0106 96.0 576 0.0053
0.0145 97.0 582 0.0080
0.0125 98.0 588 0.0057
0.0119 99.0 594 0.0066
0.0087 100.0 600 0.0062
0.0088 101.0 606 0.0059
0.0086 102.0 612 0.0045
0.0069 103.0 618 0.0042
0.0098 104.0 624 0.0052
0.0075 105.0 630 0.0039
0.0071 106.0 636 0.0033
0.0107 107.0 642 0.0051
0.0074 108.0 648 0.0054
0.0077 109.0 654 0.0047
0.0120 110.0 660 0.0103
0.0087 111.0 666 0.0031
0.0076 112.0 672 0.0046
0.0061 113.0 678 0.0049
0.0056 114.0 684 0.0031
0.0046 115.0 690 0.0025
0.0044 116.0 696 0.0021
0.0026 117.0 702 0.0019
0.0032 118.0 708 0.0017
0.0062 119.0 714 0.0087
0.0172 120.0 720 0.0040
0.0089 121.0 726 0.0058
0.0091 122.0 732 0.0053
0.0150 123.0 738 0.0026
0.0091 124.0 744 0.0046
0.0113 125.0 750 0.0044
0.0080 126.0 756 0.0046
0.0135 127.0 762 0.0097
0.0224 128.0 768 0.0081
0.0214 129.0 774 0.0107
0.0159 130.0 780 0.0122
0.0179 131.0 786 0.0045
0.0148 132.0 792 0.0074
0.0139 133.0 798 0.0030
0.0072 134.0 804 0.0052
0.0065 135.0 810 0.0029
0.0037 136.0 816 0.0021
0.0031 137.0 822 0.0016
0.0027 138.0 828 0.0015
0.0020 139.0 834 0.0014
0.0028 140.0 840 0.0014
0.0017 141.0 846 0.0014
0.0021 142.0 852 0.0013
0.0020 143.0 858 0.0014
0.0016 144.0 864 0.0013
0.0021 145.0 870 0.0013
0.0015 146.0 876 0.0012
0.0015 147.0 882 0.0011
0.0014 148.0 888 0.0012
0.0014 149.0 894 0.0012
0.0014 150.0 900 0.0012
0.0012 151.0 906 0.0011
0.0018 152.0 912 0.0010
0.0015 153.0 918 0.0009
0.0017 154.0 924 0.0010
0.0024 155.0 930 0.0034
0.0018 156.0 936 0.0013
0.0013 157.0 942 0.0010
0.0012 158.0 948 0.0010
0.0010 159.0 954 0.0010
0.0009 160.0 960 0.0009
0.0008 161.0 966 0.0009
0.0009 162.0 972 0.0009
0.0008 163.0 978 0.0009
0.0007 164.0 984 0.0008
0.0009 165.0 990 0.0008
0.0008 166.0 996 0.0008
0.0009 167.0 1002 0.0009
0.0009 168.0 1008 0.0009
0.0008 169.0 1014 0.0009
0.0012 170.0 1020 0.0009
0.0008 171.0 1026 0.0009
0.0008 172.0 1032 0.0009
0.0007 173.0 1038 0.0008
0.0007 174.0 1044 0.0007
0.0006 175.0 1050 0.0007
0.0007 176.0 1056 0.0007
0.0005 177.0 1062 0.0007
0.0006 178.0 1068 0.0007
0.0007 179.0 1074 0.0007
0.0005 180.0 1080 0.0007
0.0006 181.0 1086 0.0007
0.0005 182.0 1092 0.0006
0.0005 183.0 1098 0.0006
0.0005 184.0 1104 0.0006
0.0007 185.0 1110 0.0006
0.0005 186.0 1116 0.0007
0.0006 187.0 1122 0.0007
0.0005 188.0 1128 0.0007
0.0005 189.0 1134 0.0007
0.0006 190.0 1140 0.0007
0.0006 191.0 1146 0.0007
0.0005 192.0 1152 0.0007
0.0004 193.0 1158 0.0007
0.0004 194.0 1164 0.0007
0.0006 195.0 1170 0.0007
0.0006 196.0 1176 0.0007
0.0004 197.0 1182 0.0007
0.0006 198.0 1188 0.0007
0.0006 199.0 1194 0.0007
0.0005 200.0 1200 0.0007
0.0006 201.0 1206 0.0007
0.0005 202.0 1212 0.0006
0.0005 203.0 1218 0.0006
0.0006 204.0 1224 0.0007
0.0004 205.0 1230 0.0007
0.0004 206.0 1236 0.0007
0.0006 207.0 1242 0.0007
0.0004 208.0 1248 0.0007
0.0005 209.0 1254 0.0006
0.0004 210.0 1260 0.0006
0.0004 211.0 1266 0.0006
0.0004 212.0 1272 0.0006
0.0004 213.0 1278 0.0006
0.0005 214.0 1284 0.0006
0.0005 215.0 1290 0.0006
0.0004 216.0 1296 0.0006
0.0005 217.0 1302 0.0006
0.0005 218.0 1308 0.0006
0.0005 219.0 1314 0.0006
0.0004 220.0 1320 0.0006
0.0004 221.0 1326 0.0006
0.0004 222.0 1332 0.0006
0.0005 223.0 1338 0.0006
0.0004 224.0 1344 0.0005
0.0004 225.0 1350 0.0006
0.0003 226.0 1356 0.0006
0.0004 227.0 1362 0.0006
0.0004 228.0 1368 0.0006
0.0004 229.0 1374 0.0006
0.0005 230.0 1380 0.0006
0.0005 231.0 1386 0.0006
0.0004 232.0 1392 0.0006
0.0005 233.0 1398 0.0006
0.0004 234.0 1404 0.0006
0.0004 235.0 1410 0.0006
0.0005 236.0 1416 0.0006
0.0004 237.0 1422 0.0006
0.0004 238.0 1428 0.0006
0.0003 239.0 1434 0.0006
0.0004 240.0 1440 0.0006
0.0006 241.0 1446 0.0006
0.0003 242.0 1452 0.0006
0.0003 243.0 1458 0.0006
0.0005 244.0 1464 0.0006
0.0003 245.0 1470 0.0006
0.0004 246.0 1476 0.0006
0.0004 247.0 1482 0.0006
0.0005 248.0 1488 0.0006
0.0006 249.0 1494 0.0006
0.0004 250.0 1500 0.0006

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
69
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ReadHegel/calculator_model_test

Unable to build the model tree, the base model loops to the model itself. Learn more.