calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5945

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4886 1.0 5 2.9077
2.6030 2.0 10 2.1443
1.9775 3.0 15 1.7360
1.6903 4.0 20 1.5977
1.5862 5.0 25 1.5620
1.5460 6.0 30 1.5337
1.5075 7.0 35 1.4873
1.4838 8.0 40 1.4564
1.4496 9.0 45 1.4431
1.4041 10.0 50 1.3911
1.3950 11.0 55 1.3940
1.3421 12.0 60 1.3259
1.3013 13.0 65 1.2902
1.2550 14.0 70 1.2079
1.1784 15.0 75 1.1146
1.1135 16.0 80 1.0629
1.0583 17.0 85 1.0104
1.0198 18.0 90 0.9646
0.9765 19.0 95 0.9315
0.9386 20.0 100 0.8994
0.9116 21.0 105 0.8690
0.8871 22.0 110 0.8332
0.8702 23.0 115 0.8674
0.8441 24.0 120 0.7941
0.8180 25.0 125 0.7898
0.8022 26.0 130 0.7690
0.7880 27.0 135 0.7386
0.7739 28.0 140 0.7302
0.7618 29.0 145 0.7090
0.7451 30.0 150 0.7043
0.7340 31.0 155 0.6951
0.7240 32.0 160 0.6731
0.7076 33.0 165 0.6516
0.6951 34.0 170 0.6469
0.6846 35.0 175 0.6319
0.6737 36.0 180 0.6170
0.6601 37.0 185 0.6103
0.6558 38.0 190 0.6016
0.6509 39.0 195 0.5963
0.6435 40.0 200 0.5945

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
23
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support