calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5025

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4213 1.0 6 2.7742
2.3873 2.0 12 1.9848
1.8855 3.0 18 1.6983
1.6818 4.0 24 1.5908
1.5942 5.0 30 1.5682
1.5581 6.0 36 1.5332
1.5573 7.0 42 1.5375
1.5423 8.0 48 1.5241
1.5435 9.0 54 1.5207
1.5402 10.0 60 1.5169
1.5330 11.0 66 1.5100
1.5252 12.0 72 1.5091
1.5234 13.0 78 1.5080
1.5284 14.0 84 1.5142
1.5221 15.0 90 1.5212
1.5305 16.0 96 1.5103
1.5199 17.0 102 1.5069
1.5174 18.0 108 1.5121
1.5104 19.0 114 1.5072
1.5197 20.0 120 1.5043
1.5101 21.0 126 1.5044
1.5145 22.0 132 1.5040
1.5226 23.0 138 1.5058
1.5299 24.0 144 1.5069
1.5290 25.0 150 1.5061
1.5070 26.0 156 1.5067
1.5077 27.0 162 1.5066
1.5240 28.0 168 1.5030
1.5344 29.0 174 1.5010
1.5148 30.0 180 1.5009
1.5182 31.0 186 1.5029
1.5158 32.0 192 1.5058
1.5093 33.0 198 1.5057
1.5149 34.0 204 1.5037
1.5098 35.0 210 1.5022
1.5095 36.0 216 1.5020
1.5085 37.0 222 1.5021
1.5091 38.0 228 1.5020
1.5020 39.0 234 1.5025
1.5103 40.0 240 1.5025

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
151
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support