calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2019

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
2.8957 1.0 6 2.2381
2.0146 2.0 12 1.7705
1.6197 3.0 18 1.4439
1.2979 4.0 24 1.2227
1.1103 5.0 30 1.0364
0.9847 6.0 36 0.9059
0.8452 7.0 42 0.8222
0.7736 8.0 48 0.7130
0.7047 9.0 54 0.7146
0.6887 10.0 60 0.6280
0.6119 11.0 66 0.5824
0.5802 12.0 72 0.5667
0.5312 13.0 78 0.5148
0.5067 14.0 84 0.4975
0.4947 15.0 90 0.4650
0.4716 16.0 96 0.4645
0.4669 17.0 102 0.4134
0.4348 18.0 108 0.4056
0.4105 19.0 114 0.4091
0.4158 20.0 120 0.3917
0.3862 21.0 126 0.3704
0.3753 22.0 132 0.3581
0.3624 23.0 138 0.3444
0.3557 24.0 144 0.3281
0.3335 25.0 150 0.3180
0.3240 26.0 156 0.3008
0.3181 27.0 162 0.2858
0.3069 28.0 168 0.2820
0.2978 29.0 174 0.2660
0.2866 30.0 180 0.2594
0.2688 31.0 186 0.2464
0.2642 32.0 192 0.2372
0.2520 33.0 198 0.2307
0.2484 34.0 204 0.2216
0.2392 35.0 210 0.2195
0.2458 36.0 216 0.2128
0.2513 37.0 222 0.2108
0.2287 38.0 228 0.2072
0.2331 39.0 234 0.2034
0.2344 40.0 240 0.2019

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
24
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support