calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.3426 1.0 6 2.7993
2.4209 2.0 12 1.9973
1.8734 3.0 18 1.7473
1.6692 4.0 24 1.6342
1.5799 5.0 30 1.6449
1.5671 6.0 36 1.5829
1.5089 7.0 42 1.5246
1.4829 8.0 48 1.5022
1.4327 9.0 54 1.4243
1.4399 10.0 60 1.3937
1.3700 11.0 66 1.3579
1.3430 12.0 72 1.2976
1.2724 13.0 78 1.2658
1.2386 14.0 84 1.1594
1.2057 15.0 90 1.2266
1.2069 16.0 96 1.3501
1.2408 17.0 102 1.1047
1.1625 18.0 108 1.1621
1.1029 19.0 114 1.1712
1.1209 20.0 120 1.0636
1.0304 21.0 126 0.9785
0.9679 22.0 132 0.9535
0.9591 23.0 138 0.8968
0.9017 24.0 144 0.8817
0.8773 25.0 150 0.9545
0.9173 26.0 156 1.0227
0.9503 27.0 162 0.8290
0.8785 28.0 168 0.8701
0.8594 29.0 174 0.8212
0.8462 30.0 180 0.8228
0.8191 31.0 186 0.8144
0.8301 32.0 192 0.7736
0.7794 33.0 198 0.7820
0.7795 34.0 204 0.7523
0.7806 35.0 210 0.7386
0.7463 36.0 216 0.7327
0.7594 37.0 222 0.7222
0.7774 38.0 228 0.7165
0.7488 39.0 234 0.7132
0.7370 40.0 240 0.7120

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
15
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support