calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5538

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
3.4511 1.0 5 2.8467
2.5181 2.0 10 2.1116
1.9303 3.0 15 1.7151
1.6577 4.0 20 1.5789
1.5651 5.0 25 1.5754
1.5127 6.0 30 1.5309
1.4740 7.0 35 1.4478
1.4122 8.0 40 1.4028
1.3682 9.0 45 1.3378
1.3127 10.0 50 1.2629
1.2188 11.0 55 1.1926
1.1528 12.0 60 1.1110
1.0918 13.0 65 1.0502
1.0482 14.0 70 1.0344
1.0041 15.0 75 0.9841
0.9944 16.0 80 0.9972
0.9652 17.0 85 0.9387
0.9474 18.0 90 0.9364
0.9464 19.0 95 0.8833
0.8842 20.0 100 0.8297
0.8439 21.0 105 0.8420
0.8259 22.0 110 0.8106
0.8101 23.0 115 0.7762
0.7815 24.0 120 0.7527
0.7651 25.0 125 0.7202
0.7371 26.0 130 0.7016
0.7205 27.0 135 0.6782
0.7045 28.0 140 0.6595
0.6867 29.0 145 0.6433
0.6672 30.0 150 0.6401
0.6627 31.0 155 0.6267
0.6433 32.0 160 0.6054
0.6396 33.0 165 0.6003
0.6211 34.0 170 0.5905
0.6128 35.0 175 0.5826
0.6134 36.0 180 0.5728
0.6001 37.0 185 0.5680
0.5967 38.0 190 0.5584
0.5884 39.0 195 0.5570
0.5887 40.0 200 0.5538

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cpu
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
23
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support