calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2172

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0547 1.0 6 2.3235
2.0645 2.0 12 1.8699
1.6975 3.0 18 1.5582
1.4058 4.0 24 1.2875
1.1930 5.0 30 1.0846
1.0571 6.0 36 1.0523
0.9713 7.0 42 0.9177
0.8768 8.0 48 0.9016
0.8688 9.0 54 0.8003
0.7712 10.0 60 0.7546
0.7255 11.0 66 0.6924
0.6674 12.0 72 0.6362
0.6104 13.0 78 0.6404
0.5723 14.0 84 0.5483
0.5251 15.0 90 0.5152
0.5100 16.0 96 0.4806
0.4442 17.0 102 0.4510
0.4643 18.0 108 0.4321
0.4391 19.0 114 0.4565
0.4395 20.0 120 0.4186
0.4202 21.0 126 0.3869
0.4038 22.0 132 0.3577
0.3632 23.0 138 0.3645
0.3682 24.0 144 0.3557
0.3688 25.0 150 0.3535
0.3548 26.0 156 0.3524
0.3401 27.0 162 0.3203
0.3440 28.0 168 0.3018
0.3059 29.0 174 0.2777
0.2963 30.0 180 0.2761
0.2960 31.0 186 0.2661
0.2681 32.0 192 0.2570
0.2578 33.0 198 0.2479
0.2549 34.0 204 0.2413
0.2702 35.0 210 0.2329
0.2488 36.0 216 0.2267
0.2550 37.0 222 0.2219
0.2324 38.0 228 0.2206
0.2294 39.0 234 0.2190
0.2242 40.0 240 0.2172

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
25
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support