calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2387

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.0501 1.0 6 2.2966
2.0486 2.0 12 1.7533
1.5929 3.0 18 1.3909
1.2724 4.0 24 1.1387
1.0906 5.0 30 0.9902
0.9501 6.0 36 0.8982
0.8489 7.0 42 0.8078
0.7684 8.0 48 0.7145
0.6872 9.0 54 0.6570
0.6574 10.0 60 0.6683
0.6364 11.0 66 0.6542
0.6217 12.0 72 0.5741
0.561 13.0 78 0.5594
0.548 14.0 84 0.5491
0.5447 15.0 90 0.5075
0.5114 16.0 96 0.5275
0.4897 17.0 102 0.4606
0.4736 18.0 108 0.4505
0.4536 19.0 114 0.4420
0.4438 20.0 120 0.4338
0.4178 21.0 126 0.4388
0.439 22.0 132 0.4336
0.4315 23.0 138 0.3953
0.4006 24.0 144 0.3763
0.3923 25.0 150 0.3776
0.3785 26.0 156 0.3616
0.38 27.0 162 0.3504
0.3546 28.0 168 0.3411
0.3507 29.0 174 0.3395
0.3518 30.0 180 0.3256
0.3324 31.0 186 0.3084
0.3161 32.0 192 0.2991
0.3043 33.0 198 0.2820
0.2977 34.0 204 0.2706
0.2911 35.0 210 0.2644
0.2843 36.0 216 0.2583
0.2662 37.0 222 0.2491
0.2737 38.0 228 0.2438
0.2599 39.0 234 0.2397
0.2647 40.0 240 0.2387

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1+cu128
  • Datasets 4.5.0
  • Tokenizers 0.22.1
Downloads last month
28
Safetensors
Model size
7.81M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support