calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3362

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.3627 1.0 6 2.7265
2.3833 2.0 12 1.9531
1.8417 3.0 18 1.6770
1.6578 4.0 24 1.5824
1.5674 5.0 30 1.5478
1.5750 6.0 36 1.5474
1.5333 7.0 42 1.7200
1.6069 8.0 48 1.5513
1.5526 9.0 54 1.5275
1.5118 10.0 60 1.5277
1.5246 11.0 66 1.5190
1.5645 12.0 72 1.5177
1.5285 13.0 78 1.5219
1.5391 14.0 84 1.5125
1.5124 15.0 90 1.5222
1.5302 16.0 96 1.5181
1.5253 17.0 102 1.5064
1.4981 18.0 108 1.5100
1.5007 19.0 114 1.4890
1.4692 20.0 120 1.4828
1.4616 21.0 126 1.4810
1.4744 22.0 132 1.4636
1.4671 23.0 138 1.4516
1.4594 24.0 144 1.4417
1.4802 25.0 150 1.4573
1.4573 26.0 156 1.4384
1.4171 27.0 162 1.4459
1.4506 28.0 168 1.4285
1.4360 29.0 174 1.4135
1.4537 30.0 180 1.4091
1.4133 31.0 186 1.4245
1.4442 32.0 192 1.3852
1.3666 33.0 198 1.3804
1.3887 34.0 204 1.3633
1.3842 35.0 210 1.3585
1.3670 36.0 216 1.3495
1.3610 37.0 222 1.3469
1.3501 38.0 228 1.3452
1.3474 39.0 234 1.3374
1.3398 40.0 240 1.3362

Framework versions

  • Transformers 5.0.0
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
15
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support