calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.4511	1.0	5	2.8467
2.5181	2.0	10	2.1116
1.9303	3.0	15	1.7151
1.6577	4.0	20	1.5789
1.5651	5.0	25	1.5754
1.5127	6.0	30	1.5309
1.4740	7.0	35	1.4478
1.4122	8.0	40	1.4028
1.3682	9.0	45	1.3378
1.3127	10.0	50	1.2629
1.2188	11.0	55	1.1926
1.1528	12.0	60	1.1110
1.0918	13.0	65	1.0502
1.0482	14.0	70	1.0344
1.0041	15.0	75	0.9841
0.9944	16.0	80	0.9972
0.9652	17.0	85	0.9387
0.9474	18.0	90	0.9364
0.9464	19.0	95	0.8833
0.8842	20.0	100	0.8297
0.8439	21.0	105	0.8420
0.8259	22.0	110	0.8106
0.8101	23.0	115	0.7762
0.7815	24.0	120	0.7527
0.7651	25.0	125	0.7202
0.7371	26.0	130	0.7016
0.7205	27.0	135	0.6782
0.7045	28.0	140	0.6595
0.6867	29.0	145	0.6433
0.6672	30.0	150	0.6401
0.6627	31.0	155	0.6267
0.6433	32.0	160	0.6054
0.6396	33.0	165	0.6003
0.6211	34.0	170	0.5905
0.6128	35.0	175	0.5826
0.6134	36.0	180	0.5728
0.6001	37.0	185	0.5680
0.5967	38.0	190	0.5584
0.5884	39.0	195	0.5570
0.5887	40.0	200	0.5538

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support