calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4886	1.0	5	2.9077
2.6030	2.0	10	2.1443
1.9775	3.0	15	1.7360
1.6903	4.0	20	1.5977
1.5862	5.0	25	1.5620
1.5460	6.0	30	1.5337
1.5075	7.0	35	1.4873
1.4838	8.0	40	1.4564
1.4496	9.0	45	1.4431
1.4041	10.0	50	1.3911
1.3950	11.0	55	1.3940
1.3421	12.0	60	1.3259
1.3013	13.0	65	1.2902
1.2550	14.0	70	1.2079
1.1784	15.0	75	1.1146
1.1135	16.0	80	1.0629
1.0583	17.0	85	1.0104
1.0198	18.0	90	0.9646
0.9765	19.0	95	0.9315
0.9386	20.0	100	0.8994
0.9116	21.0	105	0.8690
0.8871	22.0	110	0.8332
0.8702	23.0	115	0.8674
0.8441	24.0	120	0.7941
0.8180	25.0	125	0.7898
0.8022	26.0	130	0.7690
0.7880	27.0	135	0.7386
0.7739	28.0	140	0.7302
0.7618	29.0	145	0.7090
0.7451	30.0	150	0.7043
0.7340	31.0	155	0.6951
0.7240	32.0	160	0.6731
0.7076	33.0	165	0.6516
0.6951	34.0	170	0.6469
0.6846	35.0	175	0.6319
0.6737	36.0	180	0.6170
0.6601	37.0	185	0.6103
0.6558	38.0	190	0.6016
0.6509	39.0	195	0.5963
0.6435	40.0	200	0.5945

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support