calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9472	1.0	6	2.2628
2.0366	2.0	12	1.7642
1.5645	3.0	18	1.3854
1.2543	4.0	24	1.1356
1.0465	5.0	30	1.0000
0.9275	6.0	36	0.8717
0.8314	7.0	42	0.7647
0.7834	8.0	48	0.7636
0.7211	9.0	54	0.6759
0.6548	10.0	60	0.6257
0.6005	11.0	66	0.6371
0.6133	12.0	72	0.6054
0.5681	13.0	78	0.5610
0.5576	14.0	84	0.5441
0.5279	15.0	90	0.4975
0.4981	16.0	96	0.4773
0.4508	17.0	102	0.4559
0.4491	18.0	108	0.4150
0.4122	19.0	114	0.4092
0.3865	20.0	120	0.3901
0.3667	21.0	126	0.3332
0.3287	22.0	132	0.3279
0.3177	23.0	138	0.2912
0.2865	24.0	144	0.2619
0.2658	25.0	150	0.2492
0.2669	26.0	156	0.2297
0.2491	27.0	162	0.2012
0.2140	28.0	168	0.1816
0.2039	29.0	174	0.1664
0.1917	30.0	180	0.1593
0.1792	31.0	186	0.1406
0.1558	32.0	192	0.1353
0.1507	33.0	198	0.1392
0.1538	34.0	204	0.1245
0.1550	35.0	210	0.1156
0.1298	36.0	216	0.1170
0.1377	37.0	222	0.1142
0.1428	38.0	228	0.1121
0.1332	39.0	234	0.1076
0.1255	40.0	240	0.1067

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support