calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.3723	1.0	6	2.7100
2.3610	2.0	12	1.9674
1.8441	3.0	18	1.6414
1.6440	4.0	24	1.7133
1.6275	5.0	30	1.5587
1.5577	6.0	36	1.4983
1.4918	7.0	42	1.4335
1.4306	8.0	48	1.3870
1.3583	9.0	54	1.5533
1.4316	10.0	60	1.3992
1.3427	11.0	66	1.3620
1.3047	12.0	72	1.2491
1.2281	13.0	78	1.1471
1.1542	14.0	84	1.1002
1.1274	15.0	90	1.3248
1.1864	16.0	96	1.2169
1.1328	17.0	102	1.1488
1.0901	18.0	108	1.0341
1.0214	19.0	114	0.9890
1.0060	20.0	120	1.1041
1.0463	21.0	126	1.0832
1.0114	22.0	132	0.9414
0.9680	23.0	138	1.0066
0.9854	24.0	144	0.9178
0.9139	25.0	150	0.8783
0.9392	26.0	156	0.8476
0.8653	27.0	162	0.8081
0.8270	28.0	168	0.8145
0.8342	29.0	174	0.7931
0.8104	30.0	180	0.7706
0.8007	31.0	186	0.7405
0.7581	32.0	192	0.7223
0.7534	33.0	198	0.7220
0.7468	34.0	204	0.7195
0.7329	35.0	210	0.7129
0.7140	36.0	216	0.6833
0.7118	37.0	222	0.6774
0.7359	38.0	228	0.6700
0.7210	39.0	234	0.6725
0.6968	40.0	240	0.6660

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support