calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.3426	1.0	6	2.7993
2.4209	2.0	12	1.9973
1.8734	3.0	18	1.7473
1.6692	4.0	24	1.6342
1.5799	5.0	30	1.6449
1.5671	6.0	36	1.5829
1.5089	7.0	42	1.5246
1.4829	8.0	48	1.5022
1.4327	9.0	54	1.4243
1.4399	10.0	60	1.3937
1.3700	11.0	66	1.3579
1.3430	12.0	72	1.2976
1.2724	13.0	78	1.2658
1.2386	14.0	84	1.1594
1.2057	15.0	90	1.2266
1.2069	16.0	96	1.3501
1.2408	17.0	102	1.1047
1.1625	18.0	108	1.1621
1.1029	19.0	114	1.1712
1.1209	20.0	120	1.0636
1.0304	21.0	126	0.9785
0.9679	22.0	132	0.9535
0.9591	23.0	138	0.8968
0.9017	24.0	144	0.8817
0.8773	25.0	150	0.9545
0.9173	26.0	156	1.0227
0.9503	27.0	162	0.8290
0.8785	28.0	168	0.8701
0.8594	29.0	174	0.8212
0.8462	30.0	180	0.8228
0.8191	31.0	186	0.8144
0.8301	32.0	192	0.7736
0.7794	33.0	198	0.7820
0.7795	34.0	204	0.7523
0.7806	35.0	210	0.7386
0.7463	36.0	216	0.7327
0.7594	37.0	222	0.7222
0.7774	38.0	228	0.7165
0.7488	39.0	234	0.7132
0.7370	40.0	240	0.7120

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support