calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.0547	1.0	6	2.3235
2.0645	2.0	12	1.8699
1.6975	3.0	18	1.5582
1.4058	4.0	24	1.2875
1.1930	5.0	30	1.0846
1.0571	6.0	36	1.0523
0.9713	7.0	42	0.9177
0.8768	8.0	48	0.9016
0.8688	9.0	54	0.8003
0.7712	10.0	60	0.7546
0.7255	11.0	66	0.6924
0.6674	12.0	72	0.6362
0.6104	13.0	78	0.6404
0.5723	14.0	84	0.5483
0.5251	15.0	90	0.5152
0.5100	16.0	96	0.4806
0.4442	17.0	102	0.4510
0.4643	18.0	108	0.4321
0.4391	19.0	114	0.4565
0.4395	20.0	120	0.4186
0.4202	21.0	126	0.3869
0.4038	22.0	132	0.3577
0.3632	23.0	138	0.3645
0.3682	24.0	144	0.3557
0.3688	25.0	150	0.3535
0.3548	26.0	156	0.3524
0.3401	27.0	162	0.3203
0.3440	28.0	168	0.3018
0.3059	29.0	174	0.2777
0.2963	30.0	180	0.2761
0.2960	31.0	186	0.2661
0.2681	32.0	192	0.2570
0.2578	33.0	198	0.2479
0.2549	34.0	204	0.2413
0.2702	35.0	210	0.2329
0.2488	36.0	216	0.2267
0.2550	37.0	222	0.2219
0.2324	38.0	228	0.2206
0.2294	39.0	234	0.2190
0.2242	40.0	240	0.2172

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support