calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.9776	1.0	6	2.2338
2.0137	2.0	12	1.7343
1.5786	3.0	18	1.3472
1.2457	4.0	24	1.0998
1.0472	5.0	30	1.0031
0.9384	6.0	36	0.8653
0.8173	7.0	42	0.7479
0.7478	8.0	48	0.7076
0.6777	9.0	54	0.6401
0.6439	10.0	60	0.5758
0.5733	11.0	66	0.5338
0.5459	12.0	72	0.5289
0.5389	13.0	78	0.4816
0.4900	14.0	84	0.4525
0.4634	15.0	90	0.4408
0.4486	16.0	96	0.4086
0.4222	17.0	102	0.3999
0.4134	18.0	108	0.3542
0.3664	19.0	114	0.3412
0.3552	20.0	120	0.3342
0.3430	21.0	126	0.2933
0.3136	22.0	132	0.2505
0.2884	23.0	138	0.2319
0.2694	24.0	144	0.2140
0.2444	25.0	150	0.1982
0.2260	26.0	156	0.1673
0.2047	27.0	162	0.1553
0.2054	28.0	168	0.1375
0.1868	29.0	174	0.1267
0.1679	30.0	180	0.1165
0.1555	31.0	186	0.1120
0.1533	32.0	192	0.1032
0.1422	33.0	198	0.0982
0.1332	34.0	204	0.0978
0.1335	35.0	210	0.0920
0.1256	36.0	216	0.0900
0.1311	37.0	222	0.0879
0.1185	38.0	228	0.0830
0.1171	39.0	234	0.0816
0.1154	40.0	240	0.0811

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support