calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.0932	1.0	6	2.3284
2.0708	2.0	12	1.7597
1.6188	3.0	18	1.4260
1.2984	4.0	24	1.1225
1.0571	5.0	30	0.9371
0.9006	6.0	36	0.8191
0.8187	7.0	42	0.7260
0.7281	8.0	48	0.6762
0.6804	9.0	54	0.6348
0.6251	10.0	60	0.5609
0.5701	11.0	66	0.5090
0.5325	12.0	72	0.4674
0.4973	13.0	78	0.4407
0.4619	14.0	84	0.4090
0.4408	15.0	90	0.3996
0.4311	16.0	96	0.4260
0.4237	17.0	102	0.3490
0.3734	18.0	108	0.3225
0.3387	19.0	114	0.2895
0.3111	20.0	120	0.2506
0.2790	21.0	126	0.2317
0.2652	22.0	132	0.2102
0.2521	23.0	138	0.1889
0.2293	24.0	144	0.1697
0.2031	25.0	150	0.1413
0.1844	26.0	156	0.1269
0.1856	27.0	162	0.1358
0.1787	28.0	168	0.1104
0.1549	29.0	174	0.1175
0.1644	30.0	180	0.1034
0.1406	31.0	186	0.0931
0.1379	32.0	192	0.0888
0.1299	33.0	198	0.0916
0.1265	34.0	204	0.0809
0.1216	35.0	210	0.0747
0.1160	36.0	216	0.0725
0.1106	37.0	222	0.0707
0.1092	38.0	228	0.0686
0.1076	39.0	234	0.0689
0.1113	40.0	240	0.0681

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support