calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.8957	1.0	6	2.2381
2.0146	2.0	12	1.7705
1.6197	3.0	18	1.4439
1.2979	4.0	24	1.2227
1.1103	5.0	30	1.0364
0.9847	6.0	36	0.9059
0.8452	7.0	42	0.8222
0.7736	8.0	48	0.7130
0.7047	9.0	54	0.7146
0.6887	10.0	60	0.6280
0.6119	11.0	66	0.5824
0.5802	12.0	72	0.5667
0.5312	13.0	78	0.5148
0.5067	14.0	84	0.4975
0.4947	15.0	90	0.4650
0.4716	16.0	96	0.4645
0.4669	17.0	102	0.4134
0.4348	18.0	108	0.4056
0.4105	19.0	114	0.4091
0.4158	20.0	120	0.3917
0.3862	21.0	126	0.3704
0.3753	22.0	132	0.3581
0.3624	23.0	138	0.3444
0.3557	24.0	144	0.3281
0.3335	25.0	150	0.3180
0.3240	26.0	156	0.3008
0.3181	27.0	162	0.2858
0.3069	28.0	168	0.2820
0.2978	29.0	174	0.2660
0.2866	30.0	180	0.2594
0.2688	31.0	186	0.2464
0.2642	32.0	192	0.2372
0.2520	33.0	198	0.2307
0.2484	34.0	204	0.2216
0.2392	35.0	210	0.2195
0.2458	36.0	216	0.2128
0.2513	37.0	222	0.2108
0.2287	38.0	228	0.2072
0.2331	39.0	234	0.2034
0.2344	40.0	240	0.2019

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support