calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.3627	1.0	6	2.7265
2.3833	2.0	12	1.9531
1.8417	3.0	18	1.6770
1.6578	4.0	24	1.5824
1.5674	5.0	30	1.5478
1.5750	6.0	36	1.5474
1.5333	7.0	42	1.7200
1.6069	8.0	48	1.5513
1.5526	9.0	54	1.5275
1.5118	10.0	60	1.5277
1.5246	11.0	66	1.5190
1.5645	12.0	72	1.5177
1.5285	13.0	78	1.5219
1.5391	14.0	84	1.5125
1.5124	15.0	90	1.5222
1.5302	16.0	96	1.5181
1.5253	17.0	102	1.5064
1.4981	18.0	108	1.5100
1.5007	19.0	114	1.4890
1.4692	20.0	120	1.4828
1.4616	21.0	126	1.4810
1.4744	22.0	132	1.4636
1.4671	23.0	138	1.4516
1.4594	24.0	144	1.4417
1.4802	25.0	150	1.4573
1.4573	26.0	156	1.4384
1.4171	27.0	162	1.4459
1.4506	28.0	168	1.4285
1.4360	29.0	174	1.4135
1.4537	30.0	180	1.4091
1.4133	31.0	186	1.4245
1.4442	32.0	192	1.3852
1.3666	33.0	198	1.3804
1.3887	34.0	204	1.3633
1.3842	35.0	210	1.3585
1.3670	36.0	216	1.3495
1.3610	37.0	222	1.3469
1.3501	38.0	228	1.3452
1.3474	39.0	234	1.3374
1.3398	40.0	240	1.3362

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support