calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4213	1.0	6	2.7742
2.3873	2.0	12	1.9848
1.8855	3.0	18	1.6983
1.6818	4.0	24	1.5908
1.5942	5.0	30	1.5682
1.5581	6.0	36	1.5332
1.5573	7.0	42	1.5375
1.5423	8.0	48	1.5241
1.5435	9.0	54	1.5207
1.5402	10.0	60	1.5169
1.5330	11.0	66	1.5100
1.5252	12.0	72	1.5091
1.5234	13.0	78	1.5080
1.5284	14.0	84	1.5142
1.5221	15.0	90	1.5212
1.5305	16.0	96	1.5103
1.5199	17.0	102	1.5069
1.5174	18.0	108	1.5121
1.5104	19.0	114	1.5072
1.5197	20.0	120	1.5043
1.5101	21.0	126	1.5044
1.5145	22.0	132	1.5040
1.5226	23.0	138	1.5058
1.5299	24.0	144	1.5069
1.5290	25.0	150	1.5061
1.5070	26.0	156	1.5067
1.5077	27.0	162	1.5066
1.5240	28.0	168	1.5030
1.5344	29.0	174	1.5010
1.5148	30.0	180	1.5009
1.5182	31.0	186	1.5029
1.5158	32.0	192	1.5058
1.5093	33.0	198	1.5057
1.5149	34.0	204	1.5037
1.5098	35.0	210	1.5022
1.5095	36.0	216	1.5020
1.5085	37.0	222	1.5021
1.5091	38.0	228	1.5020
1.5020	39.0	234	1.5025
1.5103	40.0	240	1.5025

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support