calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0010

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 80

Training results

Training Loss	Epoch	Step	Validation Loss
2.4893	1.0	13	1.6643
1.2679	2.0	26	0.8866
0.8178	3.0	39	0.7130
0.6528	4.0	52	0.5806
0.5426	5.0	65	0.5071
0.5094	6.0	78	0.4677
0.4514	7.0	91	0.3869
0.3829	8.0	104	0.3313
0.3274	9.0	117	0.2774
0.2904	10.0	130	0.2338
0.2507	11.0	143	0.1997
0.2193	12.0	156	0.1843
0.2027	13.0	169	0.1613
0.1787	14.0	182	0.1374
0.1596	15.0	195	0.1373
0.1485	16.0	208	0.1148
0.1350	17.0	221	0.1013
0.1191	18.0	234	0.0988
0.1056	19.0	247	0.0762
0.0920	20.0	260	0.0643
0.0819	21.0	273	0.0607
0.0733	22.0	286	0.0524
0.0633	23.0	299	0.0388
0.0531	24.0	312	0.0313
0.0461	25.0	325	0.0283
0.0406	26.0	338	0.0226
0.0315	27.0	351	0.0156
0.0245	28.0	364	0.0138
0.0203	29.0	377	0.0112
0.0165	30.0	390	0.0095
0.0145	31.0	403	0.0081
0.0127	32.0	416	0.0070
0.0105	33.0	429	0.0062
0.0098	34.0	442	0.0056
0.0087	35.0	455	0.0047
0.0079	36.0	468	0.0044
0.0067	37.0	481	0.0041
0.0065	38.0	494	0.0039
0.0060	39.0	507	0.0034
0.0053	40.0	520	0.0031
0.0049	41.0	533	0.0029
0.0046	42.0	546	0.0027
0.0042	43.0	559	0.0026
0.0039	44.0	572	0.0022
0.0041	45.0	585	0.0025
0.0037	46.0	598	0.0021
0.0036	47.0	611	0.0022
0.0035	48.0	624	0.0019
0.0031	49.0	637	0.0019
0.0032	50.0	650	0.0019
0.0029	51.0	663	0.0017
0.0026	52.0	676	0.0016
0.0024	53.0	689	0.0015
0.0026	54.0	702	0.0015
0.0023	55.0	715	0.0016
0.0024	56.0	728	0.0014
0.0023	57.0	741	0.0014
0.0023	58.0	754	0.0013
0.0021	59.0	767	0.0013
0.0020	60.0	780	0.0011
0.0019	61.0	793	0.0012
0.0020	62.0	806	0.0012
0.0018	63.0	819	0.0011
0.0017	64.0	832	0.0011
0.0017	65.0	845	0.0011
0.0018	66.0	858	0.0010
0.0016	67.0	871	0.0010
0.0017	68.0	884	0.0010
0.0016	69.0	897	0.0010
0.0014	70.0	910	0.0010
0.0016	71.0	923	0.0010
0.0016	72.0	936	0.0011
0.0015	73.0	949	0.0010
0.0015	74.0	962	0.0011
0.0014	75.0	975	0.0010
0.0014	76.0	988	0.0010
0.0014	77.0	1001	0.0010
0.0013	78.0	1014	0.0010
0.0014	79.0	1027	0.0010
0.0014	80.0	1040	0.0010

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 334

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support