calculator_model_test

This model is a fine-tuned version of zzox531/calculator_model_test on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0566

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
0.1807	1.0	5	0.1524
0.1720	2.0	10	0.1506
0.1684	3.0	15	0.1488
0.1650	4.0	20	0.1413
0.1600	5.0	25	0.1404
0.1596	6.0	30	0.1395
0.1540	7.0	35	0.1335
0.1535	8.0	40	0.1309
0.1494	9.0	45	0.1302
0.1469	10.0	50	0.1270
0.1438	11.0	55	0.1246
0.1441	12.0	60	0.1232
0.1406	13.0	65	0.1215
0.1368	14.0	70	0.1161
0.1340	15.0	75	0.1152
0.1309	16.0	80	0.1125
0.1293	17.0	85	0.1114
0.1266	18.0	90	0.1087
0.1259	19.0	95	0.1065
0.1229	20.0	100	0.1084
0.1215	21.0	105	0.1039
0.1193	22.0	110	0.1020
0.1180	23.0	115	0.1019
0.1147	24.0	120	0.1017
0.1163	25.0	125	0.0958
0.1137	26.0	130	0.0959
0.1094	27.0	135	0.0936
0.1099	28.0	140	0.0919
0.1095	29.0	145	0.0913
0.1076	30.0	150	0.0901
0.1060	31.0	155	0.0875
0.1057	32.0	160	0.0878
0.1054	33.0	165	0.0876
0.1050	34.0	170	0.0886
0.1032	35.0	175	0.0841
0.1018	36.0	180	0.0843
0.1040	37.0	185	0.0816
0.0998	38.0	190	0.0847
0.0988	39.0	195	0.0805
0.1018	40.0	200	0.0806
0.0992	41.0	205	0.0800
0.0998	42.0	210	0.0780
0.0970	43.0	215	0.0766
0.0948	44.0	220	0.0770
0.0931	45.0	225	0.0745
0.0934	46.0	230	0.0768
0.0924	47.0	235	0.0754
0.0906	48.0	240	0.0734
0.0919	49.0	245	0.0754
0.0897	50.0	250	0.0712
0.0889	51.0	255	0.0721
0.0882	52.0	260	0.0697
0.0881	53.0	265	0.0718
0.0859	54.0	270	0.0690
0.0864	55.0	275	0.0700
0.0844	56.0	280	0.0681
0.0829	57.0	285	0.0686
0.0855	58.0	290	0.0668
0.0817	59.0	295	0.0671
0.0821	60.0	300	0.0668
0.0812	61.0	305	0.0653
0.0800	62.0	310	0.0648
0.0797	63.0	315	0.0650
0.0794	64.0	320	0.0655
0.0792	65.0	325	0.0643
0.0796	66.0	330	0.0641
0.0770	67.0	335	0.0631
0.0796	68.0	340	0.0635
0.0784	69.0	345	0.0626
0.0767	70.0	350	0.0616
0.0768	71.0	355	0.0615
0.0738	72.0	360	0.0609
0.0745	73.0	365	0.0617
0.0784	74.0	370	0.0614
0.0764	75.0	375	0.0613
0.0744	76.0	380	0.0618
0.0750	77.0	385	0.0602
0.0754	78.0	390	0.0599
0.0729	79.0	395	0.0602
0.0727	80.0	400	0.0589
0.0732	81.0	405	0.0597
0.0756	82.0	410	0.0588
0.0722	83.0	415	0.0580
0.0703	84.0	420	0.0592
0.0739	85.0	425	0.0582
0.0705	86.0	430	0.0579
0.0716	87.0	435	0.0584
0.0703	88.0	440	0.0582
0.0707	89.0	445	0.0576
0.0707	90.0	450	0.0574
0.0717	91.0	455	0.0574
0.0712	92.0	460	0.0572
0.0707	93.0	465	0.0572
0.0691	94.0	470	0.0570
0.0694	95.0	475	0.0567
0.0690	96.0	480	0.0567
0.0703	97.0	485	0.0567
0.0698	98.0	490	0.0566
0.0695	99.0	495	0.0566
0.0696	100.0	500	0.0566

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 59

Safetensors

Model size

7.8M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zzox531/calculator_model_test

Unable to build the model tree, the base model loops to the model itself. Learn more.