esm2-baseline-gb1

This model is a fine-tuned version of facebook/esm2_t12_35M_UR50D on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0933
Spearman: 0.9505
Pearson: 0.9608
Mse: 0.0933

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 64
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 212
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Spearman	Pearson	Mse
1.0087	1.0	85	1.1785	0.3811	0.3269	1.1785
1.0970	2.0	170	0.8212	0.6283	0.5801	0.8212
0.7012	3.0	255	0.7354	0.7396	0.6531	0.7354
0.5652	4.0	340	0.6340	0.7215	0.7052	0.6340
0.6484	5.0	425	0.5649	0.7957	0.7912	0.5649
0.4971	6.0	510	0.4307	0.8170	0.8087	0.4307
0.3682	7.0	595	0.3330	0.8478	0.8559	0.3330
0.3242	8.0	680	0.3750	0.8461	0.8623	0.3750
0.2856	9.0	765	0.2834	0.8644	0.8818	0.2834
0.2004	10.0	850	0.2564	0.8764	0.8963	0.2564
0.2301	11.0	935	0.2626	0.8806	0.9012	0.2626
0.1922	12.0	1020	0.2589	0.8947	0.9060	0.2589
0.1536	13.0	1105	0.2027	0.8897	0.9253	0.2027
0.1490	14.0	1190	0.2104	0.9004	0.9103	0.2104
0.1441	15.0	1275	0.2873	0.9095	0.9255	0.2873
0.0967	16.0	1360	0.1551	0.9125	0.9348	0.1551
0.1270	17.0	1445	0.1474	0.9119	0.9402	0.1474
0.1299	18.0	1530	0.1543	0.9215	0.9415	0.1543
0.0958	19.0	1615	0.1604	0.9197	0.9422	0.1604
0.0794	20.0	1700	0.1248	0.9203	0.9486	0.1248
0.0885	21.0	1785	0.1568	0.9119	0.9412	0.1568
0.0473	22.0	1870	0.1490	0.9280	0.9407	0.1490
0.0696	23.0	1955	0.1321	0.9262	0.9512	0.1321
0.0513	24.0	2040	0.1034	0.9310	0.9568	0.1034
0.0409	25.0	2125	0.1153	0.9359	0.9547	0.1153
0.0297	26.0	2210	0.0937	0.9448	0.9606	0.0937
0.0302	27.0	2295	0.1094	0.9372	0.9538	0.1094
0.0267	28.0	2380	0.0907	0.9416	0.9618	0.0907
0.0228	29.0	2465	0.0940	0.9491	0.9607	0.0940
0.0218	30.0	2550	0.0985	0.9493	0.9606	0.0985
0.0162	31.0	2635	0.0905	0.9494	0.9624	0.0905
0.0097	32.0	2720	0.0958	0.9445	0.9604	0.0958
0.0096	33.0	2805	0.0900	0.9518	0.9622	0.0900
0.0065	34.0	2890	0.0911	0.9510	0.9616	0.0911
0.0054	35.0	2975	0.0949	0.9518	0.9601	0.0949
0.0056	36.0	3060	0.0933	0.9504	0.9613	0.0933
0.0032	37.0	3145	0.0931	0.9511	0.9609	0.0931
0.0020	38.0	3230	0.0932	0.9500	0.9610	0.0932
0.0017	39.0	3315	0.0932	0.9502	0.9610	0.0932
0.0015	40.0	3400	0.0919	0.9506	0.9616	0.0919
0.0011	41.0	3485	0.0926	0.9509	0.9610	0.0926
0.0004	42.0	3570	0.0931	0.9505	0.9609	0.0931
0.0006	43.0	3655	0.0927	0.9507	0.9611	0.0927
0.0004	44.0	3740	0.0933	0.9504	0.9607	0.0933
0.0005	45.0	3825	0.0932	0.9506	0.9608	0.0932
0.0002	46.0	3910	0.0931	0.9507	0.9608	0.0931
0.0002	47.0	3995	0.0931	0.9506	0.9608	0.0931
0.0001	48.0	4080	0.0932	0.9507	0.9608	0.0932
0.0001	49.0	4165	0.0932	0.9506	0.9608	0.0932
0.0001	50.0	4250	0.0933	0.9505	0.9608	0.0933

Framework versions

Transformers 5.6.2
Pytorch 2.11.0+cu130
Datasets 4.8.5
Tokenizers 0.22.2

Downloads last month: 5

Safetensors

Model size

33.5M params

Tensor type

F32

Model tree for AliSaadatV/esm2-baseline-gb1

Base model

facebook/esm2_t12_35M_UR50D

Finetuned

(61)

this model