esm2-baseline-gb1

This model is a fine-tuned version of facebook/esm2_t12_35M_UR50D on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0933
  • Spearman: 0.9505
  • Pearson: 0.9608
  • Mse: 0.0933

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 212
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Spearman Pearson Mse
1.0087 1.0 85 1.1785 0.3811 0.3269 1.1785
1.0970 2.0 170 0.8212 0.6283 0.5801 0.8212
0.7012 3.0 255 0.7354 0.7396 0.6531 0.7354
0.5652 4.0 340 0.6340 0.7215 0.7052 0.6340
0.6484 5.0 425 0.5649 0.7957 0.7912 0.5649
0.4971 6.0 510 0.4307 0.8170 0.8087 0.4307
0.3682 7.0 595 0.3330 0.8478 0.8559 0.3330
0.3242 8.0 680 0.3750 0.8461 0.8623 0.3750
0.2856 9.0 765 0.2834 0.8644 0.8818 0.2834
0.2004 10.0 850 0.2564 0.8764 0.8963 0.2564
0.2301 11.0 935 0.2626 0.8806 0.9012 0.2626
0.1922 12.0 1020 0.2589 0.8947 0.9060 0.2589
0.1536 13.0 1105 0.2027 0.8897 0.9253 0.2027
0.1490 14.0 1190 0.2104 0.9004 0.9103 0.2104
0.1441 15.0 1275 0.2873 0.9095 0.9255 0.2873
0.0967 16.0 1360 0.1551 0.9125 0.9348 0.1551
0.1270 17.0 1445 0.1474 0.9119 0.9402 0.1474
0.1299 18.0 1530 0.1543 0.9215 0.9415 0.1543
0.0958 19.0 1615 0.1604 0.9197 0.9422 0.1604
0.0794 20.0 1700 0.1248 0.9203 0.9486 0.1248
0.0885 21.0 1785 0.1568 0.9119 0.9412 0.1568
0.0473 22.0 1870 0.1490 0.9280 0.9407 0.1490
0.0696 23.0 1955 0.1321 0.9262 0.9512 0.1321
0.0513 24.0 2040 0.1034 0.9310 0.9568 0.1034
0.0409 25.0 2125 0.1153 0.9359 0.9547 0.1153
0.0297 26.0 2210 0.0937 0.9448 0.9606 0.0937
0.0302 27.0 2295 0.1094 0.9372 0.9538 0.1094
0.0267 28.0 2380 0.0907 0.9416 0.9618 0.0907
0.0228 29.0 2465 0.0940 0.9491 0.9607 0.0940
0.0218 30.0 2550 0.0985 0.9493 0.9606 0.0985
0.0162 31.0 2635 0.0905 0.9494 0.9624 0.0905
0.0097 32.0 2720 0.0958 0.9445 0.9604 0.0958
0.0096 33.0 2805 0.0900 0.9518 0.9622 0.0900
0.0065 34.0 2890 0.0911 0.9510 0.9616 0.0911
0.0054 35.0 2975 0.0949 0.9518 0.9601 0.0949
0.0056 36.0 3060 0.0933 0.9504 0.9613 0.0933
0.0032 37.0 3145 0.0931 0.9511 0.9609 0.0931
0.0020 38.0 3230 0.0932 0.9500 0.9610 0.0932
0.0017 39.0 3315 0.0932 0.9502 0.9610 0.0932
0.0015 40.0 3400 0.0919 0.9506 0.9616 0.0919
0.0011 41.0 3485 0.0926 0.9509 0.9610 0.0926
0.0004 42.0 3570 0.0931 0.9505 0.9609 0.0931
0.0006 43.0 3655 0.0927 0.9507 0.9611 0.0927
0.0004 44.0 3740 0.0933 0.9504 0.9607 0.0933
0.0005 45.0 3825 0.0932 0.9506 0.9608 0.0932
0.0002 46.0 3910 0.0931 0.9507 0.9608 0.0931
0.0002 47.0 3995 0.0931 0.9506 0.9608 0.0931
0.0001 48.0 4080 0.0932 0.9507 0.9608 0.0932
0.0001 49.0 4165 0.0932 0.9506 0.9608 0.0932
0.0001 50.0 4250 0.0933 0.9505 0.9608 0.0933

Framework versions

  • Transformers 5.6.2
  • Pytorch 2.11.0+cu130
  • Datasets 4.8.5
  • Tokenizers 0.22.2
Downloads last month
5
Safetensors
Model size
33.5M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AliSaadatV/esm2-baseline-gb1

Finetuned
(61)
this model