You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

mixed-nllb-top200k-mt

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.1585
  • Bleu: 13.8458
  • Chrf: 36.5087

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 8e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine_with_min_lr
  • lr_scheduler_warmup_steps: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf
16.4759 0.3514 500 3.1670 6.0004 24.4272
11.5069 0.7027 1000 2.7826 7.0662 28.2690
9.2266 1.0541 1500 2.5737 9.4713 29.9534
7.7717 1.4055 2000 2.4456 10.1876 32.1491
7.0392 1.7569 2500 2.3600 10.9698 32.8122
6.5306 2.1082 3000 2.2932 12.0184 34.3068
5.8066 2.4596 3500 2.2404 12.8806 34.8744
5.6442 2.8110 4000 2.2095 13.0357 35.2123
5.2669 3.1623 4500 2.1901 12.8163 35.7831
5.0469 3.5137 5000 2.1718 13.3429 36.2136
4.9437 3.8651 5500 2.1817 13.3994 35.7160
4.8553 4.2164 6000 2.1581 13.6634 36.2458
4.7227 4.5678 6500 2.1456 13.7885 36.6517
4.6865 4.9192 7000 2.1685 14.2061 36.7383
4.6865 5.0 7115 2.1585 13.8458 36.5087

Framework versions

  • Transformers 5.7.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.8.5
  • Tokenizers 0.22.2
Downloads last month
-
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for madoss/mixed-nllb-top200k-mt

Finetuned
(286)
this model