You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

mixed-nllb-top200k-mt

This model is a fine-tuned version of facebook/nllb-200-distilled-600M on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.1585
Bleu: 13.8458
Chrf: 36.5087

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 16
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine_with_min_lr
lr_scheduler_warmup_steps: 0.1
num_epochs: 5.0

Training results

Training Loss	Epoch	Step	Validation Loss	Bleu	Chrf
16.4759	0.3514	500	3.1670	6.0004	24.4272
11.5069	0.7027	1000	2.7826	7.0662	28.2690
9.2266	1.0541	1500	2.5737	9.4713	29.9534
7.7717	1.4055	2000	2.4456	10.1876	32.1491
7.0392	1.7569	2500	2.3600	10.9698	32.8122
6.5306	2.1082	3000	2.2932	12.0184	34.3068
5.8066	2.4596	3500	2.2404	12.8806	34.8744
5.6442	2.8110	4000	2.2095	13.0357	35.2123
5.2669	3.1623	4500	2.1901	12.8163	35.7831
5.0469	3.5137	5000	2.1718	13.3429	36.2136
4.9437	3.8651	5500	2.1817	13.3994	35.7160
4.8553	4.2164	6000	2.1581	13.6634	36.2458
4.7227	4.5678	6500	2.1456	13.7885	36.6517
4.6865	4.9192	7000	2.1685	14.2061	36.7383
4.6865	5.0	7115	2.1585	13.8458	36.5087

Framework versions

Transformers 5.7.0
Pytorch 2.8.0+cu128
Datasets 4.8.5
Tokenizers 0.22.2

Downloads last month: -

Safetensors

Model size

0.6B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for madoss/mixed-nllb-top200k-mt

Base model

facebook/nllb-200-distilled-600M

Finetuned

(286)

this model