XNLI CDA Model with Llama (Cross-Lingual)
This model was trained on the XNLI dataset using Counterfactual Data Augmentation (CDA) with counterfactuals generated by Llama.
Training Parameters
- Dataset: XNLI
- Mode: CDA
- Selection Model: Llama
- Selection Method: Random
- Cross Lingual: true
- Train Size: 2400 examples
- Epochs: 12
- Batch Size: 24
- Effective Batch Size: 96 (batch_size * gradient_accumulation_steps)
- Learning Rate: 2e-05
- Patience: 6
- Max Length: 256
- Gradient Accumulation Steps: 4
- Warmup Ratio: 0.1
- Weight Decay: 0.01
- Optimizer: AdamW
- Scheduler: cosine_with_warmup
- Random Seed: 42
Performance
- Overall Accuracy: 58.19%
- Overall Loss: 0.0189
Language-Specific Performance
- English (EN): 70.86%
- German (DE): 61.58%
- Arabic (AR): 55.01%
- Spanish (ES): 63.51%
- Hindi (HI): 51.28%
- Swahili (SW): 46.89%
Model Information
- Base Model: bert-base-multilingual-cased
- Task: Natural Language Inference
- Languages: 6 languages (EN, DE, AR, ES, HI, SW)