RU-Trans-Init

Russian scientific T5 model initialized from EN-T5-Sci using WECHSEL and a language-specific SentencePiece 32k tokenizer.

Model Details

This is one of the non-English scientific T5 transfer models from the paper. The model keeps the EN-T5-Sci Transformer weights and reinitializes the language-specific embeddings with WECHSEL using a target SentencePiece tokenizer.

  • Paper name: RU-Trans-Init
  • Model role: main
  • Source/base model: EN-T5-Sci
  • Code and pipeline: GitHub repository
  • Architecture: T5 encoder-decoder
  • SciLaD dataset: scilons/SciLaD-all-text-v1
  • Evaluation benchmark: Global-MMLU
  • Target-language tokenizer: Russian SciLaD split; language-specific SentencePiece 32k tokenizer

Evaluated against:

WECHSEL resources: English fastText embeddings + Russian fastText embeddings (ru) with the russian bilingual dictionary.

Evaluation

Zero-shot Global-MMLU accuracy reported by the paper aggregation:

Metric Accuracy
Average 26.36
STEM 27.12
Humanities 24.89
Social Sciences 28.86
Other 25.33

Limitations

The model is evaluated primarily with zero-shot Global-MMLU. Downstream task-specific evaluation is recommended before deployment in specialized scientific workflows.

Citation

  • Title: Transferring Scientific English Pre-Trained Language Models to Multiple Languages Using Cross-Lingual Transfer
  • Authors: Nikolas Rauscher, Fabio Barth, Georg Rehm
  • Venue: LREC-COLING 2026, citation details TBA after publication
Downloads last month
14
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rausch/ru-t5-sci-transfer-init-spm32k

Finetuned
(6)
this model

Dataset used to train rausch/ru-t5-sci-transfer-init-spm32k

Collection including rausch/ru-t5-sci-transfer-init-spm32k