scientific-multilingual-transfer
Collection
This Collection contains the Models from the paper [TBA Link] • 13 items • Updated
Japanese scientific T5 model initialized from EN-T5-Sci using WECHSEL and a language-specific SentencePiece 32k tokenizer.
This is one of the non-English scientific T5 transfer models from the paper. The model keeps the EN-T5-Sci Transformer weights and reinitializes the language-specific embeddings with WECHSEL using a target SentencePiece tokenizer.
JA-Trans-InitmainEvaluated against:
WECHSEL resources: English fastText embeddings + Japanese fastText embeddings (ja) with the japanese bilingual dictionary.
Zero-shot Global-MMLU accuracy reported by the paper aggregation:
| Metric | Accuracy |
|---|---|
| Average | 25.51 |
| STEM | 26.26 |
| Humanities | 27.12 |
| Social Sciences | 23.79 |
| Other | 24.01 |
The model is evaluated primarily with zero-shot Global-MMLU. Downstream task-specific evaluation is recommended before deployment in specialized scientific workflows.
Base model
rausch/en-t5-sci-continued-pretraining-487k
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("rausch/ja-t5-sci-transfer-init-spm32k") model = AutoModelForSeq2SeqLM.from_pretrained("rausch/ja-t5-sci-transfer-init-spm32k")