--- title: README emoji: 🌍 colorFrom: yellow colorTo: gray sdk: static pinned: false ---
denoising_small_16_9

Diffutron: A Masked Diffusion Language Model for Turkish Language

   | 🤗 Models   |    📊 Pre-training Dataset   |    📄 Paper   |

## Overview Diffutron is a lightweight, non-autoregressive Masked Diffusion Language Model (MDLM) specifically optimized for the Turkish language. By utilizing a discrete diffusion process, Diffutron generates text through iterative refinement, allowing for bi-directional context awareness and high parameter efficiency. ## Core Features * **Architecture:** Discrete Masked Diffusion (MDLM) using a 307M parameter encoder backbone. * **Efficiency:** Achieves competitive performance against 2B+ parameter autoregressive models on Turkish benchmarks. * **Adaptation:** LoRA-based (r=256) continual pre-training on a 2M sequence Turkish corpus. * **Instruction Tuning:** Progressive strategy using LlamaTurk and InstrucTurca datasets for enhanced command following. ## Benchmarks Diffutron achieves a significant reduction in perplexity and competitive scores across the CETVEL benchmark suite: | Benchmark | Diffutron-1st-Stage (0.3B) | Diffutron-2nd-Stage (0.3B) | TURNA (1.1B) | Kumru (2B) | Kanarya (2B) | Llama-3.2 (3B) | Trendyol (7B) | Aya-101 (13B) | | :--- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | **Belebele_TR** | 22.22 | 27.00 | 22.56 | 29.00 | 28.11 | **55.78** | 36.22 | 22.89 | | **EXAMS_TR** | 25.95 | 27.74 | 23.66 | **30.03** | **30.03** | 26.21 | 28.50 | 22.90 | | **IronyTR** | 50.67 | **52.00** | 48.33 | 51.00 | 50.00 | 50.17 | 50.00 | **52.17** | | **News_Cat** | 23.20 | 32.40 | 32.80 | 26.40 | 66.80 | 64.00 | **81.20** | 20.00 | | **MNLI_TR** | 33.29 | 32.81 | 34.94 | **36.42** | 33.40 | 34.76 | 35.19 | 27.90 | | **STS_TR** | 17.77 | **18.78** | 14.21 | 11.75 | 12.91 | 12.91 | 15.52 | 16.97 | | **XCOPA_TR** | 53.80 | 52.00 | 55.80 | 54.00 | **64.20** | 54.60 | 61.00 | 59.60 | | **Average** | 32.41 | **34.68** | 33.19 | 34.09 | 40.78 | 42.63 | **43.95** | 31.78 | ## Citation ```bibtex @misc{diffutron2026, title={Diffutron: A Masked Diffusion Language Model for Turkish Language}, author={Şuayp Talha Kocabay and Talha Rüzgar Akkuş}, year={2026}, eprint={2603.20466}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2603.20466}, } ```