rubert_level1_v2

This model is a fine-tuned version of DeepPavlov/rubert-base-cased for multilabel classification of software requirements in Russian (Level 1).

It achieves the following results on the evaluation set:

Loss: 0.0727
F1 Micro: 0.9749
F1 Macro: 0.9750
F1 Weighted: 0.9750

Model description

Level 1 classifier in a cascaded requirements classification pipeline. Classifies Russian-language text fragments from meeting recordings into three categories:

Label	Description
`IsFunctional`	Functional requirements — what the system must do
`IsBusiness`	Business requirements — budgets, KPIs, deadlines, regulations
`Other (OT)`	Non-requirements — organizational remarks, transition phrases, context

IsNonFunctional is derived automatically as OR over Level 2 predictions and is not predicted by this model directly.

The model is part of a cascaded pipeline: Audio → GigaAM-v3 (ASR) → rubert_level1_v2 (L1) → rubert_level2_v2 (L2) → Report

Per-class classification thresholds are stored in thresholds.json in this repository.

Intended uses & limitations

Intended for classification of Russian-language software requirements extracted from meeting audio recordings. Not suitable for general-purpose text classification or non-Russian languages.

Training and evaluation data

Custom Russian-language requirements dataset compiled from:

PROMISE dataset (translated to Russian)
PURE dataset (parsed from XML, translated to Russian)
Synthetically generated examples (Grok, Claude Sonnet) across 14 domain areas

Total: ~9800 labeled examples. Train/test split: 80/20, stratified, seed=42.

Training procedure

Training hyperparameters

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: AdamW with betas=(0.9, 0.999), epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.06
num_epochs: 15 (early stopping patience=3)
max_length: 96

Training results

Training Loss	Epoch	Validation Loss	F1 Micro	F1 Macro	F1 Weighted
0.1007	1	0.1046	0.9030	0.8907	0.8906
0.0462	2	0.0471	0.9669	0.9671	0.9671
0.0215	3	0.0467	0.9698	0.9697	0.9697
0.0170	4	0.0556	0.9689	0.9689	0.9689
0.0072	5	0.0784	0.9607	0.9604	0.9605
0.0055	6	0.0608	0.9724	0.9727	0.9724

Early stopping triggered after epoch 6.

Per-class results (test set)

Class	Precision	Recall	F1	Support
IsFunctional	0.934	0.948	0.941	420
IsBusiness	0.993	0.978	0.985	416
Other (OT)	1.000	1.000	1.000	421
micro avg	0.975	0.975	0.975	1257

Framework versions

Transformers 4.57.1
PyTorch 2.8.0+cu128
Datasets 4.0.0
Tokenizers 0.22.2

Downloads last month: 126

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for eternalGenius/rubert_level1_v2

Base model

DeepPavlov/rubert-base-cased

Finetuned

(66)

this model

Evaluation results

Validation Loss
self-reported

0.073
F1 Micro
self-reported

0.975
F1 Macro
self-reported

0.975
F1 Weighted
self-reported

0.975