rubert_level1_v2

This model is a fine-tuned version of DeepPavlov/rubert-base-cased for multilabel classification of software requirements in Russian (Level 1).

It achieves the following results on the evaluation set:

  • Loss: 0.0727
  • F1 Micro: 0.9749
  • F1 Macro: 0.9750
  • F1 Weighted: 0.9750

Model description

Level 1 classifier in a cascaded requirements classification pipeline. Classifies Russian-language text fragments from meeting recordings into three categories:

Label Description
IsFunctional Functional requirements — what the system must do
IsBusiness Business requirements — budgets, KPIs, deadlines, regulations
Other (OT) Non-requirements — organizational remarks, transition phrases, context

IsNonFunctional is derived automatically as OR over Level 2 predictions and is not predicted by this model directly.

The model is part of a cascaded pipeline: Audio → GigaAM-v3 (ASR) → rubert_level1_v2 (L1) → rubert_level2_v2 (L2) → Report

Per-class classification thresholds are stored in thresholds.json in this repository.

Intended uses & limitations

Intended for classification of Russian-language software requirements extracted from meeting audio recordings. Not suitable for general-purpose text classification or non-Russian languages.

Training and evaluation data

Custom Russian-language requirements dataset compiled from:

  • PROMISE dataset (translated to Russian)
  • PURE dataset (parsed from XML, translated to Russian)
  • Synthetically generated examples (Grok, Claude Sonnet) across 14 domain areas

Total: ~9800 labeled examples. Train/test split: 80/20, stratified, seed=42.

Training procedure

Training hyperparameters

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: AdamW with betas=(0.9, 0.999), epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 15 (early stopping patience=3)
  • max_length: 96

Training results

Training Loss Epoch Validation Loss F1 Micro F1 Macro F1 Weighted
0.1007 1 0.1046 0.9030 0.8907 0.8906
0.0462 2 0.0471 0.9669 0.9671 0.9671
0.0215 3 0.0467 0.9698 0.9697 0.9697
0.0170 4 0.0556 0.9689 0.9689 0.9689
0.0072 5 0.0784 0.9607 0.9604 0.9605
0.0055 6 0.0608 0.9724 0.9727 0.9724

Early stopping triggered after epoch 6.

Per-class results (test set)

Class Precision Recall F1 Support
IsFunctional 0.934 0.948 0.941 420
IsBusiness 0.993 0.978 0.985 416
Other (OT) 1.000 1.000 1.000 421
micro avg 0.975 0.975 0.975 1257

Framework versions

  • Transformers 4.57.1
  • PyTorch 2.8.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.22.2
Downloads last month
126
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for eternalGenius/rubert_level1_v2

Finetuned
(66)
this model

Evaluation results