We identified the CADRADS scores in 1429 radiology reports using regular expressions for the CADRADS directly and for the degrees of stenosis. For the training we masked the direct CADRADS patterns.

Model card:

Field Value
Task Multi-class sequence/text classification
Number of samples 1,429
Number of classes 6
Labels 0, 1, 2, 3, 4, 5
Base model UMCU/CardioBERTa.nl_clinical
Architecture RobertaForSequenceClassification
Classification head Newly initialized before fine-tuning
Validation strategy Stratified 10-fold cross-validation
Trained fold Fold 0 only
Fold 0 train size 1,286
Fold 0 validation size 143
Class weighting Yes

Label distribution:

Label Count Percentage
0 345 24.1%
1 281 19.7%
2 333 23.3%
3 317 22.2%
4 135 9.4%
5 18 1.3%

Class weights:

Label Weight
0 0.6914
1 0.8505
2 0.7168
3 0.7494
4 1.7568
5 12.6078

10-fold CV results.

"average_results": {
    "avg_eval_accuracy": 0.863542795232936,
    "std_eval_accuracy": 0.03399941533079611,
    "avg_eval_f1": 0.8609888740211058,
    "std_eval_f1": 0.034077598782922706,
    "avg_eval_precision": 0.8642976260040041,
    "std_eval_precision": 0.03264535733017625,
    "avg_eval_recall": 0.863542795232936,
    "std_eval_recall": 0.03399941533079611
  }
Downloads last month
33
Safetensors
Model size
0.4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for UMCU/DutchCADRADS_fromRadioReports_Masked

Finetuned
(2)
this model