lv-mbert-large-conloan-lv-ext

Description

This model is part of a Bachelor's thesis at the University of Latvia: "Contextual approach to Latvian loanword detection: dataset creation and classification experiments".

It is a fine-tuned version of lv-mbert-large-conloan-lv-ext on the extended dataset.

Classes and Labels

Dataset Type: {Baseline (Binary) / Extended (Contrastive)} Labels:

  • O: Outside
  • LOAN: Borrowing (Materiālie aizguvumi)
  • CS: Code-switching (Koda maiņa)
  • NE: Named Entities (Nosauktās entitātes)

Performance (k-fold average)

  • F1 Score: {0.897}
  • Std Dev (σ): {0.006}

Usage

from transformers import pipeline
nlp = pipeline("ner", model="jorenchik/lv-mbert-large-conloan-lv-ext")
nlp("Šodienas mītings bija ļoti produktīvs.")
Downloads last month
39
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train jorenchik/lv-mbert-large-conloan-lv-ext

Collection including jorenchik/lv-mbert-large-conloan-lv-ext