YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Denotational Type Classifier
This model classifies the denotational relation between two word senses, represented as a pair of definitions. It is based on roberta-large with a sequence-classification head and is trained for lexical-semantic relation classification.
The model predicts one of the following relation types:
generalizationspecializationmetaphormetonymyhomonymyantonymy
The first five labels correspond to the denotational relation types evaluated in SenseRel. antonymy is included because it is present in the fine-tuning data, although it is not part of the expert-annotated SenseRel denotational test set.
Model Details
- Model type: Causal language model
- Base model:
meta-llama/Meta-Llama-3.1-8B - Task: Denotational relation classification via text generation
- Language: English
Intended Use
This model is intended for research on lexical semantics, semantic change, polysemy, and sense-level semantic relations. It can be used to classify the relation between two definitions of a word sense, for example whether a newer meaning is a generalization, specialization, metaphorical extension, metonymic extension, homonym, or antonymic development of another meaning.
Example use cases include:
- studying semantic change in dictionaries or lexical resources;
- analyzing polysemous word senses;
- supporting sense-level semantic annotation;
- scaling exploratory studies of denotational change.
The model is not intended for high-stakes decision-making or for general-purpose natural language understanding outside lexical-semantic relation classification.
Input Format
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "ChangeIsKey/denotational-llama-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
definition_1 = "the young of the domestic cow"
definition_2 = "the young of various large mammals"
text = f"{definition_1} <|s|> {definition_2} <|t|>"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
next_token_logits = outputs.logits[0, -1, :]
label_tokens = [
"<|generalization|>",
"<|specialization|>",
"<|metaphor|>",
"<|metonymy|>",
"<|homonymy|>",
"<|antonymy|>",
]
label_token_ids = tokenizer.convert_tokens_to_ids(label_tokens)
label_logits = next_token_logits[label_token_ids]
predicted_id = label_token_ids[torch.argmax(label_logits).item()]
prediction = tokenizer.convert_ids_to_tokens(predicted_id)
print(prediction)
Training Data
The model was fine-tuned on a combined denotational-relation dataset referred to as WN+CN+UM, constructed from existing lexical-semantic resources:
- ChainNet for metaphor, metonymy, and homonymy relations;
- UniMet for additional metonymy examples;
- WordNet for generalization, specialization, and auto-antonymy examples.
These resources use definition or synset-pair information as proxies for relations between word senses.
Training Procedure
The base Llama model was fine-tuned using a causal language modeling objective.
Each training instance consists of a prompt containing two dictionary definitions followed by the target relation label. During fine-tuning, the model learns to generate the correct label given the input definitions.
Evaluation
The model was evaluated on:
- the
WN+CN+UMtest set; - the SenseRel expert-annotated denotational dataset.
| Dataset | Weighted F1 |
|---|---|
| WN+CN+UM test set | 0.737 |
| SenseRel denotational dataset | 0.658 |
On the WN+CN+UM test set, this model achieved the best reported score among the evaluated systems.
Citation
If you use this model, please cite:
@inproceedings{cassotti-etal-2026-senserel,
title = "{S}ense{R}el: A Sense-Level Benchmark for Denotational and Connotational Meaning Relations",
author = "Cassotti, Pierluigi and
Baes, Naomi and
De Pascale, Stefano and
de S{\'a}, J{\'a}der Martins Camboim and
Periti, Francesco and
Haslam, Nick and
Geeraerts, Dirk and
Tahmasebi, Nina",
booktitle = "Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
month = jul,
year = "2026",
address = "San Diego, California, United States",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2026.acl-long.20/",
pages = "499--515"
}