KingTechnician/yahoo-answers-osmosis
Viewer • Updated • 44.2k • 74
Five-class response sufficiency classifier using DeBERTa-v3 as a cross-encoder. Takes (objective, response) pairs as separate inputs with direct cross-attention between the two texts.
| Evaluation | Accuracy | Macro F1 |
|---|---|---|
| Yahoo within-domain | 57.9% | 42.7% |
| Triage held-out | 99.2% | 99.2% |
| Class | Precision | Recall | F1 |
|---|---|---|---|
| ADDR_DIRECT | 97.4% | 100% | 98.7% |
| ADDR_PARTIAL | 100% | 98.7% | 99.3% |
| NOADDR_ON | 100% | 98.7% | 99.3% |
| NOADDR_TANGENTIAL | 100% | 98.7% | 99.3% |
| NOADDR_OFF | 98.7% | 100% | 99.3% |
Only 3 misclassifications out of 374 test samples.
| Model | Yahoo Acc | Triage Acc |
|---|---|---|
| RepProbe linear (L16) | 39.4% | 23.7% |
| SetFit MiniLM joint | 44.2% | 86.9% |
| SetFit ModernBERT joint | 51.9% | 94.7% |
| This model (cross-encoder) | 57.9% | 99.2% |
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model = AutoModelForSequenceClassification.from_pretrained("KingTechnician/osmosis-crossencoder-joint")
tokenizer = AutoTokenizer.from_pretrained("KingTechnician/osmosis-crossencoder-joint")
labels = ["ADDR_DIRECT", "ADDR_PARTIAL", "NOADDR_ON", "NOADDR_TANGENTIAL", "NOADDR_OFF"]
objective = "What causes rain?"
response = "Rain forms when water vapor in the atmosphere condenses into droplets."
inputs = tokenizer(objective, response, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
logits = model(**inputs).logits
prediction = logits.argmax(dim=-1).item()
print(f"Prediction: {labels[prediction]}")
# Output: Prediction: ADDR_DIRECT
This is a cross-encoder, not a bi-encoder. The model processes [CLS] objective [SEP] response [SEP]
as a single input, allowing full self-attention between objective and response tokens. This is
critical for response sufficiency classification because the judgment depends on token-level
alignment between what was asked and what was answered.
| Label | Description |
|---|---|
| ADDR_DIRECT | Response directly and completely addresses the objective |
| ADDR_PARTIAL | Response partially addresses the objective |
| NOADDR_ON | Response is on-topic but does not address the objective |
| NOADDR_TANGENTIAL | Response is tangentially related to the objective |
| NOADDR_OFF | Response is completely off-topic |
Base model
microsoft/deberta-v3-base