Instructions to use dipta007/atomicity-single-focus-judge-balanced with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dipta007/atomicity-single-focus-judge-balanced with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="dipta007/atomicity-single-focus-judge-balanced")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("dipta007/atomicity-single-focus-judge-balanced") model = AutoModelForSequenceClassification.from_pretrained("dipta007/atomicity-single-focus-judge-balanced") - Notebooks
- Google Colab
- Kaggle
DecomposeRL Tiny-Judge: Atomicity (single focus) Judge
A ModernBERT-large classifier that scores whether a generated sub-question is single-focus — one of the five binary checks that make up the atomicity sub-signal of DecomposeRL's joint multiplicative quality reward.
It is part of the DecomposeRL tiny-judge stack — eight task-specific LoRA classifier heads on a shared ModernBERT-large backbone that distill a Qwen3-32B LLM judge into small, fast reward models. Swapping the 32B judge for this ~400M-parameter stack cuts GRPO judge compute by ~80% (240 → 48 GPU-hours) while retaining ~99% of in-domain accuracy.
Model Overview
| Property | Value |
|---|---|
| Model Type | ModernBertForSequenceClassification (sequence classification) |
| Base Model | answerdotai/ModernBERT-large (~400M params) |
| Training | LoRA (r=64, α=128), merged into the base before release |
| Labels | 2-way: no / yes |
| Distilled from | Qwen/Qwen3-32B judge labels |
| Dataset / config | dipta007/decomposeRL-tiny-judge · atomicity_single_focus |
| Train split | train_balanced (class-balanced); selected on macro-F1 |
| Language | English |
What it judges
This head is one of five binary atomicity checks (is_question, single_focus, no_conjunctions, verifiable, grounded). At reward time the five yes/no predictions are averaged into the per-question atomicity score R_atom, which is then multiplied with the answerability (R_ans) and answer-correctness (R_corr) sub-signals to form the joint multiplicative quality reward (Eq. 7 in the paper).
Input format
Claim + candidate sub-question:
Claim: {claim}
Question: {question}
Label space
| Label | Name | Meaning |
|---|---|---|
0 |
no |
the question bundles multiple sub-claims or asks about several things at once |
1 |
yes |
the question targets exactly one sub-claim / one focus |
Quickstart
import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer
repo = "dipta007/atomicity-single-focus-judge-balanced"
tokenizer = AutoTokenizer.from_pretrained(repo)
model = AutoModelForSequenceClassification.from_pretrained(repo).eval()
text = (
'Claim: the middle ages period came before the period containing the enlightenment.\\n'
'Question: When did the Enlightenment period start according to common historical knowledge?'
)
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=8192)
with torch.no_grad():
logits = model(**inputs).logits
pred = int(logits.argmax(-1))
print(pred, model.config.id2label[pred])
# expected: 1 -> yes
Training Data
Trained on the atomicity_single_focus config of dipta007/decomposeRL-tiny-judge, whose labels are distilled from Qwen3-32B judge calls made during DecomposeRL reward computation. The model is fine-tuned with LoRA on the class-balanced train_balanced split, validated on the natural validation split, and the best checkpoint is chosen by macro-F1. LoRA adapters are merged into the backbone before release, so the model loads with a plain from_pretrained (no PEFT required).
Role in DecomposeRL
DecomposeRL trains a claim-verification policy with GRPO over a seven-reward ensemble. Five of those rewards are scored by an LLM judge, which dominates training-time GPU cost. The tiny-judge stack replaces that 32B judge with eight small distilled heads so reward scoring runs on the same single GPU as training. See the paper (tiny-judge ablation) and the DecomposeRL-7B model for the full reward design.
Intended Use
- In-scope: serving as a fast reward / scoring model inside the DecomposeRL training loop, or as a standalone classifier for the specific judgment above on claim-decomposition traces.
- Out-of-scope: general-purpose fact-checking, use on inputs that do not follow the input format above, or as a standalone end-to-end claim verifier (use DecomposeRL-7B for that).
Citation
@article{dipta2025decomposerl,
title={DecomposeRL: Learning to Ask Useful, Informative, and Diverse Questions for Semi-Supervised, Traceable Claim Verification},
author={Shubhashis Roy Dipta and Ankur Padia and Francis Ferraro},
year={2025},
eprint={2605.27858},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2605.27858v1},
}
License
Released under the Apache 2.0 License.
- Downloads last month
- 73
Model tree for dipta007/atomicity-single-focus-judge-balanced
Base model
answerdotai/ModernBERT-large