mmBERT Fact-Check Classifier (LoRA Adapter)

A multilingual binary classifier that determines whether a query requires external fact-checking or can be answered without verification.

Model Description

This model classifies prompts into two categories:

FACT_CHECK_NEEDED: Information-seeking questions requiring external verification
NO_FACT_CHECK_NEEDED: Creative, opinion, coding, math - no verification needed

Supports 1800+ languages through mmBERT's multilingual pretraining.

Performance

Metric	Score
Accuracy	96.2%
F1	96.2%
Precision	96.2%
Recall	96.2%
Training Time	151 seconds (MI300X GPU)

Training Details

Base Model: jhu-clsp/mmBERT-base
LoRA Rank: 32
LoRA Alpha: 64
Trainable Parameters: 6.8M / 314M (2.2%)
Epochs: 10
Batch Size: 64
Learning Rate: 2e-5

Training Data Sources

FACT_CHECK_NEEDED:

SQuAD, TriviaQA, HotpotQA, TruthfulQA, CoQA
HaluEval QA, RAG dataset questions

NO_FACT_CHECK_NEEDED:

Dolly (creative_writing, brainstorming)
WritingPrompts, Alpaca (coding, math, opinion)

Usage

from peft import PeftModel
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load model
base_model = AutoModelForSequenceClassification.from_pretrained(
    "jhu-clsp/mmBERT-base", num_labels=2
)
model = PeftModel.from_pretrained(base_model, "llm-semantic-router/mmbert-fact-check-lora")
tokenizer = AutoTokenizer.from_pretrained("jhu-clsp/mmBERT-base")

# Classify
queries = [
    "When was the Eiffel Tower built?",  # FACT_CHECK_NEEDED
    "Write a poem about the ocean",       # NO_FACT_CHECK_NEEDED
]

for query in queries:
    inputs = tokenizer(query, return_tensors="pt", truncation=True)
    outputs = model(**inputs)
    label = "FACT_CHECK_NEEDED" if outputs.logits.argmax(-1).item() == 1 else "NO_FACT_CHECK_NEEDED"
    print(f"{query} -> {label}")

Use Cases

LLM Guardrails: Route factual queries to RAG systems
Hallucination Prevention: Flag queries needing external verification
Cost Optimization: Skip expensive retrieval for creative/coding tasks

Part of vLLM Semantic Router

This model is part of the vLLM Semantic Router project.

License

Apache 2.0

Downloads last month: 4

Model tree for llm-semantic-router/mmbert-fact-check-lora

Base model

jhu-clsp/mmBERT-base

Adapter

(9)

this model

llm-semantic-router
/

mmbert-fact-check-lora