Resumator v2

Fine-tuned sentence-transformer for resume-to-job matching. Encodes resumes and job descriptions into embeddings for cosine similarity search.

Trained with Matryoshka Representation Learning — supports truncation to 384/256/128 dims at inference with minimal quality loss.

Property	Value
Base Model	all-mpnet-base-v2 (109M params)
Native Dimensions	768
Recommended Dimensions	384 (Matryoshka truncation)
Max Sequence Length	512 tokens
Pooling	Mean pooling + L2 normalization
Parameters	~109.5M
Size	~418 MB
Training	MatryoshkaLoss + MultipleNegativesRankingLoss on 15,600 LLM-scored resume-job pairs

What's New in v2

Improvement	v1	v2
Base model	all-MiniLM-L6-v2 (22.7M)	all-mpnet-base-v2 (109.5M)
Training loss	CosineSimilarityLoss	MatryoshkaLoss + MNRL + CosineSimilarityLoss
Training data	624 pairs	15,600 LLM-scored pairs
Spearman (held-out)	0.436	0.796
Ranking accuracy	79.9%	98.0%

Usage

from sentence_transformers import SentenceTransformer

# Load with Matryoshka truncation to 384 dims
model = SentenceTransformer("shankerram3/resumator", truncate_dim=384)

candidate = "Name: Jane Doe\nSkills: Python, React, PostgreSQL\nResume: Full-stack engineer with 4 years..."
job = "Title: Senior Software Engineer\nCompany: TechCorp\nDescription: Looking for full-stack engineer..."

embeddings = model.encode([candidate, job], normalize_embeddings=True)
similarity = float(embeddings[0] @ embeddings[1])  # cosine similarity

Input Format

Candidate:

Name: {name}
Titles: {title1}, {title2}
Skills: {skill1}, {skill2}, {skill3}
Experience: {years} years
Location: {city}, {state}
Resume: {full_resume_text}

Job:

Title: {job_title}
Company: {company_name}
Location: {city}, {state}
Experience: {years} years
Description: {full_job_description}

Benchmarks

Held-Out Evaluation (Unseen Data)

Evaluated on 500 resume-job pairs using jobs the model never saw during training, scored by Cerebras gpt-oss-120b as ground truth. Run on AMD Instinct MI300X GPU.

Model	Dims	Params	Spearman	Ranking Acc	Separation	IQR
resumator-v2 (384)	384	109.5M	0.796	98.0%	0.217	0.185
all-mpnet-base-v2	768	109.5M	0.511	78.7%	0.124	0.160
wynisco-matcher-v1	384	22.7M	0.436	79.9%	0.103	0.123
all-MiniLM-L6-v2	384	22.7M	0.387	76.7%	0.110	0.140
bge-base-en-v1.5	768	109.5M	0.238	61.6%	0.038	0.088
e5-base-v2	768	109.5M	0.188	55.0%	0.011	0.058

Spearman: Rank correlation with LLM recruiter scores (higher = better ranking)
Ranking Acc: % of (good, bad) pairs where the good match scores higher
Separation: Mean cosine sim gap between good matches (≥0.6) and bad matches (≤0.3)
IQR: Interquartile range — higher means better discrimination between matches

MTEB Benchmarks

Standard sentence-transformer evaluation on public benchmarks (not domain-specific):

Task	v1	v2 (384)	Delta
STS12	0.724	0.716	-0.008
STS13	0.808	0.833	+0.025
STS14	0.755	0.776	+0.021
STS15	0.851	0.852	+0.001
STS16	0.785	0.793	+0.007
STSBenchmark	0.817	0.828	+0.012
SICK-R	0.775	0.804	+0.028
BIOSSES	0.798	0.812	+0.014
SprintDuplicateQuestions	0.941	0.907	-0.034
TwitterSemEval2015	0.675	0.730	+0.056
AskUbuntuDupQuestions	0.633	0.654	+0.021
SciDocsRR	0.872	0.873	+0.001
StackOverflowDupQuestions	0.507	0.513	+0.007
Average	0.765	0.777	+0.012

v2 improves on 11 of 13 MTEB tasks despite being fine-tuned for a specific domain.

Speed

MI300X GPU:

Model	Single	Batch of 50
resumator-v2 (384)	6.15 ms	13.3 ms (0.27 ms/item)
all-mpnet-base-v2	6.09 ms	13.0 ms (0.26 ms/item)
wynisco-matcher-v1	3.48 ms	7.6 ms (0.15 ms/item)

Apple M-series CPU:

Model	Single	Batch of 50
resumator-v2 (384)	9.6 ms	72 ms (1.4 ms/item)
all-mpnet-base-v2	8.2 ms	61 ms (1.2 ms/item)
wynisco-matcher-v1	5.7 ms	15 ms (0.3 ms/item)

Score Distribution (2,500 resume-job pairs)

Metric	v1	v2 (384)
Mean	0.546	0.376
Std Dev	0.098	0.124
IQR	0.133	0.186
Min / Max	0.211 / 0.859	0.025 / 0.714

v2 distribution:
0.0-0.1:                                             13 (  0.5%)
0.1-0.2: ########                                   149 (  6.0%)
0.2-0.3: ################################           590 ( 23.6%)
0.3-0.4: ########################################   722 ( 28.9%)
0.4-0.5: ##############################             543 ( 21.7%)
0.5-0.6: ######################                     410 ( 16.4%)
0.6-0.7: ###                                         72 (  2.9%)
0.7-0.8:                                              1 (  0.0%)

v2 spreads scores more widely (IQR 0.186 vs 0.133), making it easier to set thresholds and distinguish good matches from mediocre ones.

Matryoshka Dimension Quality

Trained with Matryoshka Representation Learning — quality at different truncation levels:

Dimensions	Good Match Score	Bad Match Score	Delta
768 (full)	0.691	0.228	0.462
384 (recommended)	0.696	0.186	0.510

384-dim truncation actually improves separation — Matryoshka loss concentrates the most discriminative information in the first 384 dimensions.

Training Details

Data

15,600 resume-job pairs scored 0.0–1.0 by Cerebras gpt-oss-120b
51 candidates × 300 random jobs each
Scoring criteria: technical skills overlap (50%), role/title alignment (30%), experience level match (20%)

Training Pipeline

Phase 1: MatryoshkaLoss wrapping MultipleNegativesRankingLoss (5 epochs)
- Triplets: anchor (candidate) + positive (score ≥ 0.6) + hard negative (score ≤ 0.3)
- 2,702 triplets, effective batch size 32 (8 × 4 gradient accumulation)
- Matryoshka dims: [768, 384, 256, 128]
Phase 2: CosineSimilarityLoss refinement (2 epochs)
- Mid-range pairs (0.3 < score < 0.6) for calibrating the middle of the score range
- 2,503 pairs, learning rate 5e-6

Training Logs

Phase 1 — MatryoshkaLoss + MNRL (5 epochs):

Epoch	Step	Loss
0.12	10	2.942
1.06	90	1.856
2.00	170	1.792
3.06	260	1.757
4.00	340	1.684
4.95	420	1.634

Phase 2 — CosineSimilarityLoss (2 epochs):

Epoch	Step	Loss
0.13	10	0.044
0.51	40	0.009
1.01	80	0.007
1.91	150	0.006

Hardware

Training: AMD Instinct MI300X (ROCm 6.2), ~3 minutes total
Evaluation: AMD Instinct MI300X + Apple M-series CPU

Framework Versions

Python: 3.12.3
Sentence Transformers: 5.2.3
Transformers: 5.3.0
PyTorch: 2.5.1+rocm6.2
Accelerate: 1.13.0
Datasets: 4.7.0

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False, 'architecture': 'MPNetModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_mean_tokens': True})
  (2): Normalize()
)

Why Fine-Tune?

Generic sentence-transformers treat resumes as arbitrary text. Fine-tuning teaches:

Domain vocabulary: "OPT", "H1B", "C2C" are visa types, not random acronyms
Structural alignment: Match skills sections to requirements sections
Experience calibration: "3 years Java" → "mid-level Java developer", not "senior architect"
Description reading: v2 matches on actual job description content, not just title keywords

General-purpose models (bge, e5) score 0.19–0.24 Spearman on resume-job matching despite leading MTEB leaderboards. Domain fine-tuning is essential for this task.

Limitations

English only, US tech market bias
512 token limit — key information should appear early in the text
Trained on tech/IT recruiting data — may underperform on non-tech roles
LLM-scored training data inherits any biases from the scoring model

Citation

BibTeX

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

License

Apache 2.0

Downloads last month: 101

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for shankerram3/resumator

Base model

sentence-transformers/all-mpnet-base-v2

Finetuned

(358)

this model

Paper for shankerram3/resumator

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Paper • 1908.10084 • Published Aug 27, 2019 • 12