sandeeppanem/qwen3-0.6b-resume-json
A fine-tuned Qwen3-0.6B model for resume parsing and structured JSON extraction using LoRA (Low-Rank Adaptation).
โ ๏ธ IMPORTANT: This repository contains ONLY the LoRA adapter weights.
You must load the base modelQwen/Qwen3-0.6Bseparately.
See the Usage section below for complete loading instructions.
Model Details
- Base Model: Qwen/Qwen3-0.6B
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Dataset: sandeeppanem/resume-json-extraction-5k
- Training Examples: 4,879 resumes with structured JSON outputs
- Training Framework: TRL (SFTTrainer) with PEFT
Training Configuration
LoRA Parameters
- LoRA Rank (r): 8
- LoRA Alpha: 16
- LoRA Dropout: 0.05
- Target Modules: q_proj, v_proj, k_proj, o_proj
- Bias: none (not trained)
- Trainable Parameters: ~2.3M (0.38% of total 598M parameters)
Training Hyperparameters
- Learning Rate: 1e-4
- Batch Size: 2 per device
- Gradient Accumulation Steps: 8
- Effective Batch Size: 16 (2 ร 8)
- Epochs: 3
- Max Sequence Length: 1536
- Optimizer: adamw_torch
- Weight Decay: 0.01
- Learning Rate Scheduler: cosine
- Warmup Steps: 100
- Gradient Checkpointing: Enabled (for memory efficiency)
- Mixed Precision: fp16 (float16)
Usage
Load Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen3-0.6B",
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "sandeeppanem/qwen3-0.6b-resume-json")
model.eval() # Set to evaluation mode
tokenizer = AutoTokenizer.from_pretrained("sandeeppanem/qwen3-0.6b-resume-json", trust_remote_code=True)
# Set padding token if not present
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
tokenizer.pad_token_id = tokenizer.eos_token_id
Inference
import torch
import json
# Prepare input
resume_text = """Your resume text here..."""
messages = [
{
"role": "system",
"content": "You are an expert resume parser. Extract structured information from resumes and return ONLY valid JSON. Do not include explanations or extra text."
},
{
"role": "user",
"content": f"Resume:\n{resume_text}"
}
]
# Apply chat template
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
enable_thinking=False
)
# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=512,
do_sample=False, # IMPORTANT: Use deterministic generation for JSON
pad_token_id=tokenizer.eos_token_id
)
# Extract only the assistant's response (after the input prompt)
assistant_response = tokenizer.decode(
outputs[0][inputs["input_ids"].shape[-1]:],
skip_special_tokens=True
).strip()
# Parse JSON
try:
parsed_json = json.loads(assistant_response)
print("โ Valid JSON")
print(json.dumps(parsed_json, indent=2))
except json.JSONDecodeError as e:
print(f"โ ๏ธ Invalid JSON: {e}")
print("Raw response:", assistant_response)
Model Performance
This model was fine-tuned on 4,879 resume examples and is optimized for:
- Extracting structured information from unstructured resume text
- Generating valid JSON output without explanations
- Handling diverse resume formats and job titles
Limitations
- The model is fine-tuned specifically for resume parsing tasks
- Performance may vary with resumes in languages other than English
- Very long resumes (>1536 tokens) may be truncated
- The model requires the base Qwen3-0.6B model to function
Citation
If you use this model, please cite:
@misc{qwen3-resume-json,
title={Qwen3-0.6B Resume JSON Extraction Model},
author={Sandeep Panem},
year={2026},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/sandeeppanem/qwen3-0.6b-resume-json}}
}
License
This model is licensed under Apache 2.0, same as the base Qwen3-0.6B model.
- Downloads last month
- 19