Instructions to use zodumair/document-forgery-detector with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use zodumair/document-forgery-detector with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-classification", model="zodumair/document-forgery-detector") pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")# Load model directly from transformers import AutoImageProcessor, AutoModelForImageClassification processor = AutoImageProcessor.from_pretrained("zodumair/document-forgery-detector") model = AutoModelForImageClassification.from_pretrained("zodumair/document-forgery-detector") - Notebooks
- Google Colab
- Kaggle
Document Forgery Detector
A fine-tuned Vision Transformer (ViT) model for detecting forged or tampered documents. Classifies any document image as either real or forged with 92.2% accuracy.
This model was developed as a Final Year Project (FYP) at Sir Syed University of Engineering & Technology (SSUET), Karachi, Pakistan.
Model Details
Model Description
- Model type: Vision Transformer (ViT) fine-tuned for binary image classification
- Base model:
google/vit-base-patch16-224 - Developed by: M. Umair Khan Computer Engineering Technology, SSUET Karachi
- Institution: Sir Syed University of Engineering & Technology (SSUET), Karachi, Pakistan
- Project type: Final Year Project (FYP)
- Language(s): English
- License: MIT
- Finetuned from:
google/vit-base-patch16-224
Uses
Direct Use
This model can be used to detect whether a scanned or photographed document has been tampered with or forged. Suitable for:
- Identity document verification (ID cards, passports)
- Academic certificate authentication
- Invoice and financial document fraud detection
- General document integrity checks
Downstream Use
Can be integrated into document verification pipelines, KYC (Know Your Customer) systems, HR onboarding tools, or any workflow that requires document authenticity checks.
Out-of-Scope Use
- This model is not designed for pixel-level forgery localization (it predicts a document-level label only)
- Not suitable for handwriting verification or signature authentication
- Should not be used as the sole verification mechanism in high-stakes legal or financial decisions without human review
How to Get Started
from transformers import ViTForImageClassification, ViTImageProcessor
from PIL import Image, ImageChops
import torch
import torch.nn.functional as F
import io
# Load model and processor
model = ViTForImageClassification.from_pretrained('zodumair/document-forgery-detector')
processor = ViTImageProcessor.from_pretrained('zodumair/document-forgery-detector')
def compute_ela(image_path, quality=90, scale=15):
original = Image.open(image_path).convert('RGB')
buf = io.BytesIO()
original.save(buf, 'JPEG', quality=quality)
buf.seek(0)
recompressed = Image.open(buf).convert('RGB')
ela = ImageChops.difference(original, recompressed)
max_diff = max([ex[1] for ex in ela.getextrema()]) or 1
ela = ela.point(lambda px: min(255, int(px * (255.0 / max_diff) * (scale / 10.0))))
return ela
def predict(image_path):
img = Image.open(image_path).convert('RGB')
ela = compute_ela(image_path)
blended = Image.blend(img, ela, alpha=0.3)
inputs = processor(images=blended, return_tensors='pt')
with torch.no_grad():
logits = model(**inputs).logits
probs = F.softmax(logits, dim=-1)
pred = torch.argmax(probs).item()
return {'label': model.config.id2label[pred], 'confidence': probs[0][pred].item()}
result = predict('your_document.jpg')
print(result) # {'label': 'real', 'confidence': 0.97}
Training Details
Training Data
The model was trained on a combined dataset of 2000 real and 2000 forged document images:
- Real documents: Sourced from
chainyo/rvl-cdip(RVL-CDIP dataset) โ real scanned documents across 16 categories including invoices, letters, forms, emails, resumes, and more - Synthetic real documents: Faker-generated documents (invoices, ID cards, certificates, passports, transcripts) rendered using PIL
- Forged documents: Programmatically generated by applying forgery attack functions to real
documents, including:
- Copy-move attack (region duplication)
- Text replacement (erase and rewrite field values)
- Stamp overlay (fake verification stamps)
- JPEG compression artifacts (double-compression of regions)
- Splicing (pasting regions from different documents)
Preprocessing
Each image undergoes Error Level Analysis (ELA) blending before being passed to the model.
ELA highlights regions with inconsistent compression levels โ a reliable indicator of tampering.
The ELA map is blended with the original image at alpha=0.3 before resizing to 224x224.
Training Hyperparameters
| Parameter | Value |
|---|---|
| Base model | google/vit-base-patch16-224 |
| Epochs | 20 (best at epoch 13) |
| Batch size | 32 |
| Learning rate | 1e-5 |
| LR scheduler | Cosine |
| Weight decay | 0.05 |
| Warmup steps | 200 |
| Label smoothing | 0.1 |
| Classifier dropout | 0.4 |
| Mixed precision | FP16 |
| Hardware | Google Colab T4 GPU |
| Training time | ~28 minutes |
Model Details
- Model type: Vision Transformer (ViT) for image classification
- Base model:
google/vit-base-patch16-224 - Task: Binary classification (Real vs Forged documents)
- Developed by: M. Umair Khan, Computer Engineering Technology
- Institution: SSUET Karachi, Pakistan
- License: MIT
- Frameworks: PyTorch, HuggingFace Transformers
- JPEG compression artifacts
- Region splicing
Training Configuration
| Parameter | Value |
|---|---|
| Base model | google/vit-base-patch16-224 |
| Epochs | 15 |
| Batch size | 32 |
| Learning rate | 1e-5 |
| Scheduler | Cosine |
| Weight decay | 0.05 |
| Warmup steps | 200 |
| Label smoothing | 0.1 |
| Dropout | 0.4 |
| Precision | FP16 |
| Hardware | Google Colab T4 GPU |
Evaluation Results
Verified Test Performance (500 random samples)
| Metric | Score |
|---|---|
| Accuracy | ~91% |
| F1 Score | ~0.91 |
This result is based on randomized evaluation over 500 unseen test samples.
Training Progress
| Epoch | Train Loss | Val Loss | Accuracy | F1 |
|---|---|---|---|---|
| 1 | 0.715 | 0.688 | 0.543 | 0.539 |
| 2 | 0.574 | 0.546 | 0.749 | 0.700 |
| 3 | 0.449 | 0.405 | 0.870 | 0.868 |
| 4 | 0.389 | 0.375 | 0.886 | 0.886 |
| 5 | 0.392 | 0.374 | 0.881 | 0.875 |
| 6 | 0.359 | 0.365 | 0.887 | 0.885 |
| 7 | 0.334 | 0.374 | 0.888 | 0.883 |
| 8 | 0.328 | 0.358 | 0.894 | 0.893 |
| 9 | 0.328 | 0.371 | 0.891 | 0.888 |
| 10 | 0.308 | 0.369 | 0.901 | 0.900 |
| 11 | 0.306 | 0.364 | 0.907 | 0.907 |
| 12 | 0.296 | 0.364 | 0.903 | 0.902 |
| 13 | 0.265 | 0.370 | 0.901 | 0.900 |
| 14 | 0.276 | 0.374 | 0.901 | 0.899 |
| 15 | 0.262 | 0.383 | 0.894 | 0.890 |
Bias, Risks, and Limitations
- The forgery attacks used in training are programmatic โ the model may not generalise perfectly to sophisticated AI-generated forgeries (e.g. deepfake documents, inpainting-based edits)
- Performance may vary on document types not well represented in RVL-CDIP
- The model predicts a document-level label only โ it does not localise which region was forged
- Should be used as a screening tool, not as a definitive legal verdict
Environmental Impact
- Hardware: Google Colab T4 GPU (NVIDIA Tesla T4, 16GB VRAM)
- Cloud provider: Google Colab
- Training time: ~28 minutes
- Compute region: Google Cloud (us-central1)
- Carbon emissions can be estimated using the ML Impact Calculator
Citation
If you use this model in your research or project, please cite:
@misc{umair2025forgerydetector,
author = {M. Umair Khan},
title = {Document Forgery Detector: A Fine-tuned ViT for Document Authenticity Classification},
year = {2026},
publisher = {HuggingFace},
institution = {Sir Syed University of Engineering & Technology, Karachi, Pakistan},
url = {https://huggingface.co/zodumair/document-forgery-detector}
}
Model Card Authors
M. Umair Khan Computer Engineering Technology Final Year Sir Syed University of Engineering & Technology (SSUET), Karachi, Pakistan
This model was developed as part of a Final Year Project (FYP) at SSUET Karachi. Built using HuggingFace Transformers, PyTorch, and Google Colab.
- Downloads last month
- 212
Model tree for zodumair/document-forgery-detector
Base model
google/vit-base-patch16-224