Rajarshi-Roy-research/Defactify_Image_Dataset
Viewer • Updated • 96k • 3.25k • 18
Fine-tuned Vision Transformer (ViT) for AI-generated image detection.
This model is a binary classifier trained to distinguish real photographs from images generated by modern AI models (Stable Diffusion 2.1 / XL / 3, DALL-E 3, Midjourney v6).
Part of the SteganographIA project (MIAGE TPI).
| Metric | Value |
|---|---|
| Accuracy | 0.923 |
| Real precision / recall | 0.88 / 0.99 |
| AI precision / recall | 0.98 / 0.86 |
The model is conservative: it almost never accuses a real image of being AI-generated (1.4% false positive rate), but misses ~14% of AI images by classifying them as real.
google/vit-base-patch16-224Rajarshi-Roy-research/Defactify_Image_Dataset (Defactify Challenge @ AAAI),
rebalanced 50/50 by undersampling the AI class stratified across the 5 generators.from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import torch
processor = AutoImageProcessor.from_pretrained("delpot/steganograph-ia-detector")
model = AutoModelForImageClassification.from_pretrained("delpot/steganograph-ia-detector")
image = Image.open("path/to/image.jpg").convert("RGB")
inputs = processor(image, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
predicted = logits.argmax(-1).item()
print(model.config.id2label[predicted])