VM.AI โ€” Image Classifier

EfficientNet-B4 trained on 14 activity categories for the image-to-prompt pipeline.

Performance

Metric Value
Test samples {test_samples}
Top-1 accuracy {top1}
Top-3 accuracy {top3}
Macro F1 {macro_f1}
Weighted F1 {weighted_f1}

Per-Class Metrics

Class Precision Recall F1 Support
{class_rows}

Usage

import torch
import timm
from PIL import Image
from torchvision import transforms

model = timm.create_model("efficientnet_b4", pretrained=False, num_classes=14)
model.load_state_dict(torch.load("efficientnet_b4_classifier.pth", map_location="cpu"))
model.eval()

transform = transforms.Compose([
    transforms.Resize((380, 380)),
    transforms.ToTensor(),
    transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
])

img = Image.open("photo.jpg").convert("RGB")
tensor = transform(img).unsqueeze(0)
with torch.no_grad():
    logits = model(tensor)
pred = logits.argmax(1).item()

Training

Two-phase training: 5 frozen epochs (head only) + 20 unfrozen epochs (last 2 blocks). Optimizer: AdamW with cosine annealing. Mixed precision (AMP). See train_classifier.py for details.

Charts

Confusion matrix Per-class metrics Top-K accuracy

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Dataset used to train maxf-coder/task_image_classifier