ylecun/mnist
Viewer • Updated • 70k • 78.9k • 244
How to use kenil-patel-183/mnist-cnn-digit-classifier with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("image-classification", model="kenil-patel-183/mnist-cnn-digit-classifier", trust_remote_code=True)
pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png") # Load model directly
from transformers import AutoModelForImageClassification
model = AutoModelForImageClassification.from_pretrained("kenil-patel-183/mnist-cnn-digit-classifier", trust_remote_code=True, dtype="auto")This is a Convolutional Neural Network (CNN) model trained on the MNIST dataset for handwritten digit classification.
This model classifies handwritten digits (0-9) from 28x28 grayscale images using a custom CNN architecture with batch normalization.
Architecture Details:
Security Note: Requires trust_remote_code=True because it uses custom model/processor classes.
from transformers import pipeline
clf = pipeline(
"image-classification",
model="kenil-patel-183/mnist-cnn-digit-classifier",
trust_remote_code=True, # required due to custom classes
)
preds = clf("path/to/digit.png", top_k=1)
print(preds) # [{'label': '7', 'score': 0.998...}]
from transformers import AutoConfig, AutoModel, AutoImageProcessor
from PIL import Image
model_id = "kenil-patel-183/mnist-cnn-digit-classifier"
config = AutoConfig.from_pretrained(model_id, trust_remote_code=True)
model = AutoModel.from_pretrained(model_id, trust_remote_code=True)
processor = AutoImageProcessor.from_pretrained(model_id, trust_remote_code=True)
image = Image.open("digit.png")
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
pred = logits.argmax(-1).item()
print(pred)
MNISTCNN(
(flatten): Flatten(start_dim=1, end_dim=-1)
(lin): Linear(in_features=3136, out_features=10, bias=True)
(network): Sequential(
(0): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1))
(1): BatchNorm2d(8, eps=1e-05, momentum=0.1)
(2): ReLU()
(3): MaxPool2d(kernel_size=(2, 2), stride=2)
(4): Conv2d(8, 16, kernel_size=(3, 3), stride=(1, 1))
(5): BatchNorm2d(16, eps=1e-05, momentum=0.1)
(6): ReLU()
(7): Conv2d(16, 32, kernel_size=(3, 3), stride=(1, 1))
(8): BatchNorm2d(32, eps=1e-05, momentum=0.1)
(9): ReLU()
(10): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
(11): BatchNorm2d(64, eps=1e-05, momentum=0.1)
(12): ReLU()
)
)
For best results, input images should be preprocessed as follows:
transform = transforms.Compose([
transforms.Grayscale(),
transforms.Resize((28, 28)),
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
])
Achieved 99.25% accuracy on MNIST test set.