InspectNet-CX
InspectNet-CX is a per-category PatchCore-based anomaly detector for the MVTec AD
benchmark, evaluated across 4 categories (bottle, cable, capsule, leather). Every
number on this card is traceable to a JSON report in reports/eval_harness/ from
this evaluation session. The earlier AUDIT.md concern that no native InspectNet-CX
checkpoints existed is now resolved: PatchCore Lightning checkpoints exist for all
4 categories and (for bottle and leather) for 3 seeds each. See PROVENANCE.md for
the per-metric source-of-truth map.
Headline: PaDiM to PatchCore Ablation
PatchCore replaces the PaDiM head from the prior baseline. The decisive wins are on the categories where PaDiM had headroom:
- Cable: image AUROC 0.8720 (PaDiM) -> 0.9910 (PatchCore, coreset 0.01). Delta +0.1190.
- Capsule: image AUROC 0.8807 (PaDiM) -> 0.9944 (PatchCore, coreset 0.01). Delta +0.1137.
Both margins are large enough that single-seed measurement is sufficient at this magnitude (the gap is two orders of magnitude larger than typical PatchCore seed noise). Bottle and leather are wins at the image-AUROC ceiling: 1.0000 across 3 seeds with zero seed variance.
Full Ablation Table
| category | method | coreset | image AUROC | pixel AUROC | AUPRO@0.3 | image delta | pixel delta | AUPRO delta |
|---|---|---|---|---|---|---|---|---|
| bottle | PaDiM | n/a | 0.9976 | 0.9816 | 0.9406 | |||
| bottle | PatchCore | 0.01 | 1.0000 (mean, n=3, std=0) | 0.9852 +/- 0.0001 (n=3) | 0.9406 +/- 0.0005 (n=3) | +0.0024 | +0.0036 | +0.0000 |
| cable | PaDiM | n/a | 0.8720 | 0.9551 | 0.8519 | |||
| cable | PatchCore | 0.01 | 0.9910 (single seed) | 0.9834 (single seed) | 0.9281 (single seed) | +0.1190 | +0.0283 | +0.0761 |
| capsule | PaDiM | n/a | 0.8807 | 0.9849 | 0.9149 | |||
| capsule | PatchCore | 0.01 | 0.9944 (single seed) | 0.9902 (single seed) | 0.9382 (single seed) | +0.1137 | +0.0053 | +0.0233 |
| leather | PaDiM | n/a | 0.9925 | 0.9882 | 0.9682 | |||
| leather | PatchCore | 0.01 | 1.0000 (mean, n=3, std=0) | 0.9922 +/- 0.0001 (n=3) | 0.9752 +/- 0.0006 (n=3) | +0.0075 | +0.0040 | +0.0070 |
Cable and capsule PatchCore rows are single-seed (legacy seed-0, see Seed Labeling Note below); the 0.119 and 0.114 image-AUROC margins over PaDiM are far above plausible seed noise so the verdict is robust. Bottle and leather PatchCore rows are mean +/- pstdev across 3 seeds (seed 0 legacy unseeded plus explicit seeds 1 and 2).
Cable Coreset Sensitivity
| coreset | image AUROC | pixel AUROC | AUPRO@0.3 |
|---|---|---|---|
| 0.01 | 0.9910 | 0.9834 | 0.9281 |
| 0.10 | 0.9856 | 0.9848 | 0.9304 |
| 0.25 | 0.9893 | 0.9844 | 0.9280 |
A 1% coreset matches 10% and 25% within noise on cable, so the paper-default 1% sampling ratio is sufficient for this category.
Seed Labeling Note
"Seed 0" refers to the legacy unseeded baseline run from Phase B (it predates the
--seed flag added in Phase Bx, so its RNG state is not pinned). Seeds 1 and 2
are pinned explicitly via the --seed flag in scripts/train_patchcore.py. All
three runs are reported as-is; we do not pretend they were drawn identically.
For bottle and leather, image AUROC is exactly 1.0000 across all three seeds, so the seed-0 ambiguity is moot at the image-classification level. Pixel AUROC and AUPRO show non-zero seed variance and are reported as mean +/- pstdev (n=3).
Latency
Per-category, per-device latency, measured on the same hardware in this session. All values in milliseconds, batch size 1, image size 256x256, 50 timed images, 10 warmup images (capsule CPU used 30 warmup images, see note).
CPU (AMD Ryzen 9 9900X 12-Core)
| category | min | median | p95 | mean | std | warmup |
|---|---|---|---|---|---|---|
| bottle | 28.318 | 30.155 | 31.569 | 30.178 | 1.012 | 10 |
| cable | 30.415 | 31.297 | 32.908 | 31.463 | 0.807 | 10 |
| capsule | 28.789 | 29.749 | 32.502 | 30.173 | 1.162 | 30 |
| leather | 30.974 | 32.812 | 35.191 | 32.838 | 1.178 | 10 |
The capsule CPU row is taken from patchcore_capsule_latency_rerun2.json (30 warmup
images, std 1.162 ms). The original capsule CPU run had an unstable warm-up tail
that inflated std; the rerun is the clean number to cite.
CUDA (NVIDIA GeForce RTX 5070, driver 570.211.01, 12227 MiB)
| category | min | median | p95 | mean | std | warmup |
|---|---|---|---|---|---|---|
| bottle | 5.144 | 5.202 | 6.415 | 5.508 | 0.473 | 10 |
| cable | 5.130 | 5.290 | 6.558 | 5.513 | 0.465 | 10 |
| capsule | 5.109 | 5.354 | 6.179 | 5.474 | 0.376 | 10 |
| leather | 5.171 | 5.350 | 6.365 | 5.613 | 0.468 | 10 |
Platform: Linux-6.8.0-117-generic-x86_64-with-glibc2.35, Python 3.10.12, Torch 2.11.0+cu128, Anomalib 2.4.1.
Accuracy/Cost Tradeoff
PatchCore is more accurate than PaDiM on all 4 categories (cable +0.119 image AUROC, capsule +0.114, leather +0.0075, bottle +0.0024), but at higher inference cost: CPU median ~30 ms/image vs PaDiM's lighter coupling, and CUDA median ~5.2-5.6 ms/image. The CPU cost is dominated by the wide_resnet50_2 backbone and the memory-bank nearest-neighbor lookup. On CUDA the model is fast enough for real-time-class inspection workloads; on CPU it sits in the tens of ms.
OpenVINO Parity (Measured This Session)
Fresh PatchCore ONNX and OpenVINO exports were produced via Anomalib's
Engine.export(export_type=ExportType.ONNX|OPENVINO, ...) from each trained
Lightning checkpoint. Outputs were compared on N=20 real MVTec AD test images per
category (mix of normal + anomalous) under ONNX Runtime CPU (FP32) and OpenVINO
CPU with INFERENCE_PRECISION_HINT=f32. Inference precision hint matters: leaving
it at the CPU plugin default can silently engage bf16 on AVX-512-BF16 hosts and
break parity, which is why the f32 hint is explicit.
| category | max abs error (anomaly map) | max abs error (pred score) | pred_label flips (N=20) | pred_mask pixel flips (out of 1,310,720) | source JSON |
|---|---|---|---|---|---|
| bottle | 2.181e-05 | 6.020e-06 | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_bottle.json |
| cable | 4.768e-06 | 3.278e-06 | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_cable.json |
| capsule | 7.719e-06 | 4.053e-06 | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_capsule.json |
| leather | 4.321e-06 | 7.299e-05 (pred_score) | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_leather.json |
All 4 categories status parity_clean per the JSON definition (max_abs_error <=
1e-3, zero label flips, zero mask pixel flips). ONNX Runtime 1.23.2, OpenVINO
2026.1.0-21367-63e31528c62-releases/2026/1.
This is a fresh PatchCore parity measurement. The earlier commit c3594fc covered
PaDiM only and does not transfer to PatchCore by extrapolation; this measurement
replaces that.
License
Package and Card
The code in the InspectNet-CX package, this model card, the per-category result JSON files, and the parity reports are licensed under Apache-2.0.
MVTec AD Dataset Restriction (Important)
The trained PatchCore checkpoints were fit on the MVTec AD dataset, which is distributed under CC BY-NC-SA 4.0. That license is non-commercial.
This restriction propagates to the trained checkpoints. Even though the package code is Apache-2.0, downstream commercial use of the trained PatchCore checkpoints (or any derivative model that was fit on MVTec AD images) is not permitted under MVTec AD's terms. The dataset license overrides the package license for any artifact whose weights or memory bank were built from MVTec AD pixels.
If you want commercial use, you must retrain the per-category PatchCore detector on your own commercially-licensed data using the package code, and the non-commercial restriction does not apply to the resulting weights.
Checkpoints
PatchCore Lightning checkpoints used for the numbers in this card:
| category | seed | coreset | SHA256 | size MB |
|---|---|---|---|---|
| bottle | 0 (legacy unseeded) | 0.01 | b0eb8834ae8d2bece3270cd1ef003427f16e0a109cc6bbb3a85eea49e50461df |
107.6 |
| bottle | 1 | 0.01 | d89e12adc18b806e3da552d261c33b113422bf3e49068c73b2e1223816cabd12 |
107.6 |
| bottle | 2 | 0.01 | b4afc04f0af2dd70de8393754ed47f276cc2778a6ec9e2d87431e894dcedb725 |
107.6 |
| cable | 0 (legacy unseeded) | 0.01 | 29d451c6a03707c155adaf1e5bf33313531c9d8204d8722cbe8f9516aac930c2 |
108.5 |
| capsule | 0 (legacy unseeded) | 0.01 | 25454995713926187e9816613d0e76e8e9531d6ca99becdf2565e8e8ebda8feb |
108.2 |
| leather | 0 (legacy unseeded) | 0.01 | 5cf7c7a793ad441a9c6cd92ee27517c674df35720c1873e61ef7aab5ebc2bd29 |
109.8 |
| leather | 1 | 0.01 | 5af3f908dae9df60fe472718b588deeaaeb93ce7b7b8d286c9077df098375d65 |
109.8 |
| leather | 2 | 0.01 | 268b1d0819ef50353a0ed874dc84ce2f38d5fe1686978ed284f881c5532fbc0e |
109.8 |
Checkpoints are not bundled in this HF repo. They live in the upstream training
tree under artifacts/patchcore_{cat}[_seed{N}]/Patchcore/MVTecAD/{cat}/v0/weights/lightning/model.ckpt
and are reproducible from the documented training commands.
Backbone and Hyperparameters
- Backbone:
wide_resnet50_2(timm). - Feature layers:
layer2,layer3. - Coreset sampling ratio: 0.01 (main runs), with 0.10 and 0.25 sweep on cable.
- Image size: 256x256, RGB, BILINEAR resize, divide-by-255 normalization.
- Train/test split: MVTec AD default per-category split.
Verification
See CHECKSUMS.sha256 for SHA256 of every non-README file shipped in this repo.
Verify with:
sha256sum -c CHECKSUMS.sha256
See PROVENANCE.md for the metric-to-JSON map. Every number in the YAML
model-index block and in the ablation, coreset, latency, and parity tables
points to a specific field in a specific JSON under reports/eval_harness/.
Caveats
- Cable and capsule PatchCore rows are single-seed; the +0.119 / +0.114 image AUROC margins over PaDiM are large enough that this is acceptable, but the caveat is real.
- Seed 0 across categories is the legacy unseeded baseline run; only seeds 1 and 2 have pinned RNG state.
- MVTec AD non-commercial license (CC BY-NC-SA 4.0) propagates to the checkpoints and overrides the package Apache-2.0 for downstream commercial use.
- No Jetson, TensorRT, or edge-hardware validation has been performed. CPU latency is on an AMD Ryzen 9 9900X workstation, not on target inspection hardware.
- Pixel-level evaluation uses the standard MVTec AD pixel AUROC and AUPRO@FPR=0.3 with no additional production thresholding.
Reproduction
PaDiM and PatchCore evaluation harness, latency benchmark, and parity script are in the upstream repo:
PYTHONPATH=src python3 scripts/eval_harness.py --method patchcore --dataset mvtec_ad --category cable --coreset 0.01 --output reports/eval_harness/patchcore_cable.json
PYTHONPATH=src python3 scripts/train_patchcore.py --category leather --seed 1 --output artifacts/patchcore_leather_seed1
PYTHONPATH=src python3 scripts/bench_latency.py --checkpoint artifacts/patchcore_bottle/Patchcore/MVTecAD/bottle/v0/weights/lightning/model.ckpt --category bottle --output reports/eval_harness/patchcore_bottle_latency.json
PYTHONPATH=src python3 scripts/validate_patchcore_export.py --category bottle --checkpoint artifacts/patchcore_bottle/Patchcore/MVTecAD/bottle/v0/weights/lightning/model.ckpt --output reports/eval_harness/openvino_parity_patchcore_bottle.json
Evaluation results
- Image AUROC (mean over 3 seeds) on MVTec AD (bottle)self-reported1.000
- Pixel AUROC (mean over 3 seeds) on MVTec AD (bottle)self-reported0.985
- AUPRO@FPR=0.3 (mean over 3 seeds) on MVTec AD (bottle)self-reported0.941
- Image AUROC (single seed, coreset 0.01) on MVTec AD (cable)self-reported0.991
- Pixel AUROC (single seed, coreset 0.01) on MVTec AD (cable)self-reported0.983
- AUPRO@FPR=0.3 (single seed, coreset 0.01) on MVTec AD (cable)self-reported0.928
- Image AUROC (single seed, coreset 0.01) on MVTec AD (capsule)self-reported0.994
- Pixel AUROC (single seed, coreset 0.01) on MVTec AD (capsule)self-reported0.990