Code Monkey home page Code Monkey logo

padim-tf's Introduction

[TF 2.x] PaDiM - Anomaly Detection Localization

This repository contains an unofficial PaDiM implementation using tensorflow.

Paper

PaDiM: a patch distribution modeling framework for anomaly detection and localization. [Link]

Dependencies

  • Windows 10, Python 3.8.8, Tensorflow 2.4.1 GPU
  • Scikit-learn, Scikit-image, Matplotlib

Run

# options: seed, rd, target, batch_size, is_plot, net
python main.py

Dataset

MVTecAD dataset

Results (AU ROC)

Implementation results on MVTec

  • Network Type:

    • PyTorch.: WideResNet50, Rd 550 (from PyTorch version) (WR50-Rd550)
    • Net 1: EfficientNetB7 [layer a_expand_activation 5, 6, 7], Rd 1000 (ENB7-Rd1000)
    • Net 2: EfficientNetB7 [layer a_expand_activation 4, 6, 7], Rd 1000 (ENB7-Rd1000)
    • Net 3: EfficientNetB7 [layer a_activation 5, 6, 7], Rd 1000 (ENB7-Rd1000)
  • I observed that intermediate layers selection has some effects on detection performance.

  • Besides, a high image-level au-roc does not guarantee a high level of au-roc on patch-level.

MvTec PyTorch (Img) Net 1 (Img) Net 2 (Img) Net 3 (Img)
carpet 0.999 0.950 0.982 0.996
grid 0.957 0.936 0.971 0.976
leather 1.000 0.999 1.000 1.000
tile 0.974 0.957 0.984 0.981
wood 0.988 0.948 0.954 0.990
bottle 0.998 0.983 0.996 0.999
cable 0.922 0.909 0.919 0.973
capsule 0.915 0.946 0.953 0.958
hazelnut 0.933 0.983 0.973 0.997
metal_nut 0.992 0.869 0.930 0.931
pill 0.944 0.882 0.879 0.925
screw 0.844 0.632 0.767 0.895
toothbrush 0.972 0.767 0.972 0.811
transistor 0.978 0.930 0.949 0.975
zipper 0.909 0.980 0.986 0.990
Avg. (tex.) 0.9840 0.9579 0.9781 0.9885
Avg. (obj.) 0.9410 0.8881 0.9323 0.9455
Avg. (all) 0.9550 0.9114 0.9476 0.9598
MvTec org. (Patch) Net 1 (Patch) Net 2 (Patch) Net 3(Patch)
carpet 0.990 0.973 0.854 0.829
grid 0.965 0.958 0.750 0.768
leather 0.989 0.986 0.902 0.831
tile 0.939 0.905 0.729 0.748
wood 0.941 0.946 0.831 0.814
bottle 0.982 0.971 0.861 0.831
cable 0.968 0.963 0.815 0.843
capsule 0.986 0.977 0.940 0.911
hazelnut 0.979 0.965 0.876 0.834
metal_nut 0.971 0.986 0.926 0.926
pill 0.961 0.955 0.893 0.903
screw 0.983 0.986 0.941 0.893
toothbrush 0.983 0.979 0.937 0.864
transistor 0.987 0.977 0.958 0.958
zipper 0.975 0.965 0.840 0.814
Avg. (tex.) 0.9650 0.9536 0.8131 0.7979
Avg. (obj.) 0.9780 0.9724 0.8987 0.8776
Avg. (all) 0.9730 0.9661 0.8702 0.8510

ROC Curve (Net 1) Bottle

bottle_auroc

PR Curve (Net 1) Bottle

bottle_pr

Localization examples (Net 1) (cherry-picked)

carpet_ex grid_ex leather_ex tile_ex wood_ex bottle_ex cable_ex capsule_ex hazelnut_ex metalnut_ex pill_ex screw_ex toothbrush_ex transistor_ex zipper_ex

padim-tf's People

Contributors

remmarp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

nickf93 xuannadi

padim-tf's Issues

Gaussian and Mahalanobis distance

Hello, I would like to ask a question. Now I have found that the Gaussian distribution at certain points exhibits a double Gaussian distribution, and there may be deviations when calculating the mean and covariance. I would like to distinguish the channels that conform to the double Gaussian distribution, and then calculate the minimum Mahalanobis distance separately from the two distributions. Do you have any ideas about this?

Pixel-level localization/mask

I'm trying to reproduce this implementation. Although the image level detection works well, I get weird results for the pixel-level localization/mask. Seems like I get a lot of False Positives.
Please see attached photos.
I do not change any code nor requirements. So everything is as recommended in this repo.
Could you please comment if I'm missing anything.

Thanks.

leather_47
leather_93

epoch and out of memory problem

I'm going to apply an augmentation to my dataset, not Mvtec, to spin several epochs. Since there are many batches, the out of memory phenomenon occurred when out went up to RAM. How can I get a mean, variation?

out = []
    for x, _, _ in train_set:
        l1, l2, l3 = net(x)
        _out = tf.reshape(embedding_concat(embedding_concat(l1, l2), l3), (batch_size, h * w, c))  # (b, h x w, c)
        out.append(_out.numpy())

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.