Code Monkey home page Code Monkey logo

inctrl's Introduction

InCTRL (CVPR 2024)

Official PyTorch implementation of "Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts".

Overview

In this work, we propose to train a Generalist Anomaly Detection (GAD) model with few-shot normal images as sample prompts for AD on diverse datasets on the fly. To this end, we introduce a novel approach that learns an incontext residual learning model for GAD, termed InCTRL. It is trained on an auxiliary dataset to discriminate anomalies from normal samples based on a holistic evaluation of the residuals between query images and few-shot normal sample prompts. Regardless of the datasets, per definition of anomaly, larger residuals are expected for anomalies than normal samples, thereby enabling InCTRL to generalize across different domains without further training. Comprehensive experiments on nine AD datasets are performed to establish a GAD benchmark that encapsulate the detection of industrial defect anomalies, medical anomalies, and semantic anomalies in both one-vs-all and multi-class setting, on which InCTRL is the best performer and significantly outperforms state-of-the-art competing methods.

image

Setup

  • python >= 3.10.11
  • torch >= 1.13.0
  • torchvision >= 0.14.0
  • scipy >= 1.10.1
  • scikit-image >= 0.21.0
  • numpy >= 1.24.3
  • tqdm >= 4.64.0

Device

Single NVIDIA GeForce RTX 3090

Run

Step 1. Download the Anomaly Detection Dataset(ELPV, SDD, AITEX, VisA, MVTec AD, BrainMRI, HeadCT, MNIST, CIFAR-10) and Convert it to MVTec AD Format(the convert script).

The dataset folder structure should look like:

DATA_PATH/
    subset_1/
        train/
            good/
        test/
            good/
            defect_class_1/
            defect_class_2/
            defect_class_3/
            ...
    ...

Step 2. Generate Training/Test Json Files of Anomaly Detection Datasets(the generate script).

The json folder structure should look like:

JSON_PATH/
    dataset_1/
        subset_1/
            subset_1_train_normal.json
            subset_1_train_outlier.json
            subset_1_val_normal.json
            subset_1_val_outlier.json
        subset_2/
        subset_3/
        ...
    ...

Step 3. Download the Few-shot Normal Samples for Inference on Google Drive

Step 4. Download the Pre-train Models on Google Drive

Step 5. Quick Start

Change the TEST.CHECKPOINT_FILE_PATH in config to the path of pre-train model and run

python test.py --val_normal_json_path $normal-json-files-for-testing --val_outlier_json_path $abnormal-json-files-for-testing --category $dataset-class-name --few_shot_dir $path-to-few-shot-samples

For example, if run on the category candle of visa with k=2:

python test.py --val_normal_json_path /AD_json/visa/candle_val_normal.json --val_outlier_json_path /AD_json/visa/candle_val_outlier.json --category candle --few_shot_dir /fs_samples/visa/2/

Training

python main.py --normal_json_path $normal-json-files-for-training --outlier_json_path $abnormal-json-files-for-training --val_normal_json_path $normal-json-files-for-testing --val_outlier_json_path $abnormal-json-files-for-testing

Implementation of WinCLIP

WinCLIP is one main competing method to ours, but its official implentation is not publicly available. We have successfully reproduced the results of WinCLIP based on our extensive communications with its authors and used our implementation to perform experiments in the paper. Our implementation has been released at WinCLIP.

Citation

@inproceedings{zhu2024toward,
  title={Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts},
  author={Zhu, Jiawen and Pang, Guansong},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  year={2024}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.