Code Monkey home page Code Monkey logo

ddad's People


arimousa avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar


 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ddad's Issues

Time of training


Thanks for releasing the code! I am using an RTXA6000 to run the for the "Hazelnut" class of the MVTec dataset. It takes over 10 hours to finish the training. I do not have much experience in training a diffusion model. I wonder is it common to take such a long time to train the model and is it necessary to train so many epochs to achieve competitive performance on the MVTec dataset?

Thank you very much :) !

Why nothing happens when running eval

~/diffusion/DDAD$ python --eval True
Class: screw w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 ,

This seems to be because the --eval is unknown in parse_args()
def parse_args(): cmdline_parser = argparse.ArgumentParser('DDAD') cmdline_parser.add_argument('-cfg', '--config', default= os.path.join(os.path.dirname(os.path.abspath(__file__)),'config.yaml'), help='config file') cmdline_parser.add_argument('--train', default= False, help='Train the diffusion model') cmdline_parser.add_argument('--detection', default= False, help='Detection anomalies') cmdline_parser.add_argument('--domain_adaptation', default= False, help='Domain adaptation') args, unknowns = cmdline_parser.parse_known_args() return args

Regarding the Execution Results of MVTecAD Bottle Evaluation.

Thank you for sharing such a wonderful piece of work.

I tried executing the evaluation after training MVTecAD bottle on Colab Pro+, and the following log was output:

Sample :  0  predicted as:  0  label is:  1 
Sample :  10  predicted as:  0  label is:  1 
Sample :  15  predicted as:  0  label is:  1 
Sample :  49  predicted as:  0  label is:  1 
Sample :  50  predicted as:  0  label is:  1 
Sample :  51  predicted as:  0  label is:  1 
Sample :  52  predicted as:  0  label is:  1 
Sample :  53  predicted as:  0  label is:  1 
Sample :  55  predicted as:  0  label is:  1 
Sample :  58  predicted as:  0  label is:  1 
Sample :  59  predicted as:  0  label is:  1 
Sample :  60  predicted as:  0  label is:  1 
Sample :  61  predicted as:  0  label is:  1 
Sample :  62  predicted as:  0  label is:  1 

AUROC: 1.0
AUROC pixel level: 0.9292386174201965 
PRO: 0.7771466867947617
threshold:  0.15196724

The AUROC pixel level result is lower than the value mentioned in the Readme. Upon checking the images in the results, 13 out of 83 seemed to be misclassified.

If there are any specific points or precautions to consider for reproduction, please let me know.

Finetuning hyperparameter w

Hello, thank you for your excellent work! I wanted to ask how you got values for hyperparameter w for MVTec AD and VisA. I was not able to find it in the paper, sorry if I have missed something.

Data read error during testing

FileNotFoundError: Caught FileNotFoundError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\torch\utils\data_utils\", line 308, in _worker_loop
data = fetcher.fetch(index)
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\torch\utils\data_utils\", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\torch\utils\data_utils\", line 51, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "D:\Code\Anomaly_Detection\DDAD-main\", line 67, in getitem
target =
File "E:\SoftWare\anaconda3\envs\lsomer\lib\site-packages\PIL\", line 3227, in open
fp =, "rb")
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/MVTec\bottle\test\broken_large\000_mask.png'

When I test mine, the above error occurs. There is no 000_mask.png under MVTec\bottle\\test\\broken_large\\path.


data :
name: VisA_dataset #MVTec #MTD #VisA
data_dir: /home/anywhere3090l/Desktop/henry/ddadvisa/VisA_dataset #MVTec #VisA #MTD
category: pcb4 #['carpet', 'bottle', 'hazelnut', 'leather', 'cable', 'capsule', 'grid', 'pill', 'transistor', 'metal_nut', 'screw','toothbrush', 'zipper', 'tile', 'wood']
# ['candle', 'capsules', 'cashew', 'chewinggum', 'fryum', 'macaroni1', 'macaroni2', 'pcb1', 'pcb2' ,'pcb3', 'pcb4', 'pipe_fryum']
image_size: 256
batch_size: 8 # 32 for DDAD and 16 for DDADS
DA_batch_size: 32 #16 for MVTec and [macaroni2, pcb1] in VisA, and 32 for other categories in VisA
test_batch_size: 32 #16 for MVTec, 32 for VisA
mask : True
imput_channel : 3

DDADS: False
checkpoint_dir: /home/anywhere3090l/Desktop/henry/ddadvisa/checkpoints/VisA #MTD #MVTec #VisA
checkpoint_name: weights
exp_name: default
feature_extractor: wide_resnet101_2 #wide_resnet101_2 # wide_resnet50_2 #resnet50
learning_rate: 3e-4
weight_decay: 0.05
epochs: 1
load_chp : 750 # From this epoch checkpoint will be loaded. Every 250 epochs a checkpoint is saved. Try to load 750 or 1000 epochs for Visa and 1000-1500-2000 for MVTec.
DA_epochs: 3 # Number of epochs for Domain adaptation.
DA_chp: 8
v : 7 #7 # 1 for MVTec and cashew in VisA, and 7 for VisA (1.5 for cashew). Control parameter for pixel-wise and feature-wise comparison. v * D_p + D_f
w : 8 # Conditionig parameter. The higher the value, the more the model is conditioned on the target image. "Fine tuninig this parameter results in better performance".
w_DA : 3 #3 # Conditionig parameter for domain adaptation. The higher the value, the more the model is conditioned on the target image.
DLlambda : 0.01 # 0.1 for MVTec and 0.01 for VisA
trajectory_steps: 1000
test_trajectoy_steps: 250 # Starting point for denoining trajectory.
test_trajectoy_steps_DA: 250 # Starting point for denoining trajectory for domain adaptation.
skip : 25 # Number of steps to skip for denoising trajectory.
skip_DA : 25
eta : 1 # Stochasticity parameter for denoising process.
beta_start : 0.0001
beta_end : 0.02
device: 'cuda' #<"cpu", "gpu", "tpu", "ipu">
save_model: True
num_workers : 2
seed : 42

auroc: True
pro: True
misclassifications: False
visualisation: False
import os
from glob import glob
from pathlib import Path
import shutil
import numpy as np
import csv
import torch
from PIL import Image
from torchvision import transforms
import torch.nn.functional as F
import torchvision.datasets as datasets
from torchvision.datasets import CIFAR10

class Dataset_maker(
def init(self, root, category, config, is_train=True):
self.image_transform = transforms.Compose(
transforms.ToTensor(), # Scales data into [0,1]
transforms.Lambda(lambda t: (t * 2) - 1) # Scale between [-1, 1]
self.config = config
self.mask_transform = transforms.Compose(
transforms.ToTensor(), # Scales data into [0,1]
if is_train:
if category:
self.image_files = glob(
os.path.join(root, category, "train", "good", ".JPG")
self.image_files = glob(
os.path.join(root, "train", "good", "
if category:
self.image_files = glob(os.path.join(root, category, "test", "", ".png"))
self.image_files = glob(os.path.join(root, "test", "", ".png"))
self.is_train = is_train

def __getitem__(self, index):
    image_file = self.image_files[index]
    image =
    image = self.image_transform(image)
    if(image.shape[0] == 1):
        image = image.expand(3,,
    if self.is_train:
        label = 'good'
        return image, label
            if os.path.dirname(image_file).endswith("good"):
                target = torch.zeros([1, image.shape[-2], image.shape[-1]])
                label = 'good'
            else :
                if == 'MVTec':
                    target =
                        image_file.replace("/test/", "/ground_truth/").replace(
                            ".png", "_mask.png"
                    target =
                        image_file.replace("/test/", "/ground_truth/"))
                target = self.mask_transform(target)
                label = 'defective'
            if os.path.dirname(image_file).endswith("good"):
                target = torch.zeros([1, image.shape[-2], image.shape[-1]])
                label = 'good'
            else :
                target = torch.zeros([1, image.shape[-2], image.shape[-1]])
                label = 'defective'
        return image, target, label

def __len__(self):
    return len(self.image_files)

(base) anywhere3090l@3090l:~/Desktop/henry/ddadvisa$ python --eval True
++++++++++testloader++++++++++ < object at 0x7fb37972da50>
++++++++++test_dataset++++++++++ <dataset.Dataset_maker object at 0x7fb37ab03b90>
Traceback (most recent call last):
File "/home/anywhere3090l/Desktop/henry/ddadvisa/", line 90, in
File "/home/anywhere3090l/Desktop/henry/ddadvisa/", line 36, in test
evaluate(unet, config)
File "/home/anywhere3090l/Desktop/henry/ddadvisa/", line 82, in evaluate
threshold = metric(labels_list, predictions, anomaly_map_list, gt_list, config)
File "/home/anywhere3090l/Desktop/henry/ddadvisa/", line 18, in metric
pro = compute_pro(gt_list, anomaly_map_list, num_th = 200)
File "/home/anywhere3090l/Desktop/henry/ddadvisa/", line 75, in compute_pro
results_embeddings = amaps[1]
IndexError: list index out of range

Some questions about training hyperparameters

Why set epochs of pre-training Unet to a large value ( > 1000), such as 1500,2000,3000? The amount of images in my own trainset is relatively large, and it will take a lot of time to perform pre-training with your setting. Otherwise, How to set epochs of fine-tuning the feature extractor. Because training loss, including pretraining loss, cannot provide useful information to judge the effectiveness of model in training. Thank you.

problem about load checkpoints

Very nice work. But something went wrong when I loaded the provided checkpoint to evaluate and test the model.

Class: hazelnut w: 8 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 ,
Detecting Anomalies...
Traceback (most recent call last):
File "D:\code\DDAD-main\", line 96, in
File "D:\code\DDAD-main\", line 36, in detection
checkpoint = torch.load(os.path.join(os.getcwd(), config.model.checkpoint_dir,, str(config.model.load_chp)))
File "C:\Users\23871\anaconda3\envs\vicuna\lib\site-packages\torch\", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "C:\Users\23871\anaconda3\envs\vicuna\lib\site-packages\torch\", line 271, in _open_file_like
return _open_file(name_or_buffer, mode)
File "C:\Users\23871\anaconda3\envs\vicuna\lib\site-packages\torch\", line 252, in init
super().init(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'D:\code\DDAD-main\checkpoints/MVTec\hazelnut\2000'

I made sure the path to the file was correct. I noticed the downloaded checkpoint is a .zip fold, such as checkpoints/MVTec/hazelnut/, is it right? I tried to extract the zip file, and it is also not right. how should load the checkpoint?

The checkpoint is like this:
My python is 3.10 and pytorch is 2.0. And when I execute python --eval True, there is not a --eval in args, so I change it into python --detection True

Request for Fine-Tuning Parameters in MVTec's config.yaml

Hello, thank you for realease this awesome work.
Regarding the result of the Mvtec dataset mentioned in the paper, could you please provide some fine-tune parameters such as 'load_chp','DA_epochs','w',and so on?
I've encountered lower results for certain classes. I think it might be due to my parameter settings.

If it's not possible, that's okay too. Thank you very much.


(base) anywhere3090l@3090l:/Desktop/henry/ddadvisa$ python --train True
Class: pcb4 w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 ,
Num params: 32952707
Epoch 0 | Loss: 196538.21875
(base) anywhere3090l@3090l:
/Desktop/henry/ddadvisa$ python --domain_adaptation True
Class: pcb4 w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 ,
Domain Adaptation...
Epoch 0 | Loss: 0.02777327597141266
(base) anywhere3090l@3090l:~/Desktop/henry/ddadvisa$ python --eval True
Class: pcb4 w: 2 v: 1 load_chp: 2000 feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 ,

problem about feat0

how to save feat0? Does this mean that fine_tuning is not required in categories with FE_epoch =0?
In code the model "feat" is saved from 1. Does this mean that I should set DA_chp=1?
I can hardly repeat "carpet" experiment with w=0, load_chp=2500, and DA_chp=1.
I am curious at what step I went wrong. Looking forward to your reply.

Some questions about the traininig time and the finally results

First thank you for your excellent work, while I met some questions when reproducing the results:

  1. I find that the training time is extremely long. I used 3 3090 to train the model on the MVTec dataset and it took me almost 2 days to finish the whole class training process.
  2. There is some large gap between the report results and the one I reproduced, like class carpet
    AUROC: 0.6737560033798218
    AUROC pixel level: 0.8384530544281006
    threshold: 0.59561723
    I kept almost all of the parameters unchanged in the config.yaml except for the batchsize, I wonder whether the setting is the same as the one you used and if not , how can I make the proper change to get the similar results.

detection issue using the checkpoints file you provided

Hello, I encountered the following issue while using the checkpoints file you provided for testing. I'm not sure if you can point out the mistake or provide a solution to the problem. Thank you very much

Class: leather w: 8 v: 1 load_chp: 2000/data.pkl feature extractor: wide_resnet101_2 w_DA: 3 DLlambda: 0.1
config.model.test_trajectoy_steps=250 ,
Detecting Anomalies...
Traceback (most recent call last):
File "", line 96, in
File "", line 36, in detection
checkpoint = torch.load(os.path.join(os.getcwd(), config.model.checkpoint_dir,, str(config.model.load_chp)))
File "/home/dell/anaconda3/envs/diffusion2.0/lib/python3.8/site-packages/torch/", line 815, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/home/dell/anaconda3/envs/diffusion2.0/lib/python3.8/site-packages/torch/", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

About the meaning of "n" of "DDAD-S-n"

Hello, I'd like to ask about the meaning of 10 in DDAD-S-10 in your paper. Your paper describes it as "n refers to the number of denoising iterations", but isn't the number of denoising iterations 1000? I don't see a 10 setting in the code, or do I iterate the training process 10 times to get it?

Finetuning & result

Hi. I really enjoyed your great idea to use the diffusion model & domain adaptation to anomaly detection while reading your paper. I have followed the sample with MVTec dataset but it just gave me not a great score in the paper. (AUROC: (86.0,96.3),90.8)

I changed 'batch_size' parameter to 16 and did not change any parameters with the code. Is it just a normal result or not? Can you guide me on how to get a better result?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.