Code Monkey home page Code Monkey logo

Comments (4)

hq-deng avatar hq-deng commented on September 2, 2024

Hello,

Pre+OCE means that we re-train the bottleneck block, while only pre is that we only use the frozen bottleneck block. We train the bottleneck block on one-class samples so that it refers to one-class embedding (OCE). The pre+MFF is meaningless, because the pre-trained bottleneck block (like 4th layer in ResNet) received features from the last layer rather than multi-scale feature when it was trained on ImageNet. If we want to fuse multi-scale features, we should train the bottleneck block to adapt to multi-scale features.

This is the same as why auto-encoder can achieve anomaly detection. During training, we compress the normal images to a low-dimensional codes, and then restore the normal images. Although code contains less information, the normal images are generated still well as it is trained. The anomalous case will generate a larger error as some anomalous informations are discarded. A toy case is that if normal case is [0,0,0,0,0,0,0] and anomalous case is [0,0,1,1,1,0,0]. If we compress it to a code with length of 5, the anomalous code maybe [0,1,1,1,0]. If we compress it to a code with length of 3, the anomalous code maybe [0,1,0]. The normal case is still restored as [0,0,0,0,0,0,0], but the anomalous case will be restored as [0,0,1,1,1,0,0] and [0,0,0,1,0,0,0] respectively. This case is very extreme and just for understanding it. Actually, previous studies have shown that the more compact latent code lead to larger anomaly error. When it comes back to OCBE, the default compression is like [(16,16,1024)] -> [(8,8,2048)] and MFF should be [(64,64,256),(32,32,512),(16,16,1024)] -> [(8,8,2048)]. Although we give the same target space, but there is much more information input. It relatively compresses information.

from rd4ad.

tommying avatar tommying commented on September 2, 2024

Hello,

Pre+OCE means that we re-train the bottleneck block, while only pre is that we only use the frozen bottleneck block. We train the bottleneck block on one-class samples so that it refers to one-class embedding (OCE). The pre+MFF is meaningless, because the pre-trained bottleneck block (like 4th layer in ResNet) received features from the last layer rather than multi-scale feature when it was trained on ImageNet. If we want to fuse multi-scale features, we should train the bottleneck block to adapt to multi-scale features.

This is the same as why auto-encoder can achieve anomaly detection. During training, we compress the normal images to a low-dimensional codes, and then restore the normal images. Although code contains less information, the normal images are generated still well as it is trained. The anomalous case will generate a larger error as some anomalous informations are discarded. A toy case is that if normal case is [0,0,0,0,0,0,0] and anomalous case is [0,0,1,1,1,0,0]. If we compress it to a code with length of 5, the anomalous code maybe [0,1,1,1,0]. If we compress it to a code with length of 3, the anomalous code maybe [0,1,0]. The normal case is still restored as [0,0,0,0,0,0,0], but the anomalous case will be restored as [0,0,1,1,1,0,0] and [0,0,0,1,0,0,0] respectively. This case is very extreme and just for understanding it. Actually, previous studies have shown that the more compact latent code lead to larger anomaly error. When it comes back to OCBE, the default compression is like [(16,16,1024)] -> [(8,8,2048)] and MFF should be [(64,64,256),(32,32,512),(16,16,1024)] -> [(8,8,2048)]. Although we give the same target space, but there is much more information input. It relatively compresses information.

Thank you very much for the reply and I'm sorry to bother you again.

  1. I still can't understand why pre+MFF is meaningless.
    You said that the pre-trained bottleneck block (like 4th layer in ResNet) received features from the last layer rather than
    multi-scale feature when it was trained on ImageNet. I actually didn't really understand that.

    pre+MFF get the multi-scale feature from the pre-trained encoder and use MFF align them. Then put it to decoder. There
    only use the output of diferent bottleneck block that were pre-trained on ImageNet and no OCE module to use.

  2. OCE was adopt the 4th residule block of ResNet. You said pre+OCE means that we re-train the bottleneck block. So, OCE
    is part of teacher network (4th residule block of teacher network) or is OCE just a new residule block like 4th residule block
    of Teacher?

from rd4ad.

hq-deng avatar hq-deng commented on September 2, 2024

As we use the 1st, 2nd, 3rd layers of a resnet as the teacher encoder, so the dimension of the features from the 3rd layer is fitting for the 4th layer of the resnet, so it's naturally we use the 4th layer of the encoder as the bottleneck layer. Pre refers to the pertained but frozen 4th layer which is the same as 1st, 2nd, 3rd in teacher encoder. The whole encoder is a pre-trained model on ImageNet. As the 4th layer receive the feature from the 3rd layer and trained on ImageNet, not from MFF when pre-training, we should re-train it on MVTec when we want to add MFF features.
OCE is part pf resnet used in teacher network. But sometimes we should train the OCE (4th layer) but freeze the teacher encoder (1st, 2nd, 3rd layers). Or modifying it as MFF+OCE.

from rd4ad.

tommying avatar tommying commented on September 2, 2024

As we use the 1st, 2nd, 3rd layers of a resnet as the teacher encoder, so the dimension of the features from the 3rd layer is fitting for the 4th layer of the resnet, so it's naturally we use the 4th layer of the encoder as the bottleneck layer. Pre refers to the pertained but frozen 4th layer which is the same as 1st, 2nd, 3rd in teacher encoder. The whole encoder is a pre-trained model on ImageNet. As the 4th layer receive the feature from the 3rd layer and trained on ImageNet, not from MFF when pre-training, we should re-train it on MVTec when we want to add MFF features. OCE is part pf resnet used in teacher network. But sometimes we should train the OCE (4th layer) but freeze the teacher encoder (1st, 2nd, 3rd layers). Or modifying it as MFF+OCE.

Thanks a lot.

from rd4ad.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.