Code Monkey home page Code Monkey logo

secu's People

Contributors

qian-qi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

secu's Issues

Reproducing self-labeling results on CIFAR10

Hi, first, thank you for this nice work and for open sourcing your code.

I was able to reproduce your results on CIFAR10 (without self-labeling) with your code base, but encountered some issues and open questions using the SCAN library to conduct the self-labeling fine-tuning experiments. It would be great if you could help me in my reproduction effort.

Question 1) SCAN uses a linear cluster head for the self-labeling, whereas SeCu uses a Projection+Prediction+ClusterHead design. Did you reuse SeCu's head during self-labeling or trained a new one based on SCANs design using only the ResNet18 backbone of SeCu? Mainly I am asking if you replaced this line in SCAN:

self.cluster_head = nn.ModuleList([nn.Linear(self.backbone_dim, nclusters) for _ in range(self.nheads)])

with a head that suits SeCu, like:

self.cluster_head = nn.Sequential( nn.Linear(self.backbone_dim, self.backbone_dim), nn.BatchNorm1d(self.backbone_dim), nn.ReLU(inplace=True), nn.Linear(self.backbone_dim, 128), nn.Linear(128, nclusters, bias=False))

Question 2) I tried both options, but receive the Mask in MaskedCrossEntropyLoss is all zeros. error message from SCANs selflabel.py script. This is already with your specified parameters and a threshold of 0.9. In the appendix of the arxiv version of SeCu you state in the Self-labeling paragraph:

Before selecting the confident instances by the prediction from the weak augmentation with a threshold of 0.9, we have a warm-up period with 10 epochs, where all instances are trained with the fixed pseudo label from the assignment of pre-trained SeCu.

I could not find the option for a warmup period in SCAN and I am not sure how exactly you implemented this. It would be great if you could clarify the self-labeling process in more detail, especially the warm-up period.

Thank you and thanks again for the great work.

Best,
Lukas

Question about ablation experiments

Hello, first of all, thank you very much for your work. I have a question that makes me very confused, and I hope to get your answer:
I want to try one of the ablation experiments you wrote about in your paper, replacing the SeCu loss with the standard cross entropy loss, and I just change the calculation of loss_c as follows: loss_proj_c += criterion(proj_c1 / self.tw, label) + criterion(proj_c2 / self.tw, label), without changing any other Settings. The result is very strange and I think I misunderstood it. I would be very grateful if you could give me a little more detailed guidance

Model evaluation

Hello

First of all, thank you for publishing this code. I'm having difficulty in evaluating the trained model. Adopting eval.py form SCAN seems not straight forward and I'm not sure whether I've done it correctly. Having trained the model for 401 epoch, The ACC remains 0.1. Here is the training log from last epoch:

use time : 64.4565258026123
Epoch: [399][ 0/391] Time 33.955 (33.955) Data 33.852 (33.852) Loss 7.7962e+00 (7.7962e+00)
Epoch: [399][100/391] Time 0.075 ( 0.415) Data 0.000 ( 0.336) Loss 7.5979e+00 (7.9200e+00)
Epoch: [399][200/391] Time 0.075 ( 0.248) Data 0.000 ( 0.169) Loss 7.7325e+00 (7.8598e+00)
Epoch: [399][300/391] Time 0.074 ( 0.191) Data 0.000 ( 0.113) Loss 7.3591e+00 (7.8692e+00)
Epoch: [399][391/391] Time 0.058 ( 0.165) Data 0.000 ( 0.087) Loss 7.8860e+00 (7.8535e+00)
max and min cluster size for 10-class clustering is (5120.0,4736.0)
max and min cluster size for 20-class clustering is (2688.0,2304.0)
max and min cluster size for 30-class clustering is (1920.0,1408.0)
max and min cluster size for 40-class clustering is (1664.0,976.0)
max and min cluster size for 50-class clustering is (1280.0,512.0)
max and min cluster size for 60-class clustering is (1152.0,384.0)
max and min cluster size for 70-class clustering is (1152.0,256.0)
max and min cluster size for 80-class clustering is (896.0,256.0)
max and min cluster size for 90-class clustering is (1152.0,0.0)
max and min cluster size for 100-class clustering is (896.0,0.0)
use time : 65.68861746788025
Epoch: [400][ 0/391] Time 32.947 (32.947) Data 32.842 (32.842) Loss 8.7333e+00 (8.7333e+00)
Epoch: [400][100/391] Time 0.074 ( 0.404) Data 0.000 ( 0.326) Loss 7.6575e+00 (7.7840e+00)
Epoch: [400][200/391] Time 0.092 ( 0.242) Data 0.000 ( 0.164) Loss 8.2370e+00 (7.8897e+00)
Epoch: [400][300/391] Time 0.096 ( 0.188) Data 0.000 ( 0.110) Loss 7.3230e+00 (7.8503e+00)
Epoch: [400][391/391] Time 0.059 ( 0.162) Data 0.000 ( 0.084) Loss 7.5377e+00 (7.8445e+00)
max and min cluster size for 10-class clustering is (5248.0,4864.0)
max and min cluster size for 20-class clustering is (2640.0,2304.0)
max and min cluster size for 30-class clustering is (1792.0,1536.0)
max and min cluster size for 40-class clustering is (1408.0,1024.0)
max and min cluster size for 50-class clustering is (1408.0,640.0)
max and min cluster size for 60-class clustering is (1280.0,384.0)
max and min cluster size for 70-class clustering is (1024.0,256.0)
max and min cluster size for 80-class clustering is (1024.0,256.0)
max and min cluster size for 90-class clustering is (896.0,0.0)
max and min cluster size for 100-class clustering is (768.0,128.0)
use time : 64.5689332485199

Here is my adaptation to get the cluster prediction (in order to reuse code from SCAN) :
image

I'm not sure if this is the right way to get the prediction. Please correct me if I'm wrong. Surely it will be better if you can provide the evaluation script.

Question about the dataset

Hello, first of all thank you very much for your outstanding contribution!
I had some problems replicating your paper results. Could you please give the source of the dataset(such as cifar10) used in the paper?

Question about closed-form solution

Hello, I would like to ask how the optimization through closed-form solution mentioned in your paper is realized, I did not find it in the code you gave, thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.