visionlearninggroup / ssda_mme Goto Github PK
View Code? Open in Web Editor NEWSemi-supervised Domain Adaptation via Minimax Entropy
License: MIT License
Semi-supervised Domain Adaptation via Minimax Entropy
License: MIT License
Hi @ksaito-ut ,
What's the difference between unlabeled_target_images_XXXX_1.txt and unlabeled_target_images_XXXX_3.txt?
Dear Authors,
Thank you for making the code open source! I had a small question while reading the paper. I was wondering if you could let me know the approximate hyper-parameters used while getting the tSNE embeddings as given in the paper, i.e. the perplexity, number of iterations, learning rate among others. I am plotting the tSNE using this function. Please do let me know if it would be possible.
Thanks,
Megh
Thank for sharing your project~
Based on the data split and settings of hyper-parameters in the paper, I can obtain similar classification performance on DomainNet when K = 3, but not when K = 1 (e.g., in P to R, I get 73.65 accuracy, instead of 76.1 claimed in the paper; in R to S, I get 59.655 accuracy, rather than 61.0 claimed in the paper).
Do I miss some important details ?
Hi,
thanks for publishing the codes.
I have a question regarding the training with the gradient reversal layer.
As I understood from the code (main.py), Training consists of two steps:
My question is whether you need two steps because in the paper you mentioned it is done one step.
whether it can't be done like this:
Thanks in advance
Hello, thank you for sharing the code.
Could you publish the commands for training the model to reproduce the results of table1 in the paper for the 3-shot settings? I ran the training with "labeled_source_images_real.txt " as source annotations, "labeled_target_images_sketch_3.txt" as labeled target, "unlabeled_target_images_sketch_3.txt" for domain adaptation. The model performance was tested on "unlabeled_target_images_sketch_3.txt".
For resnet-34 I obtained the following results :
ACC All 62.775959 ACC Averaged over Classes 63.848788
It is stated that the model performance is 72.2% in the table 1 of the paper.
Maybe I made a mistake in the training setup. Thank you
First, thanks for your sharing code!
Actually, I am little confused about How maximizing entropy of unlabelled data w.r.t classifier work? in the chapter 3.2 training objectives.
Firstly, you train the F (feature extractor) and C on labelled data, it is intuitive that the prototype of one class (i.e. class A) will locate at the centre of feature distributions of this class (i.e. class A), where the objective function is cross-entropy minimization.
In the second step, you mentioned that the maximizing entropy of unlabelled data w.r.t classifier will force/push the all the prototypes (representative points) to the feature distributions of target domain.
It's correct, but how can you ensure that prototype of Class A (initially at the centre of source Class A feature distribution) will be pushed to the centre of target Class A feature distribution?
Because the figures in your paper shows that the class-specific prototype (initially at the centre of source Class A feature distribution) will be pushed to class-specific centre of feature distribution in target domain.
Or doesn't need to do or cannot achieve class-specific updating? And then in next step, try to minimizing the entropy w.r.t feature extractor.
If so, do you think first minimizing entropy w.r.t feature extractor and then max it w.r.t classifier is better? or it doesn't matter because Minimax will be implemented alternatively.
It is easy and intuitive to understand that Minimax alternative training can refine the performance, but it is still confused how the first max entropy step work (push prototypes to specific target centre).
Lines 195 to 204 in 81c3a9c
Lines 28 to 41 in 81c3a9c
Thank you for your code.
From your code it seems that
ENT method try to minimize entropy on classifier but maximize on feature extractor;
AdENT method try to maximize entropy on classifier but minimize on feature extractor, which is proposed in your paper.
BUT, in your paper the ENT method seems to be described as minimize entropy on both classifier and feature extractor, as referred in Yves Grandvalet and Yoshua Bengio. Semi-supervised learning by entropy minimization. In NIPS, 2005
So, i'm very confused about it. I'm looking forward to hearing from you.
Hi @ksaito-ut
Thanks for sharing the code.
Did you try the unsupervised setting on DomainNet using ResNet34? If yes, then can you please share the accuracy values obtained?
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.