shjo-april / puzzlecam Goto Github PK
View Code? Open in Web Editor NEW[ICIP 2021] Puzzle-CAM: Improved localization via matching partial and full features.
[ICIP 2021] Puzzle-CAM: Improved localization via matching partial and full features.
In you're code their is a conf loss (ShannonEntropyLoss) applied on the tiled logits. This is not mentionned in your paper, any reason for that ? What's the idea behind it ?
Thanks again, love your work/paper !
When I used the released weights for inference phase and evaluation, I found that the mIoU I got was different from the mIoU reported in the paper. I would like to ask whether this weight is corresponding to the paper, if it is, how to reproduce the result in your paper. Looking forward to your reply.
i run the train classification.py with baseline resnet50,and only get the best train_miou 44.12%. which is lower than your result in paper 47.82%. i used four nvidia 1080ti, conld you tell me the experiments details, thanks!
I want to make a visual camparison with PuzzleCAM, however, I cannot reimplemet the performance. For PuzzleCAM with ResNet50, only 48.53 mIOU can be obtained, which is lower than 51.53 reported in your paper.
Hello ! Thanks you for this wonderful paper and code, are you going to release the saved weights you trained ?
dear author:
I notice that if the image size isn't 512 x 512, it will have some error. I use image size 1280 x 496 and i got tensor size error at calculate puzzle module:the original feature is 31 dims and re_feature is 32 dims. So i have to change image size to 1280 x 512 and i work.
So i think this maybe a little bug. It will better that you fixed it or add a notes in code~
Thanks for your job!
Hello, thank you for the great repository! It's pretty impressive how organized it is.
I have a critic (or maybe a question, in case I got it wrong) regarding the training of the classifier, though:
I understand the importance of measuring and logging the mIoU during training (specially when creating the ablation section in your paper), however it doesn't strike me as correct to save the model with best mIoU. This procedural decision is based on fully supervised segmentation information, which should not be available for a truly weakly supervised problem; while resulting in a model better suited for segmentation.
The paper doesn't address this. Am I right to assume all models were trained like this? Were there any trainings where other metrics were considered when saving the model (e.g. classification loss or Eq (7) in the paper)?
Hello, I would like to ask a question, when I train train_classification_with_puzzle.py, why my mIOU is always 4.23 and can’t improve? I used two 2080Ti for training.
Like this
[i] iteration=66, learning_rate=0.0994, alpha=0.03, loss=1.2468, class_loss=0.6384, p_class_loss=0.6042, re_loss=0.3179, conf_loss=0.0000, time=65sec
[i] iteration=132, learning_rate=0.0988, alpha=0.08, loss=0.5721, class_loss=0.2831, p_class_loss=0.2832, re_loss=0.0733, conf_loss=0.0000, time=54sec
[i] iteration=198, learning_rate=0.0982, alpha=0.13, loss=0.5701, class_loss=0.2822, p_class_loss=0.2793, re_loss=0.0651, conf_loss=0.0000, time=54sec
[i] iteration=264, learning_rate=0.0976, alpha=0.19, loss=0.5488, class_loss=0.2696, p_class_loss=0.2712, re_loss=0.0434, conf_loss=0.0000, time=54sec
[i] iteration=330, learning_rate=0.0970, alpha=0.24, loss=0.5321, class_loss=0.2615, p_class_loss=0.2606, re_loss=0.0416, conf_loss=0.0000, time=54sec
[i] iteration=396, learning_rate=0.0964, alpha=0.29, loss=0.5358, class_loss=0.2632, p_class_loss=0.2591, re_loss=0.0462, conf_loss=0.0000, time=53sec
[i] iteration=462, learning_rate=0.0958, alpha=0.35, loss=0.5387, class_loss=0.2635, p_class_loss=0.2608, re_loss=0.0417, conf_loss=0.0000, time=54sec
[i] iteration=528, learning_rate=0.0952, alpha=0.40, loss=0.5292, class_loss=0.2578, p_class_loss=0.2573, re_loss=0.0351, conf_loss=0.0000, time=54sec
[i] iteration=594, learning_rate=0.0946, alpha=0.45, loss=0.5279, class_loss=0.2573, p_class_loss=0.2545, re_loss=0.0356, conf_loss=0.0000, time=53sec
[i] iteration=660, learning_rate=0.0940, alpha=0.51, loss=0.5173, class_loss=0.2518, p_class_loss=0.2496, re_loss=0.0313, conf_loss=0.0000, time=53sec
[i] save model
[i] iteration=661, threshold=0.10, train_mIoU=4.23%, best_train_mIoU=4.23%, time=28sec
[i] iteration=726, learning_rate=0.0934, alpha=0.56, loss=0.5259, class_loss=0.2554, p_class_loss=0.2537, re_loss=0.0303, conf_loss=0.0000, time=83sec
[i] iteration=792, learning_rate=0.0928, alpha=0.61, loss=0.5114, class_loss=0.2484, p_class_loss=0.2483, re_loss=0.0241, conf_loss=0.0000, time=53sec
[i] iteration=858, learning_rate=0.0922, alpha=0.67, loss=0.5194, class_loss=0.2523, p_class_loss=0.2526, re_loss=0.0219, conf_loss=0.0000, time=53sec
[i] iteration=924, learning_rate=0.0916, alpha=0.72, loss=0.5110, class_loss=0.2479, p_class_loss=0.2472, re_loss=0.0221, conf_loss=0.0000, time=53sec
[i] iteration=990, learning_rate=0.0910, alpha=0.77, loss=0.5185, class_loss=0.2515, p_class_loss=0.2500, re_loss=0.0220, conf_loss=0.0000, time=53sec
[i] iteration=1,056, learning_rate=0.0904, alpha=0.83, loss=0.5102, class_loss=0.2470, p_class_loss=0.2465, re_loss=0.0202, conf_loss=0.0000, time=53sec
[i] iteration=1,122, learning_rate=0.0898, alpha=0.88, loss=0.5295, class_loss=0.2542, p_class_loss=0.2514, re_loss=0.0271, conf_loss=0.0000, time=53sec
[i] iteration=1,188, learning_rate=0.0892, alpha=0.93, loss=0.5289, class_loss=0.2525, p_class_loss=0.2503, re_loss=0.0280, conf_loss=0.0000, time=53sec
[i] iteration=1,254, learning_rate=0.0886, alpha=0.98, loss=0.5302, class_loss=0.2542, p_class_loss=0.2531, re_loss=0.0233, conf_loss=0.0000, time=53sec
[i] iteration=1,320, learning_rate=0.0879, alpha=1.04, loss=0.5196, class_loss=0.2501, p_class_loss=0.2486, re_loss=0.0201, conf_loss=0.0000, time=53sec
[i] iteration=1,322, threshold=0.10, train_mIoU=4.19%, best_train_mIoU=4.23%, time=29sec
[i] iteration=1,386, learning_rate=0.0873, alpha=1.09, loss=0.5252, class_loss=0.2533, p_class_loss=0.2521, re_loss=0.0182, conf_loss=0.0000, time=84sec
Hello,
I want to know how you calculated the fg_threshold and bg_threshold in the code of make_affinity_labels, correct me if I am wrong, after running inference_classification.py, in the end, we will have a comment to run for evaluate.py which gives some threshold value, I am thinking that the threshold is for bg_threshold. Then how come you got the fg_threshold value, if you provide a explain regarding the calculation of these thresholds it would be helpful.
Thank you,
Avinash.
Greetings,
Thank you for your code and bravo paper, they give us a novel way to process the relationship with most descriptive part and the other object part.
When I read your code, I find some question, would you have time to answer them for me? Thanks!
I noticed that you apply color jitter and random augment in your training step, I am interested about how much the development could them provide, did you do the ablation study of these?
In Table 2 of your paper, the backbone of Affinitynet is resnet38. Why did you write resnet50?
After my experiment, I found that RW result reached 65.42% for Affinitynet which is based on resnet50 and higher than yours.
Hi, I can't access the Resnest checkpoints because of the permissions. Are the permissions changed? Thanks
Hello Author,
Thanks for sharing great works!
I have a question about how to set hyperparameter alpha for AffinityNet. In AffinityNet, there's a parameter 'alpha' that adjusts the background confidence scores, also described in equation (2) in the paper.
For your experiment setting, I'm wondering what alpha value you used.
Hello, the link you gave me http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit Unable to open
ModuleNotFoundError Traceback (most recent call last)
in
1 from core.puzzle_utils import *
----> 2 from core.networks import *
3 from core.datasets import *
4
5 from tools.general.io_utils import *
/working/PuzzleCAM/core/networks.py in
24 # Normalization
25 #######################################################################
---> 26 from .sync_batchnorm.batchnorm import SynchronizedBatchNorm2d
27
28 class FixedBatchNorm(nn.BatchNorm2d):
ModuleNotFoundError: No module named 'core.sync_batchnorm'
`
Dear Sanghyun Jo,
I was wondering if you were able to share training logs with your final parameters, losses and mious?
Thanks, Alex
EDIT:
Also I would like to ask you, if you think its reasonable for training and validation if I rely entirely on the loss, as I have no ground truth masks for validation. For this purpose I calculate the "raw loss" additionally without multiplying RE loss with alpha, both on the train and validation set (because otherwise my loss metric would be influenced by the number of epochs).
I am trying to train with ResNest101, and I also added affinity and RW.
When I try to train, it runs according to the specified code. It is found that the obtained affinity labels are not effective, and the effect of pseudo_labels is almost invisible, which is close to the effect of all black. I don't know where the problem is, who can explain the details. help!
Can you give me the details of how to decide the threshold values? I think there is a problem with this procedure or I missed some parts.
Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.