shjo-april / puzzlecam Goto Github PK

[ICIP 2021] Puzzle-CAM: Improved localization via matching partial and full features.

Python 100.00%

weakly-supervised-segmentation deeplearning semantic-segmentation convolutional-neural-networks weakly-supervised-learning

puzzlecam's People

Contributors

Stargazers

Watchers

Forkers

yuv4r4j linyeli60 linhduongtuan kevinshieh0225 medical-projects alexriedel1 ferenas louis-she yuan-er jason-george mouzhengdong sakusaku-rich cxz alisure-fork yangsenwxy violet998 zymale avinashyaganapu herbwood suhyunyoon ml-edu xs-ui inch-z trendingtechnology sunwanchun cenkbircanoglu dongzhang89 zhangsongdmk voidrank yangyang117 simrit1 willcruse miao-j liujie183 hazy-wu abner0001 tankostao mldl baichenxi-hub nathanzhang1104 dl-cnn yzk695 nastu-ho 4three2one fengqianye mm10811

puzzlecam's Issues

[Question] conf loss not mentionned in paper ?

In you're code their is a conf loss (ShannonEntropyLoss) applied on the tiled logits. This is not mentionned in your paper, any reason for that ? What's the idea behind it ?

Thanks again, love your work/paper !

performance issue

When I used the released weights for inference phase and evaluation, I found that the mIoU I got was different from the mIoU reported in the paper. I would like to ask whether this weight is corresponding to the paper, if it is, how to reproduce the result in your paper. Looking forward to your reply.

about train details

i run the train classification.py with baseline resnet50,and only get the best train_miou 44.12%. which is lower than your result in paper 47.82%. i used four nvidia 1080ti, conld you tell me the experiments details, thanks!

Will you plan to provide the trained weights of PuzzlCAM(ResNet50)?

I want to make a visual camparison with PuzzleCAM, however, I cannot reimplemet the performance. For PuzzleCAM with ResNet50, only 48.53 mIOU can be obtained, which is lower than 51.53 reported in your paper.

Weights ?

Hello ! Thanks you for this wonderful paper and code, are you going to release the saved weights you trained ?

error occured when image-size isn't 512 * n

dear author:
I notice that if the image size isn't 512 x 512, it will have some error. I use image size 1280 x 496 and i got tensor size error at calculate puzzle module:the original feature is 31 dims and re_feature is 32 dims. So i have to change image size to 1280 x 512 and i work.
So i think this maybe a little bug. It will better that you fixed it or add a notes in code~
Thanks for your job!

Evaluation in classifier training is using supervised segmentation maps?

Hello, thank you for the great repository! It's pretty impressive how organized it is.

I have a critic (or maybe a question, in case I got it wrong) regarding the training of the classifier, though:
I understand the importance of measuring and logging the mIoU during training (specially when creating the ablation section in your paper), however it doesn't strike me as correct to save the model with best mIoU. This procedural decision is based on fully supervised segmentation information, which should not be available for a truly weakly supervised problem; while resulting in a model better suited for segmentation.
The paper doesn't address this. Am I right to assume all models were trained like this? Were there any trainings where other metrics were considered when saving the model (e.g. classification loss or Eq (7) in the paper)?

The Training problem

Hello, I would like to ask Like this
[i] iteration=66, [i] iteration=132, [i] iteration=198, [i] iteration=264, [i] iteration=330, [i] iteration=396, [i] iteration=462, [i] iteration=528, [i] iteration=594, [i] iteration=660, [i] save model
[i] iteration=661, [i] iteration=726, [i] iteration=792, [i] iteration=858, [i] iteration=924, [i] iteration=990, [i] iteration=1,056, [i] iteration=1,122, [i] iteration=1,188, [i] iteration=1,254, [i] iteration=1,320, [i] iteration=1,322, [i] iteration=1,386, a question, when I train train_classification_with_puzzle.py, why my mIOU is always 4.23 and can’t improve? I used two 2080Ti for training.
learning_rate=0.0994, alpha=0.03, loss=1.2468, class_loss=0.6384, p_class_loss=0.6042, re_loss=0.3179, conf_loss=0.0000, time=65sec
learning_rate=0.0988, alpha=0.08, loss=0.5721, class_loss=0.2831, p_class_loss=0.2832, re_loss=0.0733, conf_loss=0.0000, time=54sec
learning_rate=0.0982, alpha=0.13, loss=0.5701, class_loss=0.2822, p_class_loss=0.2793, re_loss=0.0651, conf_loss=0.0000, time=54sec
learning_rate=0.0976, alpha=0.19, loss=0.5488, class_loss=0.2696, p_class_loss=0.2712, re_loss=0.0434, conf_loss=0.0000, time=54sec
learning_rate=0.0970, alpha=0.24, loss=0.5321, class_loss=0.2615, p_class_loss=0.2606, re_loss=0.0416, conf_loss=0.0000, time=54sec
learning_rate=0.0964, alpha=0.29, loss=0.5358, class_loss=0.2632, p_class_loss=0.2591, re_loss=0.0462, conf_loss=0.0000, time=53sec
learning_rate=0.0958, alpha=0.35, loss=0.5387, class_loss=0.2635, p_class_loss=0.2608, re_loss=0.0417, conf_loss=0.0000, time=54sec
learning_rate=0.0952, alpha=0.40, loss=0.5292, class_loss=0.2578, p_class_loss=0.2573, re_loss=0.0351, conf_loss=0.0000, time=54sec
learning_rate=0.0946, alpha=0.45, loss=0.5279, class_loss=0.2573, p_class_loss=0.2545, re_loss=0.0356, conf_loss=0.0000, time=53sec
learning_rate=0.0940, alpha=0.51, loss=0.5173, class_loss=0.2518, p_class_loss=0.2496, re_loss=0.0313, conf_loss=0.0000, time=53sec
threshold=0.10, train_mIoU=4.23%, best_train_mIoU=4.23%, time=28sec
learning_rate=0.0934, alpha=0.56, loss=0.5259, class_loss=0.2554, p_class_loss=0.2537, re_loss=0.0303, conf_loss=0.0000, time=83sec
learning_rate=0.0928, alpha=0.61, loss=0.5114, class_loss=0.2484, p_class_loss=0.2483, re_loss=0.0241, conf_loss=0.0000, time=53sec
learning_rate=0.0922, alpha=0.67, loss=0.5194, class_loss=0.2523, p_class_loss=0.2526, re_loss=0.0219, conf_loss=0.0000, time=53sec
learning_rate=0.0916, alpha=0.72, loss=0.5110, class_loss=0.2479, p_class_loss=0.2472, re_loss=0.0221, conf_loss=0.0000, time=53sec
learning_rate=0.0910, alpha=0.77, loss=0.5185, class_loss=0.2515, p_class_loss=0.2500, re_loss=0.0220, conf_loss=0.0000, time=53sec
learning_rate=0.0904, alpha=0.83, loss=0.5102, class_loss=0.2470, p_class_loss=0.2465, re_loss=0.0202, conf_loss=0.0000, time=53sec
learning_rate=0.0898, alpha=0.88, loss=0.5295, class_loss=0.2542, p_class_loss=0.2514, re_loss=0.0271, conf_loss=0.0000, time=53sec
learning_rate=0.0892, alpha=0.93, loss=0.5289, class_loss=0.2525, p_class_loss=0.2503, re_loss=0.0280, conf_loss=0.0000, time=53sec
learning_rate=0.0886, alpha=0.98, loss=0.5302, class_loss=0.2542, p_class_loss=0.2531, re_loss=0.0233, conf_loss=0.0000, time=53sec
learning_rate=0.0879, alpha=1.04, loss=0.5196, class_loss=0.2501, p_class_loss=0.2486, re_loss=0.0201, conf_loss=0.0000, time=53sec
threshold=0.10, train_mIoU=4.19%, best_train_mIoU=4.23%, time=29sec
learning_rate=0.0873, alpha=1.09, loss=0.5252, class_loss=0.2533, p_class_loss=0.2521, re_loss=0.0182, conf_loss=0.0000, time=84sec

Regarding the fg_threshold and bg_threshold in the code of make_Affinity_labels

Hello,
I want to know how you calculated the fg_threshold and bg_threshold in the code of make_affinity_labels, correct me if I am wrong, after running inference_classification.py, in the end, we will have a comment to run for evaluate.py which gives some threshold value, I am thinking that the threshold is for bg_threshold. Then how come you got the fg_threshold value, if you provide a explain regarding the calculation of these thresholds it would be helpful.
Thank you,
Avinash.

Question About DataAugment ablation

Greetings,

Thank you for your code and bravo paper, they give us a novel way to process the relationship with most descriptive part and the other object part.
When I read your code, I find some question, would you have time to answer them for me? Thanks!

I noticed that you apply color jitter and random augment in your training step, I am interested about how much the development could them provide, did you do the ablation study of these?

the backbone of Affinitynet is resnet38. Why did you write resnet50?

In Table 2 of your paper, the backbone of Affinitynet is resnet38. Why did you write resnet50?
After my experiment, I found that RW result reached 65.42% for Affinitynet which is based on resnet50 and higher than yours.

Access problem to ResNest Checkpoints

Hi, I can't access the Resnest checkpoints because of the permissions. Are the permissions changed? Thanks

hyperparameter 'alpha' for affinityNet

Hello Author,

Thanks for sharing great works!

I have a question about how to set hyperparameter alpha for AffinityNet. In AffinityNet, there's a parameter 'alpha' that adjusts the background confidence scores, also described in equation (2) in the paper.

For your experiment setting, I'm wondering what alpha value you used.

Datasets

Hello, the link you gave me http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit Unable to open

ModuleNotFoundError: No module named 'core.sync_batchnorm'

`

ModuleNotFoundError Traceback (most recent call last)
in
1 from core.puzzle_utils import *
----> 2 from core.networks import *
3 from core.datasets import *
4
5 from tools.general.io_utils import *

/working/PuzzleCAM/core/networks.py in
24 # Normalization
25 #######################################################################
---> 26 from .sync_batchnorm.batchnorm import SynchronizedBatchNorm2d
27
28 class FixedBatchNorm(nn.BatchNorm2d):

ModuleNotFoundError: No module named 'core.sync_batchnorm'
`

Training Logs

Dear Sanghyun Jo,
I was wondering if you were able to share training logs with your final parameters, losses and mious?
Thanks, Alex

EDIT:
Also I would like to ask you, if you think its reasonable for training and validation if I rely entirely on the loss, as I have no ground truth masks for validation. For this purpose I calculate the "raw loss" additionally without multiplying RE loss with alpha, both on the train and validation set (because otherwise my loss metric would be influenced by the number of epochs).

Ask for details of the training process！

I am trying to train with ResNest101, and I also added affinity and RW.
When I try to train, it runs according to the specified code. It is found that the obtained affinity labels are not effective, and the effect of pseudo_labels is almost invisible, which is close to the effect of all black. I don't know where the problem is, who can explain the details. help!

about training procedure

Can you give me the details of how to decide the threshold values? I think there is a problem with this procedure or I missed some parts.
Thanks