The pfenet's discuss from dvlab-research

The SBD dataset

Hi，I want to know how to get the SBD class segmentation masks. The data I downloaded from http://home.bharathh.info/pubs/codes/SBD/download.html only contains segmentation information in mat format.

scripts for processing coco dataset

It is written that pair "image_path_1 label_path_1" is needed for training the network. But the annotation downloaded from coco is in .json format. May I ask if you can provide the script to generate semantic segmentation masks in .png from coco .json annotation?

Thanks

how to generate SegmentationClassAug？

I have downloaded the PASCAL VOC 2012 dataset and SBD dataset from their official website respectively.But I donnot know how to use them in the code .Could you show me your dataset folder structure or the method to generate SegmentationClassAug?
Thanks , bubble from hfut!

About Pascal VOC

Does your code read Pascal VOC gt mask in color lkike this?

Preprocessing

Hi, I'm reading your code and I have a question:

is there a particular reason for which you decided to implement transformations (in transform.py) by hand, instead of relying on the torch torchvision.transforms package?
Do you think it can be a source of delays in the code?

Thanks in advance!

similarity matrix

For each xq ∈ XQ, we take the maximum similarity among
all support pixels as the correspondence value cq
By writing calculation process on the paper, I think It means that take maximum on each column
But in code ,I find that it is converse

similarity = similarity.max(1)[0].view(bsize, sp_sz*sp_sz)
I think it should be max(0)[0]
What's your opinion?
Thanks for your reply

question about resnet

Good work for FSS.

I have a question about resnet-v2, you download it from your own path, if available, where you download it? or you pre-train it. Since resnet-v2 is different from resnet, have you noticed the effects of using the former, and the previous work like [canet] uses rennet.
The support feature from layer3 multiply with the mask，
''supp_feat_4 = self.layer4(supp_feat_3*mask),
final_supp_list.append(supp_feat_4)
for i, tmp_supp_feat in enumerate(final_supp_list):
tmp_supp_feat_4 = tmp_supp_feat * tmp_mask ''
I notice mask operation twice, and there is only once in your paper,
if I missed some details, which is appropriate.
thanks!

Colormap

Hello!

Thank you for this work. Could you provide the colormap you use to visualize segmentation results?

Thank you!

About the crop size in COCO

Hi, I found in your paper the cropping size is 473 x 473 in training with a learning rate of 0.005. (COCO)
However, the cropping size in coco config is 641 x 641 with a learning rate of 0.02.
Just want to know what's the exact cropping size and learning rate in coco dataset, thank you!

Which setting do the pre-trained VGG16 models on COCO follow?

From your paper, we know there two settings for COCO, from PANet and FWBF, respectively. So, for the pre-trained models you give, which setting you are using?

Question about test episodes

Thank you for sharing! I'm confused about the set where test episodes are sampled. In this work, datasets are divided into training set and validation set. As far as I'm concerned, both train and val sets contain 4 folds. If you test on fold-0, training episode of fold-1,2,3 are sampled from training set, and, test episodes of fold-0 are sampled from val set? Do I have any misunderstandings? Thank you.

remove small objects

Thanks for sharing your work!
I noticed that you remove all objects whose area are smaller than 2x32x32, following the practice of OSLSM. However, in some recent works, such as PANet(ICCV'19) and CANet(CVPR'19), they include all objects in their source code whatever their size is. So I am wondering whether it is somewhat unfair to compare with these methods directly? Or are there any details that I have neglected?

VOC dataset question

Hello! Thanks for sharing your code. I noticed that your setting on the VOC dataset is different from the PANet (https://github.com/kaixin96/PANet). Their training data and test data are from the training set. Your training data are 15 classes from the training set and test data are 5 classes from the validation set. The numbers of images in each class are also different. May I ask why there are two different settings? Are the performances on two settings comparable? Thank you in advance.

time

hello,Thanks for your work, By the way, when will open source FFENet++?

COCO2014 dataset label

Hi!

I notice that you used a *.png for COCO2014 label images. Can you provide one?

I'm trying to do it myself, and I find that when I convert from JSON of COCO's anotations to an image, one pixel corresponds to multiple category. How can I handle this situation to ensure that I get the same results as you?

Train / Val / Test split

Hi,

Thank you for the great work. I'm new to few-shot segmentation and I was just trying to get my head around how the data split is made. From the code, I seem to understand that the validation and test sets are the same, i.e the best model during training is picked based on its performance on the test set (with the novel classes). Am I missing something here ?

Thanks in advance,
Malik

The gpu setting

I have changed the train_gpu: as 4, why the model still goes on the gpu 0. Could you give me some advice?

results obtained fluctuate greatly even set all random seeds

Hi, thanks for sharing your work!
There's a question that troubles me. I set the same random seed every time, but the results obtained fluctuate greatly (about 1%).
Is this due to the "Dropout " operation, or something I neglected?

Regards,
Lang

Can not download pretrained model

The pretrained model cannot be downloaded from the given link: https://mycuhk-my.sharepoint.com/:u:/g/personal/1155122171_link_cuhk_edu_hk/EW20i_eiTINDgJDqUqikNR4Bo-7kVFkLBkxGZ2_uorOJcw?e=4%3aSIRlwD&at=9

为什么按照作者提供的代码训练之后，无法达到实验结果？

以 resnet-v2 为backbone，split=0， epoch = 200 , 训练之后的结果远没有达到 mIoU 61.7（ pascal 5i 1shot fold 0）?

作者提供的模型可以达到61.7 ，是不是epoch 为200太少了，还是batch size 为4 太小了？

我的训练结果只有 mIoU 接近 50

请教各位同学。

Experiment detail

Hi, this project is very helpful. But I would like to know some details:

the backbone you provided named resnet50_v2,pth and resnet101_v2.pth. Did you train this backbone from init for few-shot segmentation task?
I get same result with pre-trained model. But the model training by myself is lower about 2% mIoU. Is the config same with the paper setting?

Thanks.

Weighted_GAP

This function returns a number? Shouldn't be a tentor?

Can you release pretrained models for other settings?

Thank you for your great work!

Can you release pretrained models of all settings? (VGG on PASCAL, ResNet101 on PASCAL ...)

About ImageNet pertained model for resnet50

I download the pertained ResNet50 from "https://download.pytorch.org/models/resnet50-19c8e357.pth" and load in the model, there are some mismatch for conv1 and layer1. Can you provide your ImageNet pretrained ResNet50 which is "/data-nas/lyx/FSS/pretrained/resnet50_v2.pth"?

PFENet++ code

Thanks for your work, By the way, when will you open FFENet++ source code?

What does the "split" means

In README

Update the config file by speficifying the target split and path (weights) for loading the checkpoint.

what does the split means ?
What role does 'split' play in the program?
e.g

assert args.split in [0, 1, 2, 3, 999]

Thank you .

May be a wrong code ?

In train.py the validate function.

but i find the len(subcls) == 5, len(subcls[0]) == 20, I set 5-shot, and batch_size_val = 20, split is 1. I can't get the purpose of ' 'subcls[0].cpu().numpy()[0]', I think the operator subcls[0].cpu().numpy()[0] is wrong,

The problem about the generation method of dataset

As seen at the line 60th in dataset.py, did you only select a query image when the number of the object pixels in this image is larger than 2 * 32 * 32 ?

Comparison of the number of model parameters

Hi,

Good work for FSS. I have a question about the count of the number of model parameters. The number of model parameters is compared with that of other methods in Table 1 of the paper.

It seems that only the number of trainable parameters (10.8M for the proposed modules) is counted. The number of fixed backbone parameters (23.6M) is not included.

But the 19.0M for CANet contains both the number of fixed backbone parameters and the number of trainable head parameters.

Is this a fair comparison? Or if I missed some details, please let me know.

Thanks!

Problem about dataset setting and baseline

Thanks for sharing your work! I would like to know why the PASCAL training set only contains about 5900 images. I mean in the setting of CANet or OSLSM, the number of training images is more than that, right?

the meaning of "split_gap" in function "validate".

for i in range(split_gap): logger.info('Class_{} Result: iou {:.4f}.'.format(i+1, class_iou_class[i]))
just split data into different group, but the class is not the split_gap range index.

Training the model on binary image segmentation dataset

n way setting

hello, thanks for your excellent work. I didn't find any parameter setting number of way, is the batch size of support image actually define the n_way? thanks

>some results

@Saralyliu
Hi,

Thanks for your attention.

The pre-trained weights of resnet-v2 are obtained from the official repo of PSPNet (https://github.com/hszhao/semseg). The difference between the original resnet only lies in the layer0 where the v2 version applies the deep-stem strategy. We used resnet-v2 to reproduce CANet and we got rather comparable results to the ones reported in the paper of CANet.

The mask used in "supp_feat_4 = self.layer4(supp_feat_3*mask)" is used for screening out the redundant background region, and I remember that it will not affect the performance much, you can try it out by sending feat_3 to layer-4 without the masking operation.

The another mask used in "tmp_supp_feat_4 = tmp_supp_feat * tmp_mask" is more important, since it is used for the prior calculation.

Thank you for your reply. If I understand correctly, resnet-v2 or resnet-50 is the same for feature extractor? Recently, we run voc group0 with your code, train is the numbers of 5955,val is 1449, and the best mIoU we test is 58.57 at 124 epoch without any modifications. we can't get your 61.7 mIoU in 1-shot case. waiting for your suggestion, thank you!

COCO dataset

Hello, Thank you very much for your work. Can you tell me how to get the ground turth of the COCO data set?Can you share it with me?Thank you.

It is difficult to download model from OneDrive.Can you share in Google Drive?

Hi, I noticed that the script of this link was modified on 2021/6/21. Now I use this modified script to generate the label image of coco, which is different from that generated before. Is there something wrong with me or the modified script? Can you provide a previous version that you used before? Thanks!

Hi, I noticed that the script of this link was modified on 2021/6/21. Now I use this modified script to generate the label image of coco, which is different from that generated before. Is there something wrong with me or the modified script? Can you provide a previous version that you used before? Thanks!

Originally posted by @JJ-res101 in #20 (comment)
Sorry, I have the same promblem, could you please provide with the origin version? Thank u a lot.

Questions about the paper

hello, I wanna try few-shot segmentation for medical images. There are two questions I could not understand. I hope you can help me.

The ground truth MQ of the query image is invisible to the model. So how did you use cross entropy loss for training?
In experiments part, what does 1-shot and 5-shot mean in Table 2? Could you please give me more implementation details?

Problem about reproduced accuracy

Hi, thanks for sharing the code! I have trained the model without any modification, but the results are always about 1% worse than the reported accuracy.
Here are some reproduced results with reported results on Pascalvoc dataset: Fold-0 60.8 (61.7), Fold-1 68.5 (69.5), Fold-2 53.9 (55.4)
So I wonder if I miss some tricks to reach the reported result? do I need to keep fine-tuning the model?

dataset composition

Sorry to bother again, the train_list you provide consist of 'voc_train' and 'sbd_train'. I wonder if 'sbd_val' should be included too(like 'trainaug' in Deeplab). I did not find the detailed description in OSLSM.
And i am still a little confused about the image number. 'sbd_train+'voc_train'=8284, why the number is above 5900 in your setting?
Not sure if i miss some preprocess.

train list and test list

Can you share the train_list.txt and val_list.txt file? I'd like to know the size and division of your training set and test set

关于测试样本数量的问题

#6 中提到并给出了训练日志，在训练阶段使用count=2000来进行评估，但是代码中训练阶段是5000，这是否对评估结果有影响？

about scale_lr

In the experiment setting of PASCAL-5i, the initial lr is setting to 0.0025. But i saw that there was a scaler_lr in the poly_learning_rate(). And the default value is 10. So the initial lr is actually 0.025. right?

question about train.py

Why subtract one here (subcls-1)?
I understand subcls here is the index of class_chosen in sublist ,so it should start from 0,but subtracting one cause it become a negative number
subcls = subcls[0].cpu().numpy()[0] class_intersection_meter[(subcls-1)%split_gap] += intersection[1] class_union_meter[(subcls-1)%split_gap] += union[1]

Whether there is a dict about Segmentation picture value to color, I chose use coco and split == 0, I find if the value of label is larger than 80, this would be not considered. I believe this is not a good idea that different picture value is equal to label classes

Performance inconsistency between paper and reproduce.

Thank you for your great work. I learned a lot from your paper.

I tested the pre-trained models you provided (ResNet50-based for Pascal VOC, 1 shot), but I can get better performance than you reported in your paper.

Method	Split0	Split1	Split2	Split3	Mean
Reproduce	61.8	69.9	56.3	56.6	61.2
Paper	61.7	69.5	55.4	56.3	60.8

Is this performance fluctuation within the normal range? I used the same codes and settings in you github repo.

I also tried to train another baseline experiment (ResNet50-based for Pascal VOC, 5 shots) by myself using your configs.

Method	Split0	Split1	Split2	Split3	Mean
Reproduce	64.7	71.5	55.5	60.6	63.1
Paper	63.1	70.7	55.8	57.9	61.9

There seems to be a greater fluctuation.

model parameters in optimizer

Hello , Thank you for the excellent work. I noticed that when setting optimizer ,modules in the model are all separately listed while backbone layers' require_grad are already set False. Why not use model.parameters instead, will it influence the training procedure ?

Question regarding evaluation with ignore_label

Hello, I have a question regarding evaluation

In line 52-64 in PFENet/util/util.py (function intersectionAndUnion), the function computes intersection & union between gt-mask and predicted mask. I found that the code uses ignore_label=255, to refine the prediction right before computing the intersection & union. Why did you adopt this kind of refinement process (using mask boundary, e.g., ignore_label)?

This refinement would make sense if the code tries to ignore the zero-padded regions (which is set to ignore_label, e.g., 255) in both predicted and gt masks. However, because the object boundary is also set as ignore_label=255, the prediction is further refined.

In the paper, the mIoU of PFENet in 1-shot setting (with resnet50 backbone) is 60.8% on PASCAL-5i dataset but when I reproduced the model without using object boundary (ignore label), the mIoU drops to 56.2%. May I ask why did you adopt such prediction refinement using gt boundary?

I brought example predictions: Top image visualizes naive prediction by the model while the bottom image visualizes refined prediction using gt-boundary.

about coco-dataset

In the coco txt, I see the ".png" for annotations. How do you get the coco .png annotations? Thanks

dvlab-research / pfenet Goto Github PK

pfenet's Issues

Recommend Projects

Recommend Topics

Recommend Org