Code Monkey home page Code Monkey logo

bdl's People

Contributors

liyunsheng13 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bdl's Issues

Training CycleGAN

Hello,
I'm trying to train the CycleGAN in order to use the same configuration you had while training the network, but using a different dataset. After following the readme.md file, I ran the code but there are several opt parameters inside the cycle_gan_model.py file of which I do not know the default parameters. Just to name few, there are opt.no_lsgan, opt.fineSize and so on...
Could you please tell me what the default values for such parameters were, in order to retrain the CycleGAN for my dataset? My dataset has even a different amount of labels, so I cannot use the weights you provided for the segmentation model...

Number of common classes in Synthia dataset

Hi Yunsheng,

I noticed that in your paper you mentioned:

For SYNTHIA [28], we use the SYNTHIA-RAND-CITYSCAPES
set which contains 9, 400 images with the resolution 1280×
760 and 16 common categories with Cityscapes [5].

which means only 16 classes should be involved in the training and the evaluation in the experiment, but I found in the configuration of the dataset of Synthia that:

self.id_to_trainid = {3: 0, 4: 1, 2: 2, 21: 3, 5: 4, 7: 5, 15: 6, 9: 7, 6: 8, 16: 9, 1: 10, 10: 11, 17: 12, 8: 13, 18: 14, 19: 15, 20: 16, 12: 17, 11: 18}

so there are 19 classes in total, just like GTA V and cityscapes. Is that an inconsistency between the code and the paper or did I miss anything?

Thanks,
Kaihong

About threshold and mask in SSL

thres = []
for i in range(19):
    x = predicted_prob[predicted_label==i]
    if len(x) == 0:
        thres.append(0)
        continue        
    x = np.sort(x)
    thres.append(x[np.int(np.round(len(x)*0.5))])
print (thres)
thres = np.array(thres)
thres[thres>0.9]=0.9
print (thres)
for index in range(len(targetloader)):
    name = image_name[index]
    label = predicted_label[index]
    prob = predicted_prob[index]
    for i in range(19):
        label[(prob<thres[i])*(label==i)] = 255  
    output = np.asarray(label, dtype=np.uint8)
    output = Image.fromarray(output)
    name = name.split('/')[-1]
    output.save('%s/%s' % (args.save, name)) 
  1. I didn't really understand these code. You describe in paper that you choose 0.9 as threshold ,but it seems that you choose the mid prob as thrshold[thres.append(x[np.int(np.round(len(x)*0.5))])]
  2. pixel higher than 0.9 should be choosed for pseudo label, but the code [ [label[(prob<thres[i])*(label==i)] = 255 ] shows that prob<thres=255 ,why? shouldn't it be > or =0 ?
  3. Sorry, my question may be a little confusing, but I wonder how mask and threshold are selected. Thanks

GTA5 image size

What I did to train Cycle-GAN is to resize the image to 1024X or X1024 and then crop a patch with size 452*452. You can choose other size based on the GPU you use.

Originally posted by @liyunsheng13 in #11 (comment)

The original size of GTA5 images is 1914x1052, after resize the width of the images to 1024, then the hight should be (1052 * 1024 // 1914) = 562. The image you provided is 1024x564, so is there any extra processing such as padding?

quick question

Hello,

In this scenario (mentioned in the previous post):

  1. train CycleGAN to get translated images
    (2) train BDL.py to get segmentation model with the translated images and source images,
    (3) train SSL.py to get pseudo-labels and refine the segmentation model.
    (4) retrain CycleGAN with an additional perceptual loss.

Do "train CycleGAN" mean one-batch iteration (A - B - A, B - A - B) or complete training phase with 20 epochs?

Thank you.

Issues with reproducing results using pre-trained weights on PyTorch 1.7 and CUDA 11

Getting 1.5 to 2 percent worse results when I evaluated using downloaded pre-trained weights on PyTorch 1.7 and CUDA 11. Anyone else experiencing this?

GTA to Cityscapes

python evaluation.py --restore-from ./snapshots/gta2city/gta_2_city_deeplab --save ./results/gta2city/ \
    --data-dir-target /data/cityscapes/ --data-list-target ./dataset/cityscapes_list/val.txt \
    --gt_dir /data/cityscapes/gtFine/val/ --devkit_dir ./dataset/cityscapes_list/
===>road:       90.9
===>sidewalk:   45.38
===>building:   83.56
===>wall:       32.19
===>fence:      25.46
===>pole:       27.81
===>light:      35.3
===>sign:       36.23
===>vegetation: 84.17
===>terrain:    39.95
===>sky:        83.95
===>person:     56.5
===>rider:      29.17
===>car:        81.48
===>truck:      31.93
===>bus:        46.8
===>train:      2.93
===>motocycle:  27.75
===>bicycle:    31.79
===> mIoU19: 47.01
===> mIoU16: 51.15
===> mIoU13: 56.38

Expected mIoU19: 48.5

Synthia to Cityscapes

$ python evaluation.py --restore-from ./snapshots/syn2city/syn_2_city_deeplab --save ./results/syn2city/ \
    --data-dir-target /data/cityscapes/ --data-list-target ./dataset/cityscapes_list/val.txt \
    --gt_dir /data/cityscapes/gtFine/val/ --devkit_dir ./dataset/cityscapes_list/
===>road:       86.27
===>sidewalk:   46.58
===>building:   79.1
===>wall:       6.04
===>fence:      0.55
===>pole:       23.55
===>light:      7.88
===>sign:       10.77
===>vegetation: 78.46
===>terrain:    0.0
===>sky:        81.57
===>person:     52.87
===>rider:      28.51
===>car:        71.43
===>truck:      0.0
===>bus:        36.6
===>train:      0.0
===>motocycle:  26.26
===>bicycle:    37.15
===> mIoU19: 35.45
===> mIoU16: 42.1
===> mIoU13: 49.5

Expected mIoU13: 51.4

problem about loss become nan

hi,thanks for the code
I'm, trying to train GTA2Cityspace on VGG and DeepLab.
but no mater with or without self-supervised, after 2 iteration src seg loss will become nan. can you tell me how to fix it? thank you!

my environment:
CUDA Version: 11.1, pytorch-gpu:1.2.0
and here is my log on training self-supervised VGG:
擷取選取區域_066

Some question about use SSL to other domain adaptation framework

Hi, I attempt to use your SSL strategy to my own DA framework, but my performance seriously decreased (from the initial 44.3% to 41.5%).

My DA framework is similar to AdasegNet, and I use your SSL.py to generate the pseudo labels.
My loss is :
L = L_sourceseg + \lamdaL_da (1)
L = L_sourceseg + \lamda1
L_da +\lamda2*L_ssl (2)
First, I use (1) to train my network, and the best mIoU is 44.3%. Then, I use the best model to generate pseudo labels (use SSL.py). Finally, I use the pseudo labels to train the Eq(2). But the result is decreased.

Could you give me some suggestion on how to use the pseudo labels?

About the early stop in SSL training

Hi, thanks for sharing the code. Noted that in SSL training, you set it at 120000 iterations for early stopping and choose the model at that iteration for next stage. In the paper it got 47.2 mIoU from initial 42.7 mIoU by 1-stage SSL training and then from 44.3 to 48.5. Did you obeserve the mIoUs in mid-iterations in the training process? Such as in 60000 iteraions or others.

Which model selected for next iterations?

Hello @liyunsheng13 , thank you very much for the code. I have questions on Algorithm 1 in your paper:

How do you select which M^k_i model (trained with Eqn 3) that will be used for the next iteration. I imagine that you validate all snapshots of M^k_i (saved during training) and pick the best?
Similar question for F^k (trained with Eqn 2) and M^k_0 (trained with Eqn 1)

Thanks

Reproducing M0(1)[F(1)]

Hi,
Thanks for a great paper and code.
I was trying to reproduce the result for M0(1)[F(1)] without SSL. As far as I understand it is required CycleGan learning and Segmentation learning (without SSL).
I used the files you released for CycleGan and merge it to the updated code of CycleGan repository.
I got 40.7 IOT while I should get 42.7.
Any chance you can release also the files that you didn’t change for CycleGan? (including the command line)
I used Patch size of 400 instead of 452 dues to GPU memory size. Do you think such variance make sense?
The command line I used are:
CycleGan - train:
python train.py --dataroot datasets/gta/ --display_id -1 --init_weights deep_lab_checkpoint/cyclegan_sem_model.pth --niter 10 --niter_decay 10 --crop_size 400 --load_size 1024 --lambda_identity 0
CycleGan Test:
python test.py --dataroot datasets/gta/ --name --load_size 1024 --preprocess scale_width --num_test 10000000
BDL:
python BDL.py --snapshot-dir ./snapshots/gta2city --init-weights DeepLab/DeepLab_init.pth --num-steps-stop 80000 --model DeepLab --data-dir <data_dir> --data-list dataset/gta5_list/train.txt --data-list-target dataset/cityscapes_list/train.txt --data-dir-target <data_dir>

How to train CycleGAN with perceptual loss?

Thanks for sharing your code firstly. After reading your code and paper carefully, I still have some questions, how to train CycleGAN by bidirectional learning?
My undestandings about that are shown as follows:

  1. Training CycleGAN with corresponding loss to get translated GTA images.
  2. BDL.py: Training segmentation network with translated GTA images and Cityscpes images
  3. SSL.py: Obtaining the pseudo labels for Cityscapes and fine-tuning the segmentation network
  4. Trainig CycleGAN with perceptual loss, while fixing the parameters of segmentation network.

Are my understanding right ? Thanks for your answers!

Difference of first and second translation model's result

When I first train translation model, CycleGAN transforms sky into vegetation or building. So, when train segmentation model using translated source image, there is many noisy label to disrupt training.

Question is

  1. Did you get the same pattern in first translation model?
  2. If so, how did you deal with falsely generated artifact?
  3. Is translated GTA and SYNTHIA you uploaded is second translation model's result?
  4. If so, in your experiment, second translation model does not generate wrong artifact?

About the implementation

HI, Yunsheng, nice work.
I was kind of confused when I was reading the paper.
I understood the 'bi-directional' as simultaneous forward and backward pass', however when I check the code, I found it is not simultaneous training.

If I am correct, the training step might be:
(1) train CycleGAN to get translated images
(2) train BDL.py to get segmentation model with the translated images and source images,
(3) train SSL.py to get pseudo-labels and refine the segmentation model.
(4) retrain CycleGAN with an additional perceptual loss.

Then what's the next step? Step (1) tries to get better-translated images with the model of step (4)?
And then repeat (2)-(3). Does that mean this needs to be done multiple times based on the number of steps we want to try?
I am kind of confused about it since I thought 'bi-directional' was for simultaneous training of Im2Im translation and segmentation.

Moreover, the CycleGAN folder is not complete. Some libraries are missing :
from util.image_pool import ImagePool
from .base_model import BaseModel
from fcn8s_LSD import FCN8s_LSD #lys
For the last one, fcn8s_LSD is the model from step (4)?

thanks for your help.

How to evaluate the provided pre-trained models to get the same results with the paper

Hi, I am sorry to disturb you again. I was trying to evaluate the pre-trained models which are provided by this project but I met some difficulties. Can you give me some suggestions? Thanks in advance!

You provided the pre-trained models in the README file, they are GTA5_deeplab, GTA5_VGG, SYNTHIA_deeplab, and SYNTHIA_VGG. In my understanding, I can get the same results with the paper by running evaluation.py on the test dataset. However, the results I got are as follows

  1. GTA5→Cityscapes, DeepLab (48.5 in paper), the result is exactly the same as the paper
python evaluation.py --restore-from gta_2_city_deeplab --model DeepLab --save test
===> mIoU19: 48.52
  1. GTA5→Cityscapes, VGG (41.3 in paper), the result is slightly lower than the paper
python evaluation.py --restore-from gta_2_city_vgg --model VGG --save test
===> mIoU19: 41.06
  1. SYNTHIA→Cityscapes, DeepLab (51.4 in paper), the result is slightly lower than the paper
python evaluation.py --restore-from syn_2_city_deeplab --model DeepLab --save test
===> mIoU13: 51.32
  1. SYNTHIA→Cityscapes, VGG (39.0 in paper), the result is slightly lower than the paper
python evaluation.py --restore-from syn_2_city_vgg --model VGG --save test
===> mIoU16: 38.81

About SYNTHIA to Cityscapes

Thanks for sharing the codes!
Could you please show more results on the task SYNTHIA to Cityscapes, such as M(2)0F(2) M(2)1F(2)
Thanks a lot.

BDL adversarial training

First of all, it's just a silly question.
My understand for "real" and "fake":
I found that in your training code, the adversarial loss for output probility appears to be the opposite of the common setting, that would be 1 for "real" data, 0 for "fake" data. In this case, since we have the gt for translated sync data, we need to push unlabelled/pseudo labelled real data to behave like labelled data, that means translated sync data data should be "real" and target data should be "fake".
Common setting of adversarial training:
When we train the generator(segmentation network), we push the output of discriminator for generator input to "real"(1). When we train the discriminator, the input produced by the generator should be classified as "fake"(0).
Your setting:
You revese the domain label for the real data and generator(segmentation) output. Although, it may have little influence for the final result, I still want to check if this a personal preference or a deliberate design?

A small question

It appears that you use Adam and a stepLR schedule, isn't Adam supposed to be an adaptive optimizer, which is designed to replace a handcrafted lr scheduler? I wonder why not just use SGD with momentum if you are adjusting the lr yourself?

error at running train.py for cyclegan

i met a problem at running train.py for cyclegan, when i replace the model file in the cyclegan by using the uploaded cyclegan files from @liyunsheng13, i find it will happend that list index out of range like following picture showing
error for cyclegan
do you guys have any solution for that

About the GTA2Cityscapes images and SSL model

Thanks for sharing your code. I want to know the iteration number of provided GTA2Cityscapes dataset, SSL_step1 and SSL_step2 model, is K=1 or K=2? or after training, will we get mIoU of 47.2 or 48.5?

Synthia to Cityscapes

Would you please release your trained CycleGAN model for Synthia->Cityscapes translation? Thanks.

First Image Transltation

Hi, I have read the paper and still have som problems.

  1. CycleGAN is trained with a perceptual loss. Does the first image translation use the perceptual loss? If so, which segmentation model parameters is used. Is the source only model with 33.6 mIoU in the paper?

  2. With first traslated images in hand, when starting the adversarial training of the segmentation model, is the initial model parameters the ImageNet pretrained parameters or the source-image pretrained parameters with mIoU of 33.6?

Train CycleGAN

Thanks for sharing the codes!
I have a quetion about training Cycle-GAN.
Did you use all 24966 GTA5 images and 2975 Cityscapes images for training the Cycle-GAN?
When i traning it on RTX2080tI , it will take a few days.

Reproducing M0(2)(F(2)) (the released SSL1)

I followed your training steps by using

python BDL.py --snapshot-dir ./snapshots/gta2city
--init-weights /path/to/inital_weights
--num-steps-stop 80000
--model DeepLab

I took the initial model DeepLabV2 as initial weights. Source data is the provided translated images GTA5 as CityScapes (DeepLab). Target data is the training set of Cityscapes. Then I only got 43.6 mIou while I should get 44.3.

ssl.py

could you give more explanation about using self supervised training?

Why is there no forward direction process in BDL.py ?

In your paper, using the bidirectional lerning, the image translation model and the segmentation adaptation model can be learned alternatively. But, I think there is not forward direction process in BDL.py. Please teach me the reason. Thank you in advance.

Complete training procedure

Could you provide complete training procedure of your network for gta5 to cityscape dataset? I have some confusion about training cycle Gan and using procedures.

Train and eval size of Cycle-GAN

Hi,
Thanks for sharing the codes!
I have a small question about the training of Cycle-GAN.
For the training of Cycle-GAN, the input images are resized to 256x256 for the training. But the original size of image is 1024x564 (such as GTA), so after training of Cycle-GAN, how could you generate the transfered image of GTA in the original size (1024x564) ?
If you directly feed the original GTA image to the model trained with 256x256, does it harm the transfered performance?
Thanks very much and looking forward to your reply!

How do you decide when to stop the training ?

Hi, I am curious how to decide when to stop the training and how to choose the final snapshots. It's not clarified in your paper. I found the "Early Stopping" parameters in your code, how to set this hyper-parameter?

fineSize (cropSize) for Cycle-Gan

Hi,
I am a bit confused about the fineSize parameter for the CycleGAN training process.
In your Readme file found in the cycle-gan the fineSize mentioned for training is 452, while in a previous post you say cycle-gan is trained with fineSize=1024.
This is the previous post I am referring: #28

which one is the right one?
and in general, what is the fineSize policy for the cycleGAN training?
seems reasonable so choose the same input size used in BDL for that dataset so cycleGAN will optimize outputting images with that size and they can be used later in the BDL.py traning.

Thanks =)

Inconsistency between paper and code

Hi, I find two inconsistencies between paper and code. Can you give me some suggestions? Thanks!

  1. The paper says that "When training the segmentation adaptation model, images are resized with the long side to be 1,024 and the ratio is kept." However, for the dataloader, the code of image sizes is as follows

    image_sizes = {'cityscapes': (1024,512), 'gta5': (1280, 720), 'synthia': (1280, 760)}

  2. The paper says that "For FCN-8s with VGG16, we use Adam as the optimizer with momentum as 0.9 and 0.99. The initial learning rate is 1 × 10−5 and decreased with ‘step’ learning rate policy with step size as 5000 and γ = 0.1." However, the code for adjusting learning rate is as follows. I mean, is the number 5000 correct?

    optimizer.param_groups[0]['lr'] = args.learning_rate * (0.1**(int(i/50000)))

    optimizer.param_groups[0]['lr'] = args.learning_rate_D * (0.1**(int(i/50000)))

AttributeError: 'NotImplementedError' object has no attribute 'step'

Hi. I have some questions about the training of CycleGAN with perception loss.
When I trained for the second round, the terminal made the following mistake:
scheduler.step()
AttributeError: 'NotImplementedError' object has no attribute 'step'
Can you help me? My choice of learning rate model is: --lr_policy = linear
Thank you.

how did you set discriminator's weight?

In inner loop for SSL, if N=2, segmentation network is trained 3 times in a row.
When training segmentation network, how did you set the discriminator's weight?
Initialize discriminator every time or use previous discriminator's weight??

Also same question for outer loop's weight.

how did you make initial weight?

Before training FCN8s-VGG16 model, I test initial weight you provide.
Then I got 30.0 mIoU.
In paper, you wrote imagenet pretrained model.
But in my thought, initial model's high mIoU means it is trained on supervision of segmentation data.
How did you make the initial weight??

Dataset download failed

Hi, yunsheng, I can't open the GTA5 as CityScapes url on drive.google, could you please give other urls for example on Baiduyun? THX

SSL for SYNTHIA

Hi
When I train segmentation network with the pseudo labels with your code,
for GTA5, training is stable and mIoU is incrementally increases.
But for SYNTHIA, it shows different pattern.
The mIoU first steeply rise in early iteration, but decrease as train proceed.
I think that ,because layout gap between SYNTHIA and Cityscapes is much bigger, overfitting problem occurs in SYNTHIA setting.
Did you get the same pattern?

About the training on CycleGAN

Hello,
Which files in official CycleGAN code should be replaced by your code? I try to replace the ./modes/ ./options/ , but it doesn't work .

Inconsistency fix from CycleGan repository

First of all, thanks for this contribution.

As some may have noticed, there is some inconsistency between the BDL repository and the CycleGAN repository which has updated since it was used by BDL. thus some code and params on the BDL repository needs an update as well. This fixes are based on the history of the CycleGAN repository.
I hope I did not miss anything. If anyone notices something wrong or anything I missed please comment this issue.

Param name updates: (old -> new)
(fix: change the name on the provided script on BDL/cyclegan/readme.md)

  1. resize_or_crop -> preprocess
  2. niter -> n_epoches
  3. niter_decay -> n_epoches_decay
  4. loadSize -> load_size
  5. fineSize -> crop_size (also change al opt.fineSize calls to opt.crop_size in the given cycle_gan_model.py file)
  6. batchSize -> batch_size

Params more complex updates:

  1. lr_policy: this parameter default was changed to 'linear' from 'lambda'. change the default back to 'lambda'.
    If you wish you can copy the linear policy implementation from the cycleGAN repository networks.py file to the given networks.py.

  2. which_direction parameter was chenged to direction. go to the given cycle_gan_model.py file and change all opt.which_direction usages to opt.direction.

  3. which_model_netG, and which_model_netD where changed to 'netG' and 'netD'. go to the given cycle_gan_model.py file and change all which_model_netG and which_model_netD
    usages to netG and netD.

  4. no_lsgan parameter was changed to gan_mode and its type changed from bool to str which indicates the wanted gan mode.
    a previous issue posted raised this issue and the repository owner commented that the no_lsgan param should be False at default.
    since this parameter was not changed in the given run command (BDL/cyclegan/readme.md) we can indicate that we want to use the lsgan and the gan_mode parameter is set to "lsgan" as default so were good.
    all left to do is change two lines in the given cycle_gan_model.py file: (#line num: old -> new)
    62: use_sigmoid = opt.no_lsgan -> use_sigmoid = opt.gan_mode != 'lsgan'
    74: self.criterionGAN = networks.GANLoss(use_lsgan=not opt.no_lsgan).to(self.device) -> self.criterionGAN = networks.GANLoss(use_lsgan=opt.gan_mode == 'lsgan').to(self.device)

Additional Problems:

  1. CycleGANModel has a method called Initialize(). this may have been used as an _ init _ function in the past in the original CycleGAN repository but this is not true anymore.
    change the name of Initialize to a proper init function ( _ _init _ _(self, opt)) and change the call for the Base model constructor on line 28 from BaseModel.Initialize(opt) to BaseModel. _ _ init _ _(self, opt)

About confidence threshold in SSL

    for i in range(19):
        x = predicted_prob[predicted_label==i]
        if len(x) == 0:
            thres.append(0)
            continue        
        x = np.sort(x)
        thres.append(x[np.int(np.round(len(x)*0.5))])
    print thres
    thres = np.array(thres)
    thres[thres>0.9]=0.9
    print thres

In the code above, applying 0.5(len(x) * 0.5) seems to get the median of confidence per class.
In some cases, the corresponding confidence value could be 0.4 or 0.6. Is that right?

If my interpretation is correct, are there any other references on how to assign these pseudo-labels?
If not, is it a technical skill?

M(0) model

Hi, could you release the M(0) model of DeepLab for GTA5 and SYNTHIA ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.