nv-tlabs / datasetgan_release Goto Github PK

View Code? Open in Web Editor NEW

335.0 335.0 44.0 9.69 MB

License: MIT License

Python 97.17% Shell 2.83%

datasetgan_release's People

Contributors

Stargazers

Watchers

Forkers

bfialkoff wangg12 ericchansen 4ndr3ar rinatum greatwallet nirvanalan tantantetetao asromahin kkodoo macaque2077 gcjordi zebrajack cassie07 peterzhousz stuvx syeda5688 jinwook-shim templeblock githubfragments wuzhenyubuaa mornydew myhan1996 fyushan guoyang-xie wjgaas emilyzfliu albert-yue fengyin448334153 jjandnn marcosspsljr jambobjambo sahitiy kkishanbabu ucasligang connectwithprakash everythingismetaphor toshaoki gkouros phoopyae figurekim317 xrenaa masa-yo1 arkboy1224

datasetgan_release's Issues

Please update requirements.txt

I'm getting errors when trying to use requirements.txt on Python 3.6. For example, pip install -r requirements.txt shows this error message.

(quick-test) xxx@xxx:~/workspace/quick-test$ pip install -r requirements.txt
Collecting torch==1.5.1
  Using cached torch-1.5.1-cp36-cp36m-manylinux1_x86_64.whl (753.2 MB)
Collecting torchvision==0.5.0
  Using cached torchvision-0.5.0-cp36-cp36m-manylinux1_x86_64.whl (4.0 MB)
Collecting numpy
  Using cached numpy-1.19.5-cp36-cp36m-manylinux2010_x86_64.whl (14.8 MB)
Collecting future
  Using cached future-0.18.2-py3-none-any.whl
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of torch to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install -r requirements.txt (line 2) and torch==1.5.1 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested torch==1.5.1
    torchvision 0.5.0 depends on torch==1.4.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

For reference, this is a fresh virtualenv. The output from pip freeze is shown below.

(quick-test) xxx@xxx:~/workspace/quick-test$ pip freeze
pkg-resources==0.0.0

Cars 32 Class Missing

Hello. I am interested in getting the full 32 classes for cars as published in the paper. Will you be releasing experiments (JSON files), checkpoints, and other supporting files to allow us to train and generate these 32-class car images and annotations? Thank you for the excellent work so far!

Affine Layers extracted from StyleGAN1

Thank you for open-sourcing this project! We are trying to implement the model, and are trying to better understand the affine_layers variable that is used to generate the synthetic images. What are these layers that are being taken from the pre-trained StyleGAN1 (SG1)? Are they specific hidden layers? Also, if we were to swap this SG1 for our own model, would the corresponding affine_layers be sufficient along with the randomly generated latent variable?

Thank you for your time, we appreciate it!

Why you need so much RAM?

Hey guys,

Amazing work, I was wondering why you need 160GB for 16 images!

Thanks,

Cheers

160G RAM is required to run 16 images training.

is it for real?

RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

When running the interpreter's training, the error below appears for me. Does anyone know what is going on?

Traceback (most recent call last):
File "train_interpreter.py", line 579, in
main(opts)
File "train_interpreter.py", line 447, in main
all_feature_maps_train_all, all_mask_train_all, num_data = prepare_data(args, palette)
File "train_interpreter.py", line 403, in prepare_data
img, feature_maps = latent_to_image(g_all, upsamplers, latent_input.unsqueeze(0), dim=args['dim'][1],
File ".../datasetGAN_release/datasetGAN/../utils/utils.py", line 76, in latent_to_image
img_list, affine_layers = g_all.module.g_synthesis(style_latents)
File ".../.conda/envs/dg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File ".../datasetGAN_release/datasetGAN/../models/stylegan1.py", line 558, in forward
x, x2 = m(dlatents_in[:, 2 * i:2 * i + 2])
File ".../.conda/envs/dg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File ".../datasetGAN_release/datasetGAN/../models/stylegan1.py", line 370, in forward
x = self.conv(x)
File ".../.conda/envs/dg/lib/python3.8/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File ".../mspacheco/datasetGAN_release/datasetGAN/../models/stylegan1.py", line 125, in forward
return F.conv2d(x, self.weight * self.w_mul, bias, padding=self.kernel_size // 2)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED

stylegan2 support

hello,

Is stylegan2 supported in your code?
I saw stylegan2-ffhq-config-f.pt in pretrain,but there is only stylegan1.py in models

Question about the feature maps

Hello, I have a question about the feature maps, which are fed into the style interpreter. It is said in the paper, that the feature maps are the outputs of AdaIN layers. However, the code below (forward function of the Gsynthesis block) suggests something different (x is the output of the second convolution layer in the block). Which one is the correct version ? Or am I missing something?

Thanks in advance

datasetGAN_release/models/stylegan1.py

Lines 397 to 410 in d9564d4

    
           def forward(self, x, dlatents_in_range, latent_after_trans=None): 
        
               x = self.conv0_up(x) 
        
               if latent_after_trans is None: 
        
                   x = self.epi1(x, dlatents_in_range[:, 0]) 
        
               else: 
        
                   x = self.epi1(x, dlatents_in_range[:, 0], latent_after_trans[0])  # latent_after_trans is a list 
        
               x = self.conv1(x) 
        
               if latent_after_trans is None: 
        
                   x1 = self.epi2(x, dlatents_in_range[:, 1]) 
        
               else: 
        
                   x1 = self.epi2(x, dlatents_in_range[:, 1], latent_after_trans[1]) 
        
               return x1, x

Question about key point

In class generate_data， What is the purpose of using 10 models?
I noticed make_ training_ Data.py seems to be used to generate data， but I want to train the first stage with my own labeled training data. How should I operate?

Thank you！

label image generated is empty

Hi there,

I tried to use pretrained interpreter to run sampling codes as below
python train_interpreter.py --generate_data True --exp experiments/car_20.json --resume checkpoints/interpreter_checkpoint/car_20 --num_sample 10
The images are generated correctly but the label maps are all black, any idea why this happened? Thank you!

Does RTX3090 support training and inference of this network?

The author indicated that the graphics card device was Tesla V100, which is relatively high configuration requirements for some general lab. I would like to ask if the author has performed calculations on other relatively low-profile graphics cards, such as RTX30 series or RTX20 series graphics cards?

About CelebA-Mask-8

As I know, the original CelebA dataset does not provide a semantic segmentation label. The CelebAMask-HQ dataset has 19 class semantic segmentation. Could you please describe how you obtain the CelebA-Mask-8 test data, or release your test dataset?

Training deeplab on higher resolution images

Hi!

Thanks once again for your responsiveness :)

May I ask about the procedure used to train deeplab-v3 on 1024 images, I noticed the provided code in the repo samples train_interpreter.py, and trains train_deeplab.py on images with 512 x 512 resolution for face_34 task.

When doing the same for 1024 x 1024 images, the cross_validation script results in this:

Could you provide more information on the procedure for training deeplab on 1024 x 1024 images? How large was the training set that you used? The number of epochs, and batch size? I ask the latter, as I also had some issues with OOM cuda error, when training on 32G NVIDIA Tesla V100.

Confusion regarding checkpoints

Can you please provide greater clarity on where to download the styleGAN checkpoints from?

The linked repo has pickled models but not checkpoints and certainly not .pt checkpoints. Also its a tensorflow implementation so I'm a little confused...

Specfically where can I find:

karras2019stylegan-cars-512x384.for_g_all.pt
karras2019stylegan-cats-256x256.for_g_all.pt
karras2019stylegan-celebahq-1024x1024.for_g_all.pt

Images generated by the StyleGAN of the DatasetGAN different from the original ones (with the same StyleGAN)

Hi,

When I ran the training part of the "train_interpreter.py" file and got the image "train_data.jpg", I saw that the generated images (line # 2) are not exactly the same than the original ones (line # 1). This problem also seems to have been present in the case of images shared on your Google Drive (eg "train_data.jpg" from car_20). However for car_20, the difference is quite small and does not really affect the correspondence with the images of the masks.

However, in my case, with 128x128 images and a StyleGAN having a fairly high FID-50k (around 23-24), the difference between the generated images causes the problem that the sections of interest have shifted in the images. Therefore, the labeled parts of the mask images no longer correspond to the correct sections of the generated images and the pixel_classifiers are not trained on the correct information.

Do you know if there is a way to correct this problem of reproducibility of the images generated by the StyleGAN for a given latent code?

Note: the "original" images I generated at the start were with the ".pt" file of the pre-trained StyleGAN (after converting [Tensorflow -> PyTorch]) and also used the same "avg_latent" as the one that has was then used for the DatasetGAN. I also used the same PC for the generation of the "original" images and for the training of the DatasetGAN. So normally, there shouldn't be any problem related to these parts.

Update # 1: I saw that fixing the random seeds of Pytorch and Numpy helped make the images generated by the [StyleGAN from the DatasetGAN] look like the original ones. However, this is only true for the first selected images that were with a continuous numbering when I created them.

Also, I don't understand how the fixation of the random seeds could have fixed this, as I don't see any part related to random values in the DatasetGAN (unless it was induced by the additive noise of the StyleGAN).

CUDA error: CUBLAS_STATUS_EXECUTION_FAILED

When i try to reproduce the example execution (step number 1 and 2) appear the following message ->CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc).

I execute the scrips in my RTX 3080 graphic card. How can I solve it? I tried the example car_20

Mismatch between images and annotations in Cat-16

Hi,
I have found that in Cat-16 some annotations do not match the images. Examples are as follows

Index	Provided in Google drive	Generated from StyleGAN	Annotation	shown in supplementary material
image_23
image_26

Clearly, neither the images provided in Google drive nor the images generated from StyleGAN could match the annotations, but only the images shown in the supplementary material can match the annotations. I think this mismatch is quite obvious as an additional leg is annotated in image_23 while one leg is missing in annotations of image_26. I tend to believe that the released images and latents are wrongly taken. Is it possible to fix this issue?

AttributeError: module 'scipy.misc' has no attribute 'imsave'

Title describes an error I receive with modern scipy. The modern version of scipy doesn't include imsave anymore. Also scipy is not included in the requirements.

Mis-inplementation of JS divergence

Hi, according to definition of JS divergence (as mentioned in your supp file), JS divergence is calculated as the difference of
entropy of average probabilities and average of entropies.

However in your code, the first term of JS, aka the difference of entropy of average probabilities is implemented as:

datasetGAN_release/datasetGAN/train_interpreter.py

Line 304 in dee6d7d

full_entropy = Categorical(logits=mean_seg).entropy()

where mean_seg is defined as average segmentation map of 10 outputs of ensembled pixel_classifiers.

Specifically, I have traced the implementation of mean_seg -->

datasetGAN_release/datasetGAN/train_interpreter.py

Line 302 in dee6d7d

mean_seg = mean_seg / len(all_seg)

-->

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 291 to 294 in dee6d7d

    
           if mean_seg is None: 
        
               mean_seg = img_seg 
        
           else: 
        
               mean_seg += img_seg

--> img_seg

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 282 to 284 in dee6d7d

    
           img_seg = classifier(affine_layers) 
        
           img_seg = img_seg.squeeze()

In fact, img_seg are all unnormalized probabilities, aka logits defined in pytorch distribution's argument. I think in the code you attempted to do average upon logits instead of probabilies (since you have commented out Sigmoid in pixel_classifier)

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 68 to 92 in dee6d7d

    
           class pixel_classifier(nn.Module): 
        
               def __init__(self, numpy_class, dim): 
        
                   super(pixel_classifier, self).__init__() 
        
                   if numpy_class < 32: 
        
                       self.layers = nn.Sequential( 
        
                           nn.Linear(dim, 128), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=128), 
        
                           nn.Linear(128, 32), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=32), 
        
                           nn.Linear(32, numpy_class), 
        
                           # nn.Sigmoid() 
        
                       ) 
        
                   else: 
        
                       self.layers = nn.Sequential( 
        
                           nn.Linear(dim, 256), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=256), 
        
                           nn.Linear(256, 128), 
        
                           nn.ReLU(), 
        
                           nn.BatchNorm1d(num_features=128), 
        
                           nn.Linear(128, numpy_class), 
        
                           # nn.Sigmoid() 
        
                       )

TL; DR

The unlawful commutation of softmax and linear operation leads to mis-implementation of JS divergence.

ADE-Car-12 testing set and PASCAL-Car-5

Hi,
Nice work! I am curious about how to construct the ADE-Car-12 and PASCAL-Car-5. In the paper, it says ADE-Car-20 contains 50 and 250 images for training and testing. I wonder what are those exact images (e.g. image index). Are they preprocessed? Is there any instruction for constructing it? I also wonder which 900 images are taken from Pascal Part for PASCAL-Car-5 and how are the 5 classes defined. Thanks very much!

Using DatasetGan under EditGan

Is it possible to use only the DatasetGan part in the new version under EditGan? I want to generate annotated dataset based on StyleGan 2 but without the app of EditGan, it is still possible?

Error in crossvalidation

I reproduced the face_34 example, but an error was raised when executing the cross-validation. It seems there is some incompatibility with the checkpoint generated by the downstream task and the classifier torchvision.models.segmentation.deeplabv3_resnet101(pretrained=False, progress=False, num_classes=args['testing_data_number_class'], aux_loss=None). Could you please help me with this issue?

The code I ran: python test_deeplab_cross_validation.py --exp experiments/face_34.json --resume model_dir/face_34 --cross_validate True

The error I got:
Opt {'exp_dir': 'model_dir/face_34', 'batch_size': 64, 'category': 'face', 'debug': False, 'dim': [512, 512, 5088], 'deeplab_res': 512, 'number_class': 34, 'testing_data_number_class': 34, 'max_training': 16, 'stylegan_ver': '1', 'annotation_data_from_w': False, 'annotation_mask_path': './dataset_release/annotation/training_data/face_processed', 'testing_path': './dataset_release/annotation/testing_data/face_34_class', 'average_latent': './dataset_release/training_latent/face_34/avg_latent_stylegan1.npy', 'annotation_image_latent_path': './dataset_release/training_latent/face_34/latent_stylegan1.npy', 'stylegan_checkpoint': './checkpoints/stylegan_pretrain/karras2019stylegan-celebahq-1024x1024.for_g_all.pt', 'model_num': 10, 'upsample_mode': 'bilinear'}
Downloading: "https://download.pytorch.org/models/resnet101-5d3b4d8f.pth" to /root/.cache/torch/hub/checkpoints/resnet101-5d3b4d8f.pth
100% 170M/170M [00:02<00:00, 65.8MB/s]
Val Data length, 4
Testing Data length, 16
Traceback (most recent call last):
File "test_deeplab_cross_validation.py", line 511, in
cross_validate(args.resume, opts)
File "test_deeplab_cross_validation.py", line 129, in cross_validate
classifier.load_state_dict(checkpoint['model_state_dict'])
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1224, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for DeepLabV3:
Missing key(s) in state_dict: "backbone.conv1.weight", "backbone.bn1.weight", "backbone.bn1.bias", "backbone.bn1.running_mean", "backbone.bn1.running_var", "backbone.layer1.0.conv1.weight", "backbone.layer1.0.bn1.weight", "backbone.layer1.0.bn1.bias", "backbone.layer1.0.bn1.running_mean", "backbone.layer1.0.bn1.running_var", "backbone.layer1.0.conv2.weight", "backbone.layer1.0.bn2.weight", "backbone.layer1.0.bn2.bias", "backbone.layer1.0.bn2.running_mean", "backbone.layer1.0.bn2.running_var", "backbone.layer1.0.conv3.weight", "backbone.layer1.0.bn3.weight", "backbone.layer1.0.bn3.bias", "backbone.layer1.0.bn3.running_mean", "backbone.layer1.0.bn3.running_var", "backbone.layer1.0.downsample.0.weight", "backbone.layer1.0.downsample.1.weight", "backbone.layer1.0.downsample.1.bias", "backbone.layer1.0.downsample.1.running_mean", "backbone.layer1.0.downsample.1.running_var", "backbone.layer1.1.conv1.weight", "backbone.layer1.1.bn1.weight", "backbone.layer1.1.bn1.bias", "backbone.layer1.1.bn1.running_mean", "backbone.layer1.1.bn1.running_var", "backbone.layer1.1.conv2.weight", "backbone.layer1.1.bn2.weight", "backbone.layer1.1.bn2.bias", "backbone.layer1.1.bn2.running_mean", "backbone.layer1.1.bn2.running_var", "backbone.layer1.1.conv3.weight", "backbone.layer1.1.bn3.weight", "backbone.layer1.1.bn3.bias", "backbone.layer1.1.bn3.running_mean", "backbone.layer1.1.bn3.running_var", "backbone.layer1.2.conv1.weight", "backbone.layer1.2.bn1.weight", "backbone.layer1.2.bn1.bias", "backbone.layer1.2.bn1.running_mean", "backbone.layer1.2.bn1.running_var", "backbone.layer1.2.conv2.weight", "backbone.layer1.2.bn2.weight", "backbone.layer1.2.bn2.bias", "backbone.layer1.2.bn2.running_mean", "backbone.layer1.2.bn2.running_var", "backbone.layer1.2.conv3.weight", "backbone.layer1.2.bn3.weight", "backbone.layer1.2.bn3.bias", "backbone.layer1.2.bn3.running_mean", "backbone.layer1.2.bn3.running_var", "backbone.layer2.0.conv1.weight", "backbone.layer2.0.bn1.weight", "backbone.layer2.0.bn1.bias", "backbone.layer2.0.bn1.running_mean", "backbone.layer2.0.bn1.running_var", "backbone.layer2.0.conv2.weight", "backbone.layer2.0.bn2.weight", "backbone.layer2.0.bn2.bias", "backbone.layer2.0.bn2.running_mean", "backbone.layer2.0.bn2.running_var", "backbone.layer2.0.conv3.weight", "backbone.layer2.0.bn3.weight", "backbone.layer2.0.bn3.bias", "backbone.layer2.0.bn3.running_mean", "backbone.layer2.0.bn3.running_var", "backbone.layer2.0.downsample.0.weight", "backbone.layer2.0.downsample.1.weight", "backbone.layer2.0.downsample.1.bias", "backbone.layer2.0.downsample.1.running_mean", "backbone.layer2.0.downsample.1.running_var", "backbone.layer2.1.conv1.weight", "backbone.layer2.1.bn1.weight", "backbone.layer2.1.bn1.bias", "backbone.layer2.1.bn1.running_mean", "backbone.layer2.1.bn1.running_var", "backbone.layer2.1.conv2.weight", "backbone.layer2.1.bn2.weight", "backbone.layer2.1.bn2.bias", "backbone.layer2.1.bn2.running_mean", "backbone.layer2.1.bn2.running_var", "backbone.layer2.1.conv3.weight", "backbone.layer2.1.bn3.weight", "backbone.layer2.1.bn3.bias", "backbone.layer2.1.bn3.running_mean", "backbone.layer2.1.bn3.running_var", "backbone.layer2.2.conv1.weight", "backbone.layer2.2.bn1.weight", "backbone.layer2.2.bn1.bias", "backbone.layer2.2.bn1.running_mean", "backbone.layer2.2.bn1.running_var", "backbone.layer2.2.conv2.weight", "backbone.layer2.2.bn2.weight", "backbone.layer2.2.bn2.bias", "backbone.layer2.2.bn2.running_mean", "backbone.layer2.2.bn2.running_var", "backbone.layer2.2.conv3.weight", "backbone.layer2.2.bn3.weight", "backbone.layer2.2.bn3.bias", "backbone.layer2.2.bn3.running_mean", "backbone.layer2.2.bn3.running_var", "backbone.layer2.3.conv1.weight", "backbone.layer2.3.bn1.weight", "backbone.layer2.3.bn1.bias", "backbone.layer2.3.bn1.running_mean", "backbone.layer2.3.bn1.running_var", "backbone.layer2.3.conv2.weight", "backbone.layer2.3.bn2.weight", "backbone.layer2.3.bn2.bias", "backbone.layer2.3.bn2.running_mean", "backbone.layer2.3.bn2.running_var", "backbone.layer2.3.conv3.weight", "backbone.layer2.3.bn3.weight", "backbone.layer2.3.bn3.bias", "backbone.layer2.3.bn3.running_mean", "backbone.layer2.3.bn3.running_var", "backbone.layer3.0.conv1.weight", "backbone.layer3.0.bn1.weight", "backbone.layer3.0.bn1.bias", "backbone.layer3.0.bn1.running_mean", "backbone.layer3.0.bn1.running_var", "backbone.layer3.0.conv2.weight", "backbone.layer3.0.bn2.weight", "backbone.layer3.0.bn2.bias", "backbone.layer3.0.bn2.running_mean", "backbone.layer3.0.bn2.running_var", "backbone.layer3.0.conv3.weight", "backbone.layer3.0.bn3.weight", "backbone.layer3.0.bn3.bias", "backbone.layer3.0.bn3.running_mean", "backbone.layer3.0.bn3.running_var", "backbone.layer3.0.downsample.0.weight", "backbone.layer3.0.downsample.1.weight", "backbone.layer3.0.downsample.1.bias", "backbone.layer3.0.downsample.1.running_mean", "backbone.layer3.0.downsample.1.running_var", "backbone.layer3.1.conv1.weight", "backbone.layer3.1.bn1.weight", "backbone.layer3.1.bn1.bias", "backbone.layer3.1.bn1.running_mean", "backbone.layer3.1.bn1.running_var", "backbone.layer3.1.conv2.weight", "backbone.layer3.1.bn2.weight", "backbone.layer3.1.bn2.bias", "backbone.layer3.1.bn2.running_mean", "backbone.layer3.1.bn2.running_var", "backbone.layer3.1.conv3.weight", "backbone.layer3.1.bn3.weight", "backbone.layer3.1.bn3.bias", "backbone.layer3.1.bn3.running_mean", "backbone.layer3.1.bn3.running_var", "backbone.layer3.2.conv1.weight", "backbone.layer3.2.bn1.weight", "backbone.layer3.2.bn1.bias", "backbone.layer3.2.bn1.running_mean", "backbone.layer3.2.bn1.running_var", "backbone.layer3.2.conv2.weight", "backbone.layer3.2.bn2.weight", "backbone.layer3.2.bn2.bias", "backbone.layer3.2.bn2.running_mean", "backbone.layer3.2.bn2.running_var", "backbone.layer3.2.conv3.weight", "backbone.layer3.2.bn3.weight", "backbone.layer3.2.bn3.bias", "backbone.layer3.2.bn3.running_mean", "backbone.layer3.2.bn3.running_var", "backbone.layer3.3.conv1.weight", "backbone.layer3.3.bn1.weight", "backbone.layer3.3.bn1.bias", "backbone.layer3.3.bn1.running_mean", "backbone.layer3.3.bn1.running_var", "backbone.layer3.3.conv2.weight", "backbone.layer3.3.bn2.weight", "backbone.layer3.3.bn2.bias", "backbone.layer3.3.bn2.running_mean", "backbone.layer3.3.bn2.running_var", "backbone.layer3.3.conv3.weight", "backbone.layer3.3.bn3.weight", "backbone.layer3.3.bn3.bias", "backbone.layer3.3.bn3.running_mean", "backbone.layer3.3.bn3.running_var", "backbone.layer3.4.conv1.weight", "backbone.layer3.4.bn1.weight", "backbone.layer3.4.bn1.bias", "backbone.layer3.4.bn1.running_mean", "backbone.layer3.4.bn1.running_var", "backbone.layer3.4.conv2.weight", "backbone.layer3.4.bn2.weight", "backbone.layer3.4.bn2.bias", "backbone.layer3.4.bn2.running_mean", "backbone.layer3.4.bn2.running_var", "backbone.layer3.4.conv3.weight", "backbone.layer3.4.bn3.weight", "backbone.layer3.4.bn3.bias", "backbone.layer3.4.bn3.running_mean", "backbone.layer3.4.bn3.running_var", "backbone.layer3.5.conv1.weight", "backbone.layer3.5.bn1.weight", "backbone.layer3.5.bn1.bias", "backbone.layer3.5.bn1.running_mean", "backbone.layer3.5.bn1.running_var", "backbone.layer3.5.conv2.weight", "backbone.layer3.5.bn2.weight", "backbone.layer3.5.bn2.bias", "backbone.layer3.5.bn2.running_mean", "backbone.layer3.5.bn2.running_var", "backbone.layer3.5.conv3.weight", "backbone.layer3.5.bn3.weight", "backbone.layer3.5.bn3.bias", "backbone.layer3.5.bn3.running_mean", "backbone.layer3.5.bn3.running_var", "backbone.layer3.6.conv1.weight", "backbone.layer3.6.bn1.weight", "backbone.layer3.6.bn1.bias", "backbone.layer3.6.bn1.running_mean", "backbone.layer3.6.bn1.running_var", "backbone.layer3.6.conv2.weight", "backbone.layer3.6.bn2.weight", "backbone.layer3.6.bn2.bias", "backbone.layer3.6.bn2.running_mean", "backbone.layer3.6.bn2.running_var", "backbone.layer3.6.conv3.weight", "backbone.layer3.6.bn3.weight", "backbone.layer3.6.bn3.bias", "backbone.layer3.6.bn3.running_mean", "backbone.layer3.6.bn3.running_var", "backbone.layer3.7.conv1.weight", "backbone.layer3.7.bn1.weight", "backbone.layer3.7.bn1.bias", "backbone.layer3.7.bn1.running_mean", "backbone.layer3.7.bn1.running_var", "backbone.layer3.7.conv2.weight", "backbone.layer3.7.bn2.weight", "backbone.layer3.7.bn2.bias", "backbone.layer3.7.bn2.running_mean", "backbone.layer3.7.bn2.running_var", "backbone.layer3.7.conv3.weight", "backbone.layer3.7.bn3.weight", "backbone.layer3.7.bn3.bias", "backbone.layer3.7.bn3.running_mean", "backbone.layer3.7.bn3.running_var", "backbone.layer3.8.conv1.weight", "backbone.layer3.8.bn1.weight", "backbone.layer3.8.bn1.bias", "backbone.layer3.8.bn1.running_mean", "backbone.layer3.8.bn1.running_var", "backbone.layer3.8.conv2.weight", "backbone.layer3.8.bn2.weight", "backbone.layer3.8.bn2.bias", "backbone.layer3.8.bn2.running_mean", "backbone.layer3.8.bn2.running_var", "backbone.layer3.8.conv3.weight", "backbone.layer3.8.bn3.weight", "backbone.layer3.8.bn3.bias", "backbone.layer3.8.bn3.running_mean", "backbone.layer3.8.bn3.running_var", "backbone.layer3.9.conv1.weight", "backbone.layer3.9.bn1.weight", "backbone.layer3.9.bn1.bias", "backbone.layer3.9.bn1.running_mean", "backbone.layer3.9.bn1.running_var", "backbone.layer3.9.conv2.weight", "backbone.layer3.9.bn2.weight", "backbone.layer3.9.bn2.bias", "backbone.layer3.9.bn2.running_mean", "backbone.layer3.9.bn2.running_var", "backbone.layer3.9.conv3.weight", "backbone.layer3.9.bn3.weight", "backbone.layer3.9.bn3.bias", "backbone.layer3.9.bn3.running_mean", "backbone.layer3.9.bn3.running_var", "backbone.layer3.10.conv1.weight", "backbone.layer3.10.bn1.weight", "backbone.layer3.10.bn1.bias", "backbone.layer3.10.bn1.running_mean", "backbone.layer3.10.bn1.running_var", "backbone.layer3.10.conv2.weight", "backbone.layer3.10.bn2.weight", "backbone.layer3.10.bn2.bias", "backbone.layer3.10.bn2.running_mean", "backbone.layer3.10.bn2.running_var", "backbone.layer3.10.conv3.weight", "backbone.layer3.10.bn3.weight", "backbone.layer3.10.bn3.bias", "backbone.layer3.10.bn3.running_mean", "backbone.layer3.10.bn3.running_var", "backbone.layer3.11.conv1.weight", "backbone.layer3.11.bn1.weight", "backbone.layer3.11.bn1.bias", "backbone.layer3.11.bn1.running_mean", "backbone.layer3.11.bn1.running_var", "backbone.layer3.11.conv2.weight", "backbone.layer3.11.bn2.weight", "backbone.layer3.11.bn2.bias", "backbone.layer3.11.bn2.running_mean", "backbone.layer3.11.bn2.running_var", "backbone.layer3.11.conv3.weight", "backbone.layer3.11.bn3.weight", "backbone.layer3.11.bn3.bias", "backbone.layer3.11.bn3.running_mean", "backbone.layer3.11.bn3.running_var", "backbone.layer3.12.conv1.weight", "backbone.layer3.12.bn1.weight", "backbone.layer3.12.bn1.bias", "backbone.layer3.12.bn1.running_mean", "backbone.layer3.12.bn1.running_var", "backbone.layer3.12.conv2.weight", "backbone.layer3.12.bn2.weight", "backbone.layer3.12.bn2.bias", "backbone.layer3.12.bn2.running_mean", "backbone.layer3.12.bn2.running_var", "backbone.layer3.12.conv3.weight", "backbone.layer3.12.bn3.weight", "backbone.layer3.12.bn3.bias", "backbone.layer3.12.bn3.running_mean", "backbone.layer3.12.bn3.running_var", "backbone.layer3.13.conv1.weight", "backbone.layer3.13.bn1.weight", "backbone.layer3.13.bn1.bias", "backbone.layer3.13.bn1.running_mean", "backbone.layer3.13.bn1.running_var", "backbone.layer3.13.conv2.weight", "backbone.layer3.13.bn2.weight", "backbone.layer3.13.bn2.bias", "backbone.layer3.13.bn2.running_mean", "backbone.layer3.13.bn2.running_var", "backbone.layer3.13.conv3.weight", "backbone.layer3.13.bn3.weight", "backbone.layer3.13.bn3.bias", "backbone.layer3.13.bn3.running_mean", "backbone.layer3.13.bn3.running_var", "backbone.layer3.14.conv1.weight", "backbone.layer3.14.bn1.weight", "backbone.layer3.14.bn1.bias", "backbone.layer3.14.bn1.running_mean", "backbone.layer3.14.bn1.running_var", "backbone.layer3.14.conv2.weight", "backbone.layer3.14.bn2.weight", "backbone.layer3.14.bn2.bias", "backbone.layer3.14.bn2.running_mean", "backbone.layer3.14.bn2.running_var", "backbone.layer3.14.conv3.weight", "backbone.layer3.14.bn3.weight", "backbone.layer3.14.bn3.bias", "backbone.layer3.14.bn3.running_mean", "backbone.layer3.14.bn3.running_var", "backbone.layer3.15.conv1.weight", "backbone.layer3.15.bn1.weight", "backbone.layer3.15.bn1.bias", "backbone.layer3.15.bn1.running_mean", "backbone.layer3.15.bn1.running_var", "backbone.layer3.15.conv2.weight", "backbone.layer3.15.bn2.weight", "backbone.layer3.15.bn2.bias", "backbone.layer3.15.bn2.running_mean", "backbone.layer3.15.bn2.running_var", "backbone.layer3.15.conv3.weight", "backbone.layer3.15.bn3.weight", "backbone.layer3.15.bn3.bias", "backbone.layer3.15.bn3.running_mean", "backbone.layer3.15.bn3.running_var", "backbone.layer3.16.conv1.weight", "backbone.layer3.16.bn1.weight", "backbone.layer3.16.bn1.bias", "backbone.layer3.16.bn1.running_mean", "backbone.layer3.16.bn1.running_var", "backbone.layer3.16.conv2.weight", "backbone.layer3.16.bn2.weight", "backbone.layer3.16.bn2.bias", "backbone.layer3.16.bn2.running_mean", "backbone.layer3.16.bn2.running_var", "backbone.layer3.16.conv3.weight", "backbone.layer3.16.bn3.weight", "backbone.layer3.16.bn3.bias", "backbone.layer3.16.bn3.running_mean", "backbone.layer3.16.bn3.running_var", "backbone.layer3.17.conv1.weight", "backbone.layer3.17.bn1.weight", "backbone.layer3.17.bn1.bias", "backbone.layer3.17.bn1.running_mean", "backbone.layer3.17.bn1.running_var", "backbone.layer3.17.conv2.weight", "backbone.layer3.17.bn2.weight", "backbone.layer3.17.bn2.bias", "backbone.layer3.17.bn2.running_mean", "backbone.layer3.17.bn2.running_var", "backbone.layer3.17.conv3.weight", "backbone.layer3.17.bn3.weight", "backbone.layer3.17.bn3.bias", "backbone.layer3.17.bn3.running_mean", "backbone.layer3.17.bn3.running_var", "backbone.layer3.18.conv1.weight", "backbone.layer3.18.bn1.weight", "backbone.layer3.18.bn1.bias", "backbone.layer3.18.bn1.running_mean", "backbone.layer3.18.bn1.running_var", "backbone.layer3.18.conv2.weight", "backbone.layer3.18.bn2.weight", "backbone.layer3.18.bn2.bias", "backbone.layer3.18.bn2.running_mean", "backbone.layer3.18.bn2.running_var", "backbone.layer3.18.conv3.weight", "backbone.layer3.18.bn3.weight", "backbone.layer3.18.bn3.bias", "backbone.layer3.18.bn3.running_mean", "backbone.layer3.18.bn3.running_var", "backbone.layer3.19.conv1.weight", "backbone.layer3.19.bn1.weight", "backbone.layer3.19.bn1.bias", "backbone.layer3.19.bn1.running_mean", "backbone.layer3.19.bn1.running_var", "backbone.layer3.19.conv2.weight", "backbone.layer3.19.bn2.weight", "backbone.layer3.19.bn2.bias", "backbone.layer3.19.bn2.running_mean", "backbone.layer3.19.bn2.running_var", "backbone.layer3.19.conv3.weight", "backbone.layer3.19.bn3.weight", "backbone.layer3.19.bn3.bias", "backbone.layer3.19.bn3.running_mean", "backbone.layer3.19.bn3.running_var", "backbone.layer3.20.conv1.weight", "backbone.layer3.20.bn1.weight", "backbone.layer3.20.bn1.bias", "backbone.layer3.20.bn1.running_mean", "backbone.layer3.20.bn1.running_var", "backbone.layer3.20.conv2.weight", "backbone.layer3.20.bn2.weight", "backbone.layer3.20.bn2.bias", "backbone.layer3.20.bn2.running_mean", "backbone.layer3.20.bn2.running_var", "backbone.layer3.20.conv3.weight", "backbone.layer3.20.bn3.weight", "backbone.layer3.20.bn3.bias", "backbone.layer3.20.bn3.running_mean", "backbone.layer3.20.bn3.running_var", "backbone.layer3.21.conv1.weight", "backbone.layer3.21.bn1.weight", "backbone.layer3.21.bn1.bias", "backbone.layer3.21.bn1.running_mean", "backbone.layer3.21.bn1.running_var", "backbone.layer3.21.conv2.weight", "backbone.layer3.21.bn2.weight", "backbone.layer3.21.bn2.bias", "backbone.layer3.21.bn2.running_mean", "backbone.layer3.21.bn2.running_var", "backbone.layer3.21.conv3.weight", "backbone.layer3.21.bn3.weight", "backbone.layer3.21.bn3.bias", "backbone.layer3.21.bn3.running_mean", "backbone.layer3.21.bn3.running_var", "backbone.layer3.22.conv1.weight", "backbone.layer3.22.bn1.weight", "backbone.layer3.22.bn1.bias", "backbone.layer3.22.bn1.running_mean", "backbone.layer3.22.bn1.running_var", "backbone.layer3.22.conv2.weight", "backbone.layer3.22.bn2.weight", "backbone.layer3.22.bn2.bias", "backbone.layer3.22.bn2.running_mean", "backbone.layer3.22.bn2.running_var", "backbone.layer3.22.conv3.weight", "backbone.layer3.22.bn3.weight", "backbone.layer3.22.bn3.bias", "backbone.layer3.22.bn3.running_mean", "backbone.layer3.22.bn3.running_var", "backbone.layer4.0.conv1.weight", "backbone.layer4.0.bn1.weight", "backbone.layer4.0.bn1.bias", "backbone.layer4.0.bn1.running_mean", "backbone.layer4.0.bn1.running_var", "backbone.layer4.0.conv2.weight", "backbone.layer4.0.bn2.weight", "backbone.layer4.0.bn2.bias", "backbone.layer4.0.bn2.running_mean", "backbone.layer4.0.bn2.running_var", "backbone.layer4.0.conv3.weight", "backbone.layer4.0.bn3.weight", "backbone.layer4.0.bn3.bias", "backbone.layer4.0.bn3.running_mean", "backbone.layer4.0.bn3.running_var", "backbone.layer4.0.downsample.0.weight", "backbone.layer4.0.downsample.1.weight", "backbone.layer4.0.downsample.1.bias", "backbone.layer4.0.downsample.1.running_mean", "backbone.layer4.0.downsample.1.running_var", "backbone.layer4.1.conv1.weight", "backbone.layer4.1.bn1.weight", "backbone.layer4.1.bn1.bias", "backbone.layer4.1.bn1.running_mean", "backbone.layer4.1.bn1.running_var", "backbone.layer4.1.conv2.weight", "backbone.layer4.1.bn2.weight", "backbone.layer4.1.bn2.bias", "backbone.layer4.1.bn2.running_mean", "backbone.layer4.1.bn2.running_var", "backbone.layer4.1.conv3.weight", "backbone.layer4.1.bn3.weight", "backbone.layer4.1.bn3.bias", "backbone.layer4.1.bn3.running_mean", "backbone.layer4.1.bn3.running_var", "backbone.layer4.2.conv1.weight", "backbone.layer4.2.bn1.weight", "backbone.layer4.2.bn1.bias", "backbone.layer4.2.bn1.running_mean", "backbone.layer4.2.bn1.running_var", "backbone.layer4.2.conv2.weight", "backbone.layer4.2.bn2.weight", "backbone.layer4.2.bn2.bias", "backbone.layer4.2.bn2.running_mean", "backbone.layer4.2.bn2.running_var", "backbone.layer4.2.conv3.weight", "backbone.layer4.2.bn3.weight", "backbone.layer4.2.bn3.bias", "backbone.layer4.2.bn3.running_mean", "backbone.layer4.2.bn3.running_var", "classifier.0.convs.0.0.weight", "classifier.0.convs.0.1.weight", "classifier.0.convs.0.1.bias", "classifier.0.convs.0.1.running_mean", "classifier.0.convs.0.1.running_var", "classifier.0.convs.1.0.weight", "classifier.0.convs.1.1.weight", "classifier.0.convs.1.1.bias", "classifier.0.convs.1.1.running_mean", "classifier.0.convs.1.1.running_var", "classifier.0.convs.2.0.weight", "classifier.0.convs.2.1.weight", "classifier.0.convs.2.1.bias", "classifier.0.convs.2.1.running_mean", "classifier.0.convs.2.1.running_var", "classifier.0.convs.3.0.weight", "classifier.0.convs.3.1.weight", "classifier.0.convs.3.1.bias", "classifier.0.convs.3.1.running_mean", "classifier.0.convs.3.1.running_var", "classifier.0.convs.4.1.weight", "classifier.0.convs.4.2.weight", "classifier.0.convs.4.2.bias", "classifier.0.convs.4.2.running_mean", "classifier.0.convs.4.2.running_var", "classifier.0.project.0.weight", "classifier.0.project.1.weight", "classifier.0.project.1.bias", "classifier.0.project.1.running_mean", "classifier.0.project.1.running_var", "classifier.1.weight", "classifier.2.weight", "classifier.2.bias", "classifier.2.running_mean", "classifier.2.running_var", "classifier.4.weight", "classifier.4.bias".
Unexpected key(s) in state_dict: "module.layers.0.weight", "module.layers.0.bias", "module.layers.2.weight", "module.layers.2.bias", "module.layers.2.running_mean", "module.layers.2.running_var", "module.layers.2.num_batches_tracked", "module.layers.3.weight", "module.layers.3.bias", "module.layers.5.weight", "module.layers.5.bias", "module.layers.5.running_mean", "module.layers.5.running_var", "module.layers.5.num_batches_tracked", "module.layers.6.weight", "module.layers.6.bias".

Cache Data to Decrease RAM

Hi @arieling,

Great job with this repository, it is awesome.

On the ReadMe page you state that:

"One can cache the data returned from prepare_data function to disk but it will increase trianing time due to I/O burden."

How would I implement this?

Thank You!

CUDA out of memory

How to run interpreter in Google Colab, I get the error CUDA out of memory

Opt {'exp_dir': 'model_dir/face_34', 'batch_size': 1, 'category': 'face', 'debug': False, 'dim': [512, 512, 5088], 'deeplab_res': 512, 'number_class': 34, 'testing_data_number_class': 34, 'max_training': 1, 'stylegan_ver': '1', 'annotation_data_from_w': False, 'annotation_mask_path': './dataset_release/annotation/training_data/face_processed', 'testing_path': './dataset_release/annotation/testing_data/face_34_class', 'average_latent': './dataset_release/training_latent/face_34/avg_latent_stylegan1.npy', 'annotation_image_latent_path': './dataset_release/training_latent/face_34/latent_stylegan1.npy', 'stylegan_checkpoint': './checkpoints/stylegan_pretrain/karras2019stylegan-celebahq-1024x1024.for_g_all.pt', 'model_num': 10, 'upsample_mode': 'bilinear'} MODEL_NUMBER 0 MODEL_NUMBER 1 MODEL_NUMBER 2 MODEL_NUMBER 3 MODEL_NUMBER 4 MODEL_NUMBER 5 MODEL_NUMBER 6 MODEL_NUMBER 7 MODEL_NUMBER 8 MODEL_NUMBER 9 num_sample: 1 Genearte 0 Out of: 1 tcmalloc: large alloc 5335154688 bytes == 0x55a304d12000 @ 0x7fc98fd25b6b 0x7fc98fd45379 0x7fc93469bb4a 0x7fc93469d5fa 0x7fc9369cd78a 0x7fc936c1630b 0x7fc936c5db37 0x7fc97fae6325 0x7fc97fae69cb 0x7fc97faea9fe 0x7fc97fac270b 0x55a2ac2f0045 0x55a2ac2b0c52 0x55a2ac3244d9 0x55a2ac31e9ee 0x55a2ac2b1bda 0x55a2ac320737 0x55a2ac31e9ee 0x55a2ac2b1bda 0x55a2ac320737 0x55a2ac31e9ee 0x55a2ac31e6f3 0x55a2ac3e84c2 0x55a2ac3e883d 0x55a2ac3e86e6 0x55a2ac3c0163 0x55a2ac3bfe0c 0x7fc98eb2dbf7 0x55a2ac3bfcea /usr/local/lib/python3.7/dist-packages/torch/nn/functional.py:2506: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) Traceback (most recent call last): File "/content/datasetGAN_release/datasetGAN/train_interpreter.py", line 579, in <module> generate_data(opts, args.resume, args.num_sample, vis=args.save_vis, start_step=args.start_step) File "/content/datasetGAN_release/datasetGAN/train_interpreter.py", line 263, in generate_data return_upsampled_layers=True) File "../utils/utils.py", line 101, in latent_to_image affine_layers[i]) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 532, in __call__ result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/upsampling.py", line 131, in forward return F.interpolate(input, self.size, self.scale_factor, self.mode, self.align_corners) File "/usr/local/lib/python3.7/dist-packages/torch/nn/functional.py", line 2530, in interpolate return torch._C._nn.upsample_bilinear2d(input, _output_size(2), align_corners) RuntimeError: CUDA out of memory. Tried to allocate 8.00 GiB (GPU 0; 15.90 GiB total capacity; 9.27 GiB already allocated; 5.59 GiB free; 9.38 GiB reserved in total by PyTorch)

Data and the pre-trained model of bird

Hi,
can you provide the dataset and the pre-trained of stylegan of bird? Thank you very much!!

RAM required to run images training

We are planning to adapt the datasetGAN to other contexts, so it is necessary to perform the training phase to obtain the required checkpoint, that is why I would like to know if 160 GB of RAM or VRAM are required to perform the test.

stylegan2-ada-pytorch support

Hello,

Thanks for your great work, I just wanted to know how can we use the stylegan2 pre-trained models for training the interpreter? do you think you will support it in the future?

I have trained my models with stylegan2-ada-pytorch from https://github.com/NVlabs/stylegan2-ada-pytorch

What is the exact PCK used for keypoint detection

What is the exact PCK used for keypoint detection?

To be more specific, PCK means the keypoint accuracy where a keypoint is defined as "correct" if it lies in a range of the GT keypoint. I cannot find the exact range definition in the paper.

Thanks in advance.

Conflicts between paper and code about DeeplabV3

Hi. As stated in the original paper, the part segmentation task is trained on Deeplab-V3, with ImageNet pre-trained ResNet151
backbone. However, the code is written as

datasetGAN_release/datasetGAN/train_deeplab.py

Lines 134 to 135 in 9de77f7

    
           classifier = torchvision.models.segmentation.deeplabv3_resnet101(pretrained=False, progress=False, 
        
                                                                            num_classes=num_class, aux_loss=None)

Which settings is correct if I want to reproduce your result? BTW, should the argument pretrained flag above set to True or not, and will it boost the performance？

No detailed description of config is given.

Hi,
To run datasetgan, i need not only the model represented by .pt but also the npy binaries for average_latent and annotation_image_latent. However, the readme only has a configuration that works based on four pre-prepared config files.
I understand that I need to write my own config and prepare the necessary data, but there is no explanation for that.
Or is this code just a demo and not supposed to work with any data sets?
Thank you.

what car-20 class names

in file name_and_palette.txt only have car-32
32 class: car_name = car_name = ['background', 'back bumper', 'bumper', 'car body', 'car_light_right', 'car_light_left','door_back', 'fender','door_front', 'grilles', 'back handle', 'front handle', 'hoods', 'license_plate_front', 'licence_plate_back','logo','mirror','roof','running boards', 'taillight right', 'taillight left','back wheel', 'front wheel','trunks','wheelhub_back','wheelhub_front','spoke_back', 'spoke_front', 'door_window_back', 'back windshield', 'door_window_front', 'windshield']

Can you release pretrained models?

Hi, could you please release the pretrained semantic segmentation networks?

Please understand, if your paper is going to be cited and compared, either you should provide your test data, or you should release your pre-trained weights.

Thank you.

checkpoint

Hi, thank you for your excellent work~ It seems that the provided checkpoint is not right.

I run datasetGAN/test_deeplab_cross_validation.py, and the wrong is as follows:
"File "datasetGAN/test_deeplab_cross_validation.py", line 357, in test
classifier.load_state_dict(checkpoint['model_state_dict'])
Missing key(s) in state_dict: "backbone.conv1.weight", "backbone.bn1.weight", "backbone.bn1.bias", "backbone.bn1.running_mean", "backbone.bn1.running_var"....
Unexpected key(s) in state_dict: "module.layers.0.weight",.."

The parameters in the classifier and the provided .pth model are not consistent.

Lack of definition of 8 classes of CelebAMask-HQ

Hello.

CelebAMask-HQ provides 19 classes of annotation on face according to original paper and their github repo.

Label list
0: 'background'	1: 'skin'	2: 'nose'
3: 'eye_g'	4: 'l_eye'	5: 'r_eye'
6: 'l_brow'	7: 'r_brow'	8: 'l_ear'
9: 'r_ear'	10: 'mouth'	11: 'u_lip'
12: 'l_lip'	13: 'hair'	14: 'hat'
15: 'ear_r'	16: 'neck_l'	17: 'neck'
18: 'cloth'

However, in your work only 8 classes are classified. How do you define all those classes?

Moreover, SemanticGAN, which is work of your lab as well, seems also adopting 8-classes protocol (See following snapshot from SemanticGAN). Do you share the same definition?

BTW, let me make a guess, your mask labels are defined as follow, which could be directly merged by original 19 classes masks.

8 classes	ID that corresponds to in original 19 classes
0: 'background'	0, 3, 14, 15, 16, 17, 18
1: 'skin'	1
2: 'nose'	2
3: 'eye'	4, 5
4: 'brow'	6, 7
5: 'ear'	8, 9
6: 'mouth'	10, 11, 12
7: 'hair'	13

Please reply if I have made it correct. Also, if I am correct about this, how do you deal with eye_g (i.e. glasses ) in original settings, since annotating glasses to background will lead to aliases in label image.

Here is a snapshot from original celebAMask-HQ with glasses.

Some questions about the keypoint detection experiments

Nice work and thanks for sharing the code and data! I have a few questions regarding the keypoint detection experiments in table 2:

The exact definition of PCK: I noticed that different papers are using slightly different versions of PCK in terms of the threshold (e.g., the keypoint is considered correct if its distance to GT is within X% of the longer side of an image in some works, while others use the diagonal length of an image as a reference). Could you clarify the PCK metric used in your paper?
Is it possible to share the keypoint annotations and images for comparisons?

Question on the StyleGAN version used

Hello,

In the "README.md" file, the url links which are linked to the StyleGAN point to the Github of the "StyleGAN V1", but on the Google Drive, some files seem to be rather linked to the "StyleGAN V2" (ex. "Datasetgan_release_checkpoints/styegan_pretrain/stylegan2-ffhq-config-f.pt" OR "datasetGAN_data_release/training_latent/car20/latent_stylegan2_encoder.npy").

The DatasetGAN can probably work with any version of StyleGAN, but for the research and code of this Github, which version of StyleGAN was used ?

Experiment on stylegan3

Hello,

Thanks for your great work, I want to ask if you have conducted experiments on stylegan3? Does this method still work?

About the test data in your paper

In your paper, you have testing datasets selected as a subset from ADE20K and pascal, such as ADE-Car-12 and ADE-car-5. How to obtain these datasets? Because an identical testing dataset is core to a fair comparison. Thank you.

Can I reduce the nums of input features of the final classification layer

At present, I use 256 * 256*3 images for training, and the final styleGAN output is 256 * 256 * 4864. Due to the limitation of memory, I can only load the features of about 100 images at a time under 400G memory, and it will take dozens of minutes. Is there any way to use more images for training? For example, is it feasible to reduce the number of input features？

"latent_stylegan1.npy" mismatch with image and label

The training processes mentioned in #19 (comment) shows the following steps:

- Step 0: Train StyleGAN V1 with your own data set to get a pre-trained StyleGAN (or download instead a pre-trained StyleGAN provided by the Github of the StyleGAN V1 or any other sources)

- Step 1: Follow the instructions in the link for [Tensorflow -> PyTorch] conversion of the pre-trained StyleGAN weights file in issue #1 . This will give you the ".pt" file you need to perform all of the following steps.

- Step 2: **avg_latent_stylegan1.npy**: 
  - First, you need to load the pre-trained StyleGAN you got in step 1. The initialization will however exclude the use of the threshold for this step (as is also done in Step 1 if you want to test the pre-trained StyleGAN).
  - For this step, you only need to use the "G_mapping" part (the mapping network) of "g_all" (the complete StyleGAN V1 ).
  - You need to do a for loop on "range (0, 8000)" where on each iteration: you compute a random "Z latent code" with "np.random.randn(1, 512)" (Gaussian distribution N(mean = 0 , std / var = 1)), send it as input to "G_mapping" and you capture its output (the "W latent code").
  - You then simply calculate the average of the 8,000 "W latent code" that you obtained. You then store this result in a Numpy file that you can name "avg_latent_stylegan1.npy" (or any other name).

- Step 3: **latent_stylegan1.npy** + generation of fake images:
   - First, you need to load the pre-trained StyleGAN you got in step 1. However, this time you will include the threshold in the architecture since you now have the "avg_latent" you need for its initialization.
   - Make a for loop on "range (0, nb_img_you_want)" where at each iteration: you first create a random "Z latent code" ("latent = np.random.randn (1, 512)") then pass it as input to the StyleGAN using the "latent_to_image" function (from the "utils.utils" file on this Github). During this loop, you must store the latent codes linked to the creation of each image since they will be saved in the "latent_stylegan1.npy" file. You also need to save the generated images as you will then need to label them manually (using, for example, LabelMe [https://github.com/wkentaro/labelme](url) ).
   - If you choose to label only a subset of all generated images, then "latent_stylegan1.npy" should contain only the "laten code" related to the images you have chosen - but this can be done later, after you have manually labelled some images that were generated by the pre-trained StyleGAN.

- Step 4: Manually label some fake images generated by the StyleGAN (by using, for example, LabelMe https://github.com/wkentaro/labelme) in order to obtain their corresponding "mask".

- Step 5: Now that you have "avg_latent_stylegan1.npy", "latent_stylegan1.npy", the pre-trained StyleGAN and some pairs of (image, mask) for the DatasetGAN training dataset, you can perform the DatasetGAN training via the file "train_interpreter.py" from this Github. Some modifications might be necessary on some small parts (eg the files extension used for the images which could be different from ".jpg").

I hope you are now better able to implement your own implementation of all of these parts. If you haven't already, I strongly suggest that you read the StyleGAN V1 paper first, then read and understand all the code in the "train_interpreter.py" file on this Github, before you start implementing all of these steps.

_Originally posted by @PoissonChasseur in https://github.com/nv-tlabs/datasetGAN_release/issues/19#issuecomment-892114720_

I followed these steps and found that the generated dataset are very strange... so I checked the code and found this:

"make_training_data.py" do not save the latent code for the first image (image_0.jpg) in "latent_stylegan1.npy"

datasetGAN_release/datasetGAN/make_training_data.py

Line 153 in d9564d4

if i == 0:
"train_interpreter.py" load the latent code from "latent_stylegan1.npy", at the same time it load the image from "image_0.jpg" to generate the input

datasetGAN_release/datasetGAN/train_interpreter.py

Line 369 in d9564d4

im_frame = np.load(os.path.join( args['annotation_mask_path'] , name))

Question on how to get the color palettes

Hi,

The "train_interpreter.py" file uses at different times the "palette" variable which is obtained from the "utils.data_util.py" file. I would like to know if there is an easy way to get these values (eg via LabelMe) OR are they just arbitrary values chosen manually (for example) ?

Some minor modifications for the "models/stylegan1.py" file and questions about the StyleGAN code

Hello,

When I was analyzing all the code in this Github, there are some parts that I had to fix in the "models/stylegan1.py" file:

Class LayerEpilogue:
-> Change " layers.append(('pixel_norm', PixelNorm())) " to " layers.append(('pixel_norm', PixelNormLayer())) "

Class G_synthesis:
->The Spyder IDE said that the variable "last_channels" was not set because it had not been initialized before the for loop in the "init". This is technically not an error here, but I fixed this to remove these error messages.
-> There are also a few variables that are not used: randomize_noise, num_styles, torgbs, batch_size (in the "foward" method)

Class Downscale2d:
-> The undefined variable "factor" in the "forward" method is probably "self.factor"

Also, I also did not find any code to be able to train and test the StyleGAN. Are you just using their Github code for this ?
If so, then why define all this code that seems to be related to StyleGAN? (Note that I haven't read the StyleGAN paper or code yet)

Lacking documentation on how to create the average latent file

The example configuration files (https://github.com/nv-tlabs/datasetGAN_release/tree/master/datasetGAN/experiments), ex. "cat_16.json" or "car_20.json", contain a field named "average_latent" and "annotation_image_latent_path". Both of these fields are paths to .npy files.

This repo does not describe how these files are generated. Even after one trains a StyleGAN model, one cannot then use DatasetGAN without these files. Please provide documentation on how to create this file for a custom dataset.

I imagine that you'd want to provide examples using your up-to-date repos, ex. https://github.com/NVlabs/stylegan2-ada-pytorch and/or https://github.com/NVlabs/stylegan2-ada.

Cat example dies on lines 449 - 450 of train_interpreter.py

The documentation suggests that we should be able to run python train_interpreter.py --exp experiments/cat_16.json. When I do so, the process dies without an error message on lines 449 - 450 (see below).

datasetGAN_release/datasetGAN/train_interpreter.py

Lines 449 to 450 in 9de77f7

    
           train_data = trainData(torch.FloatTensor(all_feature_maps_train_all), 
        
                                  torch.FloatTensor(all_mask_train_all))

Can anyone suggest ideas for why this is happening? How about workarounds?

Bedroom data

Hi,

Could you please share the bedroom train and test data with corresponding annotations? As far as I can see, the annotation in the supplementary materials doesn't fully correspond to the one described in .txt file. Thus, it's not clear how to annotate them properly to reproduce the results.

RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x18 and 512x1024)

Thank you for your code.
I got this error:
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x18 and 512x1024)

To fix this I add transpose on the latent variable before latent_to_image in make_training_data.py

How to reduce the number of input features of pixel_classifier

Hi,

After having trained my own StyleGAN on images of 128x128 pixels in order to then use the DatasetGAN, the code of "train_interpreter.py" indicates that the total number of features that will go into the pixel_classifier (the value dim [-1]) would be of 79,691,776 - which is way too big and just not normal compared to the values presented in the configuration files (.json files).

Is there a way that would allow me to reduce this value or correct this ?

Note: This values of 79,691,776 (= 128x128x4864) was the one giving by the line of code "feature_maps = feature_maps.reshape(-1, args["dim"][2])" in the function "prepare_data"

My system halted

The code seems to take large sum of memory and it makes my computer system halted. So, how large the memory capacity should be?

Control graphics memory usage

I got a cuda OOM while running train_interpreter, I found a variable "batch_size" in train_interpreter.py but changing it doesn't seem to help, can I limit the memory usage?(For example, from the json config) I have a 2080ti board with 11GB of GMEM.
Thank you.

Opt {'exp_dir': 'model_dir/face_34', 'category': 'face', 'debug': False, 'dim': [512, 512, 5088], 'deeplab_res': 512, 'number_class': 34, 'testing_data_number_class': 34, 'max_training': 16, 'stylegan_ver': '1', 'annotation_data_from_w': False, 'annotation_mask_path': './dataset_release/annotation/training_data/face_processed', 'testing_path': './dataset_release/annotation/testing_data/face_34_class', 'average_latent': './dataset_release/training_latent/face_34/avg_latent_stylegan1.npy', 'annotation_image_latent_path': './dataset_release/training_latent/face_34/latent_stylegan1.npy', 'stylegan_checkpoint': './dataset_release/stylegan_pretrain/karras2019stylegan-ffhq-1024x1024.for_g_all.pt', 'model_num': 10, 'upsample_mode': 'bilinear'} /home/udemegane/anaconda3/envs/dataset/lib/python3.7/site-packages/torch/nn/functional.py:2973: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) Traceback (most recent call last): File "train_interpreter.py", line 578, in <module> main(opts) File "train_interpreter.py", line 446, in main all_feature_maps_train_all, all_mask_train_all, num_data = prepare_data(args, palette) File "train_interpreter.py", line 403, in prepare_data return_upsampled_layers=True, use_style_latents=args['annotation_data_from_w']) File "../utils/utils.py", line 94, in latent_to_image affine_layers_upsamples = torch.FloatTensor(1, number_feautre, dim, dim).cuda() RuntimeError: CUDA out of memory. Tried to allocate 4.97 GiB (GPU 0; 10.76 GiB total capacity; 7.61 GiB already allocated; 2.33 GiB free; 7.65 GiB reserved in total by PyTorch)

Missing pickle files for training deeplab

Hello,
Thank you so much for providing the code for this paper :)
I am trying to generate an annotated face dataset, but I am having some trouble with the train_deeplab.py.
In ln[95] there is all_pickle = glob.glob(data_path + '/*.pickle'), which loads the 16 manually annotated data if I understood the docs correctly. However, I can't seem to find these .pickle files.
I guess one can load the training/test data provided in dataset_release, but it would still be missing the uncertainty_score.

	def forward(self, x, dlatents_in_range, latent_after_trans=None):
	x = self.conv0_up(x)

	if latent_after_trans is None:
	x = self.epi1(x, dlatents_in_range[:, 0])
	else:
	x = self.epi1(x, dlatents_in_range[:, 0], latent_after_trans[0]) # latent_after_trans is a list
	x = self.conv1(x)

	if latent_after_trans is None:
	x1 = self.epi2(x, dlatents_in_range[:, 1])
	else:
	x1 = self.epi2(x, dlatents_in_range[:, 1], latent_after_trans[1])
	return x1, x

	if mean_seg is None:
	mean_seg = img_seg
	else:
	mean_seg += img_seg

	img_seg = classifier(affine_layers)

	img_seg = img_seg.squeeze()

	class pixel_classifier(nn.Module):
	def __init__(self, numpy_class, dim):
	super(pixel_classifier, self).__init__()
	if numpy_class < 32:
	self.layers = nn.Sequential(
	nn.Linear(dim, 128),
	nn.ReLU(),
	nn.BatchNorm1d(num_features=128),
	nn.Linear(128, 32),
	nn.ReLU(),
	nn.BatchNorm1d(num_features=32),
	nn.Linear(32, numpy_class),
	# nn.Sigmoid()
	)
	else:
	self.layers = nn.Sequential(
	nn.Linear(dim, 256),
	nn.ReLU(),
	nn.BatchNorm1d(num_features=256),
	nn.Linear(256, 128),
	nn.ReLU(),
	nn.BatchNorm1d(num_features=128),
	nn.Linear(128, numpy_class),
	# nn.Sigmoid()
	)

	classifier = torchvision.models.segmentation.deeplabv3_resnet101(pretrained=False, progress=False,
	num_classes=num_class, aux_loss=None)

	train_data = trainData(torch.FloatTensor(all_feature_maps_train_all),
	torch.FloatTensor(all_mask_train_all))

nv-tlabs / datasetgan_release Goto Github PK

datasetgan_release's People

Contributors

Stargazers

Watchers

Forkers

datasetgan_release's Issues

TL; DR

Recommend Projects

Recommend Topics

Recommend Org