Code Monkey home page Code Monkey logo

Comments (9)

LeandreSassi avatar LeandreSassi commented on August 25, 2024 1

I found the solution here dvschultz/stylegan2-ada-pytorch#45 (comment)

Thanks again. Maybe this could be helpfull to update your colab!

Have a good day,

Leandre Sassi

from steam-stylegan2-ada-pytorch.

woctezuma avatar woctezuma commented on August 25, 2024 1

Alright, I believe the notebook works now. I am not sure if one has to downgrade the PyTorch version.

from steam-stylegan2-ada-pytorch.

LeandreSassi avatar LeandreSassi commented on August 25, 2024 1

No need to Downgrade Python or PyTorch ;) Just trained a model and it went all good.

from steam-stylegan2-ada-pytorch.

woctezuma avatar woctezuma commented on August 25, 2024

Hello!

Warning

First of all, know that training a model on Google Colab is painful, it is much more convenient to use paid platforms if you don't want to deal with sessions closing inadvertently, frequently resuming training, machines which are not powerful, etc.

Important

Second, I did not have time to post the results which I had obtained. From what I can remember, they looked like the ones in my other repositories at steam-stylegan2 (game banners) and at steam-lightweight-gan (Steam-OneFace-small dataset). The illustration in the current repository is a projection using a model pre-trained by Nvidia, without game banners!

Tip

Third, there may be better methods nowadays, either via a newer version of StyleGAN or with diffusion models.

In the cell below, after running it with your custom datasets I get the error posted in the title of my Issue.

I will have a look.

Also, do you really need to give it a resume file when you want to train a model for the first time?

This is done to perform transfer learning from FFHQ trained at 256x256. See the documentation by Nvidia at:

Transfer learning is not mandatory, but it may help reaching a satisfying result without using too much computation time.

This is mostly relevant if the original dataset (here, FFHQ, which contains faces) is similar to your own dataset (e.g. banners of Steam games which feature a single prominent face). For the GIF illustration shown on the README, you can see projections obtained "with a network pre-trained by Nvidia on the LSUN DOG dataset", hence why I chose banners of games which featured a dog.

from steam-stylegan2-ada-pytorch.

woctezuma avatar woctezuma commented on August 25, 2024

Alright, I see the same error message as you saw.

Traceback (most recent call last):
  File "/content/stylegan2-ada-pytorch/train.py", line 556, in <module>
    main() # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/content/stylegan2-ada-pytorch/train.py", line 549, in main
    subprocess_fn(rank=0, args=args, temp_dir=temp_dir)
  File "/content/stylegan2-ada-pytorch/train.py", line 399, in subprocess_fn
    training_loop.training_loop(rank=rank, **args)
  File "/content/stylegan2-ada-pytorch/training/training_loop.py", line 299, in training_loop
    loss.accumulate_gradients(phase=phase.name, real_img=real_img, real_c=real_c, gen_z=gen_z, gen_c=gen_c, sync=sync, gain=gain)
  File "/content/stylegan2-ada-pytorch/training/loss.py", line 131, in accumulate_gradients
    (real_logits * 0 + loss_Dreal + loss_Dr1).mean().mul(gain).backward()
  File "/usr/local/lib/python3.10/dist-packages/torch/_tensor.py", line 492, in backward
    torch.autograd.backward(
  File "/usr/local/lib/python3.10/dist-packages/torch/autograd/__init__.py", line 251, in backward
    Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: derivative for aten::grid_sampler_2d_backward is not implemented

As well as a few warnings:

Creating output directory...
Launching processes...
Loading training set...

/usr/local/lib/python3.10/dist-packages/torch/utils/data/sampler.py:64: UserWarning: `data_source` argument is not used and will be removed in 2.2.0.You may still have custom implementation that utilizes it.
  warnings.warn("`data_source` argument is not used and will be removed in 2.2.0."

/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py:557: UserWarning: This DataLoader will create 3 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(

Num images:  2472
Image shape: [3, 256, 256]
Label shape: [0]

And many times:

/content/stylegan2-ada-pytorch/torch_utils/ops/conv2d_gradfix.py:55: UserWarning: conv2d_gradfix not supported on PyTorch 2.1.0+cu121. Falling back to torch.nn.functional.conv2d().
  warnings.warn(f'conv2d_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.conv2d().')

And a few times:

/content/stylegan2-ada-pytorch/torch_utils/ops/grid_sample_gradfix.py:39: UserWarning: grid_sample_gradfix not supported on PyTorch 2.1.0+cu121. Falling back to torch.nn.functional.grid_sample().
  warnings.warn(f'grid_sample_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.grid_sample().')

from steam-stylegan2-ada-pytorch.

woctezuma avatar woctezuma commented on August 25, 2024

Related issues:

Related pull request, with a simple fix which may work:

It seems that one has to use an earlier version of PyTorch (version 1), while version 2 is installed by default on Colab nowadays.

from steam-stylegan2-ada-pytorch.

LeandreSassi avatar LeandreSassi commented on August 25, 2024

Hi Woctezuma,

Thanks a lot for your response. I saw Jeff Heaton training his GANs on Colab Pro. Do you mean this with a paid platform? If not wich would you recommend?

Thanks !

Leandre Sassi

from steam-stylegan2-ada-pytorch.

woctezuma avatar woctezuma commented on August 25, 2024

Hello!

I saw Jeff Heaton training his GANs on Colab Pro. Do you mean this with a paid platform? If not wich would you recommend?

I have only had some experience with OVH's AI Training and AI Notebooks, as I had a chance to try their platform for free when they were beta-testing their service near the official launch.

I don't have any experience with other platforms, so it is not sufficient knowledge to recommend one over another. 😅

I found the solution here dvschultz/stylegan2-ada-pytorch#45 (comment)

Thanks again. Maybe this could be helpfull to update your colab!

Thank you! I will have a look at:

as well as:

Have a nice day!

from steam-stylegan2-ada-pytorch.

woctezuma avatar woctezuma commented on August 25, 2024

For info, the following commit:

fixes this warning which appeared many times:

/content/stylegan2-ada-pytorch/torch_utils/ops/conv2d_gradfix.py:55: UserWarning: conv2d_gradfix not supported on PyTorch 2.1.0+cu121. Falling back to torch.nn.functional.conv2d().
  warnings.warn(f'conv2d_gradfix not supported on PyTorch {torch.__version__}. Falling back to torch.nn.functional.conv2d().')

from steam-stylegan2-ada-pytorch.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.