Code Monkey home page Code Monkey logo

Comments (28)

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024 5

Ok.
First, notice that you are using transforms_config.FFHQEncodeTransforms which defines transform_source equal to None. Now, refer to ImagesDataset:

if self.source_transform:
from_im = self.source_transform(from_im)
else:
from_im = to_im

You will notice that if self.source_transform is None, we set from_im = to_im in line 31. And since transform_source is defined as None, we fall exactly into that case.

Therefore, you're actually not using the paired data you defined! That is, you're "over-writing" the source data with the target data. That is why in your logs, you see only the toons images (since those are your target images).

A couple of questions you may have:

  1. Why did we do it this way? Because in ffhq_encode, the task at hand is to "reconstruct" the images. Therefore, here it makes sense to set from_im = to_im.
  2. Why did we tell you to use ffhq_encode for the toonify task? Because as I mentioned above, we don't use paired toons data. We simply use the real FFHQ data and therefore it was fine to set the source images equal to the target images.

However, this is not what you want to do. What you need to do is define your own transform class where you're transforms_dict is something like:

transforms_dict = {
			'transform_gt_train': transforms.Compose([
				transforms.Resize((256, 256)),
				transforms.ToTensor(),
				transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
			'transform_source': transforms.Compose([
				transforms.Resize((256, 256)),
				transforms.ToTensor(),
				transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
			'transform_test': transforms.Compose([
				transforms.Resize((256, 256)),
				transforms.ToTensor(),
				transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
			'transform_inference': transforms.Compose([
				transforms.Resize((256, 256)),
				transforms.ToTensor(),
				transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
}

This way, transform_source is not None and you do not set from_im = to_im in ImagesDataset.

Then, you can make a new dataset type in data_configs.py. Something like:

'toonify': {
		'transforms': transforms_config.ToonifyTransforms,
		'train_source_root': dataset_paths['real_train'],
		'train_target_root': dataset_paths['toons_train'],
		'test_source_root': dataset_paths['real_test'],
		'test_target_root': dataset_paths['toons_test'],
},

And call your training script using --dataset_type=toonify.

To make sure everything works as expected, you should see in the logs the real source image next to the toons target image, followed by the output toons image.

Very long answer, but I hope it helps clear up any confusion. Let me know if you have anything further questions. 😄

from pixel2style2pixel.

justinpinkney avatar justinpinkney commented on July 24, 2024 4

Would be very interested to see how the results of your training run turn out!

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024 1
  1. Yes. Now I understand what your data paths are and looks good.
  2. You are correct regarding the logs not being correct. I think I know what the issue is. I will look into it and explain in a bit 😄

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024 1

I'm pretty sure the test data is not aligned. There are a lot of images that are very rotated with respect to the corresponding toon image.
I would run realignment again on the test to make sure. The train data seems fine.
Notice that you don't need to retrain the model. Simply take the trained model and run inference on the test data after realignment.
Regarding adding more data, I wouldn't run to add more data before you know that alignment isn't the problem.

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024 1

Sounds good. Since you also have paired data, I invite you to play around with the loss lambdas. You may be able to get better results with a different combination. Specifically, I would recommend starting by decreasing the w_norm_lambda a bit (maybe to 0.01?).
In any case, I feel like we can close this issue as it seems like we've solved the issue you had. Feel free to reopen the issue if needed and I am looking forward to seeing your results!

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024

I ran the source image you linked above through the toonified model we uploaded and I got a similar result to what you got using your trained model. Therefore, I don't think you're doing anything wrong.
I will say that our toonified model is not trained on pairs of (real, toon) and therefore we run for a small number of steps. If you have paired data, you may find that running for more iterations will improve the results.
Regarding making the eyes bigger, this all depends on the data you use for training. If your target toons have larger eyes, you are more likely to generate toons with larger eyes. Our toonify model tends not to produce smaller eyes since we did not use paired data.
I hope this helps.

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

I tried to run it for a longer time but the result did not improve after 27k steps (best) you can check the training run here.
And as you mentioned that your toon model is not trained on pairs then how does the training works? can I train the model by just providing target images?
in path config it goes like this:

'ffhq_train': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Train/Train_IN',
  'ffhq_target': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Train/Target_IN',
  'toon_train': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Test/Test_IN',
  'toon_target': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Test/Target_IN',

In the data config file we provided details like this:

'ffhq_encode': {
		'transforms': transforms_config.EncodeTransforms,
		'train_source_root': dataset_paths['ffhq_train'],
		'train_target_root': dataset_paths['ffhq_target'],
		'test_source_root': dataset_paths['toon_train'],
		'test_target_root': dataset_paths['toon_target'],
	},

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024

Ah. I think I see the problem now.

  1. Based on the logs you linked, it appears that you are trying to encode toon images. That is, you're training using pairs of (toon, toon). However, if I understand you correctly, you want to train using pairs of (real, toon). That is, the training source data should be real face images while the training target data should be the corresponding toon images.
  2. The data config you specified above seems a bit strange if I understood it correctly. I believe what you want is something like:
'ffhq_encode': {
		'transforms': transforms_config.EncodeTransforms,
		'train_source_root': dataset_paths['ffhq_train'],
		'train_target_root': dataset_paths['ffhq_toons_train'],
		'test_source_root': dataset_paths['real_test'],
		'test_target_root': dataset_paths['toons_test'],
	},

where ffhq_toons_train is the path to the toons images corresponding to the FFHQ data, real_test is the path to the test set containing real faces images and toons_test is the path to the test set containing the corresponding toons images.

As for how we trained our toons model with no paired data: we train our toons model exactly like the ffhq_encode task but replace the StyleGAN model from FFHQ to the StyleGAN toons model. That is, we train using only real face images.

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024
  1. I think the structure you described in point 2 it's same for us while doing the above training.
    below structure is what I mentioned in the data config file which I am pretty sure it should be correct or I am doing something wrong?
    ref

  2. Images in logs seem confusing as input is toon image whereas it should be a real image in the input isn't it?
    0000_0000

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

Thanks @yuval-alaluf after changing transform class and the toonify dict as you described now the logs are as expected.
and the other thing I was wondering was how can I try with just stylegan weights file(.pt) and without pair images as u did in your toon model.

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024

@justmaulik ,
Not sure what you mean by "just stylegan weights file".
But if you want to reproduce the toons model we trained with no paired data, you need to specify only a few things:

  1. Make sure --dataset_type is ffhq_encode
    - this will take the data to simply be real FFHQ face images with no toons paired data.
  2. Set --stylegan_weights to the file containing the toonified StyleGAN generator that we linked in the repo (here is the link to download it
  3. Set the loses to what we specified in the README.

Basically, the main difference is that instead of using the FFHQ StyleGAN generator, you will use the Toons StyleGAN generator.

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

Thanks, @yuval-alaluf actually have a toon StyleGAN model that was created by following @justinpinkney blog post that actually turned out really good but not perfect (10% bad image). I was curious to see how the result changes in pixel2style2pixel.

I did the training and the Logs are getting better than the previous runs. but it's not the learning anymore the best loss was achieved at the start of the training then it just stayed still. The current dataset count is ~920 pairs in train and 100 pairs in the test.

  1. Is the training data now enough?
  2. I am using a StyleGAN model that I mentioned above as --stylegan_weights
  3. The images in test and train data are generated using the same stylegan_weights
  4. The output in the test folder looks like it just gives the same few faces just a bit changes and is way far from the input.

Waiting for the follow-up post on @justinpinkney blog :) or if possible you could explain a bit about how did you manage to achieve the results so I can give that method a try. 😊

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024

Looks much better on the train!
I think the problem you're getting on the test set is that the real image data is not aligned. Other than that, everything else looks good.
I believe that if you align the data you should see much better results.
Let me know if this solves the problem. I'm interested in seeing your results!

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

All the images used were tagged as aligned on Nvidia's repo. I think it's just missing the blur effect. Still, I will start a new Run after aligning all the images and will share the results.
The output in the train folder is perfect but in the test folder, it's still a bit far from perfect.
If I add more variety of data pairs(5k-10k pairs) will it improve significantly? or 900 pairs is more than enough for this type of training?

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

I tried aligned images but it did not improve the results as expected it's more random some images give good results but most are just like this. the output is good but far from the input. I will try with a larger dataset and share the results.

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

@justmaulik Were you able to solve this? I am facing the same issue, I trained with 3k pairs, but still the output does not exactly match with the input

No, I tried few things but did not achieve the results I was aiming for. at some point, I trained with ~20k pairs (40k images) but the results did not improve much.

from pixel2style2pixel.

Nerdyvedi avatar Nerdyvedi commented on July 24, 2024

@yuval-alaluf Can we reopen this issue? I am facing the same problem. The results look good on train logs, but with test, it seems to have no connection with the input.

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024

Hi @Nerdyvedi ,
I'll be happy to help, but since your question is not specifically about the ffhq_encode task, I think the best thing to do is to open a new issue where you can provide me some more details as to what you're trying to do, the training setting, and the results your seeing on the train/test.
I will try to help out as much as possible.

from pixel2style2pixel.

cvmlddev avatar cvmlddev commented on July 24, 2024

@yuval-alaluf Would it be possible for you to share your dataset? Your dataset looks much better than ours. How did you create it?
I am close to getting good results, But my dataset is not good enough

from pixel2style2pixel.

yuval-alaluf avatar yuval-alaluf commented on July 24, 2024

Hi @cvmlddev,
As I mentioned in your issue (#83), I think part of the reason you're getting unsatisfactory results is the use of paired data.
Regarding the data, for our toonify model we actually didn't use any paired data at all. The training was done using only real images from FFHQ using the toonify StyleGAN. This allowed for more flexible results during inference.

from pixel2style2pixel.

cvmlddev avatar cvmlddev commented on July 24, 2024

Oops sorry, meant to tag @justmaulik . @justmaulik Would it be possible for you to share your paired toon dataset?

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

@cvmlddev Sure I had 20k pairs (40k images) but can't find it now will share it if I find it somewhere.
Here is the 1k set I just found.

from pixel2style2pixel.

cvmlddev avatar cvmlddev commented on July 24, 2024

@justmaulik Thanks a lot!
How did you generate it?
Also, I wonder why you are not getting good results, I was able to get decent results with default parameters.

from pixel2style2pixel.

justmaulik avatar justmaulik commented on July 24, 2024

I trained an ffhq model with a small dataset then created paired data with it to train for pixel2style2pixel.
the problem I faced was results were not close to the input results were good but far from the input.
Here I found just toon folder :) maybe it can help you somehow.

from pixel2style2pixel.

cvmlddev avatar cvmlddev commented on July 24, 2024

Thanks a lot!

from pixel2style2pixel.

saschaglo avatar saschaglo commented on July 24, 2024

@justmaulik again, one year later, did you get better results in the end or did you end up working with a completely another approach?

from pixel2style2pixel.

leslie-ds avatar leslie-ds commented on July 24, 2024

How to generate cartoon images like https://drive.google.com/drive/folders/1-6gZXiSDwT8hJxJcwqdGd2qErm-608rL , THX

from pixel2style2pixel.

goldwater668 avatar goldwater668 commented on July 24, 2024

@cvmlddev当然我有 20k 对(40k 图像)但现在找不到它如果我在某个地方找到它会分享它。 是我刚找到的 1k 集。

Hi, can you share your dataset again, can't find it here, thanks!

from pixel2style2pixel.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.