Comments (28)
Ok.
First, notice that you are using transforms_config.FFHQEncodeTransforms
which defines transform_source
equal to None
. Now, refer to ImagesDataset
:
pixel2style2pixel/datasets/images_dataset.py
Lines 28 to 31 in 89935c4
You will notice that if
self.source_transform
is None
, we set from_im = to_im
in line 31. And since transform_source
is defined as None
, we fall exactly into that case.
Therefore, you're actually not using the paired data you defined! That is, you're "over-writing" the source data with the target data. That is why in your logs, you see only the toons images (since those are your target images).
A couple of questions you may have:
- Why did we do it this way? Because in
ffhq_encode
, the task at hand is to "reconstruct" the images. Therefore, here it makes sense to setfrom_im = to_im
. - Why did we tell you to use
ffhq_encode
for the toonify task? Because as I mentioned above, we don't use paired toons data. We simply use the real FFHQ data and therefore it was fine to set the source images equal to the target images.
However, this is not what you want to do. What you need to do is define your own transform class where you're transforms_dict
is something like:
transforms_dict = {
'transform_gt_train': transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
'transform_source': transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
'transform_test': transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])]),
'transform_inference': transforms.Compose([
transforms.Resize((256, 256)),
transforms.ToTensor(),
transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5])])
}
This way, transform_source
is not None
and you do not set from_im = to_im
in ImagesDataset
.
Then, you can make a new dataset type in data_configs.py
. Something like:
'toonify': {
'transforms': transforms_config.ToonifyTransforms,
'train_source_root': dataset_paths['real_train'],
'train_target_root': dataset_paths['toons_train'],
'test_source_root': dataset_paths['real_test'],
'test_target_root': dataset_paths['toons_test'],
},
And call your training script using --dataset_type=toonify
.
To make sure everything works as expected, you should see in the logs the real source image next to the toons target image, followed by the output toons image.
Very long answer, but I hope it helps clear up any confusion. Let me know if you have anything further questions. 😄
from pixel2style2pixel.
Would be very interested to see how the results of your training run turn out!
from pixel2style2pixel.
- Yes. Now I understand what your data paths are and looks good.
- You are correct regarding the logs not being correct. I think I know what the issue is. I will look into it and explain in a bit 😄
from pixel2style2pixel.
I'm pretty sure the test data is not aligned. There are a lot of images that are very rotated with respect to the corresponding toon image.
I would run realignment again on the test to make sure. The train data seems fine.
Notice that you don't need to retrain the model. Simply take the trained model and run inference on the test data after realignment.
Regarding adding more data, I wouldn't run to add more data before you know that alignment isn't the problem.
from pixel2style2pixel.
Sounds good. Since you also have paired data, I invite you to play around with the loss lambdas. You may be able to get better results with a different combination. Specifically, I would recommend starting by decreasing the w_norm_lambda
a bit (maybe to 0.01?).
In any case, I feel like we can close this issue as it seems like we've solved the issue you had. Feel free to reopen the issue if needed and I am looking forward to seeing your results!
from pixel2style2pixel.
I ran the source image you linked above through the toonified model we uploaded and I got a similar result to what you got using your trained model. Therefore, I don't think you're doing anything wrong.
I will say that our toonified model is not trained on pairs of (real, toon) and therefore we run for a small number of steps. If you have paired data, you may find that running for more iterations will improve the results.
Regarding making the eyes bigger, this all depends on the data you use for training. If your target toons have larger eyes, you are more likely to generate toons with larger eyes. Our toonify model tends not to produce smaller eyes since we did not use paired data.
I hope this helps.
from pixel2style2pixel.
I tried to run it for a longer time but the result did not improve after 27k steps (best) you can check the training run here.
And as you mentioned that your toon model is not trained on pairs then how does the training works? can I train the model by just providing target images?
in path config it goes like this:
'ffhq_train': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Train/Train_IN',
'ffhq_target': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Train/Target_IN',
'toon_train': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Test/Test_IN',
'toon_target': '/content/drive/My Drive/Style/pixel2style2pixel/Data/Test/Target_IN',
In the data config file we provided details like this:
'ffhq_encode': {
'transforms': transforms_config.EncodeTransforms,
'train_source_root': dataset_paths['ffhq_train'],
'train_target_root': dataset_paths['ffhq_target'],
'test_source_root': dataset_paths['toon_train'],
'test_target_root': dataset_paths['toon_target'],
},
from pixel2style2pixel.
Ah. I think I see the problem now.
- Based on the logs you linked, it appears that you are trying to encode toon images. That is, you're training using pairs of (toon, toon). However, if I understand you correctly, you want to train using pairs of (real, toon). That is, the training source data should be real face images while the training target data should be the corresponding toon images.
- The data config you specified above seems a bit strange if I understood it correctly. I believe what you want is something like:
'ffhq_encode': {
'transforms': transforms_config.EncodeTransforms,
'train_source_root': dataset_paths['ffhq_train'],
'train_target_root': dataset_paths['ffhq_toons_train'],
'test_source_root': dataset_paths['real_test'],
'test_target_root': dataset_paths['toons_test'],
},
where ffhq_toons_train
is the path to the toons images corresponding to the FFHQ data, real_test
is the path to the test set containing real faces images and toons_test
is the path to the test set containing the corresponding toons images.
As for how we trained our toons model with no paired data: we train our toons model exactly like the ffhq_encode task but replace the StyleGAN model from FFHQ to the StyleGAN toons model. That is, we train using only real face images.
from pixel2style2pixel.
-
I think the structure you described in point 2 it's same for us while doing the above training.
below structure is what I mentioned in the data config file which I am pretty sure it should be correct or I am doing something wrong?
-
Images in logs seem confusing as input is toon image whereas it should be a real image in the input isn't it?
from pixel2style2pixel.
Thanks @yuval-alaluf after changing transform class and the toonify dict as you described now the logs are as expected.
and the other thing I was wondering was how can I try with just stylegan weights file(.pt) and without pair images as u did in your toon model.
from pixel2style2pixel.
@justmaulik ,
Not sure what you mean by "just stylegan weights file".
But if you want to reproduce the toons model we trained with no paired data, you need to specify only a few things:
- Make sure
--dataset_type
isffhq_encode
- this will take the data to simply be real FFHQ face images with no toons paired data. - Set
--stylegan_weights
to the file containing the toonified StyleGAN generator that we linked in the repo (here is the link to download it - Set the loses to what we specified in the README.
Basically, the main difference is that instead of using the FFHQ StyleGAN generator, you will use the Toons StyleGAN generator.
from pixel2style2pixel.
Thanks, @yuval-alaluf actually have a toon StyleGAN model that was created by following @justinpinkney blog post that actually turned out really good but not perfect (10% bad image). I was curious to see how the result changes in pixel2style2pixel.
I did the training and the Logs are getting better than the previous runs. but it's not the learning anymore the best loss was achieved at the start of the training then it just stayed still. The current dataset count is ~920 pairs in train and 100 pairs in the test.
- Is the training data now enough?
- I am using a StyleGAN model that I mentioned above as
--stylegan_weights
- The images in test and train data are generated using the same
stylegan_weights
- The output in the test folder looks like it just gives the same few faces just a bit changes and is way far from the input.
Waiting for the follow-up post on @justinpinkney blog :) or if possible you could explain a bit about how did you manage to achieve the results so I can give that method a try. 😊
from pixel2style2pixel.
Looks much better on the train!
I think the problem you're getting on the test set is that the real image data is not aligned. Other than that, everything else looks good.
I believe that if you align the data you should see much better results.
Let me know if this solves the problem. I'm interested in seeing your results!
from pixel2style2pixel.
All the images used were tagged as aligned on Nvidia's repo. I think it's just missing the blur effect. Still, I will start a new Run after aligning all the images and will share the results.
The output in the train folder is perfect but in the test folder, it's still a bit far from perfect.
If I add more variety of data pairs(5k-10k pairs) will it improve significantly? or 900 pairs is more than enough for this type of training?
from pixel2style2pixel.
I tried aligned images but it did not improve the results as expected it's more random some images give good results but most are just like this. the output is good but far from the input. I will try with a larger dataset and share the results.
from pixel2style2pixel.
@justmaulik Were you able to solve this? I am facing the same issue, I trained with 3k pairs, but still the output does not exactly match with the input
No, I tried few things but did not achieve the results I was aiming for. at some point, I trained with ~20k pairs (40k images) but the results did not improve much.
from pixel2style2pixel.
@yuval-alaluf Can we reopen this issue? I am facing the same problem. The results look good on train logs, but with test, it seems to have no connection with the input.
from pixel2style2pixel.
Hi @Nerdyvedi ,
I'll be happy to help, but since your question is not specifically about the ffhq_encode
task, I think the best thing to do is to open a new issue where you can provide me some more details as to what you're trying to do, the training setting, and the results your seeing on the train/test.
I will try to help out as much as possible.
from pixel2style2pixel.
@yuval-alaluf Would it be possible for you to share your dataset? Your dataset looks much better than ours. How did you create it?
I am close to getting good results, But my dataset is not good enough
from pixel2style2pixel.
Hi @cvmlddev,
As I mentioned in your issue (#83), I think part of the reason you're getting unsatisfactory results is the use of paired data.
Regarding the data, for our toonify model we actually didn't use any paired data at all. The training was done using only real images from FFHQ using the toonify StyleGAN. This allowed for more flexible results during inference.
from pixel2style2pixel.
Oops sorry, meant to tag @justmaulik . @justmaulik Would it be possible for you to share your paired toon dataset?
from pixel2style2pixel.
@cvmlddev Sure I had 20k pairs (40k images) but can't find it now will share it if I find it somewhere.
Here is the 1k set I just found.
from pixel2style2pixel.
@justmaulik Thanks a lot!
How did you generate it?
Also, I wonder why you are not getting good results, I was able to get decent results with default parameters.
from pixel2style2pixel.
I trained an ffhq model with a small dataset then created paired data with it to train for pixel2style2pixel.
the problem I faced was results were not close to the input results were good but far from the input.
Here I found just toon folder :) maybe it can help you somehow.
from pixel2style2pixel.
Thanks a lot!
from pixel2style2pixel.
@justmaulik again, one year later, did you get better results in the end or did you end up working with a completely another approach?
from pixel2style2pixel.
How to generate cartoon images like https://drive.google.com/drive/folders/1-6gZXiSDwT8hJxJcwqdGd2qErm-608rL , THX
from pixel2style2pixel.
@cvmlddev当然我有 20k 对(40k 图像)但现在找不到它如果我在某个地方找到它会分享它。 这是我刚找到的 1k 集。
Hi, can you share your dataset again, can't find it here, thanks!
from pixel2style2pixel.
Related Issues (20)
- Use the pre-trained model for training HOT 2
- do a huggingface demo HOT 1
- How to train on paired data? HOT 1
- latent image editing HOT 1
- About celebs_seg_to_face HOT 1
- multiple GPUs HOT 4
- Using my own pretrained model from vanilla StyleGAN2 HOT 3
- Is it possible video2anime train? like this project? HOT 3
- A problem about how to get diverse images? HOT 1
- the output image become brown HOT 3
- Need help with running the code on CPU HOT 1
- loading pretrained weights, STACK_GLOBAL requires str HOT 1
- How to train pSp on Z+ space? HOT 2
- Is it possible to create output images in profile (side-on) perspective using sketch to face? HOT 1
- Single channel input with Moco loss not working. HOT 2
- Some error reporting problems encountered during operation HOT 1
- How to use other identity loss rather than Arcface or Moco HOT 2
- How to use psp for Face beautification
- loss jump problem HOT 1
- Retraining
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pixel2style2pixel.