Code Monkey home page Code Monkey logo

Comments (5)

KeithGeorgeCiantar avatar KeithGeorgeCiantar commented on June 15, 2024

Hello, thank you for your comment!

Could you please elaborate on what type of degradation you have for your dataset (e.g., bicubic downsampling, blurring, noise, etc.)?

The sample Q-EDSR config expects a the PCA vector of the blurring kernel to be passed as an input with the image. These vectors are stored in a file, in the same folder with the LR images. To create blurred LR images with the file containing the kernel vectors, check out the data_prep.md file in the Documentation folder.

from rumpy.

biduan avatar biduan commented on June 15, 2024

Hi Keith,

Thank you very much.

"These vectors are stored in a file, in the same folder with the LR images. "
//Comments. Is the vecort file name like xxx_pca_matrix.pth? Is additional configuration required in the model trainning configuration to enbale this xxx_pca_matrix.pth file?

Below is my dataset pieline ,I reuse from the “Documentation/sample_degradation_generators/blur_downsample_noise_compress.toml” file to prepare the DIV2k dataset.

/************

pipeline = [ [ "realesrganblur", "b-config",], [ "downsample", "d-config",], [ "realesrgannoise", "n-config",], [ "randomcompress", "c-config",],]


[deg_configs.b-config]
device = 0
pca_batch_len = 10000
kernel_range = [ "iso", "aniso", "generalized_iso", "generalized_aniso", "plateau_aniso", "plateau_iso", "sinc",]
pca_length = 100
request_full_kernels = true
request_pca_kernels = true
request_kernel_metadata = true
use_kernel_code = true
sigma_x_range = [ 0.2, 3.0,]
sigma_y_range = [ 0.2, 3.0,]
normalize_metadata = true

[deg_configs.d-config]
scale = 4

[deg_configs.n-config]
gaussian_poisson_ratio = 0.5
gaussian_noise_sigma_range = [ 1.0, 30.0,]
poisson_noise_scale_range = [ 0.05, 3.0,]
gray_noise_probability = 0.4
device = "cuda"

pca_batch_len = 10000
pca_length = 100
request_full_kernels = true
request_pca_kernels = true

[deg_configs.c-config.jpeg_params]
compression_range = [ 30, 95,]
random_compression = true

[deg_configs.c-config.jm_params]
random_compression = true


from rumpy.

KeithGeorgeCiantar avatar KeithGeorgeCiantar commented on June 15, 2024

Thanks for your reply, I will try to address your comments and questions as best I can!


Let's starts with the vector file that I mentioned before. When you generate LR images using the pipeline, the folder containing the LR images will include files such as degradation_metadata.csv and degradation_hyperparameters.csv. If you open the degradation_metadata.csv file you will see a set of columns and rows, with the information for the degradations of each image. If you set the PCA options to true, the file xxx_pca_matrix.pth will also be generated. However, this is only used to keep track of how the blurring kernels were converted to PCA, in case you need to re-generate data with the same matrix or if you need to do inverse PCA.


Since you're using the sample config with blurring, downsampling, noise and compression, when you generate the LR images, the degradation_metadata.csv file will have a series of column headers in the first row. If you set request_full_kernels = false and request_pca_kernels = false, the file header will contain the following:

image, 0-realesrganblur-sigma_x, 0-realesrganblur-sigma_y, 0-realesrganblur-rotation, 0-realesrganblur-kernel_type, 0-realesrganblur-beta_p, 0-realesrganblur-beta_g, 0-realesrganblur-omega_c, 0-realesrganblur-kernel_size, 1-downsample-scale, 2-realesrgannoise-gaussian_noise_scale, 2-realesrgannoise-gray_noise, 2-realesrgannoise-poisson_noise_scale, 3-randomcompress-jm_qpi, 3-randomcompress-jpeg_quality

If you set request_full_kernels = true and request_pca_kernels = true, the file header will have the same values as before, but will also contain:

0-realesrganblur-unmodified_blur_kernel, 0-realesrganblur-blur_kernel

These two columns will have the entire blur kernel vector and the PCA vector of the kernel, respectively. The length of the PCA kernel vector depends on the size you choose in your config, so if you want 100 values you'd set pca_length = 100 and if you want 10 you'd have pca_length = 10. In general, I strongly suggest choosing a value of 10, as larger vectors don't add much information.


Now that we have covered how the degradation metadata is generated, that last thing to go over is how it is used during training. Let's take a look at the sample Q-EDSR config that you were planning to run. If you look at lines 38-47, there are some parameters for the model to train, but there's also a parameter called metadata.

[model]
name = 'qedsr' # model architecture name
[model.internal_params] # parameters specific for each model
scale = 4 # super resolution scale factor
lr = 1e-4 # learning rate
num_blocks = 32
num_features = 256
res_scale = 0.1
metadata = [ "blur_kernel"]
q_layer_nonlinearity = true

This parameter is used to tell the model which data will be fed as input, alongside the image. In the sample we provided, the term blur_kernel is set as the metadata, however, I realise there is a mistake from our end, as the metadata labels were updated, when we had changed the degradation system.

It is important to note that the even though sample Q-EDSR config, only makes use of the blurring kernel, you can set the metadata parameter to take any degradation information form the degradation_metadata.csv file.

I will provide a few sample training configs that you can use, that should work on the DIV2K you have generated. Make sure to copy the entire config, as some parameters have been added and some lines have been removed.


Option 1: Metadata = full blurring kernel, noise parameters, compression parameters

experiment = "q-edsr"  # experiment name
experiment_save_loc = '../../../Results/SISR'  # experiment save location

[data]
batch_size = 8  # data batch size
dataloader_threads = 8  # parallel threads to use to speed up data processing

[data.training_sets.data_1]
name = 'div2k'
lr = '../../../Data/div2k/LR'
hr = '../../../Data/div2k/HR'
degradation_metadata = 'on_site'
crop = 64
random_augment = true
ignore_degradation_location = true

[data.eval_sets.data_1]
name = 'div2k'
lr = '../../../Data/div2k/LR'
hr = '../../../Data/div2k/HR'
degradation_metadata = 'on_site'
ignore_degradation_location = true

[model]
name = 'qedsr'  # model architecture name
[model.internal_params] # parameters specific for each model
scale = 4 # super resolution scale factor
lr = 1e-4 # learning rate
num_blocks = 32
num_features = 256
res_scale = 0.1
metadata = ['realesrganblur-unmodified_blur_kernel',
'realesrgannoise-gaussian_noise_scale', 'realesrgannoise-gray_noise', 'realesrgannoise-poisson_noise_scale',
'randomcompress-jm_qpi', 'randomcompress-jpeg_quality]
q_layer_nonlinearity = true
ignore_degradation_location = true

scheduler = 'cosine_annealing_warm_restarts'
[model.internal_params.scheduler_params]
t_mult = 1  # no change in LR throughout training
restart_period = 40000  # number of batches for restart to occcur
lr_min = 1e-7  # minimum learning rate

[training]
gpu = 'single' # one of multi, single, off
sp_gpu = 0 # inital gpu to use
seed = 8  # random seed
epoch_cutoff = 1300  # epochs requested
metrics = ['PSNR', 'SSIM']  # metrics to calculate on validation set
id_source = 'standard'  # location of ids to use for face recognition metrics
logging = 'visual' # one of visual or text

Option 2: Metadata = blurring kernel parameters, noise parameters, compression parameters

experiment = "q-edsr"  # experiment name
experiment_save_loc = '../../../Results/SISR'  # experiment save location

[data]
batch_size = 8  # data batch size
dataloader_threads = 8  # parallel threads to use to speed up data processing

[data.training_sets.data_1]
name = 'div2k'
lr = '../../../Data/div2k/LR'
hr = '../../../Data/div2k/HR'
degradation_metadata = 'on_site'
crop = 64
random_augment = true
ignore_degradation_location = true

[data.eval_sets.data_1]
name = 'div2k'
lr = '../../../Data/div2k/LR'
hr = '../../../Data/div2k/HR'
degradation_metadata = 'on_site'
ignore_degradation_location = true

[model]
name = 'qedsr'  # model architecture name
[model.internal_params] # parameters specific for each model
scale = 4 # super resolution scale factor
lr = 1e-4 # learning rate
num_blocks = 32
num_features = 256
res_scale = 0.1
metadata = ['realesrganblur-sigma_x', 'realesrganblur-sigma_y', 'realesrganblur-rotation',
'realesrganblur-iso_aniso_type', 'realesrganblur-generalized_type',
'realesrganblur-plateau_type', 'realesrganblur-sinc_type',
'realesrganblur-beta_p', 'realesrganblur-beta_g', 'realesrganblur-omega_c',
'realesrgannoise-gaussian_noise_scale', 'realesrgannoise-gray_noise', 'realesrgannoise-poisson_noise_scale',
'randomcompress-jm_qpi', 'randomcompress-jpeg_quality']
q_layer_nonlinearity = true
ignore_degradation_location = true

scheduler = 'cosine_annealing_warm_restarts'
[model.internal_params.scheduler_params]
t_mult = 1  # no change in LR throughout training
restart_period = 40000  # number of batches for restart to occcur
lr_min = 1e-7  # minimum learning rate

[training]
gpu = 'single' # one of multi, single, off
sp_gpu = 0 # inital gpu to use
seed = 8  # random seed
epoch_cutoff = 1300  # epochs requested
metrics = ['PSNR', 'SSIM']  # metrics to calculate on validation set
id_source = 'standard'  # location of ids to use for face recognition metrics
logging = 'visual' # one of visual or text

Option 3: Metadata = PCA blurring kernel only
Note: for this option, you need to re-generate the LR images and set pca_length = 10 in the pipeline config.

experiment = "q-edsr"  # experiment name
experiment_save_loc = '../../../Results/SISR'  # experiment save location

[data]
batch_size = 8  # data batch size
dataloader_threads = 8  # parallel threads to use to speed up data processing

[data.training_sets.data_1]
name = 'div2k'
lr = '../../../Data/div2k/LR'
hr = '../../../Data/div2k/HR'
degradation_metadata = 'on_site'
crop = 64
random_augment = true
ignore_degradation_location = true

[data.eval_sets.data_1]
name = 'div2k'
lr = '../../../Data/div2k/LR'
hr = '../../../Data/div2k/HR'
degradation_metadata = 'on_site'
ignore_degradation_location = true

[model]
name = 'qedsr'  # model architecture name
[model.internal_params] # parameters specific for each model
scale = 4 # super resolution scale factor
lr = 1e-4 # learning rate
num_blocks = 32
num_features = 256
res_scale = 0.1
metadata = ['realesrganblur-blur_kernel']
q_layer_nonlinearity = true
ignore_degradation_location = true

scheduler = 'cosine_annealing_warm_restarts'
[model.internal_params.scheduler_params]
t_mult = 1  # no change in LR throughout training
restart_period = 40000  # number of batches for restart to occcur
lr_min = 1e-7  # minimum learning rate

[training]
gpu = 'single' # one of multi, single, off
sp_gpu = 0 # inital gpu to use
seed = 8  # random seed
epoch_cutoff = 1300  # epochs requested
metrics = ['PSNR', 'SSIM']  # metrics to calculate on validation set
id_source = 'standard'  # location of ids to use for face recognition metrics
logging = 'visual' # one of visual or text

Hope this helps!

from rumpy.

biduan avatar biduan commented on June 15, 2024

Hi Keith,

Thank you very much for the detailed information.

I made some changes to the code, and the Q-EDSR model successful run in my environment.
It has been running stably for 3 days and is still in training

Here are what did, Hope to help others

1. Set the training data preprocessing configuration file

refer to Documentation/sample_degradation_generators/blur_downsample_noise_compress.toml

In this configuration, the pca_length is 8 , request_pca_kernels=ture and request_kernel_metadata=true
.......

[deg_configs.b-config]
device = 0
pca_batch_len = 10000
kernel_range = [ "iso", "aniso", "generalized_iso", "generalized_aniso", "plateau_aniso", "plateau_iso", "sinc",]
pca_length =8
request_full_kernels = false
request_pca_kernels = true 
request_kernel_metadata = true
use_kernel_code = true
sigma_x_range = [ 0.2, 3.0,]
sigma_y_range = [ 0.2, 3.0,]
normalize_metadata = true

.......

2. Q-EDSR model trainning configuration

refer to Documentation/sample_config_files/div2k/q-edsr.toml

The metadata is "0-realesrganblur-blur_kernel" and ignore_degradation_location=false.

............

[model]
name = 'qedsr'  # model architecture name
[model.internal_params] # parameters specific for each model
scale = 4 # super resolution scale factor
lr = 1e-4 # learning rate
num_blocks = 32
num_features = 256
res_scale = 0.1
metadata = [ "0-realesrganblur-blur_kernel"]
q_layer_nonlinearity = true
#ignore_degradation_location = true
ignore_degradation_location = false
scheduler = 'cosine_annealing_warm_restarts'
[model.internal_params.scheduler_params]
t_mult = 1  # no change in LR throughout training
restart_period = 40000  # number of batches for restart to occcur
lr_min = 1e-7  # minimum learning rate

3.Modify the code

in step #1, the pca_length=8, So,8 PCA metadata will be generated. the self.num_metadata need to set as 8.

./SISR/models/attention_manipulators/init.py

def generate_channels(self, x, metadata, keys):
	
	#extra_channels = torch.ones(x.size(0), self.num_metadata)
	extra_channels = torch.ones(x.size(0), 8)

Suppose there is another option , set the metadata in step #2 as below. if so, self.num_metadata not need to hard code as 8

metadata = [ "0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel","0-realesrganblur-blur_kernel"]

from rumpy.

KeithGeorgeCiantar avatar KeithGeorgeCiantar commented on June 15, 2024

Hello again, great to hear that you managed to get the model to train! I really appreciate that you took the time and effort with your response, in case other people come across issues when using this code.

I will add information to what you stated, mainly to ensure further clarity.


Starting with the training config, there should technically be two possible ways of using the metadata. The first way is what you did, where you have metadata = ["0-realesrganblur-blur_kernel"] and ignore_degradation_location = false, and the second way is to have metadata = ["realesrganblur-blur_kernel"] and ignore_degradation_location = true. Both of these should give the same result.


Regarding the self.num_metadata it is currently being set here:

if 'blur_kernel' in metadata:
self.num_metadata += 9
elif 'unmodified_blur_kernel' in metadata or any(['unmodified_blur_kernel' in meta_op for meta_op in metadata]):
self.num_metadata += 440

and we use self.num_metadata += 9 because our PCA blur kernel was always size 10. In your case, changing it to 8 directly should work fine. Otherwise, you could do self.num_metadata += 7 in the code linked above. Of course, for a more complete solution, it would be a good idea to set self.num_metadata in a dynamic manner, so that it's more flexible. For us, we opted for the hardcoded system, as we never really changed the size of the PCA vector and were focusing on other things.

As a small side note to this, thank you for pointing out that the config we provided had the PCA length set to 100! This has been updated by my colleague, so that it matches with the correct length expected by the system.

from rumpy.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.