xuecaihu / meta-sr-pytorch Goto Github PK
View Code? Open in Web Editor NEWMeta-SR: A Magnification-Arbitrary Network for Super-Resolution (CVPR2019)
Meta-SR: A Magnification-Arbitrary Network for Super-Resolution (CVPR2019)
Can you provide the pre-trained RDN model?
"--load" can't running correctly
Prompt:No such file or directory: 'psnr_log.pt'
Can it support in Windows10 + pytorch1.2? THX!
run: python main.py --model metardn --ext sep --save metardn --cpu --batch_size 500 --test_only --data_test Demo --pre_train ./experiment/metardn/model/model_1000.pt --save_results --scale 1.5
but the result is very slow why?
"--test_data DIV2K --data_range 801-900" , its not running
if mod(h,scale) ~= 0
h = h - 4;
end
请问这么做的原因是什么?h减去4也不能保证就被scale整除啊
Thanks for providing the detailed code with instructions to train and test.
I am curious if I have any problem with having pretty low PSNR during the evaluation of my trained model and even the pre-trained model downloaded from Google Drive.
The trained model got trained followed by the instruction in README but the PSNR for 1.1x on B100 is only around 29dB. Then I evaluated the pre-trained model and PSNR for 1.1x is also around 29dB.
Would you mind telling me whether the pre-trained model is the model generating the scores in the paper or preliminary model? Thanks a lot!
您好,请问一下当测试的时候,需要提供相应倍数的下采样标签,才能测试吗?
Do I need corresponding down-scale labels for testing?
Can I directly test my data without these labels, just the test images as the only one input?
thank you
您好,我想问一下,为什么meta upscale module的输入是(1,1,3)呢?
Hi @XuecaiHu ,
As same as the title, I have few question about this paper.
According to the paper, I think any scale factor could work on Meta-SR,
like 8x, 16x or 16.5x, is it right?
If the scale factor over 4x worked,
have you experiment the performance on 8x or 16x? if not, why?
File "D:\CY\Super-resolution\Meta-SR-Pytorch-master\trainer.py", line 132, in train
for batch, (lr, hr, _, idx_scale) in enumerate(self.loader_train):
ValueError: not enough values to unpack (expected 4, got 3)
hello, I have a problem about training dataset, as you said, you have .m to generate training dataset. seems your HR image just include on folder. is it possible make pair for 1.1 to 4.0 just using DIV2K default HR images? they may not a pair, like 2040 can not exact division by 1.9 1.8
Hi, thanks for your good works.
I wander to know the shape of the weights generated by Meta-Upscale Module, does the shape (width, height, channel) related with the scale factor, how can we get it ?
Thanks in advance.
Hi, Xuecai
Thank you for sharing this wonderful work!!!
I have a small question about the parameters in the FC layer. In 'metardn' model, the first FC layer is like: nn.Linear(3,256). But in 'metaedsr' model, the first FC layer is like: nn.Linear(6,256). I was confused about the parameter setting, I think both of them should be nn.Linear(3,256). Because according to input_matrix_wpn function, the output pos_mat size is 1x(HW)x3. Looking forward to your reply.
Zewei
Hi, Xuecai,
I noticed that in your code, the metardn model takes "pos_mat" as input and the shape of output is [N, outC, scale_int×inH, scale_int×inW], where scale_int = math.ceil(scale). And then after model output, you have an extra step of torch.masked_select to get the final output with shape [N, outC, int(scale×inH), int(scale×inW)].
I wonder why you don't input the "pos_mat" with the right shape, and directly get the final output with shape [N, outC, int(scale×inH), int(scale×inW)]. I think this will waste some computation resources.
I'll appreciate it greatly if you can answer my question
hello Xuecai Hu,I have some question to ask.The following questions are
1.
What should I do with that part of the dataset?
2.
Which dataset did you use? I have all the datasets
3.
What should I do with the folder format of the dataset?
when testing, I met the problem as calculating psnr or ssim, I downloaded the test dataset from other website rather than the author provided because the baiduyunpai's link is not work and I can't visit Google Driver, so , Did any one meet the same problem with me?
Hi, i found if use jpg images during training, it will cause much jpg compression noise. but for EDSR no this problem, anyone face similar problem?
why you set 50.
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
forrtl: error (200): program aborting due to control-C event
How can i solve this bug?
@XuecaiHu when I test EDSR and meta-SR pre-trained model in my computer only by CPU. I find meta-SR need much more memory. Could I change something to decrease the memory cost? I want to use meta-SR on the cellphone. if I want to deploy on Andriod that memory is 8G, and the max size of image is 2000x1500. Is it ok? thank you.
sir,
what is the final loss when train on DIV2K?
I have the loss~4.0, and it swings around 4.0 when training, is it normal?
thank you very much
你好,谢谢你提供的代码!
我的pytorch版本是1.2.0,测试代码是跑通了。然后想训练metaedsr模型,用了两块GPU,但是它跑完一个epoch后就不动了,GPU利用率为0,想请问一下这是什么原因呀ε=(´ο`*)))
[Epoch 1] Learning rate: 1.00e-4
[1600/16000] [L1: 50.8884] 20.6+507.0s
......
[16000/16000] [L1: 15.0077] 14.5+151.8s
这样是对的吗?
1、BicuConv和Meta-Bicu的具体流程和结构是怎样的,能否更清楚地说明下?Meta-Bicu既然已经有了Meta上采样能应对多尺度与小数尺度,为什么还要bicubic?
2、4.3结尾提出了fov的概念,能否具体解释一下fov是怎么定义的,为什么插值是factor越大fov越小但是meta是一样的,为什么一样的更好?
3、几个RDN124和EDSR124的对照实验,train的流程和test的流程是否是一样的,是按照新流程重新训练,还是直接用的原工作的预训练模型;如果是重新训练,训练流程是否是多尺度共享单模型?如果不是重新训练而是训练好的,那么指标低是正常的;
4、论文里发现了两个小问题,第一个,4.3中,bicubic最弱的baseline,并不time-consuming,x1的则是的确非常费时;第二个,4.2中,不能说L2loss是SISR领域的traditional loss,EDSR后几乎都是用L1Loss了;不过都不是大问题。
5、关于meta模块的鲁棒性,是仅仅对训练过的尺度具有鲁棒性,还是对中间尺度也有?鲁棒性程度大小如何?
举例:用1x~4x,0.1x均匀分布,训练。那么(1)诸如2.55x这种的指标如何?(2)3.9x或者1.1x这种偏分布头尾的,是否指标优势会不如2.1x、3.1x这种靠中间的?
6、在思考对于指标提升的本质性问题。这个结构本质上实现了2点(1)多尺度训练。由于2的实现,这个多尺度变得非常多,因为拓展了小数尺度,否则可能只有2x3x4x这样(2)不同像素学习使用不同滤波器,又叫meta,或者动态滤波器。那么对于指标提升更关键的是1还是2?
1相当于不同任务共享全部参数训练有种multi-task的味道了,如果能带来提升是很正常的;2里面的动态滤波器,最近2年的image/video reconstruction领域的论文也经常会用到,带来了提升,但是为什么会带来提升?如果是Meta-SR里面的这个meta上采样结构,滤波器学习的输入仅仅是代表了像素位置和放大尺度这2个信息的HWx3的tensor,更像是适配的味道,想不出来能带来指标提升的点。
所以对于提升的问题,我个人的感觉是,1是带来提升的真正的点,但是没有2就没有1,所以1和2都是有必要的。不知道能否看到更详尽的对照实验来验证这点?比如:
(1)把meta模块用bicubic代替来适配任意尺度,排除2的影响;2x~4x训练,只测试3x,和只用3x训练,测试3x,做对照,证明是1能带来提升,或者没有用;
(2)不去除meta模块,并且只使用某个整数倍来训练和测试,那么meta模块的输入就要另想办法了。对照实验为SISR领域传统的conv+pixel_shuffle的方法,meta模块相对有提升,证明是2能带来提升,或者没有用。
7、实现方面,对于小数尺度,目前的做法是repeat_x到比小数稍大点的整数倍,meta卷积后,通过mask的生成去除无关行列,再做loss。能否直接nearest放大到指定尺度,meta卷积后直接做loss?
Hi,
Does this code support running with multiple GPUs? What's the meaning of parameter ''n_GPUs"? Does it stand for the number of used GPUs?
Thanks!
I have an parallel version implemented by parfor:
But I have no the permission to upload the *.m into this branch.
Line 107 in f2cf094
I've tried to run test code but this error raised:
Traceback (most recent call last):
File "test.py", line 20, in <module>
while not t.terminate():
File "D:\Artificial Intelligence\SuperResolution\Images\Meta-SR-Pytorch-master\trainer.py", line 284, in terminate
self.test()
File "D:\Artificial Intelligence\SuperResolution\Images\Meta-SR-Pytorch-master\trainer.py", line 233, in test
sr = self.model(lr, idx_scale,pos_mat = scale_coord_map)
File "C:\Users\127051\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
File "D:\Artificial Intelligence\SuperResolution\Images\Meta-SR-Pytorch-master\model\__init__.py", line 52, in forward
return self.forward_chop(x)
File "D:\Artificial Intelligence\SuperResolution\Images\Meta-SR-Pytorch-master\model\__init__.py", line 132, in forward_chop
sr_batch = self.model(lr_batch)
File "C:\Users\127051\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 541, in __call__
result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'pos_mat'
Can you expalin the mask_mat fuction in detail? I couldn't understand why there is a mask_mat.....
Hi @XuecaiHu , I have no problem with trainning to get the .pt files with mine own dataset(GPU 1080TI ,ANACONDA3 VIRTUEENV). well , I can not run the test part for reference ,my dataset is combined with 480x270 (low resolution) and 1920x1080(high resolution)
but when I run : python main.py --model metardn --ext sep --save metardn --n_GPUs 1 --batch_size 1 --test_only --data_test Set5 --pre_train ./experiment/metardn/model/minetrained.pt --save_results --scale 4.0
I get this error when replace the benchmark file with mine validation files(480x270,1920x1080) as show in image (run with no issue when using the download benchmark files ):
[4.0]
./benchmark/Set5/HR
./benchmark/Set5/LR_bicubic
1
Making model...
Loading model from ./experiment/metardn/model/model_1000.pt
load_model_mode=1
Evaluation:
THCudaCheck FAIL file=/pytorch/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory
Traceback (most recent call last):
File "main.py", line 18, in
while not t.terminate():
File "/home/gaofei/Meta-SR-Pytorch/trainer.py", line 269, in terminate
self.test()
File "/home/gaofei/Meta-SR-Pytorch/trainer.py", line 217, in test
sr = self.model(lr, idx_scale,scale_coord_map)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(input, **kwargs)
File "/home/gaofei/Meta-SR-Pytorch/model/init.py", line 56, in forward
return self.model(x,pos_mat)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(input, **kwargs)
File "/home/gaofei/Meta-SR-Pytorch/model/metardn.py", line 125, in forward
local_weight = self.P2W(pos_mat.view(pos_mat.size(1),-1)) ### (outHoutW, outCinCkernel_sizekernel_size)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/gaofei/Meta-SR-Pytorch/model/metardn.py", line 60, in forward
output = self.meta_block(x)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/module.py", line 491, in call
result = self.forward(*input, **kwargs)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/modules/linear.py", line 55, in forward
return F.linear(input, self.weight, self.bias)
File "/home/gaofei/anaconda3/envs/metasr/lib/python3.5/site-packages/torch/nn/functional.py", line 992, in linear
return torch.addmm(bias, input, weight.t())
RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58
could you please help me out ?
Before the last commit, I can run the code well. Both demo and DIV2K. But when I cloned the code after change 2 weeks ago, there has an erro about the part of the code that lastly changed.
Belows are details.
$ python main.py --model metardn --ext sep --save metardn --lr_decay 200 --epochs 1000 --n_GPUs 1 --batch_size 1
[1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7, 2.8, 2.9, 3.0, 3.1, 3.2, 3.3, 3.4, 3.5, 3.6, 3.7, 3.8, 3.9, 4.0]
1
./benchmark/Set5/HR
./benchmark/Set5/LR_bicubic
5
Making model...
Preparing loss function:
1.000 * L1
[Epoch 1] Learning rate: 1.00e-4
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=844 error=11 : invalid argument
Traceback (most recent call last):
File "main.py", line 20, in
t.train()
File "/home/xkey/dml/Meta-SR-Pytorch/trainer.py", line 143, in train
scale_coord_map, mask = self.input_matrix_wpn_new(H,W,self.args.scale[idx_scale]) ### get the position matrix, mask
File "/home/xkey/dml/Meta-SR-Pytorch/trainer.py", line 106, in input_matrix_wpn_new
while(pos_mat[i][0][0]!=0):
IndexError: index 200 is out of bounds for dimension 0 with size 200
Is there something wrong about my option?
Waiting for your reply!
Hello,
after 100 epochs,the loss is between 4 and 5. In your opion,we should use 1000 epochs. I finished it. but the loss still is between 4 and 5. Is it correct?
想请问您代码上的一些问题。您论文里的meta.py网络结构,具体结构为什么会有两个分支,进行了不同的操作,我看好像是一个是投影后权重预测,一个是直接投影?这样做的原理是什么呢?在最后concate的目的又是什么呢?
Hey @XuecaiHu,
Nice work and thanks for sharing your code. I have a question about your meta upscale module.
Thanks!
Hi I tried to download the test dataset from the pan.baidu link but it wasn't working. I think it is saying something about file not found.
Error:
raise NotSupportedError(base.range(), "slicing multiple dimensions at the same time isn't supported yet")
torch.jit.frontend.NotSupportedError: slicing multiple dimensions at the same time isn't supported yet
proposals (Tensor): boxes to be encoded
"""
# perform some unpacking to make it JIT-fusion friendly
wx = weights[0]
wy = weights[1]
ww = weights[2]
wh = weights[3]
proposals_x1 = proposals[:, 0].unsqueeze(1)
~~~~~~~~~ <--- HERE
proposals_y1 = proposals[:, 1].unsqueeze(1)
proposals_x2 = proposals[:, 2].unsqueeze(1)
proposals_y2 = proposals[:, 3].unsqueeze(1)
reference_boxes_x1 = reference_boxes[:, 0].unsqueeze(1)
reference_boxes_y1 = reference_boxes[:, 1].unsqueeze(1)
reference_boxes_x2 = reference_boxes[:, 2].unsqueeze(1)
reference_boxes_y2 = reference_boxes[:, 3].unsqueeze(1)
Hi,
Thanks for sharing your code.
The training phase needs 1000 epochs.
If I stop the training at the 500 epoch, how to load the model and continue the training?
Just set args.resume=500? Whether need to change the parameter args.start_epoch?
Thank you.
Hi Xuecai,
Are there any issues with using this on 1x256x256 images (with n_colours = 1 instead of 3)? Are there any parameters that should be tweaked or modified to accommodate that n_colours =1? I ran into this issue and I am not sure where to start looking to debug this.
79 output = torch.masked_select(output,mask)
---> 80 output = output.contiguous().view(N,C,outH,outW)
81 loss = criterion(output,target)
82
RuntimeError: shape '[2, 1, 256, 256]' is invalid for input of size 393216
Here batch size is 2, n_colours = out_channels = 1, and the HR image size is 256x256
Hellow XuecaiHu,
I'm trying to run the test demo,when I run the code , an error occured:
python main.py --model metardn --save metardn --ext sep --pre_train ./experiment/metardn/model/model_1000.pt --test_only --data_test Set5 --scale 4 --n_GPUs 1
the error information is:
`File "main.py", line 18, in
while not t.terminate():
File "/Meta-SR-Pytorch-master/trainer.py", line 270, in terminate
self.test()
File "/Meta-SR-Pytorch-master/trainer.py", line 210, in test
scale_coord_map, mask = self.input_matrix_wpn(H,W,self.args.scale[idx_scale])
File "/Meta-SR-Pytorch-master/trainer.py", line 76, in input_matrix_wpn
h_offset[int_h_project_coord[i], flag, 0] = offset_h_coord[i]
IndexError: index 4 is out of bounds for dimension 0 with size 4
`
Looking forward to your reply, thank you very much
Hi, Xuecai.
Thanks for your wonderful work about arbitrary upscale factor of SR.
What i want to know is that if i only have one GPU, what should i need to modify the code?
Besides, could you tell me how long is the training process?
Thanks a lot.
Sir:
when I test with my trained model, I found it will cause "out of memory", no matter what I select the scale 1.5, 2.0, or 4.0.
my memory is 32G, but I found this problem when testing.
thank you very much!
(1)scale==4.0 error:
RuntimeError: CUDA out of memory. Tried to allocate 5.94 GiB (GPU 0; 31.72 GiB total capacity; 25.29 GiB already allocated; 3.15 GiB free; 2.23 GiB cached)
(2)scale==2.0 error:
RuntimeError: CUDA out of memory. Tried to allocate 5.94 GiB (GPU 0; 31.72 GiB total capacity; 27.52 GiB already allocated; 1.50 GiB free; 1.65 GiB cached)
(3)scale==1.5 error:
RuntimeError: CUDA out of memory. Tried to allocate 31.66 GiB (GPU 0; 31.72 GiB total capacity; 10.15 GiB already allocated; 19.06 GiB free; 1.47 GiB cached)
and my script is:
python test.py --model metardn --save metardn --ext sep --pre_train ./experiment/metardn/model/model_1000.pt --test_only --data_test DIV2K --dir_data ./benchmark --scale 1.5 --n_GPUs 3 --data_range 1-1 0 --batch_size 1
./experiment/metardn/model/model_1000.pt --test_only --data_test DIV2K --dir_data ./benchmark --scale 1.5 --n_GPUs 1 --data_range 1-1 0 --batch_size 1
Hello, I don't figure out get_model(self) function in model/init.py, and can you explain what does "self.model.module" mean? Because I didn't find that the Class metardn (self.model) have the module function. Looking forward to your reply and thanks!
Hi,
I could not find detailed setup for training EDSR(x1) in the paper.
Would you mind telling me the setup details?
Unknown lists are below:
thank you.
I have installed the required modules, but still report an error when running:
RuntimeError: cuda runtime error (2) : out of memory at C:/Users/Administrator/Downloads/new-builder/win-wheel/pytorch/aten/src/THC/THCTensorRandom.cu:25
RuntimeError: cuda runtime error (30) : unknown error at C:/Users/Administrator/Downloads/new-builder/win-wheel/pytorch/aten/src/THC/THCTensorRandom.cu:25
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
If someone has encountered such a mistake, can you help me?
thanks!
Hi,
I have some questions about the paper.Take the RDN-based meta-SR as an example. How is the Feature Learning Module trained? Is the upsampling module removed directly based on RDN x4?In addition, how is the meta-upscale module trained? Is the parameter of the Feature Learning Module fixed and then trained on meta-upscale or a combination of both?
If it is convenient, can you talk about the training pipeline in detail? What is the input and label?
Thank you very much!!(respect to big old)
Look forward to your kind reply!
When I run main.py with command:
python main.py --model metardn --save metardn --ext sep --pre_train ./experiment/metardn/model/model_1000.pt --test_only --data_test Set5 --scale 1.5 --cpu
The following error occurs:
[1.5]
./benchmark/Set5/HR
./benchmark/Set5/LR_bicubic
5
Making model...
Loading model from ./experiment/metardn/model/model_1000.pt
Traceback (most recent call last):
File "main.py", line 15, in
model = model.Model(args, checkpoint)
File "/Users/junfenghe/Code/Github/Meta-SR-Pytorch/model/init.py", line 34, in init
cpu=args.cpu
File "/Users/junfenghe/Code/Github/Meta-SR-Pytorch/model/init.py", line 101, in load
self.get_model().load_state_dict(
File "/Users/junfenghe/Code/Github/Meta-SR-Pytorch/model/init.py", line 60, in get_model
return self.model.module
File "/Users/junfenghe/anaconda3/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 532, in getattr
type(self).name, name))
AttributeError: 'MetaRDN' object has no attribute 'module'
My system is macOS Mojave 10.14.4, python 3.5, and I think my package versions are all correct:
scikit-image 0.13.1 py35h1de35cc_1 defaults
torch 0.4.0 pypi_0 pypi
torchvision 0.2.0 pypi_0 pypi
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.