ahupujr / efnet Goto Github PK
View Code? Open in Web Editor NEWEvent-based Fusion for Motion Deblurring with Cross-modal Attention (ECCV'22 Oral) https://ahupujr.github.io/EFNet/
License: Other
Event-based Fusion for Motion Deblurring with Cross-modal Attention (ECCV'22 Oral) https://ahupujr.github.io/EFNet/
License: Other
Hi, how to run EFNet on a custom dataset of video frames in which the root contains subfolders. Each subfolder contains frames of a separate video. The YAML configs we have here dont cover datasets of this type. You just mentioned the h5 hormat
How do I deblur the blurry video and event data I have captured, and what are the data format requirements for this network when inference
Hi,
I wanted to run the test script on my own frames without creating any dataset. I was confused on how to do it. Can you please guide me on how to run the test script for my own frames.
Hi, can you share the GoPro dataset with the raw events?
Thank you for your awesome code!
I am hoping you might open-source the log files you have from training. Maybe the training and validation loss as a function of epoch
(and/or batch) with an estimate of the runtime?
作者你好,我在训练你的网络的时候,在迭代过程中遇到了loss为nan的问题,无法正常训练,请问这要怎么解决呢
2023-06-16 21:58:43,765 INFO: [debug..][epoch: 0, iter: 151, lr:(2.000e-04,2.000e-05,)] [eta: 395 days, 5:42:49, time (data): 0.489 (0.001)] l_pix: -3.0743e+01
2023-06-16 21:58:44,519 INFO: [debug..][epoch: 0, iter: 152, lr:(2.000e-04,2.000e-05,)] [eta: 392 days, 15:56:30, time (data): 0.482 (0.001)] l_pix: -2.8084e+01
2023-06-16 21:58:44,519 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [23:56<00:00, 1.29s/image]
2023-06-16 22:22:48,051 INFO: Validation debug, # psnr: 28.4838 # ssim: 0.9036
2023-06-16 22:22:48,546 INFO: [debug..][epoch: 0, iter: 153, lr:(2.000e-04,2.000e-05,)] [eta: 411 days, 19:13:58, time (data): 0.490 (0.001)] l_pix: -3.0197e+01
2023-06-16 22:22:49,036 INFO: [debug..][epoch: 0, iter: 154, lr:(2.000e-04,2.000e-05,)] [eta: 409 days, 3:35:46, time (data): 0.489 (0.001)] l_pix: -2.7506e+01
2023-06-16 22:22:49,505 INFO: [debug..][epoch: 0, iter: 155, lr:(2.000e-04,2.000e-05,)] [eta: 406 days, 12:46:06, time (data): 0.470 (0.001)] l_pix: inf
2023-06-16 22:22:49,988 INFO: [debug..][epoch: 0, iter: 156, lr:(2.000e-04,2.000e-05,)] [eta: 403 days, 22:44:43, time (data): 0.482 (0.002)] l_pix: nan
2023-06-16 22:22:50,474 INFO: [debug..][epoch: 0, iter: 157, lr:(2.000e-04,2.000e-05,)] [eta: 401 days, 9:30:33, time (data): 0.487 (0.002)] l_pix: nan
2023-06-16 22:22:50,938 INFO: [debug..][epoch: 0, iter: 158, lr:(2.000e-04,2.000e-05,)] [eta: 398 days, 21:02:06, time (data): 0.464 (0.001)] l_pix: nan
2023-06-16 22:22:51,750 INFO: [debug..][epoch: 0, iter: 159, lr:(2.000e-04,2.000e-05,)] [eta: 396 days, 9:26:15, time (data): 0.487 (0.001)] l_pix: nan
2023-06-16 22:22:52,240 INFO: [debug..][epoch: 0, iter: 160, lr:(2.000e-04,2.000e-05,)] [eta: 393 days, 22:28:10, time (data): 0.490 (0.002)] l_pix: nan
2023-06-16 22:22:52,240 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [21:33<00:00, 1.19s/image]
2023-06-16 22:44:32,789 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002
2023-06-16 22:44:33,276 INFO: [debug..][epoch: 0, iter: 161, lr:(2.000e-04,2.000e-05,)] [eta: 410 days, 1:52:18, time (data): 0.481 (0.001)] l_pix: nan
2023-06-16 22:44:33,763 INFO: [debug..][epoch: 0, iter: 162, lr:(2.000e-04,2.000e-05,)] [eta: 407 days, 13:36:31, time (data): 0.485 (0.001)] l_pix: nan
作者你好,我在训练你的网络的时候,出现了PSNR为-42,l_pix为nan的问题,无法正常训练,请问该怎么解决呢,我只是改了数据集的路径,其他都是保持不变的
2023-06-15 21:08:13,778 INFO: Model [ImageEventRestorationModel] is created.
2023-06-15 21:08:13,804 INFO: Resuming training from epoch: 0, iter: 648.
2023-06-15 21:08:13,951 INFO: Start training from epoch: 0, iter: 648
2023-06-15 21:08:18,450 INFO: [debug..][epoch: 0, iter: 649, lr:(2.000e-04,2.000e-05,)] [eta: 5 days, 8:38:06, time (data): 4.499 (1.987)] l_pix: nan
2023-06-15 21:08:19,032 INFO: [debug..][epoch: 0, iter: 650, lr:(2.000e-04,2.000e-05,)] [eta: 4 days, 0:29:47, time (data): 0.581 (0.003)] l_pix: nan
2023-06-15 21:08:19,602 INFO: [debug..][epoch: 0, iter: 651, lr:(2.000e-04,2.000e-05,)] [eta: 3 days, 8:16:14, time (data): 0.570 (0.003)] l_pix: nan
2023-06-15 21:08:20,166 INFO: [debug..][epoch: 0, iter: 652, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 22:27:08, time (data): 0.563 (0.002)] l_pix: nan
2023-06-15 21:08:20,727 INFO: [debug..][epoch: 0, iter: 653, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 15:53:43, time (data): 0.561 (0.002)] l_pix: nan
2023-06-15 21:08:21,331 INFO: [debug..][epoch: 0, iter: 654, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 11:32:32, time (data): 0.603 (0.002)] l_pix: nan
2023-06-15 21:08:21,946 INFO: [debug..][epoch: 0, iter: 655, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 8:21:24, time (data): 0.615 (0.002)] l_pix: nan
2023-06-15 21:08:22,558 INFO: [debug..][epoch: 0, iter: 656, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 5:51:33, time (data): 0.611 (0.002)] l_pix: nan
2023-06-15 21:08:22,558 INFO: Saving models and training states.
2023-06-15 21:26:51,089 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002
2023-06-15 21:26:51,656 INFO: [debug..][epoch: 0, iter: 657, lr:(2.000e-04,2.000e-05,)] [eta: 257 days, 21:51:19, time (data): 0.563 (0.002)] l_pix: nan
2023-06-15 21:26:52,259 INFO: [debug..][epoch: 0, iter: 658, lr:(2.000e-04,2.000e-05,)] [eta: 234 days, 14:08:50, time (data): 0.602 (0.003)] l_pix: nan
2023-06-15 21:26:52,840 INFO: [debug..][epoch: 0, iter: 659, lr:(2.000e-04,2.000e-05,)] [eta: 215 days, 3:37:24, time (data): 0.581 (0.002)] l_pix: nan
I used a single GPU to run the experiment, and only changed the number of GPUs and everything else, but in the end, I didn't get the results of the experiment, and I would like to ask how to get the results of the experiment for a single GPU.
The raw gopro data are not in H5 format, so how to do this?
Thank you.
Could you check if there any problems with the qualitative result, they all link to the GoPro result.
What is the shape of an ideal input tensor?
4卡分布式训练出错,我的机器配置为8*titan,报错信息如下:ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -11) local_rank: 0 ;torch.distributed.elastic.multiprocessing.errors.ChildFailedError。使用readme中给出的训练命令。
Hi,I really appreciate your excellent work and try to retrain the network.But it took quite long time to reach the same result in the paper.Is it normal to take about 5 days to train the network with a batch size of 4 for 800k iterations on GTX 1080Ti ?Could you please share some details about time consumed for training?
I would appreciate for your early reply.
亲爱的作者,首先感谢您的优秀工作,我注意到每张图片都对应一个单通道的mask,请问这个mask是怎么生成的?有什么作用呢?
Hello, thank you for your good research first of all.
I was trying to reproduce the performance reported in your paper with SCER-GoPro dataset that you shared as a link.
(Before I started training from scratch, I had checked that your pre-trained weights gave me PSNR 35.44.
I thought this difference was not that big.)
Since I trained by myself with SCER-GoPro dataset and the code implementation here, total number of iterations was set as 200k which was not equal to the explanation in your paper. (The paper said total num of iter was 300k). Hence I thought this inconsistency was from that difference.
Thus I proceeded to train an additional 100k iters again, however, the performance became lower than that of 200k iter.
Is it because of issues from loading the resume training or should I have to modify some part of the experimental setting in code implementation to obtain the same performance?
Result performance is like below:
psnr 35.46 ssim 0.972 at 200K iter
psnr: 34.4930 ssim: 0.9662 at 200K + 100K iter
It would be really appreciated if you answer my question!
Hello!
Thank you for the nice work!
I really appreciate it.
I'm curious about the qualitative results files you uploaded.
I think GoPro test, REBlur test, REBlur addition -> all the links are identical(with GoPro test).
Can you update the REBlur test, REBlur addition results' file link(with Google Drive)?
Thanks!
Hi,
Thanks for your great work. Any plans for releasing the dataset?
I find the link in the repo is unavailable.
Thanks a lot.
How can I modify the code to realize multi-GPU operation? thanks!
raise subprocess.CalledProcessError(returncode=process.returncode,
subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', '-u', 'basicsr/train.py', '--local_rank=0', '-opt', 'options/train/GoPro/EFNet.yml', '--launcher', 'pytorch']' died with <Signals.SIGKILL: 9>.
(base) root@autodl-container-5d4911a352-200b5df6:~/autodl-tmp/EFNet-main/EFNet-main# /root/miniconda3/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 35 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
Could you kindly upload the pre-trained model?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.