ahupujr / efnet Goto Github PK

Event-based Fusion for Motion Deblurring with Cross-modal Attention (ECCV'22 Oral) https://ahupujr.github.io/EFNet/

License: Other

Python 100.00%

eccv2022 oral deblurring event-camera image-restoration

efnet's Issues

Running EFNet on custom dataset

Hi, how to run EFNet on a custom dataset of video frames in which the root contains subfolders. Each subfolder contains frames of a separate video. The YAML configs we have here dont cover datasets of this type. You just mentioned the h5 hormat

How to de-blur blurry video and event data that shot by myrself

How do I deblur the blurry video and event data I have captured, and what are the data format requirements for this network when inference

running on my own frames

Hi,
I wanted to run the test script on my own frames without creating any dataset. I was confused on how to do it. Can you please guide me on how to run the test script for my own frames.

Questions about the dataset

Hi, can you share the GoPro dataset with the raw events？

Log Files from Training

Thank you for your awesome code!

I am hoping you might open-source the log files you have from training. Maybe the training and validation loss as a function of epoch
(and/or batch) with an estimate of the runtime?

训练时loss为nan

作者你好，我在训练你的网络的时候，在迭代过程中遇到了loss为nan的问题，无法正常训练，请问这要怎么解决呢
2023-06-16 21:58:43,765 INFO: [debug..][epoch: 0, iter: 151, lr:(2.000e-04,2.000e-05,)] [eta: 395 days, 5:42:49, time (data): 0.489 (0.001)] l_pix: -3.0743e+01
2023-06-16 21:58:44,519 INFO: [debug..][epoch: 0, iter: 152, lr:(2.000e-04,2.000e-05,)] [eta: 392 days, 15:56:30, time (data): 0.482 (0.001)] l_pix: -2.8084e+01
2023-06-16 21:58:44,519 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [23:56<00:00, 1.29s/image]
2023-06-16 22:22:48,051 INFO: Validation debug, # psnr: 28.4838 # ssim: 0.9036
2023-06-16 22:22:48,546 INFO: [debug..][epoch: 0, iter: 153, lr:(2.000e-04,2.000e-05,)] [eta: 411 days, 19:13:58, time (data): 0.490 (0.001)] l_pix: -3.0197e+01
2023-06-16 22:22:49,036 INFO: [debug..][epoch: 0, iter: 154, lr:(2.000e-04,2.000e-05,)] [eta: 409 days, 3:35:46, time (data): 0.489 (0.001)] l_pix: -2.7506e+01
2023-06-16 22:22:49,505 INFO: [debug..][epoch: 0, iter: 155, lr:(2.000e-04,2.000e-05,)] [eta: 406 days, 12:46:06, time (data): 0.470 (0.001)] l_pix: inf
2023-06-16 22:22:49,988 INFO: [debug..][epoch: 0, iter: 156, lr:(2.000e-04,2.000e-05,)] [eta: 403 days, 22:44:43, time (data): 0.482 (0.002)] l_pix: nan
2023-06-16 22:22:50,474 INFO: [debug..][epoch: 0, iter: 157, lr:(2.000e-04,2.000e-05,)] [eta: 401 days, 9:30:33, time (data): 0.487 (0.002)] l_pix: nan
2023-06-16 22:22:50,938 INFO: [debug..][epoch: 0, iter: 158, lr:(2.000e-04,2.000e-05,)] [eta: 398 days, 21:02:06, time (data): 0.464 (0.001)] l_pix: nan
2023-06-16 22:22:51,750 INFO: [debug..][epoch: 0, iter: 159, lr:(2.000e-04,2.000e-05,)] [eta: 396 days, 9:26:15, time (data): 0.487 (0.001)] l_pix: nan
2023-06-16 22:22:52,240 INFO: [debug..][epoch: 0, iter: 160, lr:(2.000e-04,2.000e-05,)] [eta: 393 days, 22:28:10, time (data): 0.490 (0.002)] l_pix: nan
2023-06-16 22:22:52,240 INFO: Saving models and training states.
Test 00001088: 100%|██████████| 1089/1089 [21:33<00:00, 1.19s/image]
2023-06-16 22:44:32,789 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002
2023-06-16 22:44:33,276 INFO: [debug..][epoch: 0, iter: 161, lr:(2.000e-04,2.000e-05,)] [eta: 410 days, 1:52:18, time (data): 0.481 (0.001)] l_pix: nan
2023-06-16 22:44:33,763 INFO: [debug..][epoch: 0, iter: 162, lr:(2.000e-04,2.000e-05,)] [eta: 407 days, 13:36:31, time (data): 0.485 (0.001)] l_pix: nan

loss为nan

作者你好，我在训练你的网络的时候，出现了PSNR为-42，l_pix为nan的问题，无法正常训练，请问该怎么解决呢，我只是改了数据集的路径，其他都是保持不变的

2023-06-15 21:08:13,778 INFO: Model [ImageEventRestorationModel] is created.
2023-06-15 21:08:13,804 INFO: Resuming training from epoch: 0, iter: 648.
2023-06-15 21:08:13,951 INFO: Start training from epoch: 0, iter: 648
2023-06-15 21:08:18,450 INFO: [debug..][epoch: 0, iter: 649, lr:(2.000e-04,2.000e-05,)] [eta: 5 days, 8:38:06, time (data): 4.499 (1.987)] l_pix: nan
2023-06-15 21:08:19,032 INFO: [debug..][epoch: 0, iter: 650, lr:(2.000e-04,2.000e-05,)] [eta: 4 days, 0:29:47, time (data): 0.581 (0.003)] l_pix: nan
2023-06-15 21:08:19,602 INFO: [debug..][epoch: 0, iter: 651, lr:(2.000e-04,2.000e-05,)] [eta: 3 days, 8:16:14, time (data): 0.570 (0.003)] l_pix: nan
2023-06-15 21:08:20,166 INFO: [debug..][epoch: 0, iter: 652, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 22:27:08, time (data): 0.563 (0.002)] l_pix: nan
2023-06-15 21:08:20,727 INFO: [debug..][epoch: 0, iter: 653, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 15:53:43, time (data): 0.561 (0.002)] l_pix: nan
2023-06-15 21:08:21,331 INFO: [debug..][epoch: 0, iter: 654, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 11:32:32, time (data): 0.603 (0.002)] l_pix: nan
2023-06-15 21:08:21,946 INFO: [debug..][epoch: 0, iter: 655, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 8:21:24, time (data): 0.615 (0.002)] l_pix: nan
2023-06-15 21:08:22,558 INFO: [debug..][epoch: 0, iter: 656, lr:(2.000e-04,2.000e-05,)] [eta: 2 days, 5:51:33, time (data): 0.611 (0.002)] l_pix: nan
2023-06-15 21:08:22,558 INFO: Saving models and training states.
2023-06-15 21:26:51,089 INFO: Validation debug, # psnr: -42.1933 # ssim: 0.0002
2023-06-15 21:26:51,656 INFO: [debug..][epoch: 0, iter: 657, lr:(2.000e-04,2.000e-05,)] [eta: 257 days, 21:51:19, time (data): 0.563 (0.002)] l_pix: nan
2023-06-15 21:26:52,259 INFO: [debug..][epoch: 0, iter: 658, lr:(2.000e-04,2.000e-05,)] [eta: 234 days, 14:08:50, time (data): 0.602 (0.003)] l_pix: nan
2023-06-15 21:26:52,840 INFO: [debug..][epoch: 0, iter: 659, lr:(2.000e-04,2.000e-05,)] [eta: 215 days, 3:37:24, time (data): 0.581 (0.002)] l_pix: nan

The experimental results did not live up to those in the paper

I used a single GPU to run the experiment, and only changed the number of GPUs and everything else, but in the end, I didn't get the results of the experiment, and I would like to ask how to get the results of the experiment for a single GPU.

How to convert the Gopro data to H5 format?

The raw gopro data are not in H5 format, so how to do this?

Thank you.

All the qualitative results link to the same GoPro results

Could you check if there any problems with the qualitative result, they all link to the GoPro result.

Ideal shape for x and event in `def forward(self, x, event, mask=None)` in EFNet

What is the shape of an ideal input tensor?

分布式训练出错

4卡分布式训练出错，我的机器配置为8*titan，报错信息如下：ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: -11) local_rank: 0 ；torch.distributed.elastic.multiprocessing.errors.ChildFailedError。使用readme中给出的训练命令。

Question abount training time

Hi,I really appreciate your excellent work and try to retrain the network.But it took quite long time to reach the same result in the paper.Is it normal to take about 5 days to train the network with a batch size of 4 for 800k iterations on GTX 1080Ti ?Could you please share some details about time consumed for training?
I would appreciate for your early reply.

mask是做什么用的？

亲爱的作者，首先感谢您的优秀工作，我注意到每张图片都对应一个单通道的mask，请问这个mask是怎么生成的？有什么作用呢？

Training from scratch (200k iters + 100k iters) doesn't achieve the reported performance (PSNR 35.46) with GoPro

Hello, thank you for your good research first of all.

I was trying to reproduce the performance reported in your paper with SCER-GoPro dataset that you shared as a link.
(Before I started training from scratch, I had checked that your pre-trained weights gave me PSNR 35.44.
I thought this difference was not that big.)

Since I trained by myself with SCER-GoPro dataset and the code implementation here, total number of iterations was set as 200k which was not equal to the explanation in your paper. (The paper said total num of iter was 300k). Hence I thought this inconsistency was from that difference.
Thus I proceeded to train an additional 100k iters again, however, the performance became lower than that of 200k iter.

Is it because of issues from loading the resume training or should I have to modify some part of the experimental setting in code implementation to obtain the same performance?
Result performance is like below:

psnr 35.46 ssim 0.972 at 200K iter
psnr: 34.4930 ssim: 0.9662 at 200K + 100K iter

It would be really appreciated if you answer my question!

About qualitative results

Hello!
Thank you for the nice work!
I really appreciate it.
I'm curious about the qualitative results files you uploaded.
I think GoPro test, REBlur test, REBlur addition -> all the links are identical(with GoPro test).
Can you update the REBlur test, REBlur addition results' file link(with Google Drive)?
Thanks!

Dataset release

Hi,

Thanks for your great work. Any plans for releasing the dataset?
I find the link in the repo is unavailable.

Thanks a lot.

multi-GPU

How can I modify the code to realize multi-GPU operation? thanks!

train error

raise subprocess.CalledProcessError(returncode=process.returncode,

subprocess.CalledProcessError: Command '['/root/miniconda3/bin/python', '-u', 'basicsr/train.py', '--local_rank=0', '-opt', 'options/train/GoPro/EFNet.yml', '--launcher', 'pytorch']' died with <Signals.SIGKILL: 9>.
(base) root@autodl-container-5d4911a352-200b5df6:~/autodl-tmp/EFNet-main/EFNet-main# /root/miniconda3/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 35 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

Pretrained model not existed in the provided link

Could you kindly upload the pre-trained model?

ahupujr / efnet Goto Github PK

efnet's Issues

Recommend Projects

Recommend Topics

Recommend Org