bit-da / i2v-gan Goto Github PK

View Code? Open in Web Editor NEW

105.0 2.0 22.0 69.64 MB

ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation"

Home Page: https://dl.acm.org/doi/10.1145/3474085.3475445

License: MIT License

Python 99.30% HTML 0.70%

pytorch video-to-video-translation infrared-to-visible-translation gans

i2v-gan's Introduction

I2V-GAN

This repository is the official Pytorch implementation for ACMMM2021 paper
"I2V-GAN: Unpaired Infrared-to-Visible Video Translation". [Arxiv] [ACM DL]

Traffic I2V Example:

Download a pretrained model from Baidu Netdisk [Access code: Traf] or Google drive.

Monitoring I2V Example:

Flower Translation Example:

Introduction

Abstract

Human vision is often adversely affected by complex environmental factors, especially in night vision scenarios. Thus, infrared cameras are often leveraged to help enhance the visual effects via detecting infrared radiation in the surrounding environment, but the infrared videos are undesirable due to the lack of detailed semantic information. In such a case, an effective video-to-video translation method from the infrared domain to the visible counterpart is strongly needed by overcoming the intrinsic huge gap between infrared and visible fields.
Our work propose an infrared-to-visible (I2V) video translation method I2V-GAN to generate fine-grained and spatial-temporal consistent visible light video by given an unpaired infrared video.
The backbone network follows Cycle-GAN and Recycle-GAN.

Technically, our model capitalizes on three types of constraints: adversarial constraint to generate synthetic frame that is similar to the real one, cyclic consistency with the introduced perceptual loss for effective content conversion as well as style preservation, and similarity constraint across and within domains to enhance the content and motion consistency in both spatial and temporal spaces at a fine-grained level.

IRVI Dataset

Download from Baidu Netdisk [Access code: IRVI] or Google Drive.

Data Structure

SUBSET		TRAIN	TEST	TOTAL FRAME
Traffic		17000	1000	18000
Mornitoring	sub-1	1384	347	1731	6352
	sub-2	1040	260	1300
	sub-3	1232	308	1540
	sub-4	672	169	841
	sub-5	752	188	940

Installation

The code is implemented with Python(3.6) and Pytorch(1.9.0) for CUDA Version 11.2

Install dependencies:
pip install -r requirements.txt

Usage

Train

python train.py --dataroot /path/to/dataset \
--display_env visdom_env_name --name exp_name \
--model i2vgan --which_model_netG resnet_6blocks \
--no_dropout --pool_size 0 \
--which_model_netP unet_128 --npf 8 --dataset_mode unaligned_triplet

Test

python test.py --dataroot /path/to/dataset \
--which_epoch latest --name exp_name --model cycle_gan \
--which_model_netG resnet_6blocks --which_model_netP unet_128 \
--dataset_mode unaligned --no_dropout --loadSize 256 --resize_or_crop crop

Citation

If you find our work useful in your research or publication, please cite our work:

@inproceedings{I2V-GAN2021,
  title     = {I2V-GAN: Unpaired Infrared-to-Visible Video Translation},
  author    = {Shuang Li and Bingfeng Han and Zhenjie Yu and Chi Harold Liu and Kai Chen and Shuigen Wang},
  booktitle = {ACMMM},
  year      = {2021}
}

Acknowledgements

This code borrows heavily from the PyTorch implementation of Cycle-GAN and Pix2Pix and RecycleGAN.
A huge thanks to them!

@inproceedings{CycleGAN2017,
  title     = {Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networkss},
  author    = {Zhu, Jun-Yan and Park, Taesung and Isola, Phillip and Efros, Alexei A},
  booktitle = {ICCV},
  year      = {2017}
}

@inproceedings{Recycle-GAN2018,
  title     = {Recycle-GAN: Unsupervised Video Retargeting},
  author    = {Aayush Bansal and Shugao Ma and Deva Ramanan and Yaser Sheikh},
  booktitle = {ECCV},
  year      = {2018}
}

i2v-gan's People

Contributors

Stargazers

Watchers

i2v-gan's Issues

学术求助，您好，我是一名图像算法工程师，对您的研究很感兴趣，运行代码的过程中出现了一些问题，可以为我解答一下吗？

您好请教您一个问题，在贵公司红外真彩转换I2V-GAN 算法中出现了一个问题能为我解答一下吗？运行这个代码需要多大的显存才够呢？

您好！我在做关于可见光向红外光的转换，请问您的模型重新训练后可以实现这一任务吗

测试有问题

你好，我使用下载的预训练模型，直接使用IRVI/single/traffic/testA 和IRVI/single/traffic/testB 两个文件夹下的数据进行测试，生成的结果中*_real_A.png明显和testA中是一样的，testB也是一样的。

下载的预训练模型放在I2V-GAN/checkpoints/experiment_name下，testA testB 放在I2V-GAN/img下测试的命令如下：
python .\test.py --dataroot .\img\ --name experiment_name

我的问题是：

生成的结果中*_real_A.png是预测的结果，而*_real_B.png是参照图像，描述是否正确？
生成的结果为什么*_real_A.png和testA中是一样的，都是红外图，没有任何变化？

目前在研究这方面的算法，希望能得到你详细的解答，非常感谢！！！

CUDA error: out of memory

Hello! thanks for sharing your work. I followed the instructions for running the python train.py, but it has the following error.

Could you give me some suggestions? The dataroot I used is IRVI/triplet/traffic, and my GPU device is RTX3090.

请问我在复现代码时候出现这个路径找不到问题怎么解决？

assert os.path.isdir(dir), '%s is not a valid directory' % dir
AssertionError: ../D:/I2V-GAN-main/IRVI/single/traffic/trainA is not a valid directory

测试结果有关的问题

抱歉，再次叨扰您。昨天我跑了测试的代码，我想问结果中的_fake_A, _fake_B, _rec_A, _rec_B分别代表什么意思呢？为什么没有红外影像变为可见光的结果呢？

预训练模型

Unable to Download IRVI Dataset

I'm trying to download the data from the provided link. It asked to install Baidu Net Disk which I installed but it also requires a Baidu account which I don't have. It seems like we can't open a Baidu account from outside China now. Is there other ways to download the data?

测试结果有问题

你好，我使用了预训练模型，直接进行测试我的数据，testA文件夹：测试集 testB文件夹：对应的ground truth，结果发现：

生成结果中 real_A.png和real_B.png分别和 testA testB文件夹内的数据一样
测试集为红外测试集，生成的结果 real_A.png仍然是红外，只是大小不同

问题如下：

结果是否是生成的real_A.png？如果是，结果和输入没啥变化，换了提供的测试集也是这样的效果，原因是什么？

问下贵组cvpr2023那篇代码什么时候放? 论文里面的链接是失效的

如题

FID evaluation in IRVI benchmark

Hi, thanks for your great work! This is a good contribution. Could you please advise me on how you calculated the FID score for your benchmark?
Thank you!

求助

请问这个代码在运行时一直卡顿在这一步没有反应了是什么原因呢？
我是在windows，cpu环境下运行的。

visible to infrared conversion query.

@BingfengHan / @paperheart
I have a query regarding I2V-GAN method.
If we replace RGB/visible images in TrainA folder of dataset and replace Infrared images in TrainB folder, will we be able to train the model for Visible to infrared videos generation?