Code Monkey home page Code Monkey logo

maybeshewill-cv / attentive-gan-derainnet Goto Github PK

View Code? Open in Web Editor NEW
252.0 10.0 83.0 358.15 MB

Unofficial tensorflow implemention of "Attentive Generative Adversarial Network for Raindrop Removal from A Single Image (CVPR 2018) " model https://maybeshewill-cv.github.io/attentive-gan-derainnet/

License: MIT License

Python 100.00%
gan cvpr2018 derain dcgan-tensorflow attention-mechanism deep-learning attention-map tensorflow raindrop network-architecture

attentive-gan-derainnet's Introduction

🏆 Github Trophy

trophy

👀 Lovely Snake

Contribution snake Light

👀 Active Repo

Lane_Detection segment-anything-u-specify

👀 Activity Graph

MaybeShewill-CV's github activity graph

🏆 Github Status

🏆 3D Status Profile

👨‍💻 Languages and Tools

CPP Python

Tensorflow Pytorch ONNX

LeetCode Docker

Git GitHub

☕ Get In Touch

visitors

visitors_new

⭐️ From Baidu.Inc

attentive-gan-derainnet's People

Contributors

dependabot[bot] avatar ichn-hu avatar maybeshewill-cv avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

attentive-gan-derainnet's Issues

th

thanks for your good job, when I run the author's code which in pytorch, and out of memory in a 1080 gpu,but you can run in 1070gpu,what's the problem in pytorch, and the train image is resized to hou much, will different train image inifluence the reslut?
thanks

Got the error when run test_model.py

Please help some advice to fix the Error, I got the error when run test_model.py

command:

python3 tools/test_model.py --weights_path model/new_model/derain_gan_2018-11-02-19-55-27.ckpt-200000 --image_path data/test_data/test_1.png

and error is:

/home/tuencns/.local/lib/python3.5/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
VGG16 Network init complete
WARNING:tensorflow:From /home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version.
Instructions for updating:
Colocations handled automatically by placer.
WARNING:tensorflow:From /home/tuencns/ws/raindrops2/attentive_gan_model/cnn_basenet.py:402: conv2d_transpose (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.conv2d_transpose instead.
2019-03-07 10:02:37.177862: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-07 10:02:37.198260: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3192000000 Hz
2019-03-07 10:02:37.198666: I tensorflow/compiler/xla/service/service.cc:150] XLA service 0x7022340 executing computations on platform Host. Devices:
2019-03-07 10:02:37.198690: I tensorflow/compiler/xla/service/service.cc:158] StreamExecutor device (0): ,
WARNING:tensorflow:From /home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2019-03-07 10:02:37.420497: W tensorflow/core/framework/op_kernel.cc:1401] OP_REQUIRES failed at save_restore_v2_ops.cc:184 : Not found: Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
Traceback (most recent call last):
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1334, in _do_call
return fn(*args)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1319, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1407, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.NotFoundError: Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[{{node save/RestoreV2}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1276, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 929, in run
run_metadata_ptr)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1152, in _run
feed_dict_tensor, options, run_metadata)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1328, in _do_run
run_metadata)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/client/session.py", line 1348, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.NotFoundError: Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[node save/RestoreV2 (defined at tools/test_model.py:88) ]]

Caused by op 'save/RestoreV2', defined at:
File "tools/test_model.py", line 148, in
test_model(args.image_path, args.weights_path)
File "tools/test_model.py", line 88, in test_model
saver = tf.train.Saver()
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 832, in init
self.build()
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 844, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 881, in _build
build_save=build_save, build_restore=build_restore)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal
restore_sequentially, reshape)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps
restore_sequentially)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2
name=name)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

NotFoundError (see above for traceback): Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[node save/RestoreV2 (defined at tools/test_model.py:88) ]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1286, in restore
names_to_keys = object_graph_key_mapping(save_path)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1591, in object_graph_key_mapping
checkpointable.OBJECT_GRAPH_PROTO_KEY)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 370, in get_tensor
status)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/errors_impl.py", line 528, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: Key _CHECKPOINTABLE_OBJECT_GRAPH not found in checkpoint

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/test_model.py", line 148, in
test_model(args.image_path, args.weights_path)
File "tools/test_model.py", line 92, in test_model
saver.restore(sess=sess, save_path=weights_path)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 1292, in restore
err, "a Variable name or other graph key that is missing")
tensorflow.python.framework.errors_impl.NotFoundError: Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[node save/RestoreV2 (defined at tools/test_model.py:88) ]]

Caused by op 'save/RestoreV2', defined at:
File "tools/test_model.py", line 148, in
test_model(args.image_path, args.weights_path)
File "tools/test_model.py", line 88, in test_model
saver = tf.train.Saver()
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 832, in init
self.build()
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 844, in build
self._build(self._filename, build_save=True, build_restore=True)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 881, in _build
build_save=build_save, build_restore=build_restore)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 513, in _build_internal
restore_sequentially, reshape)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 332, in _AddRestoreOps
restore_sequentially)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/training/saver.py", line 580, in bulk_restore
return io_ops.restore_v2(filename_tensor, names, slices, dtypes)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/ops/gen_io_ops.py", line 1572, in restore_v2
name=name)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/util/deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 3300, in create_op
op_def=op_def)
File "/home/tuencns/.local/lib/python3.5/site-packages/tensorflow/python/framework/ops.py", line 1801, in init
self._traceback = tf_stack.extract_stack()

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[node save/RestoreV2 (defined at tools/test_model.py:88) ]]

Thanks,

train

Hello, when I was training the model, the code could only execute to 266 lines of the train_mode.py file "training from scratch", and then the code would not go down. why?thx

Running test model got error

Hi! Thank you for providing Tensorflow code! I got error by running test_model.py.
When I run

cd REPO_ROOT_DIR
python tools/test_model.py --weights_path model/new_model/derain_gan_2018-11-02-19-55-27.ckpt-200000 --image_path data/test_data/test_1.png

I got an error below:
/home/tuencns/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "tools/test_model.py", line 22, in
from attentive_gan_model import derain_drop_net
ModuleNotFoundError: No module named 'attentive_gan_model'

Please help to fix,

Thanks

train with gpu

hello;
Which parameters should I modify so that training can be done under the gpu?
look forward to your reply
thank you very much

运行test_model出现问题

test setup failed
file F:\GAN_model\test_model.py, line 43
def test_model(image_path, weights_path):
E fixture 'image_path' not found

  available fixtures: cache, capfd, capfdbinary, caplog, capsys, capsysbinary, doctest_namespace, monkeypatch, pytestconfig, record_property, record_xml_attribute, recwarn, tmp_path, tmp_path_factory, tmpdir, tmpdir_factory
  use 'pytest --fixtures [testpath]' for help on them.

大佬,求救

关于训练

File "tools/train_model.py", line 217, in
train_model(args.dataset_dir, weights_path=args.weights_path)
File "tools/train_model.py", line 175, in train_model
gt_imgs, label_imgs, mask_imgs = train_dataset.next_batch(CFG.TRAIN.BATCH_SIZE)
File "/home/hp/文档/gitcode/attentive-gan-derainnet-master/tools/data_provider/data_provider.py", line 144, in next_batch
gt_image = cv2.resize(gt_image, (CFG.TRAIN.IMG_WIDTH, CFG.TRAIN.IMG_HEIGHT))
cv2.error: OpenCV(3.4.3) /io/opencv/modules/imgproc/src/resize.cpp:4044: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

请问在训练时为什么会出现这种错误?非常感谢

dataset

问一个关于数据集的问题,训练数据集怎么来获得的啊?有什么方法可以制作吗?如果有合成数据集的方法能分享一下吗?谢谢。。。。

How do we test the code?

Hi,
I clone the code and put every .py files in the same directory. When I use the pycharm python console:
python tools/test_model.py --weights_path model/derain_gan_v2_2018-07-23-11-26-23.ckpt-200000
--image_path data/test_data/test_1.png

It returns python35 test_model.py --weights_path model/derain_gan_v2_2018-07-23-11-26-23.ckpt-200000 --image_path data/test_data/test_1.png
File "", line 1
python35 test_model.py --weights_path model/derain_gan_v2_2018-07-23-11-26-23.ckpt-200000 --image_path data/test_data/test_1.png
^
SyntaxError: invalid syntax.

When I directly run the test_model.py on pycharm, it returns D:\E\Anaconda3\envs\python35\python.exe D:/E/Deraincode/tools/test_model.py
Traceback (most recent call last):
File "D:/E/Deraincode/tools/test_model.py", line 136, in
test_model(args.image_path, args.weights_path)
File "D:/E/Deraincode/tools/test_model.py", line 64, in test_model
assert ops.exists(image_path)
File "D:\E\Anaconda3\envs\python35\lib\genericpath.py", line 19, in exists
os.stat(path)
TypeError: stat: can't specify None for path argument

Could you please provide a more detailed instruction about how to test the model after you clone the it from Github? Thank you so much

关于数据集的放置问题

hello,首先感谢分享如此优秀的代码,我有一些问题想请教您一下。首先数据集我下载的是原作者提供的数据集,我在data文件夹下新建了training_data_example然后将训练图片集放了进去,请问这样操作对吗?还有我运行程序的时候报了这样的错
Traceback (most recent call last):
File "tools/train_model.py", line 209, in
train_model(args.dataset_dir, weights_path=args.weights_path)
File "tools/train_model.py", line 58, in train_model
train_dataset = data_provider.DataSet(ops.join(dataset_dir, 'train.txt'))
File "/home/liuguangyu/attentive-gan-derainnet-master/tools/data_provider.py", line 30, in init
self._gt_img_list, self._gt_label_list = self._init_dataset(dataset_info_file)
File "/home/liuguangyu/attentive-gan-derainnet-master/tools/data_provider.py", line 43, in _init_dataset
assert ops.exists(dataset_info_file), '{:s} 不存▒?.format(dataset_info_file)'
AssertionError: {:s} 不存▒?.format(dataset_info_file)
请问这是什么问题呢。。。新手上路很多不懂很不好意思,占用了您一些时间还望海涵,感谢!!

训练结果

你好,我下载了你的代码进行测试,但是由于无法加载模型,我重新对其进行了训练之后再测试,但是测试的结果非常爆炸,已经完全没有的原图的样子,我想请问一下你是否知道这是什么原因,使用python3.6,1080Ti。谢谢!(以下是使用test1.png实验的结果图,其他图片的测试结果也是这样)
src_img
derain_ret

How do we train the GAN

Dear friend,
I was following your instructions of training the network. However, there's a traceback.

Traceback (most recent call last):
File "D:/gitGAN/attentive-gan-derainnet-master/tools/train_model.py", line 212, in
train_model(args.dataset_dir, weights_path=args.weights_path)
File "D:/gitGAN/attentive-gan-derainnet-master/tools/train_model.py", line 154, in train_model
encoding='latin1').item()
File "D:\Anaconda3333\lib\site-packages\numpy\lib\npyio.py", line 384, in load
fid = open(file, "rb")
FileNotFoundError: [Errno 2] No such file or directory: './data/vgg16.npy'

Could you please clarify this "vgg16.npy"?

Best

使用已有模型测试出错

你好 使用 ”derain_gan_2018-11-02-19-55-27.ckpt-200000“ 测试会出现以下错误,但是使用”derain_gan_2018-10-09-11-06-58.ckpt-200000“模型测试则不会出现该错误。

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[node save/RestoreV2 (defined at test_model.py:99) = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Ubuntu 18.04 Tensorflow 1.12

my test data is not well,i don't know reason

hello, I use your weight to test my picture ,but results is not well,in fact ,there is no differents about my picture and results.I have no ideals about that,so please help me
15ab89182fd360bc271cca1551ed386

sorry

有个小问题啊大神,那个train.txt怎么生成的

测试结果

您好,我用原作者发布的测试集(test_a),使用您发布的权重文件,只得到了26.38的PSNR,请问您有做过这个测试吗?

是否用到了数据增广?

我在tf_io_pipline_tools.py 文件里看到了一些对图像进行处理的函数,但是并没有看到其他地方有调用它们?请问这个repo里是否采用了一系列数据增广的方法?

Test model path and skimage

Maybe "scikit-image" should be appended to the requirements and the test model "--weights_path model/new_model/derain_gan_2018-10-09-11-06-58.ckpt-200000" should be updated according to your newest weights.

SSIM基本不变,值为0.02

你好,今天再用新程序训练时,SSIM的值一直在0.02左右,请问你遇到过类似的问题吗?

loss will be nan after several epochs

I0907 12:21:20.068805 22418 train_model.py:144] Training from scratch
I0907 12:21:31.173571 22418 train_model.py:199] Epoch: 0 D_loss: 0.58040 G_loss: 112.36253 Ssim: -0.00274 Cost_time: 6.89916s
I0907 12:21:32.760401 22418 train_model.py:199] Epoch: 1 D_loss: 1.32604 G_loss: 244.45976 Ssim: 0.01849 Cost_time: 0.55185s
I0907 12:21:33.305763 22418 train_model.py:199] Epoch: 2 D_loss: 2.09662 G_loss: 3334.70142 Ssim: 0.03258 Cost_time: 0.54503s
I0907 12:21:33.864548 22418 train_model.py:199] Epoch: 3 D_loss: 1.15596 G_loss: 22600.34570 Ssim: -0.00045 Cost_time: 0.55843s
I0907 12:21:34.411998 22418 train_model.py:199] Epoch: 4 D_loss: 0.46043 G_loss: 143877.51562 Ssim: -0.00022 Cost_time: 0.54708s
I0907 12:21:34.967912 22418 train_model.py:199] Epoch: 5 D_loss: 8.88247 G_loss: 2506968.25000 Ssim: 0.00032 Cost_time: 0.55560s
I0907 12:21:35.513916 22418 train_model.py:199] Epoch: 6 D_loss: 0.86375 G_loss: 11050401398784.00000 Ssim: 0.00000 Cost_time: 0.54586s
I0907 12:21:36.065352 22418 train_model.py:199] Epoch: 7 D_loss: 10.04829 G_loss: 7430020006188193046925154451456.00000 Ssim: 0.00000 Cost_time: 0.55113s
I0907 12:21:36.615558 22418 train_model.py:199] Epoch: 8 D_loss: 4.36992 G_loss: inf Ssim: 0.01403 Cost_time: 0.55008s
I0907 12:21:37.156896 22418 train_model.py:199] Epoch: 9 D_loss: nan G_loss: nan Ssim: 0.00493 Cost_time: 0.54096s
I0907 12:21:37.691573 22418 train_model.py:199] Epoch: 10 D_loss: nan G_loss: nan Ssim: nan Cost_time: 0.53423s
I0907 12:21:38.227119 22418 train_model.py:199] Epoch: 11 D_loss: nan G_loss: nan Ssim: nan Cost_time: 0.53518s
I0907 12:21:38.760516 22418 train_model.py:199] Epoch: 12 D_loss: nan G_loss: nan Ssim: nan Cost_time: 0.53310s
I0907 12:21:39.295391 22418 train_model.py:199] Epoch: 13 D_loss: nan G_loss: nan Ssim: nan Cost_time: 0.53458s
I0907 12:21:39.832836 22418 train_model.py:199] Epoch: 14 D_loss: nan G_loss: nan Ssim: nan Cost_time: 0.53707s

Hi, Thank for you sharing the code. I have read your code and when I run your model, after several epochs the G_loss and D_loss will be nan.

How to Visualize attention map?

In your projects, there are some attention maps which are visualized probability model.
How to visualize attention model?
Is there any code in your project?

gpu util problem

캡처

As shown above, gpu memory is allocated, but volatile GPU-util is 0%. I do not know why gpu is not used for learning.

How to achieve a better result of deraining?

Dear Friend,
I ran the test_model.py with the test_1.jpg, test_2.png and test_3.png (directory: \attentive-gan-derainnet-master\data\test_data). However, it seemed the model could not get rid of the raindrops as well as the photo pairs(with raindrops, and after deraining) illustrated in the readme. May I ask what is your future plan? Or could you give me some instructions for continuing to train the model?
I am very curious about your answer. Thank you so much!

run test_model.py get errot

Hi! Thank you for providing Tensorflow code! When i git clone this repository, i got error by running test_model.py.

tensorflow.python.framework.errors_impl.NotFoundError: Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

NotFoundError (see above for traceback): Restoring from checkpoint failed. This is most likely due to a Variable name or other graph key that is missing from the checkpoint. Please ensure that you have not altered the graph expected based on the checkpoint. Original error:

Key derain_net_loss/attentive_autoencoder_loss/autoencoder_inference/gn_1/beta not found in checkpoint
[[Node: save/RestoreV2 = RestoreV2[dtypes=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, ..., DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2/tensor_names, save/RestoreV2/shape_and_slices)]]

Can you help to check this error?

Continue to train

你好,我想继续train你的model,因为觉得很强大。有些疑问希望能得到解答~
如果是用不同于原作者的dataset进行training(新的dataset,都是新的image pairs), 那这个对network的影响是什么呢?我是想用更广泛的图片training然后训练出来的再叠加到原来trained model上面取得更广泛的应用性。请问应该怎么样安排dataset呢?请问我是应该将我新的images pairs和原来的dataset混在一起再继续跑吗?还是说有什么关于不同类型分开training最后可以通过某一个方法合并weights的方法呢?
谢谢你:)

What would be the recommended configuration of training on Tesla P4 GPU

Dear Friend,
Thank you for your code. I am going to train the network from scratch on Tesla P4. Could you please give any suggestion about the global configuration? I keep getting a warning of exceeding system's memory. Does this increase the time for training?

My Global configuration is as follows:
I1120 16:16:41.641378 23545 train_model.py:135] Global configuration is as follows:
I1120 16:16:41.641667 23545 train_model.py:136] {'TRAIN': {'IMG_HEIGHT': 480, 'BATCH_SIZE': 1, 'LEARNING_RATE': 0.002, 'IMG_WIDTH ': 360, 'GPU_MEMORY_FRACTION': 1, 'EPOCHS': 200010, 'TF_ALLOW_GROWTH': True}, 'TEST': {'IMG_WIDTH': 360, 'IMG_HEIGHT': 480, 'BATC H_SIZE': 1, 'GPU_MEMORY_FRACTION': 1, 'TF_ALLOW_GROWTH': True}}
I1120 16:16:45.810997 23545 train_model.py:144] Training from scratch

  2018-11-20 16:17:30.196406: W tensorflow/core/framework/allocator.cc:108] **Allocation of 2211840000 exceeds 10% of system memory.**

 Thank you so much:)

My Best,
Zhao

attention 模块

请问这个attention 模块可以用到图像检索中吗?,我想利用这个attentive model用到图像检索中,谢谢

loss function

hi,Negative sign is added before the lgan variable in line 69 of the derain_drop_net.py file to make lgan positive? Right? thx

BN problem

Thanks for the implementation!
Question: The batch_size has been assigned as 1, why batch normalization is still used? Functioned the same as instance normalization?

got error when run train_model.py

Hi,
First, I want to say thank you for share this model, and here is the issue I met.
my device is an i7-7700HQ CPU and a 1050ti GPU, and I use the data sample you prepared to run the train_model .

And the error is like:

(tensorflow) D:\attentive-gan-derainnet-master>python tools/train_model.py --dataset_dir D:\attentive-gan-derainnet-master\data_provider
D:\software\Anaconda\envs\tensorflow\lib\site-packages\h5py_init_.py:34: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
VGG16 Network init complete
2019-03-18 14:34:37.549633: I T:\src\github\tensorflow\tensorflow\core\platform\cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
I0318 14:34:38.240563 309452 train_model.py:255] Global configuration is as follows:
I0318 14:34:38.402141 309452 train_model.py:256] {'TEST': {'IMG_HEIGHT': 240, 'GPU_MEMORY_FRACTION': 0.8, 'IMG_WIDTH': 360, 'BATCH_SIZE': 1, 'TF_ALLOW_GROWTH': False}, 'TRAIN': {'CROP_IMG_HEIGHT': 240, 'GPU_MEMORY_FRACTION': 0.95, 'IMG_WIDTH': 376, 'IMG_HEIGHT': 256, 'LEARNING_RATE': 0.0002, 'GPU_NUM': 1, 'BATCH_SIZE': 1, 'CPU_MULTI_PROCESS_NUMS': 6, 'CROP_IMG_WIDTH': 360, 'TF_ALLOW_GROWTH': True, 'EPOCHS': 100010}}
I0318 14:34:40.289161 309452 train_model.py:264] Training from scratch
Traceback (most recent call last):
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1278, in _do_call
return fn(*args)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1263, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _call_tf_sessionrun
run_metadata)
**tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: val_IteratorGetNext = IteratorGetNextoutput_shapes=[[1,240,360,3], [1,240,360,3], [1,240,360,1]], output_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

During handling of the above exception, another exception occurred:**

Traceback (most recent call last):
File "tools/train_model.py", line 341, in
train_model(args.dataset_dir, weights_path=args.weights_path)
File "tools/train_model.py", line 294, in train_model
train_psnr, train_summary_op, val_summary_op]
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 877, in run
run_metadata_ptr)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1100, in _run
feed_dict_tensor, options, run_metadata)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1272, in _do_run
run_metadata)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\client\session.py", line 1291, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: End of sequence
[[Node: val_IteratorGetNext = IteratorGetNextoutput_shapes=[[1,240,360,3], [1,240,360,3], [1,240,360,1]], output_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

Caused by op 'val_IteratorGetNext', defined at:
File "tools/train_model.py", line 341, in
train_model(args.dataset_dir, weights_path=args.weights_path)
File "tools/train_model.py", line 117, in train_model
val_input_tensor, val_label_tensor, val_mask_tensor = val_dataset.inputs(CFG.TRAIN.BATCH_SIZE, 1)
File "D:\attentive-gan-derainnet-master\data_provider\data_feed_pipline.py", line 296, in inputs
return iterator.get_next(name='{:s}_IteratorGetNext'.format(self._dataset_flags))
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\data\ops\iterator_ops.py", line 410, in get_next
name=name)), self._output_types,
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\ops\gen_dataset_ops.py", line 2107, in iterator_get_next
output_shapes=output_shapes, name=name)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\util\deprecation.py", line 454, in new_func
return func(*args, **kwargs)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 3155, in create_op
op_def=op_def)
File "D:\software\Anaconda\envs\tensorflow\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in init
self._traceback = tf_stack.extract_stack()

OutOfRangeError (see above for traceback): End of sequence
[[Node: val_IteratorGetNext = IteratorGetNextoutput_shapes=[[1,240,360,3], [1,240,360,3], [1,240,360,1]], output_types=[DT_FLOAT, DT_FLOAT, DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"]]

which I don't know how to figure it out, please help if you see that~ thank you very much~

网络训练输入尺寸

您好!请问训练时的输入如果不是256x376,除了config/global_config.py还需要修改哪些文件呢?还是说输入的训练必须是256x376的,否则reshape到该尺寸?谢谢!

Architecture

您好,
请问您的代码architecture和原paper 作者pytorch的 architecture是否完全一样呢?如果有修改,请问是哪些部分不同呢? 我用您的code在Research上有些进展。请问可以用您的code写paper吗?
谢谢您~

cudnn

UnknownError (see above for traceback): Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
hi,cuda-9.0, cudnn-7.0 , i cannot run this code ,why?

Binary mask M

你好,问你个问题,可以告诉我M是怎么形成的吗

Outstanding work

Hi
The original author didnt publish the training code. Therefore very thanks for your work. Have you reproduce the result which is in the paper?

about the traininig

Hi,
thx a lot for your excellent code. It helps me a lot.
But, I have some questions about the training.

  1. the initial learning rate: you said that "In my experiment, the training epochs are 200010, batch size is 1, the initialized learning rate is 0.001.", However, when I checked the config file, I found that it was 0.002. When I trained with 0.002, the D_loss was very large. I guess the initial learning rate is quite important. So I wonder what is the exact value?
  2. When you are training your model, did you crop the original image into small patches? Did you aug the images or not? I think directly resizing the size into 240x360 is not a good choice.
    Thx again for your excellent code.

Zewei

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.