trevor-m / tensorflow-srgan Goto Github PK

Tensorflow implementation of "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network" (Ledig et al. 2017)

License: MIT License

Python 100.00%

deep-learning tensorflow srgan vgg19 convolutional-neural-networks generative-adversarial-network gan super-resolution srresnet

tensorflow-srgan's Introduction

SRGan in Tensorflow

This is an implementation of the paper Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network using TensorFlow.

Usage

Set up

Download the VGG19 weights provided by TensorFlow-Slim. Place the vgg_19.ckpt file in this directory.
Download a dataset of images. I recommend ImageNet or Places205. Specify the directory containing your dataset using the --train-dir argument when training the model.

Training

SRResNet-MSE

python train.py --name srresnet-mse --content-loss mse --train-dir path/to/dataset

SRResNet-VGG22

python train.py --name srresnet-vgg22 --content-loss vgg22 --train-dir path/to/dataset

SRGAN-MSE

python train.py --name srgan-mse --use-gan --content-loss mse --train-dir path/to/dataset --load results/srresnet-mse/weights-1000000

SRGAN-VGG22

python train.py --name srgan-vgg22 --use-gan --content-loss vgg22 --train-dir path/to/dataset --load results/srresnet-mse/weights-1000000

SRGAN-VGG54

python train.py --name srgan-vgg54 --use-gan --content-loss vgg54 --train-dir path/to/dataset --load results/srresnet-mse/weights-1000000

Results

Set5	Ledig SRResNet	This SRResNet	Ledig SRGAN	This SRGAN
PSNR	32.05	32.11	29.40	28.21
SSIM	0.9019	0.8933	0.8472	0.8200

Set14	Ledig SRResNet	This SRResNet	Ledig SRGAN	This SRGAN
PSNR	28.49	28.61	26.02	25.74
SSIM	0.8184	0.7809	0.7397	0.6909

BSD100	Ledig SRResNet	This SRResNet	Ledig SRGAN	This SRGAN
PSNR	27.58	27.57	25.16	24.80
SSIM	0.7620	0.7346	0.6688	0.6314

tensorflow-srgan's People

Stargazers

Watchers

Forkers

deepyury fushuo0907 robrowill takenpeanut gengjiaqi kienlgk lile-tju yunlongding ardaa-yilmaz benblackcake abhishek2028 h2-ml pandasuits

tensorflow-srgan's Issues

Regarding tensorflow version

I am using this code to generate SR images of SRGAN model.
Recently, i tried this on my GPU system, but it can't take GPU and only runs on CPU system. I think this happens due to the tensorflow version mismatch with the cuda and cudnn packages. Can you please suggest me the perfect tensorflow version with other package versions of cuda and cudnn as well as other important python packages.

How can we test other images

I want to test other images using checkpoints of training process. How can I do this

What is motivation of using large kernel size at input and output?

https://github.com/trevor-m/tensorflow-SRGAN/blob/master/srgan.py#L43
https://github.com/trevor-m/tensorflow-SRGAN/blob/master/srgan.py#L59

No of Training parameter

Does it possible to calculate number of trainable parameters (not manually)?

OOM problem

Hi @trevor-m

I am using your repo as a source to implement a SRGAN. I have some differences especially in the dataset.

I have a 2 .npy files that contents 8000 patches of size 1616 (both LR and SR), I made the correspondence modifications to implement mini-batch training using tf.train.shuffle_batch.

During training, I have an increment of memory of 1GB per 50 iterations (more or less) and sooner or later, depending of the batch-size, the cpu-memory consumption reach the maximum and the train stops.

Perhaps my problem is naive, I am a newbie with tensorflow. What would you recommend to prevent the problem of such as high memory consumption?

How can I use this for 8 upscaling fctor?

I want to super resolute the images by upscaling factor of 8. How can I do this?

Artifacts in the inference output

Hi,

i've used your implementation for SRGAN. The steps are below.
i've used 800 images from div_2k as training dataset and 90 images from div_2k as test images.
I've ran initially SRRESNET with VGG54 for 10 power 5 iterations. Then used the obtained weights to initialize (--load) for SRGAN with VGG54 and ran for another 10 power 5 iterations.

The PSNR and SSIM are given below for Set5, Set14, BSD100 as well as Div_2K's 90 images (all average values):
[BSD100] PSNR: 25.18, SSIM: 0.6398
[Set14] PSNR: 26.25, SSIM: 0.6966
[Set5] PSNR: 29.33, SSIM: 0.8370
[div2k-90] 26.48, SSIM: 0.6984

But some of the images (from all the 4 sets given above) had artifacts. (Please see the attached images).
The artifacts are present in some iterations and not present in some iterations. But the iterations with low training loss also has artifacts.

Can you help me with the following questions?

Should i also train the SRGAN with VGG54 for another 10 pow 5 iterations with learning rate 1e5? will it solve the artifact issue?
Why are the artifacts appearing? is there a fix / saturation needs to be done?
Have you encountered these artifacts? Any way i can correct them?
In the files that i've attached, 195500 had least training loss, but had artifacts. 198500 had more worse training loss, but didn't have any artifacts. Which iterations's weights shall i choose?

Expecting your reply, as i'm struck in this a bit.

Benchmark.evaluate has a bug

Benchmarks work for validate() when the comparison images are loaded from disk, but there is a bug when using the model output.

is it place vgg_19.ckpt in the dictionary straightly?but keyerror:vgg19_1/vgg_19/conv2/conv2_2

OutOfRangeError in tf.train.batch

I'm trying to train srresnet-mse using my own data set. Sometimes I get an error message. The first time it occurred between 0 and 100 eras, then between 100 and 200, then after 600 eras. In my data set there are about one hundred thousand images. I suspect that this is due to my data set. Can you help me understand what the problem is?

/opt/ds/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Logging results for this session in folder "results/srresnet-mse".
2018-09-03 12:22:04.150374: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-09-03 12:22:07.392768: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: GeForce GTX 1080 Ti major: 6 minor: 1 memoryClockRate(GHz): 1.582
pciBusID: 0000:c1:00.0
totalMemory: 10.92GiB freeMemory: 10.76GiB
2018-09-03 12:22:07.392867: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-09-03 12:22:07.811512: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 10415 MB memory) -> physical GPU (device: 0, name: GeForce GTX 1080 Ti, pci bus id: 0000:c1:00.0, compute capability: 6.1)
[0] Test: 0.4038988, Train: 0.5311046 [Set5] PSNR: 11.46, SSIM: 0.1051 [Set14] PSNR: 12.50, SSIM: 0.0841 [BSD100] PSNR: 13.13, SSIM: 0.1036
[100] Test: 0.2326521, Train: 0.3028869 [Set5] PSNR: 13.47, SSIM: 0.4380 [Set14] PSNR: 14.62, SSIM: 0.4203 [BSD100] PSNR: 15.39, SSIM: 0.4153
2018-09-03 12:23:46.013589: W tensorflow/core/kernels/queue_base.cc:277] _0_input_producer: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.026015: W tensorflow/core/kernels/queue_base.cc:277] _2_input_producer_1: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.026804: W tensorflow/core/kernels/queue_base.cc:277] _5_batch_2/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.027426: W tensorflow/core/kernels/queue_base.cc:277] _3_batch_1/fifo_queue: Skipping cancelled enqueue attempt with queue not closed
2018-09-03 12:23:46.027743: W tensorflow/core/kernels/queue_base.cc:277] _4_input_producer_2: Skipping cancelled enqueue attempt with queue not closed
Traceback (most recent call last):
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1361, in _do_call
    return fn(*args)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1340, in _run_fn
    target_list, status, run_metadata)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/errors_impl.py", line 516, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 14, current size 0)
         [[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "train.py", line 134, in <module>
    main()
  File "train.py", line 121, in main
    batch_hr = sess.run(get_train_batch)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 905, in run
    run_metadata_ptr)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1137, in _run
    feed_dict_tensor, options, run_metadata)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1355, in _do_run
    options, run_metadata)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1374, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 14, current size 0)
         [[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]

Caused by op 'batch', defined at:
  File "train.py", line 134, in <module>
    main()
  File "train.py", line 68, in main
    get_train_batch, get_val_batch, get_eval_batch = build_inputs(args, sess)
  File "/home/ds/ykochnev/SRGAN-orig/utilities.py", line 55, in build_inputs
    get_train_batch = build_input_pipeline(train_filenames, batch_size=args.batch_size, img_size=args.image_size, random_crop=True)
  File "/home/ds/ykochnev/SRGAN-orig/utilities.py", line 36, in build_input_pipeline
    image_batch = tf.train.batch([image], batch_size=batch_size, num_threads=num_threads, capacity=10 * batch_size)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 989, in batch
    name=name)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/training/input.py", line 763, in _batch
    dequeued = queue.dequeue_many(batch_size, name=name)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/data_flow_ops.py", line 483, in dequeue_many
    self._queue_ref, n=n, component_types=self._dtypes, name=name)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 2430, in _queue_dequeue_many_v2
    component_types=component_types, timeout_ms=timeout_ms, name=name)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3271, in create_op
    op_def=op_def)
  File "/opt/ds/anaconda3/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1650, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

OutOfRangeError (see above for traceback): FIFOQueue '_1_batch/fifo_queue' is closed and has insufficient elements (requested 14, current size 0)
         [[Node: batch = QueueDequeueManyV2[component_types=[DT_UINT8], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](batch/fifo_queue, batch/n)]]

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.