Code Monkey home page Code Monkey logo

arbitrary_style_transfer's Introduction

Arbitrary-Style-Transfer

Arbitrary-Style-Per-Model Fast Neural Style Transfer Method

Description

Using an Encoder-AdaIN-Decoder architecture - Deep Convolutional Neural Network as a Style Transfer Network (STN) which can receive two arbitrary images as inputs (one as content, the other one as style) and output a generated image that recombines the content and spatial structure from the former and the style (color, texture) from the latter without re-training the network. The STN is trained using MS-COCO dataset (about 12.6GB) and WikiArt dataset (about 36GB).

This code is based on Huang et al. Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization (ICCV 2017)

stn_overview System overview. Picture comes from Huang et al. original paper. The encoder is a fixed VGG-19 (up to relu4_1) which is pre-trained on ImageNet dataset for image classification. We train the decoder to invert the AdaIN output from feature spaces back to the image spaces.

Prerequisites

Trained Model

You can download my trained model from here which is trained with style weight equal to 2.0
Or you can directly use download_trained_model.sh in the repo.

Manual

  • The main file main.py is a demo, which has already contained training procedure and inferring procedure (inferring means generating stylized images).
    You can switch these two procedures by changing the flag IS_TRAINING.
  • By default,
    (1) The content images lie in the folder "./images/content/"
    (2) The style images lie in the folder "./images/style/"
    (3) The weights file of the pre-trained VGG-19 lies in the current working directory. (See Prerequisites above. By the way, download_vgg19.sh already takes care of this.)
    (4) The MS-COCO images dataset for training lies in the folder "../MS_COCO/" (See Prerequisites above)
    (5) The WikiArt images dataset for training lies in the folder "../WikiArt/" (See Prerequisites above)
    (6) The checkpoint files of trained models lie in the folder "./models/" (You should create this folder manually before training.)
    (7) After inferring procedure, the stylized images will be generated and output to the folder "./outputs/"
  • For training, you should make sure (3), (4), (5) and (6) are prepared correctly.
  • For inferring, you should make sure (1), (2), (3) and (6) are prepared correctly.
  • Of course, you can organize all the files and folders as you want, and what you need to do is just modifying related parameters in the main.py file.

Results

style output (generated image)

My Running Environment

Hardware

  • CPU: Intel® Core™ i9-7900X (3.30GHz x 10 cores, 20 threads)
  • GPU: NVIDIA® Titan Xp (Architecture: Pascal, Frame buffer: 12GB)
  • Memory: 32GB DDR4

Operating System

  • ubuntu 16.04.03 LTS

Software

  • Python 3.6.2
  • NumPy 1.13.1
  • TensorFlow 1.3.0
  • SciPy 0.19.1
  • CUDA 8.0.61
  • cuDNN 6.0.21

References

  • The Encoder which is implemented with first few layers(up to relu4_1) of a pre-trained VGG-19 is based on Anish Athalye's vgg.py

Citation

  @misc{ye2017arbitrarystyletransfer,
    author = {Wengao Ye},
    title = {Arbitrary Style Transfer},
    year = {2017},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/elleryqueenhomels/arbitrary_style_transfer}}
  }

arbitrary_style_transfer's People

Contributors

elleryqueenhomels avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arbitrary_style_transfer's Issues

Is it possible to run this on mobile device?

What is the speed compared to Fast style transfer ?
Also is it possible to shrink the size of the model, since model + vgg19 is like 200mb, which is too much for mobile devices ?

How should I train my own model?

hello,
I have a question about training a new model,
and I want to know how I would train my own model, I always get the error , and I have no idea why.
I would appreciate if you can tell me how to do this, thank you @elleryqueenhomels

Resource Exhausted Error while testing

 ResourceExhaustedError` (see above for traceback): OOM when allocating tensor with shape[1,64,1600,1330] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

	 [[node Conv2D_10 (defined at /localhome/prathmeshmadhu/work/EFI/Code/arbitrary_style_transfer/encoder.py:95)  = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="VALID", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:GPU:0"](Conv2D_10-0-TransposeNHWCToNCHW-LayoutOptimizer, encoder/conv1_2/kernel/read)]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

	 [[{{node clip_by_value/_77}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incarnation=1, tensor_name="edge_487_clip_by_value", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

I am trying to generate some 40000+ different stylized images and it generates some 220+ images and then always throws this resources error. My hardware details are as follows :

GPU : GeForce GTX 970, 4GB.
CPU : 8 cores, 64 GB.

Do let me know if you need any further details.
Thanks.

The results of this program which I got was not right.

Firstly, thank you very much. This program is very useful and easy to understand.
But after I train this network, I did not get right result. I just got an image with single color.
I change the BATCH_SIZE = 2, and use 10000 images(content and style) to train.
I do not know where I was wrong.
I am very grateful if you have any advice. Thank you.

when I train the model,why is it always interrupted?

I downloaded the wikiart and MSCOCO dataset. The numger of them are 80k and 110k respectively. And I preprocess the content images and the style images(wikiart dataset has some pictures that are damaged,but MSCOCO not). So I begin to train the model, but it is always interrupted before ending up with the first epoch. Can you give me some advice?
The wrong results are:(one epoch should have about 9900steps)

step: 5620, total loss: 2517.050, elapsed time: 0:54:17.473793
content loss: 2478.894
style loss : 3815.609, weighted style loss: 38.156

Something wrong happens! Current model is saved to <D:\wxd\arbitrary_style_transfer-master_0\tmp_model>

Done training! Elapsed time: 0:54:30.136937
Model is saved to: model_retrain/style_weight_1e-2.ckpt

Successfully! Done all training...

we end at 2019-01-06 10:24:25.703097

yeah, I think your suggestion is right. And I will have a try. Thank you very much. And I still have the question: when you train the model, the loss is big as mine? The content loss is about 2000-5000 and it seems never converged.

yeah, I think your suggestion is right. And I will have a try. Thank you very much. And I still have the question: when you train the model, the loss is big as mine? The content loss is about 2000-5000 and it seems never converged.

Originally posted by @wxd19961228 in #8 (comment)

VGG Model File Source (And a question)

Two questions:

  1. Where can I find the code for training the vgg19 model that is available for download in the repositories README?

  2. During training, are the weights of the VGG model updated, or is that signal kept the same? The project I'm working on is style transfer for webpage screenshots. I suspect that I need to train a model for classifying webpages before I can do style transfer. That's why it would be useful to know how you trained this VGG model, so that I can do the same (with my own dataset).

Many thanks, great repo!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.