Code Monkey home page Code Monkey logo

fast-neural-style's Introduction

fast-neural-style

This is the code for the paper

Perceptual Losses for Real-Time Style Transfer and Super-Resolution
Justin Johnson, Alexandre Alahi, Li Fei-Fei
Presented at ECCV 2016

The paper builds on A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge by training feedforward neural networks that apply artistic styles to images. After training, our feedforward networks can stylize images hundreds of times faster than the optimization-based method presented by Gatys et al.

This repository also includes an implementation of instance normalization as described in the paper Instance Normalization: The Missing Ingredient for Fast Stylization by Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. This simple trick significantly improves the quality of feedforward style transfer models.

Stylizing this image of the Stanford campus at a resolution of 1200x630 takes 50 milliseconds on a Pascal Titan X:

In this repository we provide:

If you find this code useful for your research, please cite

@inproceedings{Johnson2016Perceptual,
  title={Perceptual losses for real-time style transfer and super-resolution},
  author={Johnson, Justin and Alahi, Alexandre and Fei-Fei, Li},
  booktitle={European Conference on Computer Vision},
  year={2016}
}

Setup

All code is implemented in Torch.

First install Torch, then update / install the following packages:

luarocks install torch
luarocks install nn
luarocks install image
luarocks install lua-cjson

(Optional) GPU Acceleration

If you have an NVIDIA GPU, you can accelerate all operations with CUDA.

First install CUDA, then update / install the following packages:

luarocks install cutorch
luarocks install cunn

(Optional) cuDNN

When using CUDA, you can use cuDNN to accelerate convolutions.

First download cuDNN and copy the libraries to /usr/local/cuda/lib64/. Then install the Torch bindings for cuDNN:

luarocks install cudnn

Pretrained Models

Download all pretrained style transfer models by running the script

bash models/download_style_transfer_models.sh

This will download ten model files (~200MB) to the folder models/.

Models from the paper

The style transfer models we used in the paper will be located in the folder models/eccv16. Here are some example results where we use these models to stylize this image of the Chicago skyline with at an image size of 512:


Models with instance normalization

As discussed in the paper Instance Normalization: The Missing Ingredient for Fast Stylization by Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky, replacing batch normalization with instance normalization significantly improves the quality of feedforward style transfer models.

We have trained several models with instance normalization; after downloading pretrained models they will be in the folder models/instance_norm.

These models use the same architecture as those used in our paper, except with half the number of filters per layer and with instance normalization instead of batch normalization. Using narrower layers makes the models smaller and faster without sacrificing model quality.

Here are some example outputs from these models, with an image size of 1024:



Running on new images

The script fast_neural_style.lua lets you use a trained model to stylize new images:

th fast_neural_style.lua \
  -model models/eccv16/starry_night.t7 \
  -input_image images/content/chicago.jpg \
  -output_image out.png

You can run the same model on an entire directory of images like this:

th fast_neural_style.lua \
  -model models/eccv16/starry_night.t7 \
  -input_dir images/content/ \
  -output_dir out/

You can control the size of the output images using the -image_size flag.

By default this script runs on CPU; to run on GPU, add the flag -gpu specifying the GPU on which to run.

The full set of options for this script is described here.

Webcam demo

You can use the script webcam_demo.lua to run one or more models in real-time off a webcam stream. To run this demo you need to use qlua instead of th:

qlua webcam_demo.lua -models models/instance_norm/candy.t7 -gpu 0

You can run multiple models at the same time by passing a comma-separated list to the -models flag:

qlua webcam_demo.lua \
  -models models/instance_norm/candy.t7,models/instance_norm/udnie.t7 \
  -gpu 0

With a Pascal Titan X you can easily run four models in realtime at 640x480:

The webcam demo depends on a few extra Lua packages:

You can install / update these packages by running:

luarocks install camera
luarocks install qtlua

The full set of options for this script is described here.

Training new models

You can find instructions for training new models here.

Optimization-based Style Transfer

The script slow_neural_style.lua is similar to the original neural-style, and uses the optimization-based style-transfer method described by Gatys et al.

This script uses the same code for computing losses as the feedforward training script, allowing for fair comparisons between feedforward style transfer networks and optimization-based style transfer.

Compared to the original neural-style, this script has the following improvements:

  • Remove dependency on protobuf and loadcaffe
  • Support for many more CNN architectures, including ResNets

The full set of options for this script is described here.

License

Free for personal or research use; for commercial use please contact me.

fast-neural-style's People

Contributors

dmitryulyanov avatar houxianxu avatar htoyryla avatar jcjohnson avatar junrushao avatar programmarchy avatar reddragon avatar reilnuud avatar romawhite47 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast-neural-style's Issues

Training parameters and procedure

So I am trying to train some models into an impressionism style (although I guess that doesn't matter). However, reading all the steps and the flags I have some questions:

-once I create the dataset (the first .h5 file, without training it), can that dataset be trained only once? Let's say I want to have two different models in the end from two different styles: should I copy and paste my original .h5 file obtained with the coco dataset and train each one with different styles, or does the act of training create a new .h5 file out of the original? Or, to explain it in a different way, should I do the "prepare the dataset step" every time or the first .h5 file I get won't be overwritten in step 2 "train the model"?
-does the size flag influence in any way the actual max size of the end picture (the one with the style transferred) or the speed at which it runs? what is a reasonable value for it?
-what is the range of reasonable values for the num_iterations from acceptable and fast running to the point it doesn't get any much better?
-what does the batch_size value influence?
-this is more about the process: I am using my university computer capabilities and I can choose to prioritize my CPU or my GPU. What am I looking for when preparing the dataset, training the model and running the neural-style itself: GPU,CPU or both?

it would be REALLY helpful if you guys could post some examples for your training commands with your results, because there are so many variables that I don't even try to understand....

I hope I clear the path for other newbies in this area, and, as always, thanks guys for being so helpful.
Antonio

Run train on AWS EC2

I opened an AWS EC2 account with GPU. When I ran

sudo apt-get install libhdf5-dev

I got the following error:

Package libhdf5-dev is not available, but is referred to by another package.
This may mean that the package is missing, has been obsoleted, or
is only available from another source

So I could not install torch-hdf5.

Can you help?

training images & validation images ?

Sorry for this dumb question, but what exactly are the validation images ?, ive trained on the MS-COCO dataset, but never seen a validation folder on any neural training github software

Are they the content images ?

Also getting this error
HDF5-DIAG: Error detected in HDF5 (1.8.16) thread 140659249059648:
#000: ../../../src/H5Dio.c line 173 in H5Dread(): can't read data
major: Dataset
minor: Read failed
#1: ../../../src/H5Dio.c line 437 in H5D__read(): src and dest data spaces have different sizes
major: Invalid arguments to routine
minor: Bad value
/usr/local/bin/luajit: /usr/local/share/lua/5.1/hdf5/dataset.lua:160: HDF5DataSet:partial() - failed reading data from [HDF5DataSet (83886080 /train2014/images DATASET)]
stack traceback:
[C]: in function 'assert'
/usr/local/share/lua/5.1/hdf5/dataset.lua:160: in function 'partial'
./fast_neural_style/DataLoader.lua:79: in function 'getBatch'
train.lua:160: in function 'opfunc'
/usr/local/share/lua/5.1/optim/adam.lua:37: in function 'adam'
train.lua:240: in function 'main'
train.lua:328: in main chunk
[C]: in function 'dofile'
/usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x004057a0

cannot train a new model

Encountered the following error when training a model. I didn't change anything in Johnson's code and torch + cuda works very well on my machine. Anyone has some ideas? Thanks very much!

/home/xwang/torch/install/bin/luajit: /home/xwang/torch/install/share/lua/5.1/nn/Container.lua:67:
In 6 module of nn.Sequential:
/home/xwang/torch/install/share/lua/5.1/nn/THNN.lua:109: wrong number of arguments for function call
stack traceback:
    [C]: in function 'v'
    /home/xwang/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'SpatialMaxPooling_updateOutput'
    ...ang/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:42: in function <...ang/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:31>
    [C]: in function 'xpcall'
    /home/xwang/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
    /home/xwang/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    ./fast_neural_style/PerceptualCriterion.lua:79: in function 'setStyleTarget'
    train.lua:136: in function 'main'
    train.lua:329: in main chunk
    [C]: in function 'dofile'
    ...wang/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
    [C]: in function 'error'
    /home/xwang/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
    /home/xwang/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    ./fast_neural_style/PerceptualCriterion.lua:79: in function 'setStyleTarget'
    train.lua:136: in function 'main'
    train.lua:329: in main chunk
    [C]: in function 'dofile'
    ...wang/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00405d50

Style image as input to fast_neural_style.lua?

Hi,

Is it possible to pass style image as input to one of your pre-trained fast-neural-style models?

It seems to me that it is possible to pass style image to slow_neural_style?

Thank you for your amazing work and making it open!

Could not load ResNet model

ubuntu@ip-Address:~/fast-neural-style$ th slow_neural_style.lua -content_image content.jpg -style_image style.jpg -output_image out.png -save_every 50 -print_every 50 -preprocessing resnet -loss_network models/resnet_50_1by2_nsfw.caffemodel -num_iterations 1550 -gpu 0 -backend cuda -use_cudnn 1 -optimizer adam
ERROR: Could not load loss network from models/resnet_50_1by2_nsfw.caffemodel
You may need to download the VGG-16 model by running:
bash models/download_vgg16.sh
ubuntu@ip-Address:~/fast-neural-style$

about the style loss function

Hi Justin,

I have a question about the style loss function. It seems that the Gram matrix is normalized by 1/(CHW), and you used MSECriterion on the Gram matrix that means divided by another CC. I was reading the paper and I thought you only need the 1/(CH*W) part. Could you please explain what you used? Thanks.

Support for OpenCL?

I wish I could answer my own question, but I don't have a GPU yet. :/

Torch supports OpenCL pretty well now, compared to the huge lag in other popular frameworks. Is there a reason you suggest only CUDA in the readme, or should this work on any GPU-accelerated Torch setup?

Thanks!

Using two models at the same time

Hi :)
In readme, there is an option for camera style transfer, for using more than one model to transfer style for an image. I tried to use it to fastneuralstyle, but it says, that the second model is invalid argument, or when I use it like that:
th fast_neural_style.lua -model models/eccv16/stasio.t7;models/eccv16/V.t7 -input_image images/content.chickago.jpg -output_image dwas.jpg
Writing output image to out.png
bash: models/eccv16/V.t7: Permission denied
It didnt wrote dwas.jpg, but out.jpg and used only first model for it.
I tried to sudo this, but still permission is denied.

preprocessing

@jcjohnson
Hi Johnson,

I was wondering whether you should preprocess the raw images when you send the images into the generated model?And how?Does the code preprocess images by the following step?
1、normalize the images : image/255
2、subtract the resnet mean {0.485, 0.456, 0.406}
3、divide by the standard deviation {0.229, 0.224, 0.225}

And how can I deprocess the generated image to become a raw image.
Thank you ;)

Why model file size is not fixed, but seems to increase with the training schedule?

Why model file size is not fixed, but seems to increase with the training schedule?
I have trained some model in set style_image_size to 512: the size of the model was about 25M when using 40000 images, but it was increased to 27.1M when 50000 images.
I founded that the file size of model that you provided in instance_norm folder was varies between 10~20M. Are these model files smaller because of style_image_size is just 256?
I think these model files size are a little too big. Is there any way to reduce it? @jcjohnson

Fast Neural Style

Hi Johnson,

Hope you are doing well. We want your permission to use "Fast Neural Style" functionality in our android and iOS native apps for commercial purpose.

Thanks,
MetiCode Team

problem with output_dir

Hi guys! So I have used the normal neural-style code for some weeks, and now I am trying this one. For some strange reason, when I use as inputs individual files, it runs without problems. However, when I try to use folders as inputs I get an error:
bad argument #1 to 'lower'.
I searched on the internet and the only thing I can find is this: jcjohnson/neural-style#62, but I really don't understand the solution posted by this guys.
I post what I run and what I get:

[antonioa1@wh-520-9-5 fnsm]$ th fast_neural_style.lua -model models/instance_norm/1.t7 -input_image inputs/1.jpg -image_size 100 -backend cuda -use_cudnn 1
Writing output image to out.png 

[antonioa1@wh-520-9-5 fnsm]$ th fast_neural_style.lua -model models/instance_norm/1.t7 -input_dir inputs/ -output_dir out/ -image_size 100 -backend cuda -use_cudnn 1
/home/a/antonioa1/torch/install/bin/luajit: ./fast_neural_style/utils.lua:119: bad argument #1 to 'lower' (string expected, got no value)
stack traceback:
    [C]: in function 'lower'
    ./fast_neural_style/utils.lua:119: in function 'is_image_file'
    fast_neural_style.lua:109: in function 'main'
    fast_neural_style.lua:124: in main chunk
    [C]: in function 'dofile'
    ...ioa1/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x004066d0


Speed on Titan X Pascal

Hi! I tried to get the same time usage on as you. I tested it with image size 1024 on Titan X Pascal, and I also installed the cudnn. But I got the result around 1.5-2s.
th fast_neural_style.lua \ -model models/instance_norm/candy.t7 \ -input_image images/content/chicago.jpg \ -output_image out.png \ -gpu 0
Did I miss some parts? Should I use SSD?

Error installing qtlua and lua-cjson

ubuntu@ip-Address:~/torch$ luarocks install qtlua
Installing https://raw.githubusercontent.com/torch/rocks/master/qtlua-scm-1.rockspec...
Using https://raw.githubusercontent.com/torch/rocks/master/qtlua-scm-1.rockspec... switching to 'build' mode
Cloning into 'qtlua'...
remote: Counting objects: 169, done.
remote: Compressing objects: 100% (163/163), done.
remote: Total 169 (delta 12), reused 124 (delta 2), pack-reused 0
Receiving objects: 100% (169/169), 363.48 KiB | 0 bytes/s, done.
Resolving deltas: 100% (12/12), done.
Checking connectivity... done.
cmake -E make_directory build && cd build && cmake .. -DCMAKE_BUILD_TYPE=Release -DLUA=/home/ubuntu/torch/install/bin/luajit -DLUA_BINDIR="/home/ubuntu/torch/install/bin" -DLUA_INCDIR="/home/ubuntu/torch/install/include" -DLUA_LIBDIR="/home/ubuntu/torch/install/lib" -DLUADIR="/home/ubuntu/torch/install/lib/luarocks/rocks/qtlua/scm-1/lua" -DLIBDIR="/home/ubuntu/torch/install/lib/luarocks/rocks/qtlua/scm-1/lib" -DCONFDIR="/home/ubuntu/torch/install/lib/luarocks/rocks/qtlua/scm-1/conf" && make

-- The C compiler identification is GNU 5.4.0
-- The CXX compiler identification is GNU 5.4.0
-- Check for working C compiler: /usr/bin/cc
-- Check for working C compiler: /usr/bin/cc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Detecting C compile features
-- Detecting C compile features - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Error at /usr/share/cmake-3.5/Modules/FindQt4.cmake:1326 (message):
  Found unsuitable Qt version "" from NOTFOUND, this code requires Qt 4.x
Call Stack (most recent call first):
  CMakeLists.txt:38 (FIND_PACKAGE)


-- Configuring incomplete, errors occurred!
See also "/tmp/luarocks_qtlua-scm-1-8207/qtlua/build/CMakeFiles/CMakeOutput.log".

Error: Build error: Failed building.

And the second error.

ubuntu@ip-Address:~$ luarocks install lua-cjson
Installing https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/lua-cjson-2.1.0-1.src.rock...
Using https://raw.githubusercontent.com/rocks-moonscript-org/moonrocks-mirror/master/lua-cjson-2.1.0-1.src.rock... switching to 'build' mode

Error: Failed unpacking rock file: /tmp/luarocks_luarocks-rock-lua-cjson-2.1.0-1-4081/lua-cjson-2.1.0-1.src.rock
ubuntu@ip-Address:~$

Everything else seemed to install without issue.

Edit:

The lua-cjson issue is solved with sudo apt-get install unzip.

Solution: could not install camera for webcam_demo

For some reason, I could not install lua---camera on my computer with the better GPU. Probably due to a mismatch between lua---camera and my opencv 3.x.

As I had cv module for lua installed (for https://github.com/szagoruyko/torch-opencv-demos) I quickly patched webcam_demo to use cv instead. If anyone has a similar problem, here's my code https://gist.github.com/htoyryla/c0ca5dcb7c00720a9e2b3160c494c685 . Webcam params not implemented except for webcam_idx.

PS. Thanks for releasing the code. I especially enjoy how well many of the models use lines (a huge change from the original neural-style).

More trained models

Are there any more instance_norm based trained models I can download and try?

Cannot load pre-trained model file on android and iOS

Hi Everyone,

I tried to implement the stylization step on android and iOS. After generating the apps on both platforms, I tried to execute the fast_neural_style.lua using the pre-trained model file that was trained on desktop computers with Ubuntu 14.04. The stylization step was on simulators on both platforms, i.e., Android simulator from Google and Xcode from Apple.

However, the pretrained model file cannot be loaded on both simulators. The following is the error message.
"File.lua:168: read error: read 0 blocks instead of 1".

I have tried all possible methods I could find online, for example, using "ascii" mode for saving and loading, using exact path for the model file, providing the "b64" or "apkbinary64" for loading, but no luck so far. On both simulators, I was able to load the sample model file that comes with the torch7-android installation package. (The sample model file is trivial and has nothing to do with the fast-neural-style application.) But I could not load the model files that were generated by the training step of this application. The training step is on 64-bit desktop computer.

Is there anyone who had successfully loaded the pre-trained model files on either android or iOS and could provide some help?

Thanks and Regards,
Michael

Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED in training

Epoch 1.000000, Iteration 20695 / 40000, loss = 323682.617689 0.001
/home/ubuntu/torch/install/bin/luajit: /home/ubuntu/.luarocks/share/lua/5.1/nn/Container.lua:67:
In 5 module of nn.Sequential:
/home/ubuntu/torch/install/share/lua/5.1/cudnn/init.lua:58: Error in CuDNN: CUDNN_STATUS_ALLOC_FAILED
stack traceback:
[C]: in function 'error'
/home/ubuntu/torch/install/share/lua/5.1/cudnn/init.lua:58: in function 'errcheck'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:186: in function 'createIODescriptors'
...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:364: in function <...torch/install/share/lua/5.1/cudnn/SpatialConvolution.lua:361>
[C]: in function 'xpcall'
/home/ubuntu/.luarocks/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/ubuntu/.luarocks/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
train.lua:164: in function 'opfunc'
/home/ubuntu/torch/install/share/lua/5.1/optim/adam.lua:33: in function 'adam'
train.lua:240: in function 'main'
train.lua:328: in main chunk
[C]: in function 'dofile'
...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670

this happens when it goes on the validation set (40k images), as far I know the 80k training ones work fine...maybe? not sure if it goes over them first.
What could this be? results look quite bad too, I guess it didn't finish its iterations.

out

my h5 file is 22+ gbs. It did fail with a strange error (something like cannot allocate/open) when it was missing 4 (four) images from the validation set.

edit: redid it again with a new h5 file that completed correctly (same size to the bit tho)...still the same error. It has some problems starting the second epoch for some reason, I read the code but did not find any clear reason.

Avoid checkerboard pattern

Hey,

Your algorithm has been mentioned in recent article about checkerboard artifacts. It says you could avoid them by "switching deconvolutional layers for resize-convolution layers". Not sure what it means, but I thought I could mention it..

Training installation instructions vs Ubuntu 14.04.5

Hi. Following https://github.com/jcjohnson/fast-neural-style/blob/master/doc/training.md on a Ubuntu 14.04.5 system, I noticed a few things.

  • missing sudo apt-get install python-virtualenv step
  • the pip install -r requirements.txt step ran against some UnicodeDecodeError: 'ascii' codec can't decode ... issue; re-starting with LC_ALL=C LANG=C helped (ordinarily set to en_GB.utf8)
  • apparently installing numpy through the requirements.txt is unreliable, google search suggests that other projects have similar problems; installing it beforehand in the virtual environment with pip install numpy helped.

resume from checkpoint

Thank you for implementing the resume from checkpoint option.

I noticed that using it was not resuming with proper loss. It was much higher than when I had stopped the training... but playing with the learning rate I noticed the following:

Lowering the learning rate to 1e-4 instead of the default 1e-3 will actually properly resume training without raising the loss.

Not sure if there is a science to this but worth trying if you restart training.

default model

Hi!
 For the creation of default models that are laid out in your article, you use the same default settings for all styles?

Training killed

Hi Justin:

I was training with the following command:

th train.lua -h5_file dataset_file.h5 -style_image images/styles/the_scream.jpg -style_image_size 384 -content_weights 1.0 -style_weights 5.0 -checkpoint_name checkpoint -gpu -1

The program started and printed out the network, then it said "Killed" and exited. I am wondering if you have experienced the same problem before. Is it due to memory issue or something else?

Here is the print-out:

Initializing model from scratch
nn.Sequential {
[input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> output]

///network details

Killed.

Thanks a lot.

-seed option?

I am playing around with your code to process a directory of frames from a video. I have the code in a Bash shell for WIN10 and even without CUDA support it is blazingly fast. Nice work! I am getting a lot of artifacts in the background. When I was doing a similar process with your original neural-style there was a -seed option that seemed to smooth out the frame to frame differences.

I was wondering if it would be difficult to have a similar option in fast-neural-style?

I have used Manual Ruder et al artistic-videos code, but my system isn't up to the task without it taking days of processing. It does produce nice results using the optical flow and temporal flow. I did get nearly the same results when processing with the -seed option in neural-style without the optical flow.

Generate image very slow

Hi, I am running the fast_neural_style.lua to generate a new images from "chicago.jpg". It took about 10 minutes on my Dell computer with two Intel Ci7 2.7G CPU, 4 cores. I did not use GPU. Is the time normal? I seems to long for me.

About the style layer

hi, Justin
Why you eventually choose relu1_2,relu2_2,relu3_3,relu4_3, instead of relu1_1,relu2_1,relu3_1,relu4_1 in the original neural style paper?

prisma seems to preserve more detail

I find prisma seems to preserve more details when do style transfer. Following is an example:

origin image

free

udnie style

udnie

prisma use udnie style

free_udnie

fast-neural-style results use udnie style

th_free_udnie

There are following differences:

  1. udnie style have many red, yellow, orange color, which also appears in fast-neural-style result, but does not exist in prisma result
  2. prisma result have smooth sky
  3. prisma preserve more details in bridge and tree

I do not know how does prisma achieve this, I have already tune many hyper parameters during training.

Missing neuralstyle2.utils

Convert_network.lua has this require

local utils = require 'neuralstyle2.utils'

but this is not included in the project. Even tried to patch something together from the existing modules which were obvious but missing (such as InstanceNormalization) but still eventually failed.

-init image option

Thank you for sharing this great work!

I was wondering if there was a reason why an option to init the destination style from the content is not present in the slow_neural_style.lua? I find it help a lot to initialize from content rather than noise. Is it to stay closer to the train.py results?

I find that initializing with content and setting content weight near 0 tend to produce the best style transfer.

Script errors out when running pre-trained models

There seems to be a problem when downloading and running the pre-trained models as mentioned in the wiki.

~/work/fast-neural-style on  master ⌚ 17:38:27
$ th fast_neural_style.lua -model models/eccv16/starry_night.t7 -input_image images/content/chicago.jpg -output_image out.png
ERROR: Could not load model from models/eccv16/starry_night.t7
You may need to download the pretrained models by running
bash models/instance_norm/download_models.sh

~/work/fast-neural-style on  master ⌚ 17:38:54
$ ls models/instance_norm/download_models.sh                                                                   ‹ruby-1.8.7-p374›
ls: models/instance_norm/download_models.sh: No such file or directory

~/work/fast-neural-style on  master ⌚ 17:39:00
$ ls models/eccv16/starry_night.t7                                                                             ‹ruby-1.8.7-p374›
models/eccv16/starry_night.t7
  1. models/instance_norm/download_models.sh does not exist.
  2. models/eccv16/starry_night.t7 exists, but script errors out, not clear why.

Wrong number of arguments for function call

Hi,
I get the error below when trying running train.lua like this:

th train.lua -h5_file /media/rainer/dada/coco/coco10k.h5 -loss_network models/vgg16.t7 -style_image ~/neural-style/style/10/s08.png -style_image_size 256 -content_weights 1.0 -style_weights 5.0 -checkpoint_name checkpoint -gpu 0

what's wrong? thanks!

/home/rainer/torch/install/bin/luajit: /home/rainer/torch/install/share/lua/5.1/nn/Container.lua:67:
In 6 module of nn.Sequential:
/home/rainer/torch/install/share/lua/5.1/nn/THNN.lua:109: wrong number of arguments for function call
stack traceback:
[C]: in function 'v'
/home/rainer/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'SpatialMaxPooling_updateOutput'
...ner/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:42: in function <...ner/torch/install/share/lua/5.1/nn/SpatialMaxPooling.lua:31>
[C]: in function 'xpcall'
/home/rainer/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/rainer/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
./fast_neural_style/PerceptualCriterion.lua:78: in function 'setStyleTarget'
train.lua:127: in function 'main'
train.lua:320: in main chunk
[C]: in function 'dofile'
...iner/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405d50

Training is very slow

I setup everything on MacBook Pro and started the training with

th train.lua
-h5_file path/to/dataset.h5
-style_image path/to/style/image.jpg
-style_image_size 384
-content_weights 1.0
-style_weights 5.0
-checkpoint_name checkpoint
-gpu -1

It is very slow. Current prediction is that it will take 30 days to complete 40000 iterations. Is the 384 image size and 40000 iterations really necessary?

Minimum number of images of dataset

result
Hi,
I am trying to train a model by using COCO dataset.
It has 80000 (train) and 40000(val) images.
I have generated h5 file (size is 24GB) using COCO dataset.

My system has 32 GB Ram and gpu supported nvidia card.

With -gpu -1 when I try to train model...the process is so slow each iteration executed after 10 seconds, while there are 40,000 iterations.So it needs almost 5 days for completion.

With -gpu 0 when I executed command, the result is "out of memory error"

So

  1. What is minimum specs to train a model with/without GPU...
  2. How much COCO dataset images are sufficient to reduce size of h5 file, with maximum result.
  3. Is small size of h5 file will minimize the system specs

What kind of GPU

Can you share what kind of GPU machine you are using and usually how long does it take to train a "instance-normalization" based model?

Super Resolution ?

Hi,

Thanks for releasing your code !
I would like to know if you're gonna release your code for super resolution X4 / X8 ?

Error when running make_style_dataset

I got this small error when trying to run the make_style_dataset script. I downloaded the COCO dataset and extracted it as I saw in another issue in the folder train. Did I write any of the flags wrong or smth?

[antonioa1@wh-520-9-9 fnsm]$ python scripts/make_style_dataset.py --train_dir train/train2014/ --val_dir train/val2014/ --output_file train/output.h5 --max_images -1 --num_workers 2
  File "scripts/make_style_dataset.py", line 60
    print filename
                 ^
SyntaxError: Missing parentheses in call to 'print'

Why my model file size is so big (551MB)?

When I use this command
th train.lua -h5_file out/file.h5 -style_image out/starry_night.jpg -style_image_size 384 -content_weights 1.0 -style_weights 5.0 -checkpoint_name checkpoint -gpu 0
is there anything wrong? @jcjohnson

/home/ubuntu/torch/install/bin/luajit: cannot open <data/cnns/vgg16.t7> in mode r at /home/ubuntu/torch/pkg/torch/lib/TH/THDiskFile.c:649

ubuntu@ip-Address:~/fast-neural-style$ th train.lua   -h5_file /home/ubuntu/fast-neural-style/models/smallvgg2.t7   -style_image /home/ubuntu/fast-neural-style/images/styles/raime.jpg   -style_image_size 384   -content_weights 1.0   -style_weights 5.0   -checkpoint_name checkpoint   -gpu 0    nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> output]
  (1): cudnn.SpatialConvolution(3 -> 32, 9x9, 1,1, 4,4)
  (2): nn.InstanceNormalization
  (3): cudnn.ReLU
  (4): cudnn.SpatialConvolution(32 -> 64, 3x3, 2,2, 1,1)
  (5): nn.InstanceNormalization
  (6): cudnn.ReLU
  (7): cudnn.SpatialConvolution(64 -> 128, 3x3, 2,2, 1,1)
  (8): nn.InstanceNormalization
  (9): cudnn.ReLU
  (10): nn.Sequential {
    [input -> (1) -> (2) -> output]
    (1): nn.ConcatTable {
      input
        |`-> (1): nn.Sequential {
        |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
        |      (1): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (2): nn.InstanceNormalization
        |      (3): cudnn.ReLU
        |      (4): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (5): nn.InstanceNormalization
        |    }
         `-> (2): nn.ShaveImage
         ... -> output
    }
    (2): nn.CAddTable
  }
  (11): nn.Sequential {
    [input -> (1) -> (2) -> output]
    (1): nn.ConcatTable {
      input
        |`-> (1): nn.Sequential {
        |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
        |      (1): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (2): nn.InstanceNormalization
        |      (3): cudnn.ReLU
        |      (4): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (5): nn.InstanceNormalization
        |    }
         `-> (2): nn.ShaveImage
         ... -> output
    }
    (2): nn.CAddTable
  }
  (12): nn.Sequential {
    [input -> (1) -> (2) -> output]
    (1): nn.ConcatTable {
      input
        |`-> (1): nn.Sequential {
        |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
        |      (1): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (2): nn.InstanceNormalization
        |      (3): cudnn.ReLU
        |      (4): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (5): nn.InstanceNormalization
        |    }
         `-> (2): nn.ShaveImage
         ... -> output
    }
    (2): nn.CAddTable
  }
  (13): nn.Sequential {
    [input -> (1) -> (2) -> output]
    (1): nn.ConcatTable {
      input
        |`-> (1): nn.Sequential {
        |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
        |      (1): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (2): nn.InstanceNormalization
        |      (3): cudnn.ReLU
        |      (4): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (5): nn.InstanceNormalization
        |    }
         `-> (2): nn.ShaveImage
         ... -> output
    }
    (2): nn.CAddTable
  }
  (14): nn.Sequential {
    [input -> (1) -> (2) -> output]
    (1): nn.ConcatTable {
      input
        |`-> (1): nn.Sequential {
        |      [input -> (1) -> (2) -> (3) -> (4) -> (5) -> output]
        |      (1): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (2): nn.InstanceNormalization
        |      (3): cudnn.ReLU
        |      (4): cudnn.SpatialConvolution(128 -> 128, 3x3)
        |      (5): nn.InstanceNormalization
        |    }
         `-> (2): nn.ShaveImage
         ... -> output
    }
    (2): nn.CAddTable
  }
  (15): cudnn.SpatialFullConvolution(128 -> 64, 3x3, 2,2, 1,1, 1,1)
  (16): nn.InstanceNormalization
  (17): cudnn.ReLU
  (18): cudnn.SpatialFullConvolution(64 -> 32, 3x3, 2,2, 1,1, 1,1)
  (19): nn.InstanceNormalization
  (20): cudnn.ReLU
  (21): cudnn.SpatialConvolution(32 -> 3, 9x9, 1,1, 4,4)
  (22): cudnn.Tanh
  (23): nn.MulConstant
  (24): nn.TotalVariation
}
/home/ubuntu/torch/install/bin/luajit: cannot open <data/cnns/vgg16.t7> in mode r  at /home/ubuntu/torch/pkg/torch/lib/TH/THDiskFile.c:649
stack traceback:
        [C]: at 0x7fb3dfe037e0
        [C]: in function 'DiskFile'
        /home/ubuntu/torch/install/share/lua/5.1/torch/File.lua:405: in function 'load'
        train.lua:110: in function 'main'
        train.lua:320: in main chunk
        [C]: in function 'dofile'
        ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
        [C]: at 0x00405d50
ubuntu@ip-Address:~/fast-neural-style$

How to use simultaneous DeepDream and style transfer?

https://www.reddit.com/r/deepdream/comments/55ty0k/fastneuralstyle_feedforward_style_transfer/d8du3xe

ubuntu@ip-Address:~/fast-neural-style$ th slow_neural_style.lua -deepdream_layers inception_4b -image_size 1000 -content_image model7.jpg -style_image fW15Zc4.jpg -num_iterations 1500 -backend cuda  -use_cudnn 1 -optimizer adam
nn.Sequential {
  [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> (19) -> (20) -> (21) -> (22) -> (23) -> (24) -> (25) -> (26) -> (27) -> (28) -> (29) -> (30) -> (31) -> (32) -> (33) -> (34) -> (35) -> (36) -> (37) -> (38) -> (39) -> (40) -> output]
  (1): nn.SpatialConvolution(3 -> 64, 3x3, 1,1, 1,1)
  (2): nn.ReLU
  (3): nn.SpatialConvolution(64 -> 64, 3x3, 1,1, 1,1)
  (4): nn.ReLU
  (5): nn.SpatialMaxPooling(2x2, 2,2)
  (6): nn.SpatialConvolution(64 -> 128, 3x3, 1,1, 1,1)
  (7): nn.ReLU
  (8): nn.SpatialConvolution(128 -> 128, 3x3, 1,1, 1,1)
  (9): nn.ReLU
  (10): nn.SpatialMaxPooling(2x2, 2,2)
  (11): nn.SpatialConvolution(128 -> 256, 3x3, 1,1, 1,1)
  (12): nn.ReLU
  (13): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (14): nn.ReLU
  (15): nn.SpatialConvolution(256 -> 256, 3x3, 1,1, 1,1)
  (16): nn.ReLU
  (17): nn.SpatialMaxPooling(2x2, 2,2)
  (18): nn.SpatialConvolution(256 -> 512, 3x3, 1,1, 1,1)
  (19): nn.ReLU
  (20): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (21): nn.ReLU
  (22): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (23): nn.ReLU
  (24): nn.SpatialMaxPooling(2x2, 2,2)
  (25): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (26): nn.ReLU
  (27): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (28): nn.ReLU
  (29): nn.SpatialConvolution(512 -> 512, 3x3, 1,1, 1,1)
  (30): nn.ReLU
  (31): nn.SpatialMaxPooling(2x2, 2,2)
  (32): nn.View(-1)
  (33): nn.Linear(25088 -> 4096)
  (34): nn.ReLU
  (35): nn.Dropout(0.500000)
  (36): nn.Linear(4096 -> 4096)
  (37): nn.ReLU
  (38): nn.Dropout(0.500000)
  (39): nn.Linear(4096 -> 1000)
  (40): nn.SoftMax
}
/home/ubuntu/torch/install/bin/luajit: ./fast_neural_style/utils.lua:37: size mismatch between layers "inception_4b" and weights ""
stack traceback:
        [C]: in function 'error'
        ./fast_neural_style/utils.lua:37: in function 'parse_layers'
        slow_neural_style.lua:86: in function 'main'
        slow_neural_style.lua:172: in main chunk
        [C]: in function 'dofile'
        ...untu/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
        [C]: at 0x00405d50
ubuntu@ip-Address:~/fast-neural-style$

Any chance of OpenCL?

Hi!

I have a RX480 with 8 gigs of ram. I have successfully installed Hugh Perkins' cltorch and used the regular neural-network and VGG19 with it. Looking at the fast-neural-style.lua I see it uses CUDA or nothing... Any chance of having a OpenCL backend as well?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.