Code Monkey home page Code Monkey logo

artistic-videos's Introduction

artistic-videos

This is the torch implementation for the paper "Artistic style transfer for videos", based on neural-style code by Justin Johnson https://github.com/jcjohnson/neural-style .

Our algorithm allows to transfer the style from one image (for example, a painting) to a whole video sequence and generates consistent and stable stylized video sequences.

UPDATE: A much faster version which runs in under one second per frame is avaliable at fast-artistic-videos, but it only works for precomputed style templates. This repository allows arbitrary styles, but needs several minutes per frame.

Example video:

Artistic style transfer for videos

Contact

For issues or questions related to this implementation, please use the issue tracker. For everything else, including licensing issues, please email us. Our contact details can be found in our paper.

Setup

Tested with Ubuntu 14.04.

  • Install torch7, loadcaffe and the CUDA backend (otherwise you have to use CPU mode which is horribly slow) and download the VGG model, as described by jcjohnson: neural-style#setup. Optional: Install cuDNN. This requires registration as a developer with NVIDIA, but significantly reduces memory usage. For non-Nvidia GPUs you can also use the OpenCL backend.
  • To use the temporal consistency constraints, you need an utility which estimates the optical flow between two images. You can use DeepFlow which we also used in our paper. In this case, just download both DeepFlow and DeepMatching (CPU version) from their website and place the static binaries (deepmatching-static and deepflow2-static) in the main directory of this repository. Then, the scripts included in this repository can be used to generate the optical flow for all frames as well as the certainty of the flow field. If you want to use a different optical flow algorithm, specify the path to your optical flow utility in the first line of makeOptFlow.sh; the flow files have to be created in the middlebury file format.

Requirements

A fast GPU with a large amount of video memory is recommended to execute this script. The ability to run in CPU mode is impractical due to the enormous running time.

For a resolution of 450x350, you will need at least a 4GB GPU (around 3,5 GB memory usage). If you use cuDNN, a 2GB GPU is sufficient (around 1,7GB memory usage). Memory usage scales linearly with resolution, so if you experience an out of memory error, downscale the video.

Other ways to reduce memory footprint are to use the ADAM optimizer instead of L-BFGS and/or to use the NIN Imagenet model instead of VGG-19. However, we didn't test our method with either of these and you will likely get inferior results.

Simple style transfer

To perform style transfer with mostly the default parameters, execute stylizeVideo.sh <path_to_video> <path_to_style_image>. This script will perform all the steps necessary to create a stylized version of the video. Note: You have to have ffmpeg (or libav-tools for Ubuntu 14.10 and earlier) installed.

A more advanced version of this script can be found in NameRX's fork which computes optical flow in parallel to the video stylization for improved performance: NameRX/artistic-videos

FAQ

See here for a list of frequently asked question.

Advanced Usage

Please read the script stylizeVideo.sh to see which steps you have to perform in advance exactly. Basically you have to save the frames of the video as individual image files and you have to compute the optical flow between all adjacent frames as well as the certainty of the flow field (both can be accomplished with makeOptFlow.sh).

There are two versions of this algorithm, a single-pass and a multi-pass version. The multi-pass version yields better results in case of strong camera motion, but needs more iterations per frame.

Basic usage:

th artistic_video.lua <arguments> [-args <fileName>]
th artistic_video_multiPass.lua <arguments> [-args <fileName>]

Arguments can be given by command line and/or written in a file with one argument per line. Specify the path to this file through the option -args. Arguments given by command line will override arguments written in this file.

Basic arguments:

  • -style_image: The style image.
  • -content_pattern: A file path pattern for the individual frames of the videos, for example frame_%04d.png.
  • -num_images: The number of frames. Set to 0 to process all available frames.
  • -start_number: The index of the first frame. Default: 1
  • -gpu: Zero-indexed ID of the GPU to use; for CPU mode set -gpu to -1.

Arguments for the single-pass algorithm (only present in artistic_video.lua)

  • -flow_pattern: A file path pattern for files that store the backward flow between the frames. The placeholder in square brackets refers to the frame position where the optical flow starts and the placeholder in braces refers to the frame index where the optical flow points to. For example flow_[%02d]_{%02d}.flo means the flow files are named flow_02_01.flo, flow_03_02.flo, etc. If you use the script included in this repository (makeOptFlow.sh), the filename pattern will be backward_[%d]_{%d}.flo.
  • -flowWeight_pattern: A file path pattern for the weights / certainty of the flow field. These files should be a grey scale image where a white pixel indicates a high flow weight and a black pixel a low weight, respective. Same format as above. If you use the script, the filename pattern will be reliable_[%d]_{%d}.pgm.
  • -flow_relative_indices: The indices for the long-term consistency constraint as comma-separated list. Indices should be relative to the current frame. For example 1,2,4 means it uses frames i-1,i-2 and i-4 warped for current frame at position i as consistency constraint. Default value is 1 which means it uses short-term consistency only. If you use non-default values, you have to compute the corresponding long-term flow as well.

Arguments for the multi-pass algorithm (only present in artistic_video_multiPass.lua)

  • -forwardFlow_pattern: A file path pattern for the forward flow. Same format as in -flow_pattern.
  • -backwardFlow_pattern: A file path pattern for the backward flow. Same format as above.
  • -forwardFlow_weight_pattern: A file path pattern for the forward-flow. Same format as above.
  • -backwardFlow_weight_pattern: A file path pattern for the backward flow. Same format as above.
  • -num_passes: Number of passes. Default: 15.
  • -use_temporalLoss_after: Uses temporal consistency loss in given pass and afterwards. Default: 8.
  • -blendWeight: The blending factor of the previous stylized frame. The higher this value, the stronger the temporal consistency. Default value is 1 which means that the previous stylized frame is blended equally with the current frame.

Optimization options:

  • -content_weight: How much to weight the content reconstruction term. Default is 5e0.
  • -style_weight: How much to weight the style reconstruction term. Default is 1e2.
  • -temporal_weight: How much to weight the temporal consistency loss. Default is 1e3. Set to 0 to disable the temporal consistency loss.
  • -temporal_loss_criterion: Which error function is used for the temporal consistency loss. Can be either mse for the mead squared error or smoothl1 for the smooth L1 criterion.
  • -tv_weight: Weight of total-variation (TV) regularization; this helps to smooth the image. Default is 1e-3. Set to 0 to disable TV regularization.
  • -num_iterations:
    • Single-pass: Two comma-separated values for the maximum number of iterations for the first frame and for subsequent frames. Default is 2000,1000.
    • Multi-pass: A single value for the number of iterations per pass.
  • -tol_loss_relative: Stop if the relative change of the loss function in an interval of tol_loss_relative_interval iterations falls below this threshold. Default is 0.0001 which means that the optimizer stops if the loss function changes less than 0.01% in the given interval. Meaningful values are between 0.001 and 0.0001 in the default interval.
  • -tol_loss_relative_interval: Se above. Default value: 50.
  • -init:
    • Single-pass: Two comma-separated values for the initialization method for the first frame and for subsequent frames; one of random, image, prev or prevWarped. Default is random,prevWarped which uses a noise initialization for the first frame and the previous stylized frame warped for subsequent frames. image initializes with the content frames. prev initializes with the previous stylized frames without warping.
    • Multi-pass: One value for the initialization method. Either random or image.
  • -optimizer: The optimization algorithm to use; either lbfgs or adam; default is lbfgs. L-BFGS tends to give better results, but uses more memory. Switching to ADAM will reduce memory usage; when using ADAM you will probably need to play with other parameters to get good results, especially the style weight, content weight, and learning rate; you may also want to normalize gradients when using ADAM.
  • -learning_rate: Learning rate to use with the ADAM optimizer. Default is 1e1.
  • -normalize_gradients: If this flag is present, style and content gradients from each layer will be L1 normalized. Idea from andersbll/neural_artistic_style.

Output options:

  • -output_image: Name of the output image. Default is out.png which will produce output images of the form out-<frameIdx>.png for the single-pass and out-<frameIdx>_<passIdx>.png for the multi-pass algorithm.
  • -number_format: Which number format to use for the output image. For example %04d adds up to three leading zeros. Some users reported that ffmpeg may use lexicographical sorting in some cases; therefore the output frames would be combined in the wrong order without leading zeros. Default: %d.
  • -output_folder: Directory where the output images should be saved. Must end with a slash.
  • -print_iter: Print progress every print_iter iterations. Set to 0 to disable printing.
  • -save_iter: Save the image every save_iter iterations. Set to 0 to disable saving intermediate results.
  • -save_init: If this option is present, the initialization image will be saved.

Other arguments:

  • -content_layers: Comma-separated list of layer names to use for content reconstruction. Default is relu4_2.
  • -style_layers: Comman-separated list of layer names to use for style reconstruction. Default is relu1_1,relu2_1,relu3_1,relu4_1,relu5_1.
  • -style_blend_weights: The weight for blending the style of multiple style images, as a comma-separated list, such as -style_blend_weights 3,7. By default, all style images are equally weighted.
  • -style_scale: Scale at which to extract features from the style image, relative to the size of the content video. Default is 1.0.
  • -proto_file: Path to the deploy.txt file for the VGG Caffe model.
  • -model_file: Path to the .caffemodel file for the VGG Caffe model. Default is the original VGG-19 model; you can also try the normalized VGG-19 model used in the paper.
  • -pooling: The type of pooling layers to use; one of max or avg. Default is max. The VGG-19 models uses max pooling layers, but Gatys et al. mentioned that replacing these layers with average pooling layers can improve the results. We haven't been able to get good results using average pooling, but the option is here.
  • -backend: nn, cudnn or clnn. Default is nn. cudnn requires cudnn.torch and may reduce memory usage. clnn requires cltorch and clnn.
  • -cudnn_autotune: When using the cuDNN backend, pass this flag to use the built-in cuDNN autotuner to select the best convolution algorithms for your architecture. This will make the first iteration a bit slower and can take a bit more memory, but may significantly speed up the cuDNN backend.

Acknowledgement

  • This work was inspired by the paper A Neural Algorithm of Artistic Style by Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge, which introduced an approach for style transfer in still images.
  • Our implementation is based on Justin Johnson's implementation neural-style.

Citation

If you use this code or its parts in your research, please cite the following paper:

@inproceedings{RuderDB2016,
  author = {Manuel Ruder and Alexey Dosovitskiy and Thomas Brox},
  title = {Artistic Style Transfer for Videos},
  booktitle = {German Conference on Pattern Recognition},
  pages     = {26--36},
  year      = {2016},
}

artistic-videos's People

Contributors

manuelruder avatar native93 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

artistic-videos's Issues

Gray Screen or Garbled Mess is rendered out

I have everything working correctly (I think) but when it comes time to render out the final PNG frames, the first several frames are gray then the rest are Garbled mess of pixels, bright colors, like a huge graphical glitch. I'm not at my computer right now so I can post a picture later but just know that it looks... wrong.

I'm using the same video file, I checked the .ppm files it made and they look fine. I've tried different artistic jpg files but I get the same error result as above.

I'm on Ubuntu 16.04 with a GTX 1070 8GB. Have Cuda 8 installed. Using cudnn. I'm only running the basic script "stylizeVideo.sh <MyVideo.mov> <some_painting.jpg> "

Each frame is taking about 30 seconds to process. They look wrong (obviously) but wondering if that's an appropriate render time, too.

I'm wondering if there's a sweet spot for the type of artistic image and/or video? Current video (well, ppm sequence) is 960x540. Let me know if I need to provide further info or if anyone has any suggestions. Thanks!

Using the -content_pattern flag

I am using this flag after th_artistic_video_multiPass.lua to point to a directory containing multiple .jpgs (the frames of my video).

I get an error of "detected 0 content images" so clearly I am using this input flag incorrectly. Can anyone provide examples of how the input image directory should be properly flagged via -content_pattern in the command line code for artistic_video_multiPass?

style segmentation

Hi, I'm wondering if there is an easy way to employ image segmentation with this technique. Basically where a foreground image gets processed with one style, and the background gets processed with another.

Has anyone tried this?

Speed up processing using half resolution optical flow?

Hello!

Is there a way to use an optical flow generated in half-resolution to perform a style transfer for full resolution? Or maybe there is a way to upscale optical flow passes afterwards to full-resolution?

For example, optical flow calculations for 1 frame for 1280x720 takes ~3 minutes on my PC, but for 640x360 - ~1 min.

The reason i thought about this optimisation is that in video post-production software motion vectors can be calculated in half-resolution, and usually there is no a big seeing difference between using full-res optical flow and half-res.

Thank you!

Distribution in a compute cluster

Hello, and thanks for this inspiring work! I wonder if it's possible to split/distribute the work between a number of cluster nodes. For the frame-to-frame optical flow estimation it seems no problem, however the styling algorithm takes into account the predecessor result frame as far as I can tell. Have you thought about this, and do you see a way to split the computation while maintaining global temporal consistency, maybe with some amount of communication between the nodes at runtime?

terminate called after throwing an instance of 'EFilterIncompatibleSize' makeOptFlow.sh: line 56: 2586 Aborted (core dumped)


Using username "ubuntu".
Authenticating with public key "imported-openssh-key"
Welcome to Ubuntu 14.04.4 LTS (GNU/Linux 3.13.0-79-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Wed May 25 00:18:26 UTC 2016

  System load:  0.04               Processes:           159
  Usage of /:   26.1% of 29.39GB   Users logged in:     0
  Memory usage: 1%                 IP address for eth0: Adress
  Swap usage:   0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud


Last login: Wed May 25 00:18:26 2016 from 
ubuntu@ip-Address:~$ cd artistic-videos
ubuntu@ip-Address:~/artistic-videos$ ./stylizeVideo.sh /home/ubuntu/artistic-                                                                                        videos/girl.mp4 /home/ubuntu/artistic-videos/vgsn_larger.jpg

Which backend do you want to use? For Nvidia GPU, use cudnn if avalable, otherwi                                                                                        se nn. For non-Nvidia GPU, use clnn. Note: You have to have the given backend in                                                                                        stalled in order to use it. [nn]
 > cudnn

This algorithm needs a lot of memory.   For a resolution of 450x350 you'll need                                                                                         roughly 2GB VRAM.   VRAM usage increases linear with resolution.   Please enter                                                                                         a resolution at which the video should be processed,   in the format w:h, or lea                                                                                        ve blank to use the original resolution
 > 450x350
ffmpeg version N-80026-g936751b Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
  configuration: --extra-libs=-ldl --prefix=/opt/ffmpeg --mandir=/usr/share/man                                                                                         --enable-avresample --disable-debug --enable-nonfree --enable-gpl --enable-versi                                                                                        on3 --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-decoder=amrn                                                                                        b --disable-decoder=amrwb --enable-libpulse --enable-libfreetype --enable-gnutls                                                                                         --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-libvorbis --enab                                                                                        le-libmp3lame --enable-libopus --enable-libvpx --enable-libspeex --enable-libass                                                                                         --enable-avisynth --enable-libsoxr --enable-libxvid --enable-libvidstab
  libavutil      55. 24.100 / 55. 24.100
  libavcodec     57. 42.100 / 57. 42.100
  libavformat    57. 36.100 / 57. 36.100
  libavdevice    57.  0.101 / 57.  0.101
  libavfilter     6. 45.100 /  6. 45.100
  libavresample   3.  0.  0 /  3.  0.  0
  libswscale      4.  1.100 /  4.  1.100
  libswresample   2.  0.101 /  2.  0.101
  libpostproc    54.  0.100 / 54.  0.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from '/home/ubuntu/artistic-videos/girl.mp4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.36.100
  Duration: 00:00:14.02, start: 0.000000, bitrate: 833 kb/s
    Stream #0:0(und): Video: h264 (Constrained Baseline) (avc1 / 0x31637661), yu                                                                                        v420p, 640x640 [SAR 1:1 DAR 1:1], 831 kb/s, 29.97 fps, 29.97 tbr, 11988 tbn (def                                                                                        ault)
    Metadata:
      handler_name    : VideoHandler
[image2 @ 0x268d220] Using AVStream.codec to pass codec parameters to muxers is                                                                                         deprecated, use AVStream.codecpar instead.
Output #0, image2, to 'girl/frame_%04d.ppm':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2avc1mp41
    encoder         : Lavf57.36.100
    Stream #0:0(und): Video: ppm, rgb24, 450x350 [SAR 7:9 DAR 1:1], q=2-31, 200                                                                                         kb/s, 29.97 fps, 29.97 tbn (default)
    Metadata:
      handler_name    : VideoHandler
      encoder         : Lavc57.42.100 ppm
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> ppm (native))
Press [q] to stop, [?] for help
frame=  420 fps=318 q=-0.0 Lsize=N/A time=00:00:14.01 bitrate=N/A speed=10.6x                                                                                           
video:193805kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxin                                                                                        g overhead: unknown

How much do you want to weight the style reconstruction term? Default value: 1e2                                                                                         for a resolution of 450x350. Increase for a higher resolution. [1e2]
 > 0.5

Enter the zero-indexed ID of the GPU to use, or -1 for CPU mode (very slow!). [0]
 > 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2584 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./girl/flow_640x640/forward_128_129.flo
Could not open ./girl/flow_640x640/backward_129_128.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2586 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
Place deepflow2-static and deepmatching-static in this directory.
Place deepflow2-static and deepmatching-static in this directory.
Could not open ./girl/flow_640x640/backward_130_129.flo
Could not open ./girl/flow_640x640/forward_129_130.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2592 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./girl/flow_640x640/forward_129_130.flo
Could not open ./girl/flow_640x640/backward_130_129.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2594 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
Place deepflow2-static and deepmatching-static in this directory.
Place deepflow2-static and deepmatching-static in this directory.
Could not open ./girl/flow_640x640/backward_131_130.flo
Could not open ./girl/flow_640x640/forward_130_131.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2600 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./girl/flow_640x640/forward_130_131.flo
Could not open ./girl/flow_640x640/backward_131_130.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2602 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
Place deepflow2-static and deepmatching-static in this directory.
Place deepflow2-static and deepmatching-static in this directory.
Could not open ./girl/flow_640x640/backward_132_131.flo
Could not open ./girl/flow_640x640/forward_131_132.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2608 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./girl/flow_640x640/forward_131_132.flo
Could not open ./girl/flow_640x640/backward_132_131.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56:  2610 Aborted                 (core dumped) ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
Place deepflow2-static and deepmatching-static in this directory.
Place deepflow2-static and deepmatching-static in this directory.
Could not open ./girl/flow_640x640/backward_133_132.flo
Could not open ./girl/flow_640x640/forward_132_133.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'

Question from a novice

I get as far as generating the first image using the included stylizeVideo.sh, however when that image completes I see this error:
/usr/local/bin/luajit: /usr/local/share/lua/5.1/image/init.lua:346: video_6/frame_0002.ppm: No such file or directory
stack traceback:
[C]: in function 'error'
/usr/local/share/lua/5.1/image/init.lua:346: in function 'load'
./artistic_video_core.lua:557: in function 'getContentImage'
artistic_video.lua:131: in function 'main'
artistic_video.lua:340: in main chunk
[C]: in function 'dofile'
/usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x010d746c40
admins-Mac-Pro:artistic-videos-master 2 sampsen$

I can see that frame_0001.ppm was generated but not frame_0002.ppm I also have an empty folder for flow_400x300 and a single out-1.png file generated.

Does this indicate an issue with my deepflow utility? I am daring to try this from OSX 10.10.5...

Other ideas are greatly appreciated. Thanks

Interviewanfrage techtag Webmagazin

Hallo Manuel,

wir vom techtag-Team sind durch ein Video auf ‚tagesschau.de‘ auf deine geniale Idee, Videos wie Gemälde aussehen zu lassen, gestoßen. Da wir auf unserem Webmagazin techtag ( www.techtag.de ) über Neuheiten rund im IT, Digitalisierung und Netzkultur berichten, ist dein entwickeltes Programm von großem Interesse für uns. Gerne würden wir einen Artikel über dich und dein entwickeltes Programm veröffentlichen.
Hast du Interesse, mitzumachen? Falls ja, kannst du mir eine E-mail an [email protected] schicken :)

Viele Grüße
Diana Gedeon

DeepMatching dependency - GPU version

I've been able to use the deepmatching-static binary.

However, I've having a heck of time trying to compile the GPU version of deepmatch.

Has anybody been able to successfully compile this on ubuntu? It seems like the code from 2015 doesn't match current installs of caffe and the documentation on google tools is thin as to what's expected.

If anybody has had any luck getting this compiled, please let me know. Specifically I'm having a hard time understanding what google tools are required and where they are expected to be installed on Ubuntu.

write_png_file png_create_write_struct failed

Hi I am trying to run the Artistic - Videos code to run on my mac.

After the complete compilation i get the following error

__Iteration 1800 / 2000

Content 1 loss: 3857077.500000
Style 1 loss: 4100.569916
Style 2 loss: 93125.927734
Style 3 loss: 116407.360840
Style 4 loss: 786244.87_3047
Style 5 loss: 8864.884949
Total loss: 4865821.116486
Iteration 1900 / 2000
Content 1 loss: 3846950.937500
Style 1 loss: 3968.808746
Style 2 loss: 90534.338379
Style 3 loss: 114340.576172
Style 4 loss: 778716.308594
Style 5 loss: 8792.427063
Total loss: 4843303.396454
Iteration 2000 / 2000
Content 1 loss: 3836125.000000
Style 1 loss: 3857.199097

Style 2 loss: 88356.286621
Style 3 loss: 112853.234863
Style 4 loss: 773070.703125
Style 5 loss: 8708.604431
Total loss: 4822971.028137
libpng warning: Application built with libpng-1.5.27 but running with 1.6.29
/Users/research/torch/install/bin/luajit: /Users/research/torch/install/share/lua/5.1/image/init.lua:177: [write_png_file] png_create_write_struct failed
stack traceback:
[C]: in function 'save'
/Users/research/torch/install/share/lua/5.1/image/init.lua:177: in function 'saver'
/Users/research/torch/install/share/lua/5.1/image/init.lua:457: in function 'save'
./artistic_video_core.lua:497: in function 'save_image'
./artistic_video_core.lua:80: in function 'maybe_save'
./artistic_video_core.lua:107: in function 'opfunc'
./lbfgs.lua:214: in function 'optimize'
./artistic_video_core.lua:120: in function 'runOptimization'
artistic_video.lua:262: in function 'main'
artistic_video.lua:359: in main chunk
[C]: in function 'dofile'
...arch/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x0101cfe350
bartgs48:artistic-videos research$_

I have reinstalled Libjpeg
,have check with the versions of linpng
,have uninstalled and reinstalled luarocks and luarocks image
,have tried to change the code and make the piutput command as png in --./stylizeVideo.sh code

But none of the above is working and I cant figure out whats the problem.

background : I am new to using c++.

Interrupt without any infotmation

Hi! I have encountered a problem when i run the code you offered. It would interrupt without any error information during performing style transfer. I have tried several times, but i encounter the same problem every time. I don't why, can you give me any suggesttion?

Trying to build for osx

I'm on OS X el capitan and I'm trying to build deepflow2. I simply type 'make'. However I get this error:

gcc -o deepflow2.o -Wall -g -O3 -msse4 -fPIC -c deepflow2.c
gcc -o image.o -Wall -g -O3 -msse4 -fPIC -c image.c
image.c:20:10: fatal error: 'malloc.h' file not found

include <malloc.h>

     ^

1 error generated.
make: *** [image.o] Error 1

Out of memory. use Adam? downsample?

  1. I want to transfer higher piexl videos like 1k, 2K even 4k, however the size of my server's VRAM is 8GB (NV GTX 1070). With cudnn, the maximized size of image I can take from video is 1080x720p. Similar problem has been attention in #1.
  2. jcjohson said in Memory Usage that: we can use Adam instead of L-BFGS. And how to used this method in here?
  3. Also in OpenCL usage with NIN Model :
th neural_style.lua -style_image examples/inputs/picasso_selfport1907.jpg -content_image examples/inputs/brad_pitt.jpg -output_image profile.png -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -gpu 0 -backend clnn -num_iterations 1000 -seed 123 -content_layers relu0,relu3,relu7,relu12 -style_layers relu0 ,relu3,relu7,relu12 -content_weight 10 -style_weight 1000 -image_size 512 -optimizer adam

And #1 @manuelruder suggested that:

th artistic_video.lua ... -model_file models/nin_imagenet_conv.caffemodel -proto_file models/train_val.prototxt -content_layers relu0,relu3,relu7,relu12 -style_layers relu0,relu3,relu7,relu12 -content_weight 10 -style_weight 1000 ...

Above methods might have effect.
4. How to downsample the video to reduce the memory?
5. I think the ability of above method is limited. How To reduce memory usage?

Grey glitch on several frames

Hi,

I'm using your lib for a while now and I can't solve this issue.
When there's a big mouvement, there's a grey frame surrounding a part of the rendered image.
I've played with every argument of the deepflow command with no result.
I guess it's more relevant to the deepflow discussion group but I was wondering if anyone has a clue of what's going on here. An exemple : http://i.imgur.com/C5tBiew.jpg

My image resolution is 1280 × 720

approximating a fast neural style

Is there a way to approximate the effect so it can be done in real time? For example you are familiar with the pixelization effect used to obscure the nasty private bits in video streams? Instead of a color averaged large block, one might use a ‘neural style’ thumbnail whose color average matches the calculated average of a 8×8 pixel sample in the image. The neural style thumbnail to which I refer is something that can be precalculated ahead of time from Deep Dream (https://en.wikipedia.org/wiki/DeepDream) So instead of getting a solid blob of color you get a neural pattern, in a video.

transfer the style but not the color

Just wondering if there's a way to just transfer the style but no the color, like the -original_colors tag from jcjohnson's original code?

Thanks a lot!

license

Hello,
Iam working as a video designer in the theater and would love to use 2-4 minutes of artistic-videos in a 90 minutes play. Of course i would citate your amazing work in every way possible and not take the credit for it. It would just fit in so good in the context of the play. Do you see any chance that this would be possible ?

"size mismatch," on example and some images in CPU only mode

When I tested some image, I got this error

 [libprotobuf WARNING google/protobuf/io/coded_stream.cc:505] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Style image size: 450 x 350 
Setting up temporal consistency.    
Setting up temporal layer   1   :   nil 
Setting up temporal layer   2   :   nil 
Setting up temporal layer   3   :   nil 
Setting up temporal layer   4   :   nil 
Setting up temporal layer   5   :   nil 
Setting up temporal layer   6   :   nil 
Setting up temporal layer   7   :   nil 
Setting up temporal layer   8   :   nil 
Setting up temporal layer   9   :   nil 
Setting up temporal layer   10  :   nil 
Setting up temporal layer   11  :   nil 
Setting up temporal layer   12  :   nil 
Setting up temporal layer   13  :   nil 
Setting up temporal layer   14  :   nil 
Setting up temporal layer   15  :   nil 
Setting up temporal layer   16  :   nil 
Setting up temporal layer   17  :   nil 
Setting up temporal layer   18  :   nil 
Setting up temporal layer   19  :   nil 
Setting up temporal layer   20  :   nil 
Setting up temporal layer   21  :   nil 
Setting up temporal layer   22  :   nil 
Setting up temporal layer   23  :   nil 
Setting up temporal layer   24  :   nil 
Setting up temporal layer   25  :   nil 
Setting up temporal layer   26  :   nil 
Setting up temporal layer   27  :   nil 
Setting up temporal layer   28  :   nil 
Setting up temporal layer   29  :   nil 
Setting up temporal layer   30  :   nil 
Setting up temporal layer   31  :   nil 
Setting up temporal layer   32  :   nil 
Setting up temporal layer   33  :   nil 
Setting up temporal layer   34  :   nil 
Setting up temporal layer   35  :   nil 
Setting up temporal layer   36  :   nil 
Setting up temporal layer   37  :   nil 
Setting up temporal layer   38  :   nil 
Setting up temporal layer   39  :   nil 
Setting up temporal layer   40  :   nil 
Setting up temporal layer   41  :   nil 
Setting up temporal layer   42  :   nil 
Setting up temporal layer   43  :   nil 
Setting up temporal layer   44  :   nil 
Setting up temporal layer   45  :   nil 
Setting up temporal layer   46  :   nil 
/usr/local/bin/luajit: /usr/local/share/lua/5.1/nn/Linear.lua:39: size mismatch, [4096 x 25088], [84480] at /tmp/luarocks_torch-scm-1-9520/torch7/lib/TH/generic/THTensorMath.c:707
stack traceback:
    [C]: in function 'addmv'
    /usr/local/share/lua/5.1/nn/Linear.lua:39: in function 'updateOutput'
    /usr/local/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
    ./artistic_video_core.lua:17: in function 'runOptimization'
    artistic_video.lua:259: in function 'main'
    artistic_video.lua:355: in main chunk
    [C]: in function 'dofile'
    /usr/local/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x00406260

I thought some of commend's arguments were wrong, so I did no argument(initial setting for example).
Same error occurred.
I found same error in neural style(jcjohnson/neural-style#19), but it's error came from using 'content_layers' instead of 'content_image'.

Here is my first and second attempt's commend.
first,
th artistic_video.lua -style_image style.jpg -content_pattern image/illusion%03d.jpg -num_images 40 -start_number 0 -output_folder output/ -number_format %03d -gpu -1
second,
th artistic_video.lua -gpu -1

cf) I also tried artistic_video_multiPass.lua, and nothing's really changed.

Multi-GPU update?

I am wondering if it would be an easy fix to incorporate the new multi-gpu functionality jcjohnson has built into neural-style into your video script? It would be nice to have the option of using both gpu's to create larger images.

Simple style transfer Processing Times

Hi, a 7 frames video of a resolution of 450x350 takes about 10minutes to fully process with "Simple style transfer" to create the final video. the same 7 frames but at 900x700 resolution takes 44 min.
Do you know if I'm using Cuda with these times? If not how can I activate it on Simple style transfer?

can not open */backward.flo file

Hi,
Some where during iteration, I get the following error. It is throwing the same error on both, cudnn and nn algorithms. I am using avconv

$ sh stylizeVideo.sh <video_file> <style_file>
...
...
Iteration 1651 / 2000
  Content 1 loss: 1067995.625000
  Style 1 loss: 323.281898
  Style 2 loss: 7382.588501
  Style 3 loss: 13015.654297
  Style 4 loss: 354105.585938
  Style 5 loss: 754.768982
  Total loss: 1443577.504616
Reading flow file "video/flow_200:200/backward_2_1.flo".
/home/ec2-user/torch/install/bin/luajit: cannot open <video/flow_200:200/backward_2_1.flo> in mode r  at /home/ec2-user/torch/pkg/torch/lib/TH/THDiskFile.c:649
stack traceback:
    [C]: at 0x7fb06715f720
    [C]: in function 'DiskFile'
    ./flowFileLoader.lua:15: in function 'load'
    artistic_video.lua:177: in function 'main'
    artistic_video.lua:355: in main chunk
    [C]: in function 'dofile'
    ...user/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
    [C]: at 0x004064a0

Time to process in multipass

I have a 10 second clip with 285 frames. It looks like it is taking 85 seconds per frame in the first five passes. At that rate it looks to take a little over 4 days to render for 15 passes. I was wondering if it will speed up in later passes if the total losses start to decrease?

OptFlow performance question

First of all - thanks for putting it all together. Really newbie question here - is it normal that OptFlow process takies about 3min for backward/forward/reliable set for one video frame? (i7-2600k)

deepmatching-static segfaults on ubuntu 16.04

Hello,

I migrated my setup to a fresh machine and am running into a strange issue with deepmatching segfaulting shortly after running (10 minutes?).

Ubuntu 16.04
Linux mlart 4.4.0-59-generic #80-Ubuntu SMP Fri Jan 6 17:47:47 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
GTX Titan X (Maxwell)
CPU: Intel(R) Core(TM) i7-5775C CPU @ 3.30GHz

I installed systemd-coredump and it spins up a bunch of processes as soon as deepmatching-static starts. Here's an example coredump. I'm not very familiar with debugging these, so sorry for the noob question here.

$ coredumpctl info 8618
PID: 8618 (deepmatching-st)
UID: 0 (root)
GID: 0 (root)
Signal: 11 (SEGV)
Timestamp: Tue 2017-01-17 12:16:17 PST (30min ago)
Command Line: ./deepmatching-static /content/vid/example/frames/frame_0004.ppm /content/vid/example/frames/frame_0005.ppm -nt 0
Executable: /root/artistic-videos/deepmatching-static
Control Group: /docker/922e2593770bb0190376fdae5482d60ff373a0aa32565f3d84e33b482e1a0c3b
Slice: -.slice
Boot ID: 156c400d834845b2a36f1e00da8dcf63
Machine ID: 57adaac235924da8aeaa7dd1b3a478e9
Hostname: mlart
Coredump: /var/lib/systemd/coredump/core.deepmatching-st.0.156c400d834845b2a36f1e00da8dcf63.8618.1484684177000000000000.xz
Message: Process 8618 (deepmatching-st) of user 0 dumped core.

            Stack trace of thread 329:
            #0  0x0000000000457bb0 n/a (/var/lib/docker/aufs/diff/f9fada9ab7f83d8797c8b483087cce5fdfb9f61eabe3e9a8f5fa4649103e7989/root/artistic-videos/deepmatching-static)

I'm running this with nvidia-docker, which was working great on my previous machine. I thought I'd ping this repo about possible solutions before I go ordering more RAM and start swapping hardware.

no video create only ppm files

Hello,
I was able to install the script and I am surely asking a stupid question but right now after I ran the stylizeVideo.sh, I only get ppm files without the style transfer applied and wonder if I'm missing an extra step to get the transfer styled video?

prevWarped Optical flow Grey

I am running into issues when using prevWarped with optical flow, when the camera pans grey areas show up, do i add keyframes to -flow_relative_indices to fix this? should i leave the indices on 1? right now i have it on -flow_relative_indices 1,12,32 because i didnt fully understand what they ment (Just reread the readme) if i read correctly each input number (representing each frame you want) would be the image that prevWarped would switch to at that frame and after.

Does anybody have any tips they picked up on their experience with using prevWarped and/or optical flow?

out-82

run-deepflow.sh: line 10: ./deepmatching-static: Permission denied

Hi Manuel! First of all, I would like to express my admiration to you and this wonderful aplication. I come from Spain and studied one year with an Erasmus in Karlsruhe, so it is pretty cool to see that all these new utilities are being developed so near from where I lived! I have also seen the paper from "Leon" from Tübingen.

So, I succesfully runned the jcjohnson program in Ubuntu. I am using the computing capabilities of my university to run the codes way faster. However, when trying your approach for videos, I get this error and I am not very sure about how to solve it. I am not a computer science guy so my knowledge about Linux is pretty much what I discovered through trial and error trying to get the jcjohnson code working. I am not sure if I understood well the installation process. First, I clicked the "clone and download" button in github (the guys from my university told me not to use sudo commands, and for the jcjohnson, the git clone steps used them, so I didn't want to risk). I uncompressed it and placed the folder in my work directory. Then, I also uncompressed the deep flow and deep matching folders and placed such folders in the artistic-video folder. As it didn't work, I thought that maybe you meant copying the deepflow-static and deepmatch-static files into the artistic-video folder rather than in their own deepflow and deepmatch folders, but that didn't work either.

I hope you can help me.
Dankeschon!

[antonioa1@wh-520-9-5 avm]$ bash stylizeVideo.sh inputs/f.mov model/1.jpg

Which backend do you want to use? For Nvidia GPU, use cudnn if available, otherwise nn. For non-Nvidia GPU, use clnn. Note: You have to have the given backend installed in order to use it. [nn] 
 > cudm^[[D^[[3~^[^[^[^[^[^[^[[D^[[D^[[Ccudnn
Unknown backend.
[antonioa1@wh-520-9-5 avm]$ cudnn
bash: cudnn: command not found
[antonioa1@wh-520-9-5 avm]$ bash stylizeVideo.sh inputs/f.mov model/1.jpg

Which backend do you want to use? For Nvidia GPU, use cudnn if available, otherwise nn. For non-Nvidia GPU, use clnn. Note: You have to have the given backend installed in order to use it. [nn] 
 > cudnn

This algorithm needs a lot of memory.   For a resolution of 450x350 you'll need roughly 2GB VRAM.   VRAM usage increases linear with resolution.   Please enter a resolution at which the video should be processed,   in the format w:h, or leave blank to use the original resolution 
 > 450x350
ffmpeg version 2.8.5 Copyright (c) 2000-2016 the FFmpeg developers
  built with gcc 4.4.7 (GCC) 20120313 (Red Hat 4.4.7-4)
  configuration: --prefix=/apps/ffmpeg/2.8.5 --enable-gpl --enable-postproc --enable-version3 --enable-x11grab --enable-shared
  libavutil      54. 31.100 / 54. 31.100
  libavcodec     56. 60.100 / 56. 60.100
  libavformat    56. 40.101 / 56. 40.101
  libavdevice    56.  4.100 / 56.  4.100
  libavfilter     5. 40.101 /  5. 40.101
  libswscale      3.  1.101 /  3.  1.101
  libswresample   1.  2.101 /  1.  2.101
  libpostproc    53.  3.100 / 53.  3.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'inputs/f.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    creation_time   : 2016-09-18 15:13:11
  Duration: 00:00:11.67, start: 0.000000, bitrate: 17509 kb/s
    Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x1080, 17411 kb/s, 29.98 fps, 29.97 tbr, 600 tbn, 1200 tbc (default)
    Metadata:
      rotate          : 180
      creation_time   : 2016-09-18 15:13:11
      handler_name    : Core Media Data Handler
      encoder         : H.264
    Side data:
      displaymatrix: rotation of -180.00 degrees
    Stream #0:1(und): Audio: aac (LC) (mp4a / 0x6134706D), 44100 Hz, mono, fltp, 90 kb/s (default)
    Metadata:
      creation_time   : 2016-09-18 15:13:11
      handler_name    : Core Media Data Handler
    Stream #0:2(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
    Metadata:
      creation_time   : 2016-09-18 15:13:11
      handler_name    : Core Media Data Handler
    Stream #0:3(und): Data: none (mebx / 0x7862656D), 0 kb/s (default)
    Metadata:
      creation_time   : 2016-09-18 15:13:11
      handler_name    : Core Media Data Handler
Output #0, image2, to 'f/frame_%04d.ppm':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    encoder         : Lavf56.40.101
    Stream #0:0(und): Video: ppm, rgb24, 450x350, q=2-31, 200 kb/s, 29.97 fps, 29.97 tbn, 29.97 tbc (default)
    Metadata:
      handler_name    : Core Media Data Handler
      creation_time   : 2016-09-18 15:13:11
      encoder         : Lavc56.60.100 ppm
Stream mapping:
  Stream #0:0 -> #0:0 (h264 (native) -> ppm (native))
Press [q] to stop, [?] for help
frame=  350 fps= 10 q=-0.0 Lsize=N/A time=00:00:11.67 bitrate=N/A    
video:161504kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

How much do you want to weight the style reconstruction term? Default value: 1e2 for a resolution of 450x350. Increase for a higher resolution. [1e2] 
 > 1e2

Enter the zero-indexed ID of the GPU to use, or -1 for CPU mode (very slow!). [0] 
 > 0

Computing optical flow. This may take a while...
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
Could not open ./f/flow_450x350/backward_2_1.flo
Could not open ./f/flow_450x350/forward_1_2.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31032 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_1_2.flo
Could not open ./f/flow_450x350/backward_2_1.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31033 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
Could not open ./f/flow_450x350/backward_3_2.flo
Could not open ./f/flow_450x350/forward_2_3.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31042 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_2_3.flo
Could not open ./f/flow_450x350/backward_3_2.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31043 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
Could not open ./f/flow_450x350/backward_4_3.flo
Could not open ./f/flow_450x350/forward_3_4.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31052 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_3_4.flo
Could not open ./f/flow_450x350/backward_4_3.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31053 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
Could not open ./f/flow_450x350/backward_5_4.flo
Could not open ./f/flow_450x350/forward_4_5.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31062 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_4_5.flo
Could not open ./f/flow_450x350/backward_5_4.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31063 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
Could not open ./f/flow_450x350/backward_6_5.flo
Could not open ./f/flow_450x350/forward_5_6.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31072 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_5_6.flo
Could not open ./f/flow_450x350/backward_6_5.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31073 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
Could not open ./f/flow_450x350/backward_7_6.flo
Could not open ./f/flow_450x350/forward_6_7.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31082 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_6_7.flo
Could not open ./f/flow_450x350/backward_7_6.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31083 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
Could not open ./f/flow_450x350/backward_8_7.flo
Could not open ./f/flow_450x350/forward_7_8.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31092 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_7_8.flo
Could not open ./f/flow_450x350/backward_8_7.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31093 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
run-deepflow.sh: line 10: ./deepmatching-static: Permission denied
run-deepflow.sh: line 10: ./deepflow2-static: Permission denied
Could not open ./f/flow_450x350/backward_9_8.flo
Could not open ./f/flow_450x350/forward_8_9.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31102 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/backward_${j}_${i}.flo" "${folderName}/forward_${i}_${j}.flo" "${folderName}/reliable_${j}_${i}.pgm"
Could not open ./f/flow_450x350/forward_8_9.flo
Could not open ./f/flow_450x350/backward_9_8.flo
Exception EFilterIncompatibleSize: Initial container size: 0  Resulting container size: 0
terminate called after throwing an instance of 'EFilterIncompatibleSize'
makeOptFlow.sh: line 56: 31103 Aborted                 ./consistencyChecker/consistencyChecker "${folderName}/forward_${i}_${j}.flo" "${folderName}/backward_${j}_${i}.flo" "${folderName}/reliable_${i}_${j}.pgm"

Running out of memory

I assume this is normal, but I'm reporting it in case it's a real issue, I'm using a GTX 780 in debian.

`eirexe@EIREXE-Debian:~/artistic-videos$ th artistic_video.lua -style_image starry_night.jpg -content_pattern frame_%04d.ppm -num_images 24 -gpu 0 -optimizer adam
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Style image size: 607 x 379
Setting up temporal consistency.
Setting up style layer 2 : relu1_1
Setting up style layer 7 : relu2_1
Setting up style layer 12 : relu3_1
Setting up style layer 21 : relu4_1
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-7004/cutorch/lib/THC/generic/THCStorage.cu line=41 error=2 : out of memory
/home/eirexe/torch/install/bin/luajit: /home/eirexe/torch/install/share/lua/5.1/nn/Container.lua:67:
In 19 module of nn.Sequential:
/home/eirexe/torch/install/share/lua/5.1/nn/THNN.lua:109: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-7004/cutorch/lib/THC/generic/THCStorage.cu:41
stack traceback:
[C]: in function 'v'
/home/eirexe/torch/install/share/lua/5.1/nn/THNN.lua:109: in function 'SpatialConvolutionMM_updateOutput'
...xe/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:111: in function <...xe/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:107>
[C]: in function 'xpcall'
/home/eirexe/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
/home/eirexe/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
./artistic_video_core.lua:229: in function 'buildNet'
artistic_video.lua:102: in function 'main'
artistic_video.lua:340: in main chunk
[C]: in function 'dofile'
...rexe/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405bd0

WARNING: If you see a stack trace below, it doesn't point to the place where this error occured. Please use only the one above.
stack traceback:
[C]: in function 'error'
/home/eirexe/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
/home/eirexe/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
./artistic_video_core.lua:229: in function 'buildNet'
artistic_video.lua:102: in function 'main'
artistic_video.lua:340: in main chunk
[C]: in function 'dofile'
...rexe/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00405bd0
`

How can I confirm I'm using GPU/CUDA?

I've been running th artistic_video.lua -backend cudnn for about 30 minutes now and it's still going. I installed cudnn and cudnn.torch so assume it's being used but am not sure. Luajit is taking 100% CPU of one of my cores but I'd like to verify that I am in fact using the GPU and not just CPU in this case.

Is there a trivial way to determine this?

Thanks!

Conversion stopping after a few frames

The optical flow has been successfully computed with Deep Matching and Deep Flow. However when proceeding to stylifying the images, the process stops at the 3rd or 4th. Any thoughts ?

Iteration 500 / 1000
Content 1 loss: 1703792.031250
Temporal 1 loss: 45211.532593
Style 1 loss: 2419.038391
Style 2 loss: 23474.081421
Style 3 loss: 28717.443848
Style 4 loss: 399992.993164
Style 5 loss: 3270.827103
Total loss: 2206877.947769
Iteration 600 / 1000
Content 1 loss: 1703150.468750
Temporal 1 loss: 45308.925629
Style 1 loss: 2418.774605
Style 2 loss: 23459.86785
Style 3 loss: 28724.108887
Style 4 loss: 400028.320312
Style 5 loss: 3272.040176
Total loss: 2206362.506218
<optim.lbfgs> relative change in function value is less than tolFunRelative
Iteration 651 / 1000
Content 1 loss: 1702996.406250
Temporal 1 loss: 45350.559235
Style 1 loss: 2418.964767
Style 2 loss: 23457.32879
Style 3 loss: 28720.361328
Style 4 loss: 399955.883789
Style 5 loss: 3271.706390
Total loss: 2206171.210556

avconv version 9.18-6:9.18-0ubuntu0.14.04.1, Copyright (c) 2000-2014 the Libav developers
built on Mar 16 2015 13:19:10 with gcc 4.8 (Ubuntu 4.8.2-19ubuntu1)
Input #0, image2, from 'video20/out-%d.png':
Duration: 00:00:00.20, start: 0.000000, bitrate: N/A
Stream #0.0: Video: png, rgb24, 350x250, 25 fps, 25 tbr, 25 tbn
[libx264 @ 0x1143140] using cpu capabilities: MMX2 SSE2Fast SSSE3 SSE4.2 AVX AVX2 FMA3 LZCNT BMI2
[libx264 @ 0x1143140] profile High, level 1.3
[libx264 @ 0x1143140] 264 - core 142 r2389 956c8d8 - H.264/MPEG-4 AVC codec - Copyleft 2003-2014 - http://www.videolan.org/x264.html - options: cabac=1 ref=3 deblock=1:0:0 analyse=0x3:0x113 me=hex subme=7 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=16 chroma_me=1 trellis=1 8x8dct=1 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 lookahead_threads=1 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=3 b_pyramid=2 b_adapt=1 b_bias=0 direct=1 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=40 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.25 aq=1:1.00
Output #0, mp4, to 'video20-stylized.mp4':
Metadata:
encoder : Lavf54.20.4
Stream #0.0: Video: libx264, yuv420p, 350x250, q=-1--1, 25 tbn, 25 tbc
Stream mapping:
Stream #0:0 -> #0:0 (png -> libx264)
Press ctrl-c to stop encoding
frame= 5 fps= 0 q=32763.0 Lsize= 28kB time=0.12 bitrate=1880.1kbits/s
video:0kB audio:0kB global headers:0kB muxing overhead 72210.256410%
[libx264 @ 0x1143140] frame I:1 Avg QP:27.64 size: 12578
[libx264 @ 0x1143140] frame P:2 Avg QP:29.01 size: 5289
[libx264 @ 0x1143140] frame B:2 Avg QP:30.24 size: 1711
[libx264 @ 0x1143140] consecutive B-frames: 20.0% 80.0% 0.0% 0.0%
[libx264 @ 0x1143140] mb I I16..4: 2.0% 54.5% 43.5%
[libx264 @ 0x1143140] mb P I16..4: 1.7% 4.3% 2.6% P16..4: 40.2% 34.2% 15.8% 0.0% 0.0% skip: 1.3%
[libx264 @ 0x1143140] mb B I16..4: 0.0% 0.0% 0.4% B16..8: 45.5% 17.8% 4.0% direct: 5.8% skip:26.6% L0:28.4% L1:42.6% BI:29.0%
[libx264 @ 0x1143140] 8x8 transform intra:53.5% inter:55.1%
[libx264 @ 0x1143140] coded y,uvDC,uvAC intra: 92.2% 100.0% 97.3% inter: 45.6% 63.3% 19.0%
[libx264 @ 0x1143140] i16 v,h,dc,p: 0% 53% 0% 47%
[libx264 @ 0x1143140] i8 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 27% 14% 5% 6% 6% 10% 6% 11%
[libx264 @ 0x1143140] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 15% 25% 15% 8% 8% 7% 9% 7% 8%
[libx264 @ 0x1143140] i8c dc,h,v,p: 47% 22% 14% 18%
[libx264 @ 0x1143140] Weighted P-Frames: Y:50.0% UV:50.0%
[libx264 @ 0x1143140] ref P L0: 83.5% 16.1% 0.4%
[libx264 @ 0x1143140] kb/s:1063.12

Mutli-GPU option?

Just saw other post about multi-gpu -- closing as dup - Phil

In the original neural-style code there is a form of model parallelism that allows the VGG net to be split across multiple GPUs.

Is there any reason to think that adding similar code to this implementation is fundamentally bound to fail? In other words, is there something about applying the optical flow to video frames that requires that the model be maintained on a single GPU?

Processing stops with "bad argument #2 to 'add' (sizes do not match)" after first sucessful image

Apologies if I've missed something very basic, but I'm a bit out of my experience and I don't know where to go from here.
For a brief bit of background, I'm on a 1080, I can make neural style images in about a minute and a half without difficulty, I can do deep dream video with deepdreamanim, and I've been able to run the provided example. (Though I should note that it doesn't reach full iterations on each step)
But, despite my best efforts, I come back to "bad argument # 2 to 'add' (sizes do not match)" after the first image is created.
I appreciate all the help I can get, and don't be afraid of breakin' it down eli5 style. Thank you

al@al-teletran1:~/artistic-videos$ th artistic_video.lua -style_image /home/al/artistic-videos/example/1.jpg -content_pattern /home/al/artistic-videos/frames/frame_%04d.ppm -num_images 0 -num_iterations 900

[libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message.  If the message turns out to be larger than 1073741824 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 574671192
Successfully loaded models/VGG_ILSVRC_19_layers.caffemodel
conv1_1: 64 3 3 3
conv1_2: 64 64 3 3
conv2_1: 128 64 3 3
conv2_2: 128 128 3 3
conv3_1: 256 128 3 3
conv3_2: 256 256 3 3
conv3_3: 256 256 3 3
conv3_4: 256 256 3 3
conv4_1: 512 256 3 3
conv4_2: 512 512 3 3
conv4_3: 512 512 3 3
conv4_4: 512 512 3 3
conv5_1: 512 512 3 3
conv5_2: 512 512 3 3
conv5_3: 512 512 3 3
conv5_4: 512 512 3 3
fc6: 1 1 25088 4096
fc7: 1 1 4096 4096
fc8: 1 1 4096 1000
Style image size: 486 x 631	
Setting up temporal consistency.	
Setting up style layer  	2	:	relu1_1	
Setting up style layer  	7	:	relu2_1	
Setting up style layer  	12	:	relu3_1	
Setting up style layer  	21	:	relu4_1	
Setting up content layer	23	:	relu4_2	
Setting up style layer  	30	:	relu5_1	
Detected 5150 content images.	
Running optimization with L-BFGS	
<optim.lbfgs> 	creating recyclable direction/step/history buffers	
Iteration 100 / 900	
  Content 1 loss: 5079415.000000	
  Style 1 loss: 25260.379028	
  Style 2 loss: 273015.502930	
  Style 3 loss: 165728.356934	
  Style 4 loss: 586837.890625	
  Style 5 loss: 2422.522545	
  Total loss: 6132679.652061	

SKIPPING A FEW

Iteration 900 / 900	
  Content 1 loss: 4669215.312500	
  Style 1 loss: 441.764927	
  Style 2 loss: 15081.632996	
  Style 3 loss: 27847.726440	
  Style 4 loss: 239924.487305	
  Style 5 loss: 2429.140282	
  Total loss: 4954940.064449	
<optim.lbfgs> 	reached max number of iterations	
Running time: 280s	
Iteration 900 / 900	
  Content 1 loss: 4669215.312500	
  Style 1 loss: 441.764927	
  Style 2 loss: 15081.632996	
  Style 3 loss: 27847.726440	
  Style 4 loss: 239924.487305	
  Style 5 loss: 2429.140282	
  Total loss: 4954940.064449	
Reading flow file "example/deepflow/backward_2_1.flo".	
Reading flowWeights file "example/deepflow/reliable_2_1.pgm".	
WARNING: Skipping content loss	
Reading flow file "example/deepflow/backward_2_1.flo".	
WARNING: Skipping content loss	
Running optimization with L-BFGS	
WARNING: Skipping content loss	
/home/al/torch/install/bin/luajit: /home/al/torch/install/share/lua/5.1/nn/Container.lua:67: 
In 30 module of nn.Sequential:
./artistic_video_core.lua:286: bad argument #2 to 'add' (sizes do not match at /tmp/luarocks_cutorch-scm-1-4077/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:216)
stack traceback:
	[C]: in function 'add'
	./artistic_video_core.lua:286: in function 'updateGradInput'
	/home/al/torch/install/share/lua/5.1/nn/Module.lua:31: in function </home/al/torch/install/share/lua/5.1/nn/Module.lua:29>
	[C]: in function 'xpcall'
	/home/al/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
	/home/al/torch/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward'
	./artistic_video_core.lua:93: in function 'opfunc'
	./lbfgs.lua:68: in function 'optimize'
	./artistic_video_core.lua:120: in function 'runOptimization'
	artistic_video.lua:262: in function 'main'
	artistic_video.lua:359: in main chunk
	[C]: in function 'dofile'
	...e/al/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x00405d50

WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
stack traceback:
	[C]: in function 'error'
	/home/al/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
	/home/al/torch/install/share/lua/5.1/nn/Sequential.lua:84: in function 'backward'
	./artistic_video_core.lua:93: in function 'opfunc'
	./lbfgs.lua:68: in function 'optimize'
	./artistic_video_core.lua:120: in function 'runOptimization'
	artistic_video.lua:262: in function 'main'
	artistic_video.lua:359: in main chunk
	[C]: in function 'dofile'
	...e/al/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
	[C]: at 0x00405d50

-continue_with option

If a render session get interrupted, does this option allow us to restart at the 'next' frame and maintain the previous style continuity? I didn't see it referenced in the documentation other that what is in the artisitic_video.lua file

I've noticed that different runs of the same video and style frame result in different initial frames. (probably due to random initialization).

I'm trying to figure out if it's possible to safely interrupt a long render without creating a glitch between frames and what settings would need to be set either during the initial run so as to preserve the start state and/or later during a resumption.

Thank you for your help.

Image.warp incorrect arguments

Greetings,

Thanks for all your work on this project! I'm very excited to be getting it up and running.

Currently, I'm running into an error whilst trying to process images, and unfortunately I've been unable to find any help docs that pointed me to a fix.

Running Ubuntu 14.04, CUDA 7.5, latest pull of artistic-videos, all on AWS g2.2xlarge.

It looks like the inputs to the image.warp function are incorrect – not sure why this would be. A misconfigured lib someplace?

I generated this error with the following line:

th artistic_video.lua -content_pattern example/marple8_%02d.ppm -style_image example/seated-nude.jpg -num_images 0 -gpu 0

Output:

Reading flow file "example/deepflow/backward_2_1.flo".
[res] image.warp([dst,]src,field,[mode,offset,clamp])

Warps image src (of size KxHxW ) according to flow field field . The latter has size 2xHxW where the first dimension is for the (y,x) flow field. String mode can take on values lanczos, bicubic, bilinear (the default), or simple. When offset is true (the default), (x,y) is added to the flow field. The clamp variable specifies how to handle the interpolation of samples
off the input image. Permitted values are strings clamp (the default) or pad. If dst is specified, it is used to store the result of the warp.Otherwise, returns a new res Tensor.
usage:
image.warp(
torch.Tensor -- input image (KxHxW)
torch.Tensor -- (y,x) flow field (2xHxW)
[string] -- mode: lanczos | bicubic | bilinear | simple
[string] -- offset mode (add (x,y) to flow field)
[string] -- clamp mode: how to handle interp of samples off the input image (clamp | pad)
)
image.warp(
torch.Tensor -- destination
torch.Tensor -- input image (KxHxW)
torch.Tensor -- (y,x) flow field (2xHxW)
[string] -- mode: lanczos | bicubic | bilinear | simple
[string] -- offset mode (add (x,y) to flow field)
[string] -- clamp mode: how to handle interp of samples off the input image (clamp | pad)
)
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

/home/ubuntu/torch-distro/install/bin/luajit: ...ubuntu/torch-distro/install/share/lua/5.1/dok/inline.lua:736: <image.warp> incorrect arguments
stack traceback:
[C]: in function 'error'
...ubuntu/torch-distro/install/share/lua/5.1/dok/inline.lua:736: in function 'error'
...ubuntu/torch-distro/install/share/lua/5.1/image/init.lua:807: in function 'warp'
artistic_video.lua:293: in function 'warpImage'
artistic_video.lua:182: in function 'main'
artistic_video.lua:359: in main chunk
[C]: in function 'dofile'
...rch-distro/install/lib/luarocks/rocks/trepl/scm-1/bin/th:131: in main chunk
[C]: at 0x00406670

Any help is greatly appreciated.

EDIT:
Success after editing line 293:
Original:
result = image.warp(img, flow, 'bilinear', true, 'pad', -1)

Changed:
result = image.warp(img, flow, 'bilinear', true, 'clamp')

Error 0x00405d70

Hello! First of all congratulations for your amazing work! I am a new ubuntu user and i am interested in NN's so i thought of running your code with default parameters. As you may see from the attached screenshot after few hours of computing the optical flow i face the specific issue. I do not know what else i can do as you may see from folder screenshot every file is there. I would appreciate if you could help me further on this matter.

Thank you very much!
George.

Issue Screenshot:
issue

Folder Screenshot:
folder

Question about dependencies

If I already have JCJohnson's neural-style installed, do I basically just need to install ffmpeg, and run this script? Is there a full list of dependencies etc?

How to do real time processing

Dear manuelruder:

I use ./stylizeVideo.sh  ./ball.mp4  ./example/seated-nude.jpg

ball.mp4:720p with 25 fps 18 seconds video
and I process dst 640x480 stylevideo
It took 5 hours to complete.

Machine configuration:
OS............: ubuntu14.04
Memory........: 32G
CPU...........: Intel Core i7-6700K 4.00GHz*8
DisplayCard...: NVIDIA GeForce GTX 1080
Harddisk......: 256G ssd


gtx1080@suker:~$ nvidia-smi
Tue Aug 23 21:17:24 2016
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.35 Driver Version: 367.35 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 1080 Off | 0000:01:00.0 On | N/A |
| 34% 35C P8 10W / 200W | 1306MiB / 8112MiB | 5% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1169 G /usr/bin/X 1014MiB |
| 0 2025 G compiz 59MiB |
| 0 3571 G unity-control-center 2MiB |
| 0 10412 G ...ves-passed-by-fd --v8-snapshot-passed-by- 227MiB |
+-----------------------------------------------------------------------------+
gtx1080@suker:~$

I am looking forward to your reply.
Thanks

Issue with some frames crashing after 2 iterations

Hi all,

Apologies if this was fixed elsewhere, but I cannot find it if so. I'm a noob to this type of software and it's a little bewildering.

Some frames (very often the first frame, but usually around 1 in 15 of completed frames) will crash after only going for one or two iterations. The result is that the frame is blurry compared to the ones that completed without issue. The script then carries on and attempts to create the next frame.

No obvious error, but there is a difference between frames which did and didn't crash (in bold):

Reading flow file ".../wavelarge/flow_1280:720/backward_19_18.flo".
Reading flowWeights file ".../wavelarge/flow_1280:720/reliable_19_18.pgm".
Reading flow file ".../wavelarge/flow_1280:720/backward_19_18.flo".
Running optimization with L-BFGS
<optim.lbfgs> creating recyclable direction/step/history buffers
<optim.lbfgs> function value changing less than tolX
Running time: 1s
Iteration 2 / 300
Content 1 loss: 9200722.500000
Temporal 1 loss: 0.000000
Style 1 loss: 812805.512238
Style 2 loss: 266905000.195312
Style 3 loss: 722221992.968750
Style 4 loss: 21909036787.500000
Style 5 loss: 1525010.723877
Total loss: 22909702319.400177
Reading flow file

I've tried setting -tol_loss_relative to 0 but that doesn't change anything. It has also been suggested that lowering the style weight might help but it still occurs with it lowered significantly. Any ideas?

Error running line 84 of stylizeVideo.sh

Got an error after computing the optical flow

Line 84 of stylizeVideo.sh which is

th artistic_video.lua \

Any idea why?

Here is my terminal output.

zed@zed-desktop:~/artistic-videos$ sudo bash stylizeVideo.sh 'wingsuit/wing-%03d.png' 'styles/mountains.jpg'

Which backend do you want to use? For Nvidia GPU, use cudnn if available, otherwise nn. For non-Nvidia GPU, use clnn. Note: You have to have the given backend installed in order to use it. [nn]

clnn

This algorithm needs a lot of memory. For a resolution of 450x350 you'll need roughly 4GB VRAM. VRAM usage increases linear with resolution. Maximum recommended resolution with a Titan X 12GB: 960:540. Please enter a resolution at which the video should be processed, in the format w:h, or leave blank to use the original resolution

600:338
ffmpeg version N-80283-g84efdab Copyright (c) 2000-2016 the FFmpeg developers
built with gcc 4.8 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
configuration: --extra-libs=-ldl --prefix=/opt/ffmpeg --mandir=/usr/share/man --enable-avresample --disable-debug --enable-nonfree --enable-gpl --enable-version3 --enable-libopencore-amrnb --enable-libopencore-amrwb --disable-decoder=amrnb --disable-decoder=amrwb --enable-libpulse --enable-libfreetype --enable-gnutls --enable-libx264 --enable-libx265 --enable-libfdk-aac --enable-libvorbis --enable-libmp3lame --enable-libopus --enable-libvpx --enable-libspeex --enable-libass --enable-avisynth --enable-libsoxr --enable-libxvid --enable-libvidstab
libavutil 55. 24.100 / 55. 24.100
libavcodec 57. 46.100 / 57. 46.100
libavformat 57. 37.101 / 57. 37.101
libavdevice 57. 0.101 / 57. 0.101
libavfilter 6. 46.101 / 6. 46.101
libavresample 3. 0. 0 / 3. 0. 0
libswscale 4. 1.100 / 4. 1.100
libswresample 2. 0.101 / 2. 0.101
libpostproc 54. 0.100 / 54. 0.100
Input #0, image2, from 'wingsuit/wing-%03d.png':
Duration: 00:00:08.72, start: 0.000000, bitrate: N/A
Stream #0:0: Video: png, rgb24(pc), 720x404 [SAR 254:255 DAR 3048:1717], 25 fps, 25 tbr, 25 tbn, 25 tbc
[image2 @ 0x21267a0] Using AVStream.codec to pass codec parameters to muxers is deprecated, use AVStream.codecpar instead.
Output #0, image2, to 'wing-x03d/frame_%04d.ppm':
Metadata:
encoder : Lavf57.37.101
Stream #0:0: Video: ppm, rgb24, 600x338 [SAR 42926:42925 DAR 3048:1717], q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
Metadata:
encoder : Lavc57.46.100 ppm
Stream mapping:
Stream #0:0 -> #0:0 (png (native) -> ppm (native))
Press [q] to stop, [?] for help
frame= 218 fps=0.0 q=-0.0 Lsize=N/A time=00:00:08.72 bitrate=N/A speed=10.1x
video:129526kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

How much do you want to weight the style reconstruction term? Default value: 1e2 for a resolution of 450x350. Increase for a higher resolution. [1e2]

1e2

Enter the zero-indexed ID of the GPU to use, or -1 for CPU mode (very slow!). [0]

0

Computing optical flow. This may take a while...
stylizeVideo.sh: line 84: th: command not found

I ended up running

th artistic_video.lua -style_image 'styles/mountains.jpg' -content_pattern wing-x03d/frame_%04d.ppm -flow_pattern wing-x03d/flow_600:338/backward_[%d]{%d}.flo -flowWeight_pattern wing-x03d/flow_600:338/reliable[%d]_{%d}.pgm -num_images 218 -output_image wing.png -number_format %04d -output_folder '/home/zed/wingsuit/render/' -save_iter 0 -gpu 0 -backend clnn

To use the flow files and it seems to be working that way.

why are calculating the optical flow on cpu?

I'm trying to use this program for the first time.
I have a 3 minute long video, it has something like 5000 frames.
I see optical flow working in my folder,but it's quite slow.
atm it's doing files named like 60_61, and takes a non negligible time to advance of 1 unit.

getting to 4999_5000 it will take ages. I have a gtx 1070, is there a way to use it?

does it really have to compute flows up to 4999_5000 or is the naming different?

deepmatching GPU code

As a side question, has anybody managed to get the deepmatching GPU code to compile on Ubuntu? It seems to have been written against some version of caffe/protobuf that no longer seem compatible once the SWIG code is compiled. Or perhaps I'm just really doing something wrong. I've beaten my head against it for a couple days now with zero progress.

I'm currently using the CPU static provided by the original author's site which works fine, but would love to be able to speed up the matching process.

Newbie question

First off, this is incredible work. Your videos are amazing. I recognize that I don't have the best GPU. Since beginning the output files it is taking about 200 seconds per frame. At this rate it will take a little over 25 hours to render an 18 second clip of 450 frames. Does this sound reasonable?

As a hobbyist I have been exploring using Justin Johnson's neural-style as well as Chuan Li's convolutional neural net with Markov Random Fields. I find the latter is easier for me to get nice results. Are you aware of any branch of your programs that use cnnmrf?

I know that it would not be a trivial endeavor to modify your programs. If I were to attempt it, could you point me to which files and functions I would need to look at?

Thank you for your time.

EpicFlow instead of DeepFlow

Have you experimented with EpicFlow as the optical flow engine? If so would you recommend it over DeepFlow. (faster? better? etc.)

I've also noticed that for large video sizes (960 x 540) DeepFlow is very slow (~22 seconds per frame). Unfortunately it appears to be single threaded unlike DeepMatch which is multi-threaded and runs at about 5 seconds for a forward/backward match per frame on my set up. I'm using the -R 2 setting on DeepMatch to cut the number of matches down, but would still like to find a way to reduce the DeepFlow computation time.

I'm currently working on a shell script to create multiple DeepFlow processes to keep all my cores occupied, but if you've found a work around that already solves this problem let me know. I didn't see a multi-threaded version of DeepFlow on the author's site.

Thanks

I don't know how to make .flo file

I was making a flow using webarebears video but I've got error msg like
could not open ./webarebears/flow_450:350/backward_00_00.flo
could not open ./webarebears/flow_450:350/forward_00_00.flo

But the point is I do not know how to make that .flo format file.

I think this following part of makeOptFlow.sh does not work properly
[while true; do
file1=$(printf "$filePattern" "$i")
file2=$(printf "$filePattern" "$j")
if [ -a $file2 ]; then
if [ ! -f ${folderName}/forward_${i}${j}.flo ]; then
eval $flowCommandLine "$file1" "$file2" "${folderName}/forward
${i}${j}.flo"
fi
if [ ! -f ${folderName}/backward
${j}${i}.flo ]; then
eval $flowCommandLine "$file2" "$file1" "${folderName}/backward
${j}${i}.flo"
fi
./consistencyChecker/consistencyChecker "${folderName}/backward
${j}${i}.flo" "${folderName}/forward${i}${j}.flo" "${folderName}/reliable${j}${i}.pgm"
./consistencyChecker/consistencyChecker "${folderName}/forward
${i}${j}.flo" "${folderName}/backward${j}${i}.flo" "${folderName}/reliable${i}_${j}.pgm"
else
break
fi
i=$[$i +1]
j=$[$j +1]
]

any advices or comments are appreciated

Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.