Code Monkey home page Code Monkey logo

style-transfer's Introduction

Hey there, I'm Frank 👋

Professional presser of computer buttons, currently at Zilliz. Feel free to connect with me on Twitter or LinkedIn.

My education:

  • 2009 - 2013: BS with Honors, Electrical Engineering @ Stanford University (minor in CS)
  • 2013 - 2014: MS, Electrical Engineering @ Stanford University

My background:

  • 2014 - 2016: SDE, Computer Vision & Machine Learning @ Yahoo
  • 2016 - 2021: CTO & Co-founder @ Orion Innovations
  • 2021 - present: Head of AI & ML @ Zilliz

My podcasts/presentations:

style-transfer's People

Contributors

dpaiton avatar fzliu avatar pjturcot avatar tobegit3hub avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

style-transfer's Issues

can't get converge result when using AlexNet

hi, @fzliu
Recently, i am trying using smaller network to generate neural art. However, when use alexnet, it's harder to get converged and get a very blur image. I don't know why. may be alexnet's lrn layer, or big stride in the first layers, or some inproper hyper-parameter

could you share you hyper-parameter used for alexnet, such as learning_rate, style_weight?

Thanks in advance.

Issue about Loss function

Content Loss and Style Loss are two loss functions used here.
_compute_style_grad() -------- line 100
_compute_content_grad() -------- line 114
I do not understand why * (Fl>0) when computing gradient.
Is there any specific reason?
@fzliu

Model.ckpt

A bit out of topic, but how did you get the model.ckpt you shared on google drive to be soo small in terms of memory? I am getting much larger files plus the come divided into .ckpt-meta and .pbtxt

Why do backpropagation for original VGG net?

Hi
I've been reading the original paper and this design for a couple of days, while one implementation decision puzzled me.

By my understanding, the original VGG model from caffe is used to get content from the input image, and then turn content into style(Gram matrices ), the loss is defined by a weighted combination of feature loss and style loss. It appears original VGG models does not involve any back-prorogation stuff, may I ask is there any reason for the design about this part.

The relevant code is here

Thanks!

Support for multiple style images

I admit that I didn't dig too much into the code, but by looking at the parser's arguments it seems that multiple style images are not supported yet. Am I correct? Will it be available in the near future?

shapeless blob?

Any idea what is going on with this?

style.py:main:14:09:25.062 -- Starting style transfer.
style.py:main:14:09:25.062 -- Running net on CPU.
style.py:main:14:09:26.432 -- Successfully loaded images.
style.py:main:14:09:29.054 -- Successfully loaded model vgg16.
Traceback (most recent call last):
File "style.py", line 520, in
main(args)
File "style.py", line 499, in main
n_iter=args.num_iters, verbose=args.verbose)
File "style.py", line 401, in transfer_style
orig_dim = min(self.net.blobs["data"].shape[2:])
AttributeError: 'Blob' object has no attribute 'shape'

GPU Load 5% avg

Hi !

I'm on windows 10 64bit, I got Caffe compiled and style.py works. But using GPU-Z I can see that the GPU load oscillate between 0 and 20 % (averaging at 5%) while my CPU is used at its max.

Is this a standard situation, do you have the same GPU load ?

PS : I have this warning in Caffe's common.cpp I don't know if it's relevant
style.py:main:20:29:23.556 -- Starting style transfer. WARNING: Logging before InitGoogleLogging() is written to STDERR I0717 20:29:24.659956 4260 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead.

style.py: AttributeError: 'module' object has no attribute 'set_device'

danusya@werewolf:~/style-transfer$ python ./style.py -s ~/****.jpg -c /tmp/****.jpg -v
style.py:main:21:12:25.433 -- Starting style transfer.
Traceback (most recent call last):
  File "./style.py", line 520, in <module>
    main(args)
  File "./style.py", line 481, in main
    caffe.set_device(args.gpu_id)
AttributeError: 'module' object has no attribute 'set_device'

On Windows, minimisation fails

Hi,

I'm using pycaffe on Windows, but the output is equal to the content. For what I understand only a single iteration is ran, and for some reasons the minimisation considers the first output to be already minimised. I don't know the BFGS code at all, it's possible that caffe is reporting some value to 0 instead of the right value, but I don't know which one.

Here is the log :

`(env) C:\Users\vlj\Documents\GitHub\style-transfer [master ≡ +2 ~1 -0 !]> python .\style.py -s .\images\style\starry_night.jpg -c .\images\content\sanfrancisco.jpg -m VGG16 -g 0 -v -
style.py:main:19:44:01.838 -- Starting style transfer.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0121 19:44:02.426594 14300 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead.
style.py:main:19:44:02.427 -- Running net on GPU 0.
style.py:main:19:44:02.494 -- Successfully loaded images.
style.py:main:19:44:02.557 -- Successfully loaded model VGG16.
C:\Users\vlj\Documents\GitHub\style-transfer\env\lib\site-packages\skimage\transform_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage
warn("The default mode, 'constant', will be changed to 'reflect' in "
RUNNING THE L-BFGS-B CODE

       * * *

Machine precision = 2.220D-16
N = 525312 M = 8
The initial X is infeasible. Restart with its projection.

At X0 40 variables are exactly at the bounds

At iterate 0 f= 0.00000D+00 |proj g|= 0.00000D+00

       * * *

Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value

       * * *

N Tit Tnf Tnint Skip Nact Projg F
***** 0 1 0 0 0 0.000D+00 0.000D+00
F = 0.0000000000000000

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL

Cauchy time 0.000E+00 seconds.
Subspace minimization time 0.000E+00 seconds.
Line search time 0.000E+00 seconds.

Total User time 0.000E+00 seconds.

style.py:main:19:44:05.648 -- Ran 0 iterations in 3s.
C:\Users\vlj\Documents\GitHub\style-transfer\env\lib\site-packages\skimage\util\dtype.py:122: UserWarning: Possible precision loss when converting from float32 to uint8
.format(dtypeobj_in, dtypeobj_out))
style.py:main:19:44:05.704 -- Output saved to outputs/sanfrancisco-starry_night-VGG16-content-1e4-512.jpg.`

Poor output quality when using GoogleNet and CaffeNet

image

Command:

python2 style.py -c "$ROOT_DIR/johannesburg.jpg" -s "$ROOT_DIR/starry_night.jpg" -o "$ROOT_DIR/starry_johannesburg.jpg" --model googlenet

image

Command:

python2 style.py -c "$ROOT_DIR/johannesburg.jpg" -s "$ROOT_DIR/starry_night.jpg" -o "$ROOT_DIR/starry_johannesburg.jpg" --model caffenet

Is this normal? I'm running Gentoo with a GeForce 750 Ti 2Gb, driver version 361.28, CUDA 7.0.28, Caffe built from git today. I'm getting out of memory errors when I try to run with the default neural network.

Check failed !

I download caffeNet and run like : python style.py -s style.jpg -c content.jpg -g -1 -m caffenet -n 200 -o "./outputs"
CMD just print the info:
F1211 23:40:22.295680 10440 inner_product_layer.cpp:64] Check failed: K_ == new_K (9216 vs. 82944) Input size incompatible with inner product parameters.
*** Check failure stack trace: ***

how i'm suppose to write the images in

python style.py -s <style_image> -c <content_image> -m <model_name> -g 0

sorry it not clear to me how do i'm supposed to write in the content i want to do , and in what directory

  1. python style.py -s <style_fruit.jpg> -c <content_orange> -m <model_vgg> -g 0
  2. python style.py -s <fruit.jpg> -c <orange.png> -m -g 0
  3. python style.py -s fruit.jpg -c orange.png -m vgg -g 0

Where to get the 'stylenet' model?

Hi,

in the models folder there is 'stylenet', however it cannot be automatically downloaded (as not included in the download script). Which model is this, can you provide a link so I can fetch it?

€: I would also need the related weights as defined in lines 59-85 for the other models.

Thanks in advance, Georg.

'Blob' object has no attribute 'shape'

On line 148 [[ x.reshape(net.blobs["data"].shape[1:]) ]]
net.blobs["data"] is a Blob object, which doesn't seem to have a shape field.

I think it's supposed to be [[ x.reshape(net.blobs["data"].data.shape[1:]) ]]

Just a question

I have a quick question. I was thinking of writing a wrapper for https://github.com/jwetzl/CudaLBFGS to avoid using scipy. Do you think that would help speed up the optimizations? I have large batches of images to process, so unfortunately waiting is not an option. If you have any advice or ideas, I'd appreciate it.

IOError: [Errno 2] No such file or directory: 'outputs/sanfrancisco-starry_night-vgg19-content-1e4-512.jpg'

Hi,

it seems that I have problems with writing the output image. Here is a copy from my terminal.

iki@iki-CELSIUS-R570-2:~/artistic_style/style-transfer$ python style.py -s ./images/style/starry_night.jpg -c ./images/content/sanfrancisco.jpg -m vgg19 -g 0style.py:main:21:04:09.368 -- Starting style transfer. style.py:main:21:04:09.650 -- Running net on GPU 0. style.py:main:21:04:09.699 -- Successfully loaded images. style.py:main:21:04:13.625 -- Successfully loaded model vgg19. Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:14:26 style.py:main:21:18:41.323 -- Ran 513 iterations in 868s. /home/iki/.local/lib/python2.7/site-packages/skimage/util/dtype.py:110: UserWarning: Possible precision loss when converting from float32 to uint8 "%s to %s" % (dtypeobj_in, dtypeobj)) Traceback (most recent call last): File "style.py", line 520, in <module> main(args) File "style.py", line 514, in main imsave(out_path, img_as_ubyte(img_out)) File "/usr/lib/python2.7/dist-packages/scipy/misc/pilutil.py", line 168, in imsave im.save(name) File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1676, in save fp = builtins.open(fp, "wb") IOError: [Errno 2] No such file or directory: 'outputs/sanfrancisco-starry_night-vgg19-content-1e4-512.jpg'

Thank you for the implementation!

Tomi

CPU only?

my caffe is cpu only,can I use the project?

Making GPU works

Previously, I have the blob thing issue, but resolved after changing that line.

But I cannot make the GPU working.

[py27] C:\Users\Jimmy\style-transfer-master>python style.py -s C:\Users\Jimmy\st
yle-transfer-master\images\style\starry_night.jpg -c C:\Users\Jimmy\style-transf
er-master\images\content\blackdog.jpg -m VGG16 -g 1
style.py:main:11:46:43.924 -- Starting style transfer.
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0406 11:46:43.947576 25172 common.cpp:145] Check failed: error == cudaSuccess (
10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***

Resolution

Just a quick question: is there anyway to set the resolution of the output?

Thanks for putting this together. I am having a lot of fun with it :)

why net to flatten img0?

Hello, I tried to use a different network with different inputs, I find out that in the minimize function the author using img0.flatten(), why is that? Thanks, because I also need to input image size information into my network.
I would really appreciate if someone could give me a hint!

Some about implementation

Hi,Frank. BRILLIANT WORK. BUT, I have some question about your implementation.
In the file style.py ,you write the code in the line 184 to 210.
grad = net.blobs[layer].diff[0]
grad += wl * g.reshape(grad.shape) * ratio
grad = net.blobs[next_layer].diff[0]
why you sum the net grad to the loss grad? and why you use the next_layer grad to override the total grad?
THANK YOU!

Basis for googlenet weights

Currently for googlenet model "content" representation goes to inception_3a/output (mainly) and conv2/3x3 (2e-4); and "style" representation goes to 5 layers from conv1/7x7_s2 to inception_5a/output.
Is there some basis for this choose?

inception_3a/output is closer to the input of network. Activation of this layer via "deepdream" method produces only spiral patterns and edge stroking, not even "eyes".
In vgg19 "content" goes into conv4_2 which is 10th of 16 convolution layers and it is placed after 3rd maxpool layer. In googlenet 3rd maxpool is pool3 and so layers similar to conv4_2 should be inception_4b or 4c or 4d. I see in commit history only 1 test with inception_3b+inception_4a (50% each), while looks like no higher layers were tested.
Does anyone performed some seach on where in googlenet should go "content" representation?

PyPi

Hi there,

Was wondering if you had any plans / concerns about putting this up on PyPI (Python Package Index)? Otherwise, if I wanted to use this in a project of mine, how would you like me to commit your work into my repo? I will surely be linking back to your project in my README.md but I was wondering if you wanted all your files intact or can I just take style.py & the scripts directory since I believe that is all I'd need.

Thanks!

VGG model cannot run?

Hi!
I have successfully run the code and generate the 'result.jpg' by the default googlenet model.
I try to run the VGG 19 by setting the model_name = "vgg" but It just run one 0 iterations generate the result which is identical to content image.
I found the gradient in the data layer is zero. Could you kindly help me to solve the problem?

Provide outputs directory by default

When I ran style-transfer at the first time, it fails because the outputs directory didn't exist. And it works after creating this folder.

screen shot 2016-07-01 at 14 49 59

It would be much better to provide this by default. And I may send the pull-request for addressing this.

style doesn't tranformed successfully but only content

I use the command:
python style.py -s images/style/starry_night.jpg -c images/content/johannesburg.jpg -m vgg19 -g 0

the images and model have loaded successfully .but the output image doesn't have the style but only the content, anyone know what's going on?

batch_size

Hi! Where I can change batch_size? I can't compute on gtx 960 4GB with vgg16 model ( !

where the parameter ‘n_iter’ affects in the minimize() function

Excuse me, but I wonder where the parameter ‘n_iter’ affects in the minimize() function.

if self.use_pbar and not verbose:
    self._create_pbar(n_iter)
    self.pbar.start()
    res = minimize(style_optfn, img0.flatten(), **minfn_args).nit
    self.pbar.finish()
else:
    res = minimize(style_optfn, img0.flatten(), **minfn_args).nit
return res

In these codes, n_iter is used to create pbar, and n_iter enters the ‘style_optfn’ function as an options parameter, but in the definition of

style_optfn(x, net, weights, layers, reprs, ratio)

there is no iteration option.
Does anyone have ideas? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.