fzliu / style-transfer Goto Github PK

An implementation of "A Neural Algorithm of Artistic Style" by L. Gatys, A. Ecker, and M. Bethge. http://arxiv.org/abs/1508.06576.

Shell 3.33% Python 96.67%

style-transfer's Introduction

Hey there, I'm Frank 👋

Professional presser of computer buttons, currently at Zilliz. Feel free to connect with me on Twitter or LinkedIn.

My education:

2009 - 2013: BS with Honors, Electrical Engineering @ Stanford University (minor in CS)
2013 - 2014: MS, Electrical Engineering @ Stanford University

My background:

2014 - 2016: SDE, Computer Vision & Machine Learning @ Yahoo
2016 - 2021: CTO & Co-founder @ Orion Innovations
2021 - present: Head of AI & ML @ Zilliz

My podcasts/presentations:

That Digital Show: Using AI to Supercharge Data-Driven Applications
2022 Berlin Buzzwords: Building an Open-source Platform for Generating Embedding Vectors
2022 Open Source Summit EU: The Evolution of Milvus: A Cloud-Native Vector Database
2023 Open Source Summit NA: The Rise of Vector Databases - Lessons from the Milvus Community
2023 Arize:Observe: Embeddings: Discover the Key To Building AI Applications That Scale
2023 AI4 Conference: Why Vector Search Is Important To Your Business
2023 HPC + AI Wall Street (panel): From Open Source to Third Party: Capturing the Early Potential of Generative AI for FinServ
2023 AWS GenAI Day (panel): Business Panel for Generative AI: A New Chapter in Experiential Content
2023 AI for Developers #7: Practical RAG - Choosing the Right Embedding Model
2023 Ai.dev Summit: Milvus Vector Database: Unlocking the Future of Open Source Vector Databases
2024 Intel Open Source Podcast: Supercharge your AI agents with Milvus and Zilliz Cloud

style-transfer's People

Contributors

Stargazers

Watchers

Forkers

pierrehao syntopia xranby philipz rscai loginerrort omar-florez seanth longjohncoder rustamthegreat rafaspadilha aaronzou stevenhickson jqiu tingtinglu zaithe pygmalion6636 houxianxu exgile clarencewoo siddharthm83 wellyzhang fredove skinjester z-tao biprajiman pjturcot derkreature jwrush bungnoid roman-murashov tandakun augustlong noplisu computerfuneral longngo78 chrisrammy chicofilho msultan mangorilla caiyizhu tehninth spasserby onlysang euwen pxyu oldregan yinmingyang creotiv alexios1010 beeblook dylanthomas tfhnia jwgu jacobo3d yiboin wtnan2003 qinggege torome haolemawo aimicm 80nianmo alihassan1 avinashkumarsharma lihanharry wang4959520 kerautret lyztyj enyun bentanust abhishek-somani labimage aggerdom loliod aaa952 algoarabi hustzza lethic godweed nianze lostmonk chuan16 malihon lvye1937 yexingzhe54 irentang domonji plutocyw binlu1981 absarvar lucianionita lixia244 andyxiaoprince zhuhuangjie alven8816 arasharchor rmurrish devguan steveshaw cristicvictory

style-transfer's Issues

i try to use it, but not working

Some way to set the standard output to lossless .png?

It's probably trivial, but I don't know how. TIFF, PNG or BMP would do fine for me.

can't get converge result when using AlexNet

hi, @fzliu
Recently, i am trying using smaller network to generate neural art. However, when use alexnet, it's harder to get converged and get a very blur image. I don't know why. may be alexnet's lrn layer, or big stride in the first layers, or some inproper hyper-parameter

could you share you hyper-parameter used for alexnet, such as learning_rate, style_weight?

Thanks in advance.

I can't continue after load image!

Please help me T.T

Issue about Loss function

Content Loss and Style Loss are two loss functions used here.
_compute_style_grad() -------- line 100
_compute_content_grad() -------- line 114
I do not understand why * (Fl>0) when computing gradient.
Is there any specific reason?
@fzliu

Model.ckpt

A bit out of topic, but how did you get the model.ckpt you shared on google drive to be soo small in terms of memory? I am getting much larger files plus the come divided into .ckpt-meta and .pbtxt

Why do backpropagation for original VGG net?

Hi
I've been reading the original paper and this design for a couple of days, while one implementation decision puzzled me.

By my understanding, the original VGG model from caffe is used to get content from the input image, and then turn content into style(Gram matrices ), the loss is defined by a weighted combination of feature loss and style loss. It appears original VGG models does not involve any back-prorogation stuff, may I ask is there any reason for the design about this part.

The relevant code is here

Thanks!

Support for multiple style images

I admit that I didn't dig too much into the code, but by looking at the parser's arguments it seems that multiple style images are not supported yet. Am I correct? Will it be available in the near future?

shapeless blob?

Any idea what is going on with this?

style.py:main:14:09:25.062 -- Starting style transfer.
style.py:main:14:09:25.062 -- Running net on CPU.
style.py:main:14:09:26.432 -- Successfully loaded images.
style.py:main:14:09:29.054 -- Successfully loaded model vgg16.
Traceback (most recent call last):
File "style.py", line 520, in
main(args)
File "style.py", line 499, in main
n_iter=args.num_iters, verbose=args.verbose)
File "style.py", line 401, in transfer_style
orig_dim = min(self.net.blobs["data"].shape[2:])
AttributeError: 'Blob' object has no attribute 'shape'

GPU Load 5% avg

Hi !

I'm on windows 10 64bit, I got Caffe compiled and style.py works. But using GPU-Z I can see that the GPU load oscillate between 0 and 20 % (averaging at 5%) while my CPU is used at its max.

Is this a standard situation, do you have the same GPU load ?

PS : I have this warning in Caffe's common.cpp I don't know if it's relevant
style.py:main:20:29:23.556 -- Starting style transfer. WARNING: Logging before InitGoogleLogging() is written to STDERR I0717 20:29:24.659956 4260 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead.

style.py: AttributeError: 'module' object has no attribute 'set_device'

danusya@werewolf:~/style-transfer$ python ./style.py -s ~/****.jpg -c /tmp/****.jpg -v
style.py:main:21:12:25.433 -- Starting style transfer.
Traceback (most recent call last):
  File "./style.py", line 520, in <module>
    main(args)
  File "./style.py", line 481, in main
    caffe.set_device(args.gpu_id)
AttributeError: 'module' object has no attribute 'set_device'

training script

On Windows, minimisation fails

Hi,

I'm using pycaffe on Windows, but the output is equal to the content. For what I understand only a single iteration is ran, and for some reasons the minimisation considers the first output to be already minimised. I don't know the BFGS code at all, it's possible that caffe is reporting some value to 0 instead of the right value, but I don't know which one.

Here is the log :

`(env) C:\Users\vlj\Documents\GitHub\style-transfer [master ≡ +2 ~1 -0 !]> python .\style.py -s .\images\style\starry_night.jpg -c .\images\content\sanfrancisco.jpg -m VGG16 -g 0 -v -
style.py:main:19:44:01.838 -- Starting style transfer.
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0121 19:44:02.426594 14300 common.cpp:36] System entropy source not available, using fallback algorithm to generate seed instead.
style.py:main:19:44:02.427 -- Running net on GPU 0.
style.py:main:19:44:02.494 -- Successfully loaded images.
style.py:main:19:44:02.557 -- Successfully loaded model VGG16.
C:\Users\vlj\Documents\GitHub\style-transfer\env\lib\site-packages\skimage\transform_warps.py:84: UserWarning: The default mode, 'constant', will be changed to 'reflect' in skimage
warn("The default mode, 'constant', will be changed to 'reflect' in "
RUNNING THE L-BFGS-B CODE

       * * *

Machine precision = 2.220D-16
N = 525312 M = 8
The initial X is infeasible. Restart with its projection.

At X0 40 variables are exactly at the bounds

At iterate 0 f= 0.00000D+00 |proj g|= 0.00000D+00

       * * *

Tit = total number of iterations
Tnf = total number of function evaluations
Tnint = total number of segments explored during Cauchy searches
Skip = number of BFGS updates skipped
Nact = number of active bounds at final generalized Cauchy point
Projg = norm of the final projected gradient
F = final function value

       * * *

N Tit Tnf Tnint Skip Nact Projg F
***** 0 1 0 0 0 0.000D+00 0.000D+00
F = 0.0000000000000000

CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL

Cauchy time 0.000E+00 seconds.
Subspace minimization time 0.000E+00 seconds.
Line search time 0.000E+00 seconds.

Total User time 0.000E+00 seconds.

style.py:main:19:44:05.648 -- Ran 0 iterations in 3s.
C:\Users\vlj\Documents\GitHub\style-transfer\env\lib\site-packages\skimage\util\dtype.py:122: UserWarning: Possible precision loss when converting from float32 to uint8
.format(dtypeobj_in, dtypeobj_out))
style.py:main:19:44:05.704 -- Output saved to outputs/sanfrancisco-starry_night-VGG16-content-1e4-512.jpg.`

Poor output quality when using GoogleNet and CaffeNet

Command:

python2 style.py -c "$ROOT_DIR/johannesburg.jpg" -s "$ROOT_DIR/starry_night.jpg" -o "$ROOT_DIR/starry_johannesburg.jpg" --model googlenet

Command:

python2 style.py -c "$ROOT_DIR/johannesburg.jpg" -s "$ROOT_DIR/starry_night.jpg" -o "$ROOT_DIR/starry_johannesburg.jpg" --model caffenet

Is this normal? I'm running Gentoo with a GeForce 750 Ti 2Gb, driver version 361.28, CUDA 7.0.28, Caffe built from git today. I'm getting out of memory errors when I try to run with the default neural network.

Check failed !

I download caffeNet and run like : python style.py -s style.jpg -c content.jpg -g -1 -m caffenet -n 200 -o "./outputs"
CMD just print the info:
F1211 23:40:22.295680 10440 inner_product_layer.cpp:64] Check failed: K_ == new_K (9216 vs. 82944) Input size incompatible with inner product parameters.
*** Check failure stack trace: ***

how i'm suppose to write the images in

python style.py -s <style_image> -c <content_image> -m <model_name> -g 0

sorry it not clear to me how do i'm supposed to write in the content i want to do , and in what directory

python style.py -s <style_fruit.jpg> -c <content_orange> -m <model_vgg> -g 0
python style.py -s <fruit.jpg> -c <orange.png> -m -g 0
python style.py -s fruit.jpg -c orange.png -m vgg -g 0

Issue about backpropagation

Hi, is there any specific reason that you did the back propagation one layer at a time?

Where to get the 'stylenet' model?

Hi,

in the models folder there is 'stylenet', however it cannot be automatically downloaded (as not included in the download script). Which model is this, can you provide a link so I can fetch it?

€: I would also need the related weights as defined in lines 59-85 for the other models.

Thanks in advance, Georg.

'Blob' object has no attribute 'shape'

On line 148 [[ x.reshape(net.blobs["data"].shape[1:]) ]]
net.blobs["data"] is a Blob object, which doesn't seem to have a shape field.

I think it's supposed to be [[ x.reshape(net.blobs["data"].data.shape[1:]) ]]

Just a question

I have a quick question. I was thinking of writing a wrapper for https://github.com/jwetzl/CudaLBFGS to avoid using scipy. Do you think that would help speed up the optimizations? I have large batches of images to process, so unfortunately waiting is not an option. If you have any advice or ideas, I'd appreciate it.

IOError: [Errno 2] No such file or directory: 'outputs/sanfrancisco-starry_night-vgg19-content-1e4-512.jpg'

Hi,

it seems that I have problems with writing the output image. Here is a copy from my terminal.

iki@iki-CELSIUS-R570-2:~/artistic_style/style-transfer$ python style.py -s ./images/style/starry_night.jpg -c ./images/content/sanfrancisco.jpg -m vgg19 -g 0style.py:main:21:04:09.368 -- Starting style transfer. style.py:main:21:04:09.650 -- Running net on GPU 0. style.py:main:21:04:09.699 -- Successfully loaded images. style.py:main:21:04:13.625 -- Successfully loaded model vgg19. Optimizing: 100% ||||||||||||||||||||||||||||||||||||||||||||||||| Time: 0:14:26 style.py:main:21:18:41.323 -- Ran 513 iterations in 868s. /home/iki/.local/lib/python2.7/site-packages/skimage/util/dtype.py:110: UserWarning: Possible precision loss when converting from float32 to uint8 "%s to %s" % (dtypeobj_in, dtypeobj)) Traceback (most recent call last): File "style.py", line 520, in <module> main(args) File "style.py", line 514, in main imsave(out_path, img_as_ubyte(img_out)) File "/usr/lib/python2.7/dist-packages/scipy/misc/pilutil.py", line 168, in imsave im.save(name) File "/usr/lib/python2.7/dist-packages/PIL/Image.py", line 1676, in save fp = builtins.open(fp, "wb") IOError: [Errno 2] No such file or directory: 'outputs/sanfrancisco-starry_night-vgg19-content-1e4-512.jpg'

Thank you for the implementation!

Tomi

CPU only？

my caffe is cpu only,can I use the project?

Making GPU works

Previously, I have the blob thing issue, but resolved after changing that line.

But I cannot make the GPU working.

[py27] C:\Users\Jimmy\style-transfer-master>python style.py -s C:\Users\Jimmy\st
yle-transfer-master\images\style\starry_night.jpg -c C:\Users\Jimmy\style-transf
er-master\images\content\blackdog.jpg -m VGG16 -g 1
style.py:main:11:46:43.924 -- Starting style transfer.
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0406 11:46:43.947576 25172 common.cpp:145] Check failed: error == cudaSuccess (
10 vs. 0) invalid device ordinal
*** Check failure stack trace: ***

Resolution

Just a quick question: is there anyway to set the resolution of the output?

Thanks for putting this together. I am having a lot of fun with it :)

No module named 'caffe'

supporting opencl caffe

there exist this implementation of caffe by amd: https://github.com/amd/OpenCL-caffe

I tried running your code with this backend instead of caffè, but I didn't manage.
Do you think it may work? it shoudn't be hard to make it compatible, let me know what do you think.

why net to flatten img0?

Hello, I tried to use a different network with different inputs, I find out that in the minimize function the author using img0.flatten(), why is that? Thanks, because I also need to input image size information into my network.
I would really appreciate if someone could give me a hint!

Some about implementation

Hi，Frank. BRILLIANT WORK. BUT, I have some question about your implementation.
In the file style.py ,you write the code in the line 184 to 210.
grad = net.blobs[layer].diff[0]
grad += wl * g.reshape(grad.shape) * ratio
grad = net.blobs[next_layer].diff[0]
why you sum the net grad to the loss grad? and why you use the next_layer grad to override the total grad?
THANK YOU!

Segmentation fault

When I try to run style.py i got Segmenation fault error

Basis for googlenet weights

Currently for googlenet model "content" representation goes to inception_3a/output (mainly) and conv2/3x3 (2e-4); and "style" representation goes to 5 layers from conv1/7x7_s2 to inception_5a/output.
Is there some basis for this choose?

inception_3a/output is closer to the input of network. Activation of this layer via "deepdream" method produces only spiral patterns and edge stroking, not even "eyes".
In vgg19 "content" goes into conv4_2 which is 10th of 16 convolution layers and it is placed after 3rd maxpool layer. In googlenet 3rd maxpool is pool3 and so layers similar to conv4_2 should be inception_4b or 4c or 4d. I see in commit history only 1 test with inception_3b+inception_4a (50% each), while looks like no higher layers were tested.
Does anyone performed some seach on where in googlenet should go "content" representation?

PyPi

Hi there,

Was wondering if you had any plans / concerns about putting this up on PyPI (Python Package Index)? Otherwise, if I wanted to use this in a project of mine, how would you like me to commit your work into my repo? I will surely be linking back to your project in my README.md but I was wondering if you wanted all your files intact or can I just take style.py & the scripts directory since I believe that is all I'd need.

Thanks!

Wrong googlenet .caffemodel filename

It's bvlc_googlenet.caffemodel in download_models.sh, whereas in style.py it's googlenet_style.caffemodel

VGG model cannot run?

Hi!
I have successfully run the code and generate the 'result.jpg' by the default googlenet model.
I try to run the VGG 19 by setting the model_name = "vgg" but It just run one 0 iterations generate the result which is identical to content image.
I found the gradient in the data layer is zero. Could you kindly help me to solve the problem?

Provide outputs directory by default

When I ran style-transfer at the first time, it fails because the outputs directory didn't exist. And it works after creating this folder.

It would be much better to provide this by default. And I may send the pull-request for addressing this.

style doesn't tranformed successfully but only content

I use the command:
python style.py -s images/style/starry_night.jpg -c images/content/johannesburg.jpg -m vgg19 -g 0

the images and model have loaded successfully .but the output image doesn't have the style but only the content, anyone know what's going on?

batch_size

Hi! Where I can change batch_size? I can't compute on gtx 960 4GB with vgg16 model ( !

Add description for OpenCL, works well as replacement for CUDA

I just tested this and the scripts are also fully working with OpenCL, and surprisingly not much slower; i.e. 22 minutes with VGG19 on a W9100/R9 290X/R9 390X/RX 480 and 12 minutes on a GTX1080.
Source: https://github.com/BVLC/caffe/tree/opencl
Disclaimer: I'm the author of OpenCL Caffe, shameless advertisement to gain more momentum ;)

if self.use_pbar and not verbose:
    self._create_pbar(n_iter)
    self.pbar.start()
    res = minimize(style_optfn, img0.flatten(), **minfn_args).nit
    self.pbar.finish()
else:
    res = minimize(style_optfn, img0.flatten(), **minfn_args).nit
return res

In these codes, n_iter is used to create pbar, and n_iter enters the ‘style_optfn’ function as an options parameter, but in the definition of

style_optfn(x, net, weights, layers, reprs, ratio)

there is no iteration option.
Does anyone have ideas? Thank you!

The program stops at "successfully loaded images"

the program stops there and exits, could you please tell me what's wrong and how to fix it? THX!