Code Monkey home page Code Monkey logo

urst's Introduction

Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization (AAAI'2022)

Official PyTorch implementation for our URST (Ultra-Resolution Style Transfer) framework.

URST is a versatile framework for ultra-high resolution style transfer under limited GPU memory resources, which can be easily plugged in most existing neural style transfer methods.

With the growth of the input resolution, the memory cost of our URST hardly increases. Theoretically, it supports style transfer of arbitrary resolution images.

One ultra-high resolution stylized result of 12000 x 8000 pixels (i.e., 96 megapixels).

This repository is developed based on six representative style transfer methods, which are Johnson et al., MSG-Net, AdaIN, WCT, LinearWCT, and Wang et al. (Collaborative Distillation).

For details see Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization.

If you use this code for a paper please cite:

@inproceedings{chen2022towards,
  title={Towards Ultra-Resolution Neural Style Transfer via Thumbnail Instance Normalization},
  author={Chen, Zhe and Wang, Wenhai and Xie, Enze and Lu, Tong and Luo, Ping},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  year={2022}
}

Environment

  • python3.6, pillow, tqdm, torchfile, pytorch1.1+ (for inference)

    pip install pillow
    pip install tqdm
    pip install torchfile
    conda install pytorch==1.1.0 torchvision==0.3.0 -c pytorch
  • tensorboardX (for training)

    pip install tensorboardX

Then, clone the repository locally:

git clone https://github.com/czczup/URST.git

Test (Ultra-high Resolution Style Transfer)

Step 1: Prepare images

  • Content images and style images are placed in examples/.
  • Since the ultra-high resolution images are quite large, we not place them in this repository. Please download them from this google drive.
  • All content images used in this repository are collected from pexels.com.

Step 2: Prepare models

  • Download models from this google drive. Unzip and merge them into this repository.

Step 3: Stylization

First, choose a specific style transfer method and enter the directory.

Then, please run the corresponding script. The stylized results will be saved in output/.

  • For Johnson et al., we use the PyTorch implementation Fast-Neural-Style-Transfer.

    cd Johnson2016Perceptual/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --model <model_path> --URST
  • For MSG-Net, we use the official PyTorch implementation PyTorch-Multi-Style-Transfer.

    cd Zhang2017MultiStyle/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For AdaIN, we use the PyTorch implementation pytorch-AdaIN.

    cd Huang2017AdaIN/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For WCT, we use the PyTorch implementation PytorchWCT.

    cd Li2017Universal/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For LinearWCT, we use the official PyTorch implementation LinearStyleTransfer.

    cd Li2018Learning/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For Wang et al. (Collaborative Distillation), we use the official PyTorch implementation Collaborative-Distillation.

    cd Wang2020Collaborative/PytorchWCT/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --style <style_path> --URST
  • For Multimodal Transfer, we use the PyTorch implementation multimodal_style_transfer

    cd Wang2017Multimodal/
    CUDA_VISIBLE_DEVICES=<gpu_id> python test.py --content <content_path> --model <model_name> --URST

Optional options:

  • --patch_size: The maximum size of each patch. The default setting is 1000.
  • --style_size: The size of the style image. The default setting is 1024.
  • --thumb_size: The size of the thumbnail image. The default setting is 1024.
  • --URST: Use our URST framework to process ultra-high resolution images.

Train (Enlarge the Stroke Size)

Step 1: Prepare datasets

Download the MS-COCO 2014 dataset and WikiArt dataset.

  • MS-COCO

    wget http://msvocds.blob.core.windows.net/coco2014/train2014.zip
  • WikiArt

    • Either manually download from kaggle.
    • Or install kaggle-cli and download by running:
    kg download -u <username> -p <password> -c painter-by-numbers -f train.zip

Step 2: Prepare models

As same as the Step 2 in the test phase.

Step 3: Train the decoder with our stroke perceptual loss

  • For AdaIN:

    cd Huang2017AdaIN/
    CUDA_VISIBLE_DEVICES=<gpu_id> python trainv2.py --content_dir <coco_path> --style_dir <wikiart_path>
  • For LinearWCT:

    cd Li2018Learning/
    CUDA_VISIBLE_DEVICES=<gpu_id> python trainv2.py --contentPath <coco_path> --stylePath <wikiart_path>

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

urst's People

Contributors

czczup avatar rahulbhalley avatar ttuananh112 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

urst's Issues

TIN application

Hi,

Sorry to bother again - lets say; roughly, forgive my expression;

using cysmith neural transfer (LGBFS optimizer), I process 4 x (500 x 500) slices of a lets say 2000 x 2000 pixel image-A, and I process one total picture thumbnail of 500 x 500 pixel of image-A (i suppose, downsampled from the original 2000 x 2000 image-A)

now, using your TIN method;

would I be able to 'stitch' these back together and then get rid of the 'stitch' area problems?

is it possible to string together or isolate some code as to achieve something like this?

thanks again for your time and consideration

  • apologies if ive missed or misunderstood something from your method / paper article

possible to change from adam optimizer to L-BFGS?

possible to change from adam optimizer to L-BFGS?

not particularly excellent at any of this (code etc.) - so even just a simple, yes or no, will suffice, even as to prevent me from wasting hours attempting to cut and paste stuff that will never work?

thanks again

Google Colab

Hello!

Any chance you can make a google colab version? I am having trouble setting this up.

Li2017 Universal - unhashable type list

(Wang) C:\Users\Bob\URST\Li2017Universal>python test.py --content c:\users\bob\desktop\ds.jpg --style c:\users\bob\desktop\dsstyle.jpg --URST
Traceback (most recent call last):
File "test.py", line 166, in
wct = WCT(args).to(device)
File "C:\Users\Bob\URST\Li2017Universal\util.py", line 16, in init
vgg1 = torchfile.load(args.vgg1)
File "C:\Users\Bob\anaconda3\envs\Wang\lib\site-packages\torchfile.py", line 424, in load
return reader.read_obj()
File "C:\Users\Bob\anaconda3\envs\Wang\lib\site-packages\torchfile.py", line 370, in read_obj
obj._obj = self.read_obj()
File "C:\Users\Bob\anaconda3\envs\Wang\lib\site-packages\torchfile.py", line 385, in read_obj
k = self.read_obj()
File "C:\Users\Bob\anaconda3\envs\Wang\lib\site-packages\torchfile.py", line 386, in read_obj
v = self.read_obj()
File "C:\Users\Bob\anaconda3\envs\Wang\lib\site-packages\torchfile.py", line 370, in read_obj
obj._obj = self.read_obj()
File "C:\Users\Bob\anaconda3\envs\Wang\lib\site-packages\torchfile.py", line 387, in read_obj
obj[k] = v
TypeError: unhashable type: 'list'

stroke size controlling

In your paper you mention changing the decoder of adain to change the stroke size.
what is the difference between the decoders e.g. "decoder_stroke_perceptual_loss_1.pth" ?
In my case, I would like to convert an ultra high resolution (8192x8192), but applying the same "huge" stroke size, as I would if I would resize the picture to 1024x1024

Possible to use TIN in Aesthetic UST method?

Hi @czczup! Thanks for your awesome work!

Since you've shown that many style transfer methods are compatible with TIN, I wonder if it's possible to use TIN in Aesthetic UST?

Following figure overviews how AesUST work.

Screenshot 2023-02-01 at 1 18 36 PM

They introduced AesSA module which confuses me where to put the TIN module. Could you please help me out?

Best,
Rahul Bhalley

Wrong instructions in Readme

I want to use Wang2020 Colaborative for style transfer. The readme says to use test.py and config, style arguments. There is no test.py fail and main.py (which I think you maean) lacks these attributes. Could you please update the instructions in the readme so that one can use the code easily?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.