Code Monkey home page Code Monkey logo

dip_final's Introduction

title disqus
Digital Image Processing
hackmd

Digital Image Processing

Table of Contents

Requirements

  • torch (version: 1.2.0)
  • torchvision (version: 0.4.0)
  • Pillow (version: 6.1.0)
  • matplotlib (version: 3.1.1)
  • yacs (version: >= 0.1.4)

Downloads

Clone or download the whole project

git clone https://github.com/peter0749/DIP_Final.git

Preparing for training data and some of pretrained models.

  • MSCOCO train2014 is needed to train the network.
  • Our pretrained model is released here.
  • Also needs to download the pretrain embedding GloVe from here.
  • Our exsample images and videos can be found here

Usage

Arguments

Global arguments

  • config_file: Path to the config file
  • mode: Run the mode for the model, train for training the model, and inference for inference the image through a trained model.

Inference arguments

  • --imsize: Size for resizing input images (resize shorter side of the image)
  • --cropsize: Size for crop input images (crop the image into squares)
  • --cencrop: Flag for crop the center reigion of the image (default: randomly crop)
  • --check-point: Check point path for loading trained network
  • --content_path: Content image path to evalute the network
  • --style_path: Style image path to evalute the network
  • --mask_path: Mask image path for masked stylization
  • --style-strength: Content vs Style interpolation weight (1.0: style, 0.0: content, default: 1.0)
  • --interpolatoin-weights: Weights for multiple style interpolation
  • --patch-size: Patch size of style decorator (default: 3)
  • --patch-stride: Patch stride of style decorator (default: 1)

Configs

As you can in the configs/testing.yaml file, there are many hyperparameters used in training the model, so if you want to retrain the model with different settings from us, please feel free to rewrite it.

MODEL:
  USE_DATAPARALLEL: True # Use multi-gpu for training
  USABLE_GPUS: [0,1,2,3] # Specify the gpu device numbers you use

DATASET:
  DATA_ROOT: /tmp2/Avatar-Net/dataset/train2014 # Change the data root to the folder you download MSCOCO dataset.

IMG_PROCESSING:
  IMSIZE: 512 # Fix the image size for the first input to the model
  CROPSIZE: 256 # Crop the image size for data augmentation
  CENCROP: False # Whether center crop the image or not.

LOSS:
  FEATURE_WEIGHT: 0.1 # The affect weight on content while training 
  TV_WEIGHT: 1.0 # The affect weight on generalizing the model 

TRAINING:
  MAX_ITER: 8000 # Training iterations
  LEARNING_RATE: 0.001 # Learning rate while training
  BATCH_SIZE: 16 # You can reduce the batch size to meet your gpu memory limitation.
  USE_CUDA: True # Whether use cuda device for training 
  CHECK_PER_ITER: 100 # The time period for training, and do validation on each time step

OUTPUT:
  CHECKPOINT_PREFIX: avatarnet_batch16_iter8000_1.0e-03 # Name for the checkpoint for your training

Train

python main.py configs/testing.yaml train

Inference

Single style:

Only use one style image to transfer the style to the content image.

python main.py configs/testing.yaml inference --ckpt ./checkpoints/{checkpoint} --imsize 512 --cropsize 512 --cencrop --content_path {content image path} --style_path {style image path} --style-strength 1.0

Multiple style

Use more than two style images to transfer the multi-style to the content image.

python main.py configs/testing.yaml inference --ckpt ./checkpoints/{checkpoint} --imsize 512 --cropsize 512 --content_path {content image path} --style_path {style image 1 path} {style image 2 path} --interpolation-weights 0.5 0.5

Multiple style with Mask

Use more than two style images to transfer the corresponded style to the masked region from the original content image.

python main.py configs/testing.yaml inference --ckpt ./checkpoints/{checkpoint} --imsize 512 --cropsize 512 --content_path {content image path} --style_path {style image 1 path} {style image 2 path} --mask_path {mask image 1 path} {mask image 2 path} --interpolation-weights 1.0 1.0

Appendix and FAQ

:::info Find this document incomplete? Leave a comment! :::

tags: AvatarNet Style Transfer Multi-style transfer

dip_final's People

Contributors

jiafong avatar lluma avatar peter0749 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.