hyperconnect / mmnet Goto Github PK

View Code? Open in Web Editor NEW

177.0 19.0 36.0 6.85 MB

Code for Towards Real-Time Automatic Portrait Matting on Mobile Devices

Home Page: https://arxiv.org/abs/1904.03816v1

License: Apache License 2.0

Python 98.17% Shell 1.83%

portrait-matting portrait-segmentation

mmnet's Introduction

Towards Real-Time Automatic Portrait Matting on Mobile Devices

We tackle the problem of automatic portrait matting on mobile devices. The proposed model is aimed at attaining real-time inference on mobile devices with minimal degradation of model performance. Our model MMNet, based on multi-branch dilated convolution with linear bottleneck blocks, outperforms the state-of-the-art model and is orders of magnitude faster. The model can be accelerated four times to attain 30 FPS on Xiaomi Mi 5 device with moderate increase in the gradient error. Under the same conditions, our model has an order of magnitude less number of parameters and is faster than Mobile DeepLabv3 while maintaining comparable performance.

The trade-off between gradient error and latency on a mobile device. Latency is measured using a Qualcomm Snapdragon 820 MSM8996 CPU. Size of each circle is proportional to the logarithm of the number of parameters used by the model. Different circles of Mobile DeepLabv3 are created by varying the output stride and width multiplier. The circles are marked with their width multiplier. Results using 128 x 128 inputs are marked with * , otherwise, inputs are in 256 x 256. Notice that MMNet outperforms all other models forming a Pareto front. The number of parameters for LDN+FB is not reported in their paper.

Requirements

Python 3.6+
Tensorflow 1.6

Installation

git clone --recursive https://github.com/hyperconnect/MMNet.git
pip3 install -r requirements/py36-gpu.txt

Dataset

Dataset for training and evaluation has to follow directory structure as depticted below. To use other name than train and test, one can utilize --dataset_split_name argument in train.py or evaluate.py.

dataset_directory
  |___ train
  |   |__ mask
  |   |__ image
  |
  |___ test
      |__ mask
      |__ image

Training

In scripts directory, you can find example scripts for training and evaluation of MMNet and Mobile DeepLabv3. Training scripts accept two arguments: dataset path and train directory. dataset path has to point to directory with structure described in the previous section.

MMNet

Training of MMNet with depth multiplier 1.0 and input image size 256.

./scripts/train_mmnet_dm1.0_256.sh /path/to/dataset /path/to/training/directory

Mobile DeepLabv3

Training of Mobile DeepLabv3 with output stride 16, depth multiplier 0.5 and input image size 256.

./scripts/train_deeplab_os16_dm0.5_256.sh /path/to/dataset /path/to/training/directory

Evaluation

Evaluation scripts, same as training scripts, accept two arguments: dataset path and train directory. If train directory argument points to specific checkpoint file, only that checkpoint file will be evaluated, otherwise the latest checkpoint file will be evaluated. It is recommended to run evaluation scripts together with training scripts in order to get evaluation metrics for every checkpoint file.

MMNet

./scripts/valid_mmnet_dm1.0_256.sh /path/to/dataset /path/to/training/directory

Mobile DeepLabv3

./scripts/valid_deeplab_os16_dm0.5_256.sh /path/to/dataset /path/to/training/directory

Demo

Refer to demo/demo.mp4.

License

Apache License 2.0

mmnet's People

Contributors

Stargazers

Watchers

mmnet's Issues

How to continue-train?

Is there an option which can continue-train if I have to break and I want to train continuely from last training coondition?

The inference script for an image

Hello
How are you?
Thanks for contributing this project.
Could u provide the inference script for an image?
Thanks

TFLite not giving output

We tried out your model. The output quality is good. We referred your paper, wherein it mentions that you got good result in tflite level. We tried out your pb file output, and the output is coming good. When we tried out the tflite conversion, the tflite is created and output is not coming properly. There are huge changes in the weights between the frozen graph convolution operation and tflite convolution operation when the tflite conversion happens. We are using the fused batch norm approach. Need ur help

where model

Can you tell me where model is if i want to run demo?

This "--convert_to_pb" in evaluator.py but no relevant codes to do it?

Pre-trained model

We went through your model. The outputs were surprising. Could you share with us your pre-trained model? Is the model giving these results in Mobile CPU?? You have handled quantized models, but how did you manage to preserve the accuracy to good extent.

Model performance on other datasets

Hi,
Thank you so much for such a wonderful work.

Can this model be used for getting real-time alpha mattes for images other than portraits as well? Like, Would the outputs be good if we trained it on Deep Image matting dataset?

what is depth_multiplier?

Hi,
I see in the paper that higher depth multiplier (0.5, 0.75, 1.0) leads to lower gradient error. My question is what is it and how much i can increase it?

how to train on gpu

I found it train on CPU

not test out mask ?

how to use trained model to test my image? any demo.py? Thanks...

Is there a log_file for reviewing?

When I train, the log will output in the terminal, but I can't find a log_file in the project path? Is there a log_file for reviewing?

tf.contrib.slim evaluation

authors, thanks for opensourcing your code.
I would like to know why did you choose tf.contrib.slim instead of tf.layers or just tf.keras?
I'm trying to reimplement your using tf.keras which could be more simpler for more expremintation with tf.eager_execution to debug easily? what are implecations I should take to use tf.keras?
Thanks!!

about quantizable_separable_convolution2d

Why change

Before: [SpaceToBatchND] -> [DepthwiseConv2dNative] -> [BatchToSpaceND] -> [normalize] -> [activate]

After : [SpaceToBatchND] -> [DepthwiseConv2dNative] -> [normalize] -> [activate] -> [BatchToSpaceND]

dataset

@shurain 안녕하세요. 좋은 소스를 공개해주셔서 감사합니다.
이모델을 트레이닝 하려고 하는데, 데이터셋의 구성을 어떻게 해야 하는건가요?

제안하신 디렉토리 구조는 이해가 됩니다.

이때 원본 이미지와 마스크 이미지의 형태를 어떻게 구성하면 될까요?

이미지의 포맷은 어떻게 지정합니까? png, jpg 상관없나요?
마스크의 영역의 구성을 확인하고자 합니다. 마스크는 grayscale 로 bg 0, shape 1 로 칠했는데, 맞을까요?

그리고 트레이닝이 끝나는 조건이 있나요? 너무 오래걸려서요

GPU Tesla K80 입니다.

image

mask

그리고, input / output node 명좀 먼지 알려주세요. freeze_graph 할려고 합니다. 혹시 covert_to_pb 설정이 가용한건지 확인도 부탁드립니다.

감사합니다.

hyperconnect / mmnet Goto Github PK

mmnet's Introduction

Towards Real-Time Automatic Portrait Matting on Mobile Devices

Requirements

Installation

Dataset

Training

MMNet

Mobile DeepLabv3

Evaluation

MMNet

Mobile DeepLabv3

Demo

License

mmnet's People

Contributors

Stargazers

Watchers

Forkers

mmnet's Issues

Before: [SpaceToBatchND] -> [DepthwiseConv2dNative] -> [BatchToSpaceND] -> [normalize] -> [activate]

After : [SpaceToBatchND] -> [DepthwiseConv2dNative] -> [normalize] -> [activate] -> [BatchToSpaceND]

Recommend Projects

Recommend Topics

Recommend Org