akirasosa / mobile-semantic-segmentation Goto Github PK

View Code? Open in Web Editor NEW

709.0 38.0 135.0 313 KB

Real-Time Semantic Segmentation in Mobile device

License: MIT License

Python 99.56% Dockerfile 0.44%

coreml lfw semantic-segmentation deep-learning tensorflow python android ios mobilenets u-net

mobile-semantic-segmentation's Introduction

Real-Time Semantic Segmentation in Mobile device

This project is an example project of semantic segmentation for mobile real-time app.

The architecture is inspired by MobileNetV2 and U-Net.

LFW, Labeled Faces in the Wild, is used as a Dataset.

The goal of this project is to detect hair segments with reasonable accuracy and speed in mobile device. Currently, it achieves 0.89 IoU.

About speed vs accuracy, more details are available at my post.

Example application

iOS
Android (TODO)

Requirements

Python 3.8
pip install -r requirements.txt -f https://download.pytorch.org/whl/torch_stable.html
CoreML for iOS app.

About Model

At this time, there is only one model in this repository, MobileNetV2_unet. As a typical U-Net architecture, it has encoder and decoder parts, which consist of depthwise conv blocks proposed by MobileNets.

Input image is encoded to 1/32 size, and then decoded to 1/2. Finally, it scores the results and make it to original size.

Steps to training

Data Preparation

Data is available at LFW. To get mask images, refer issue #11 for more. After you got images and masks, put the images of faces and masks as shown below.

data/
  lfw/
    raw/
      images/
        0001.jpg
        0002.jpg
      masks/
        0001.ppm
        0002.ppm

Training

If you use 224 x 224 as input size, pre-trained weight of MobileNetV2 is available. It will be automatically downloaded when you train model with the following command.

cd src
python run_train.py params/002.yaml

Dice coefficient is used as a loss function.

Pretrained model

Input size	IoU	Download
224	0.89	Google Drive

Converting

As the purpose of this project is to make model run in mobile device, this repository contains some scripts to convert models for iOS and Android.

run_convert_coreml.py
- It converts trained PyTorch model into CoreML model for iOS app.

TBD

Report speed vs accuracy in mobile device.
Convert pytorch to Android using TesorFlow Light

mobile-semantic-segmentation's People

Contributors

Stargazers

Watchers

Forkers

kingvision willdamon zgsxwsdxg lyk125 believesoul francisbitontistudio noguxun lxh-123 yak0xff a382695908 soledad89 xtanitfy tranhv-vfa-do-not-use emojiface-ops peternara inkimage longpeng2008 grseb9s bernardab 407181373 alexliyang ashwinrajendraprasad turtletaco chriszeng87 tsok-xyz dreadlord1984 liujie3948 aurooj shubhampachori12110095 sydney222 yancz1989 zumbalamambo eong2012 liangxiao05 nathan-vo810 iamnow amzpotato ouceduxzk ml-lab liaopeiyuan dongcute jiangkunliu xiusdk meelement vitalwarley wojciechkrukar lm18217762995 gen0924 hyzcn horaccefeng sercant wizaron codingboo alwc chomolungma sai-919 lamhoangtung samsgates neufreescale arialblack14 tustslf iso223 ideaplexus cleoag danielzxy chauhankaranraj codinglife77 zhouchaohong1111 sxning amanmeet pandinosaurus unityzf xiaochengcike rootkit eminarcissus dknyxh windzhougithub tsingjinyun yutingchi can2apple lzb863 598717026 xcq-legend zkhaider saeed771 abhijitdalavi cvdreamer meus-projetos-pessoais tamwaiban verebes1 saikiran321 biranchi2018 baengel98 dolphintear selina013 whitetea00 amirunpri2018 simonpickup pypeaday johnwayne1995

mobile-semantic-segmentation's Issues

Use the ./run.sh script to download data?

When I try to run it I get the error:
./run.sh: line 6: S3_BUCKET: unbound variable
Do I have to set this as global var? Can you provide it?
Thanks

Output data type is MultiArray after run_convert_coreml.py

Hi, after training is completed, I want the model to convert to .mlmodel format, but I got this kind of file below which output is not same as input data type:

Am I miss something? I have search coremltools API ref, it seems there has no way to config output in pytorch, does someone know how I can fix it?

Script to evaluate MobileUNet

Hi, could you provide a script to evaluate MobileUNet? I have trained the model and wanted to evaluate the same. Thanks

about the Version of keras and tensorflow

python train_full.py --img_file=data/images-128.npy --mask_file=data/masks-128.npy

... ...
File "/home/.local/lib/python2.7/site-packages/tensorflow/python/layers/base.py", line 1376, in init
layer.outbound_nodes.append(self)
AttributeError: 'Activation' object has no attribute 'outbound_nodes'

hi， I run the training code above，got an error，could tell me the version of keras and tensorflow ？Thank you!
My Version: Keras:2.2.0 Tensorflow:1.8.0

Is the inference time of the model related to the parameter quantity of the model?

How do you calculate the GFLOPS of a model? The less the parameter quantity of the model, the shorter the inference time of the model?

Training Issue

when i run this command
python train_full.py \ --img_file=/path/to/images.npy \ --mask_file=/path/to/masks.npy

i got this error:

Traceback (most recent call last): File "train_full.py", line 96, in <module> train(**vars(args)) File "train_full.py", line 21, in train train_gen, validation_gen, img_shape = load_data(img_file, mask_file) File "/home/lafi/Desktop/HairSegmentation/mobile-semantic-segmentation-master/data.py", line 60, in load_data train_img_gen.fit(images) File "/usr/local/lib/python2.7/dist-packages/keras/preprocessing/image.py", line 675, in fit 'Got array with shape: ' + str(x.shape)) ValueError: Input to .fit()should have rank 4. Got array with shape: (0,)
Please how i can fix it
Thank you
Lafi

Can we mask hole image object using this

How to calculate the mean and std value?

Hello, in the data.py script, the standardize function is as below:
def standardize(images, mean=None, std=None):
if mean is None:
# These values are available from all images.
mean = [[[29.24429131, 29.24429131, 29.24429131]]]
if std is None:
# These values are available from all images.
std = [[[69.8833313, 63.37436676, 61.38568878]]]
x = (images - np.array(mean)) / (np.array(std) + 1e-7)
return x
But I calculate the mean and std value of three channels of all the images respectively by np.mean() and np.std() , the result is just not the value above. Do anyone knows how to calculate the mean and std value ?

ImportError: cannot import name 'relu6'

Hi,

with Keras 2.2.2, there is problem importing DepthwiseConv2D and relu6 in MobileUNet.py

DepthwiseConv2D can be probably imported from keras.layers, but I don't know where to find relu6.

Face segmentation

The ppm files of LFW dataset has face as green, hair as red & remaining area as blue.

Suppose if I want to train the model for face segmentation, what are the changes need to be done?
Please let me know.

Regards
Gopi. J

I trained with my own dataset input size is 224x224, but the output is 112 x 112

hi ,I would like to use your training method to train my own dataset and want to replace my own mlmodel to the iOS sample Hair mlmodel.
However, when I done training and covert it to mlmoldel, input size is 224 x 224 color, output is 112x112 color. may I ask how to make the output same as the ios sample hair model mlmodel (MultiArray (Double 1 x 224 x 224))?

Have you tried the model on real world images?

Hi,
I have tried running produced from the training on sample images from normal real world seances, and I'm getting very bad results and idea why? the training and validation scores seem alright?

pred.py question

When I run pred.py and save the plt, I get these kinds of predicted images, which are not clean like in your github

Any idea?

why predict values are strange

I trained a model with my own data, the test output are as follows with "pred.py":

it seems there is something wrong, what's the reason?

I trained with my own data, my mask's pixel value is 0,1,2,3,4

dataset transform?

hi,
why apply the same color jitter transform to the mask? as for as i know, mask is just the label which stands for pixels classification.

README Training Instructions Broken

@akirasosa I tried to follow the instructions in the Repo README, and received the following error:
AttributeError: Can't pickle local object 'dice_loss.<locals>.fn'
Am i doing something incorrect, or is this a common error?

Is is possible to have dataset with different image sizes?

For example is it possible to have dataset images with different sizes, for example, the image of size 702 x 556 or 604 x 453?

Is it possible to convert the model to Tensorflow JS model?

Pulling from block 13 instead of block 12 of mobilenet

@akirasosa Is there an advantage to include stage 13 (block 13) of the mobilenet base in your architecture? As can be seen here 170. The spatial dimensions of block 12 are consistent with res-nets requirements. In my experiments, I used block 12 169 as my C5 output.

image size not correct

#1
document issue
the correct command line to set image size is
python data.py --img_size=128
not
python data.py --image_size=128

#2
training issue
when data is generated with default size 192
I started train
and the training process got killed after log
"
lr: 0.100000
Epoch 1/250
"

Variable classes

I can't see any way to define the number of classes detected. How would I increase the number of classes?

EDIT: It seems to me it should be the last conv2D layer in the model; that is, changing Conv2D(1, (1,1) ...)(b18) to Conv2D(num_classes, (1,1) ...)(b18). Of course, this gives me a shape mismatch error. Ideally, I'd like my output to be a (1, height, width) matrix where each value indicates the inferred class for that pixel (i.e., as the most highly activated channel for that pixel). I'm pretty new to keras, so any thoughts appreciated.

integrate with Lightning ecosystem CI

Hello and so happy to see you use Pytorch-Lightning! 🎉
Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI ⚡
As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... 😕 We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... 👍

What is needed to do?

have some tests, including PL integration
add config to ecosystem CI - https://github.com/PyTorchLightning/ecosystem-ci

What will you get?

scheduled nightly testing configured for development/stable versions
slack notification if something went wrong to investigate
testing also on multi-GPU machine as our gift to you 🐰

cc: @Borda

Android app

Hello,
i'm asking about the android app because i want to develop it, so please can you give me some details about what things i have to do and what script needed to be ported from python to java.
Thanks
Lafi

Any pretrained model available?

could you give a pretrained model for finetune?

Generate ppm files for custom datasets

Originally posted by @mdramos in #11 (comment)

caffe version

Hi:
Can you provide a Caffe version?

thanks

About the edge

Hi @akirasosa ,
Your result looks good. How did you process the edge? Bilinear interpolation will smooth the detail of edge, right?

Negative Loss

Thank you for your post this repo. It's been very helpful.
I am trying to run full training on LFW dataset and I'm seeing negative loss value.

[==============================] - 17s 2s/step - loss: -0.0873

Is this expected?

Why my mlmodel file is so large?

Hello, I convert my pre-trained model mu_224.h5 to mu_224.mlmodel by your coreml-converter.py script, then I get a 22.8M file. However, I git clone the https://github.com/vfa-tranhv/MobileAILab-HairColor-iOS.git and find the hair mlmodel is only 6.8M. Do anyone knows why?

Matching shapes

First of all thanks for this great repo and project.

I'm using an image of the size 240 and not 128 and when running the train_full script or any other method of training I receive the following error in the net instantiation (building) procedure:

ValueError: Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 16, 16, 512), (None, 15, 15, 512)]

This error happens when the following layer is constructed

up1 = concatenate([
        Conv2DTranspose(filters, (2, 2), strides=(2, 2), padding='same')(b13),
        b11,
    ], axis=3)

Any idea why? or how to solve it?

Furthermore how would you suggest that use the net for arbitrary size of an image (w,h) where w != h

Please consider Hydra

Hi,
I am the author of Hydra and OmegaConf which you are already using.
I think Hydra is much better suited for a project like this one than using OmegaConf only, please take a look.

how would we generate ppm for custom sets?

How to run the model?

Hi, what are the steps to runt the file? I tried to load the model in run_eval.py, but It's throwing error. Is there any other way to run file

Is there a way to test the trained model on a single image ?

Is there a way to test the trained model on a single image ? I m ok with writing a script if needed, only I do not know much about tensorflow, so I do not know where to begin with.
Thanks

How long to train the model?

I'm planning on mimicking your architecture on the cityscapes dataset. Do you mind sharing some details on the training phase? ( Hardware / how long).

how to fine-tune

hey, after run train_full.py, I got good effect, but not good enough.
when I tried
#python train_top_model.py \

--img_file=/tmp/images.npy \

--mask_file=/tmp/masks.npy

#python train_fine_tune.py \

--img_file=/tmp/images.npy \

--mask_file=/tmp/masks.npy \

I got terrible effect...
I want to know the correct train method.

CoreML model crashes in real device.

It generates non-trained CoreML model.

python coreml-converter-bench.py

If I run the generated model in iOS simulator, it works. But if I run it in real device, it crashes.

I use following Swift unit test code to run it.

    func testConvExample() {
        guard let input_data = try? MLMultiArray(shape:[3, 128, 128], dataType:MLMultiArrayDataType.float32) else {
            fatalError("Unexpected runtime error. MLMultiArray")
        }
        let model = mu_128_1_025()
        do {
            try model.prediction(data: input_data)
            print("predicted------------")
        } catch  {
            print("error------------")
        }
    }

Why you didn't need to add a relu operation after deconvolution?

Hello,
I am confused about why you did not need to add a relu operation after each deconvolution like here?

LFW related

I see the below comments in Readme

Data is available at LFW. Put the images of faces and masks as shown below.

data/
raw/
images/
0001.jpg
0002.jpg
masks/
0001.ppm
0002.ppm
<<<<

But in the link http://vis-www.cs.umass.edu/lfw/part_labels/#download, I can download files like lfw-funneled.tgz. If I extract I get folders with person names & corresponding jpg files.
Do I need to manually remove the folder structure & rearrange those files as *0001.jpg, *0002.jpg, etc... similar to the ppm files list ?

FYI, I got ppm files from https://drive.google.com/file/d/1TbQ24nIc3GGNWzV_GGX_D-1WpI2KOGii/view (will be great if you can share the method/code used to generate ppm files)

Regards
Gopi. J

Loading model with custom loss in coreml

I use
def matting_loss(x):

def loss(y_true, y_pred):

   La = dice_coef_loss(y_true, y_pred)

   mul_true = K.concatenate([y_true, y_true, y_true], axis=-1) * x

   mul_pred = K.concatenate([y_pred, y_pred, y_pred], axis=-1) * x

   Lcolor = K.mean(K.sqrt(K.square(mul_true - mul_pred) + K.epsilon()))

   return La + 0.5 * Lcolor

return loss
in keras, but cannot convet it to coreml, anyone who can help me!!!

output size of the converted model is wrong

Hi,
I tried your coreml_converter.py script and the output layer (identifier 597) is:
tensor_type {
elem_type: FLOAT
shape {
dim {
dim_value: 1
}
dim {
dim_value: 3
}
dim {
dim_value: 224
}
dim {
dim_value: 224
}

The identifier is 597 and not 595 as you told because I used your interpolate layer (was in # in your original code):
x = interpolate(x, scale_factor=2, mode='bilinear', align_corners=False)

Now, I have two questions:
(1) why the output is 3x224x224 and not 1x224x224 ?
(2) the interpolate layer is getting error when I am trying to convert. Is there another way to upload from 112x112 to 224x224?

Thanks.

do you train on Mobilenetv2 Unet pytorch?

I have train on the new pytorch model, but the result is not good

How Can I convert this model to TFLITE

I want to build Android App.

So I use Tensorflow Lite for Android.

But I can't get .tflite file.

How Can I convert this model to TFLITE?

How to verify a check point model?

I try to verify a check point model by firstly try to convert it to pb format.

python tf-converter.py --input_model_path artifacts/checkpoint_weights.10--0.51.h5

but I got an error like below

Using TensorFlow backend.
Traceback (most recent call last):
  File "tf-converter.py", line 58, in <module>
    main(**vars(args))
  File "tf-converter.py", line 24, in main
    model = load_model(input_model_path, custom_objects=custom_objects())
  File "/home/ubuntu/.pyenv/versions/keras/lib/python3.5/site-packages/keras/models.py", line 238, in load_model
    raise ValueError('No model found in config file.')
ValueError: No model found in config file.

Do you know what could be wrong?

Is there any easy way to verify the h5 model directly on the PC?

Any Inference Speed Record in CPU Mode

I am looking for a semantic segmentation model that could run fast in CPU mode.
I appreciated your post offers a very detailed analysis on the runtime and accuracy for variants of MobileNetv2-UNet.
However, I found that the mobile devices you did experiments on are all embedded with GPU device. I am more concerned with the runtime and accuracy performance on CPU mode (hopefully the model could run at FPS >= 10).
Did you do any similar analysis before? Or are you confident that the model (MobileNetv2-UNet) could give a speed of at least 10FPS with a normal Macbook Air? (8GB memory, i7 CPU)

FYI, I tried your model (image size = 224, 224) in my laptop and found that the inference speed is ~0.3s. Not sure how I could optimize the speed

akirasosa / mobile-semantic-segmentation Goto Github PK

mobile-semantic-segmentation's Introduction

Real-Time Semantic Segmentation in Mobile device

Example application

Requirements

About Model

Steps to training

Data Preparation

Training

Pretrained model

Converting

TBD

mobile-semantic-segmentation's People

Contributors

Stargazers

Watchers

Forkers

mobile-semantic-segmentation's Issues

--img_file=/tmp/images.npy \

--mask_file=/tmp/masks.npy

--img_file=/tmp/images.npy \

--mask_file=/tmp/masks.npy \

Recommend Projects

Recommend Topics

Recommend Org