Semantic Segmentation

Project is about labeling the pixels of a road in images using a Fully Convolutional Network (FCN). Project is done in Python 3.6, DNN framework TensorFlow and monitoring/debuging done by Tensorboard and Visual Studio Code.

Results

Testing results using helper.py project testing function

You can see that there are some really good results and some not that good. There (of-course) are much more job to do to improve results but I would start with larger dataset.

Images and label masks feeded into NN, plotted in Tensorboard. In each window there are three images

Augmented input image (in these examples you can spot brightness reduction, rotations, little blur)
Label mask
NN output

Training loss during one of the training sessions, plotted in Tensorboard

Project rquirements

Make sure you are using Python 3.x

Install Python dependecies using requirements.txt. If you are using Tensorflow compiled from sources then remove tensorflow line from requirements.

pip install --upgrade -r requirements.txt

Or install Python dependecies manually:

Dataset

Download the Kitti Road dataset from here. Extract the dataset in the data folder. This will create the folder data_road with all the training a test images.

Running project

Run python main.py --help to see project options. Output should look like this:

usage: main.py [-h] [--image_shape IMAGE_SHAPE [IMAGE_SHAPE ...]]
               [--num_classes NUM_CLASSES] [--epochs EPOCHS]
               [--batch_size BATCH_SIZE] [--learning_rate LEARNING_RATE]
               [--data_dir DATA_DIR] [--runs_dir RUNS_DIR]
               [--test_name TEST_NAME] [--chk_path CHK_PATH]
               [--pb_path PB_PATH] [--mode MODE]

optional arguments:
  -h, --help            show this help message and exit
  --image_shape IMAGE_SHAPE [IMAGE_SHAPE ...]
                        Resized image shape which will be used as input for
                        neural net.
  --num_classes NUM_CLASSES
                        Number of classes.
  --epochs EPOCHS       Number of epochs.
  --batch_size BATCH_SIZE
                        Number of batches.
  --learning_rate LEARNING_RATE
                        Optimizer initial learning rate.
  --data_dir DATA_DIR   Data directory path.
  --runs_dir RUNS_DIR   Runs directory path.
  --test_name TEST_NAME
                        Test name, used when create log dir with summaries as
                        prefix
  --chk_path CHK_PATH   Re-save checkpoint path for optimization. If not set
                        then won't save anything.
  --pb_path PB_PATH     Path to optimized FCN model for inferece.
  --mode MODE           Run code in possible modes:
                        --mode train : Will train and save mode. Afterwards test and save results.
                        --mode inference_model : Will re-save checkpoint path for optimization. For this --chk_path must be provided
                        --mode inference_test : Will run inference model on test video. --pb_path must be provided
                        --mode project_test : Will only perform project unit tests.

Descriptions should be clear enough to start working with code however I recommend to first run main.py in test mode, it will only perform tests and if necesarry download some stuff:

python main.py --mode project_test

Training

Run

To train FCN using custom parameters run:

python main.py --mode train --test_name MyFirstTest --learning_rate 1e-6 --batch_size 10 --epochs 25 --num_classes 2 --image_shape 160 576 3

Note that --image_shape is not original image shape but shape you want the original to be resized before feeding it to NN.

Once training is completed FCN model will be saved in ./data/vgg_fcn/ directory and script will run helper.py gen_test_output testing and results will be saved in ./runs directory.

Monitoring

While training terminal will output loss and some other useful information be may want ot get more insights of what's going in. For this I've created Tensorflow summaries and they are updating while model is being trained.

Run tensorboard from any directory by providing full path to logdir (created by main.py):

tensorboard --logdir=/full/path/to/logdir/

Tensorboard then will tell you link which needs to be open in browser. That's it, you can now check how loss is changing over time or images that are feed into NN and more (just add more summaries).

Note that summaries will be flushed to logdir every 60 seconds and you may need to wait a little bit for loss and images to appear in tensorboard.

Optimization

Before optimization we must create pbtxt model description. Run:

python main.py --mode inference_model --chk_path path-to-chk

Once it is done we can optimize trained model for inference. Edit all paths in optimization script and run bash optimize_for_inference.sh. This script will create new models in protobuf (pb) format.

NB!: Currently .pbtxt file will be created very huge because we are using tf.saved_model.loader.load and all variables there are constants. Model.pbtxt contains model description as well as constants. Because of this I wasn't able to run optimization because of memory error (consumed all 64GB of RAM).

TODO: Restore model with TF code and all weigths as variables not constants.

Run Inference on Video

TODO: Run inference model on video and monitor time and visual results.

python main.py --mode inference_test --pb_path path-to-opmized-protobuf-model

Description

VGG16 in numbers	VGG16 FCN8s

Input image augmentation

In our case we have small dataset therefore we need deal with overfitting. One effective way to do that is augment images before NN. For example flip image vertically and for NN it would be completely new input, doing this alone we increase dataset size by factor of 2. You can check all augmentation functions and their descriptions in augmentation.py and how it's being used in helper.py.

random_brightness: randomly will either add or subtract pixel values in range -50 .. 40, applied to batch
random_noise: 50% chance that random Gausian noise will be applied to batch
random_blur: randomly blur single image with cv2.GaussianBlur() in range 0 .. 5
random_flip: 50% chance that single image and corresponding mask will be flipped vertically
random_shifts: randomly shifts single image and corresponding mask up or down and to left or right. Horizontal shifts -20 .. 20 px, Vertical shifts -35 .. 35 px
random_rotations: randomly rotates single image and corresponding mask in range -6 .. 6 degrees

naurislv / p12.1-semantic-segmentation Goto Github PK

p12.1-semantic-segmentation's Introduction

Semantic Segmentation

Results

Testing results using helper.py project testing function

Images and label masks feeded into NN, plotted in Tensorboard. In each window there are three images

Training loss during one of the training sessions, plotted in Tensorboard

Project rquirements

Dataset

Running project

Training

Run

Monitoring

Optimization

Run Inference on Video

Description

Input image augmentation

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent