Code Monkey home page Code Monkey logo

deeplabv3-tensorflow's Introduction

DeepLabV3 Semantic Segmentation

Reimplementation of DeepLabV3 Semantic Segmentation

This is an (re-)implementation of DeepLabv3 -- Rethinking Atrous Convolution for Semantic Image Segmentation in TensorFlow for semantic image segmentation on the PASCAL VOC dataset. The implementation is based on DrSleep's implementation on DeepLabV2 and CharlesShang's implementation on tfrecord.

Features

  • Tensorflow support
  • Multi-GPUs on single machine (synchronous update)
  • Multi-GPUs on multi servers (asynchronous update)
  • ImageNet pre-trained weights
  • Pre-training on MS COCO
  • Evaluation on VOC 2012
  • Multi-scale evaluation on VOC 2012

Requirement

Tensorflow 1.4

python 3.5
tensorflow 1.4
CUDA  8.0
cuDNN 6.0

Tensorflow 1.2

python 3.5
tensorflow 1.2
CUDA  8.0
cuDNN 5.1

The code written in Tensorflow 1.4 are compatible with Tensorflow 1.2, tested on single GPU machine.

Installation

sh setup.sh

Train

  1. Configurate config.py.
  2. Run python3 convert_voc12.py --split-name=SPLIT_NAME, this will generate a tfrecord file in $DATA_DIRECTORY/records.
  3. Single GPU: Run python3 train_voc12.py (with validation mIOU every SAVE_PRED_EVERY).

Performance

This repository only implements MG(1, 2, 4), ASPP and Image Pooling. The training is started from scratch. (The training took me almost 2 days on a single GTX 1080 Ti. I changed the learning rate policy in the paper: instead of the 'poly' learning rate policy, I started the learning rate from 0.01, then set fixed learning rate to 0.005 and 0.001 when the seg_loss stopped to decrease, and used 0.001 for the rest of training. )

Updated 1/11/2018

I continued training with learning rate 0.0001, there is a huge increase on validation mIOU.

Updated 2/05/2018

There was an improvement on the implementation of Multi-grid, thanks @howard-mahe. The new validation results should be updated soon.

Updated 2/11/2018

The new validation result was trained from scratch. I didn't implement the two stage training policy (fixing BN and stride 16 -> 8). I may try few more runs to see if there is an improvement on the performance, but I think it is a fine-tuning work.

mIOU Validation
paper 77.21%
repo 70.63%

The validation mIOU for this repo is achieved without multi-scale and left-right flippling.

The improvement can be achieved by finetuning on hyperparameters such as learning rate, batch size, optimizer, initializer and batch normalization. I didn't spend too much time on training and the results are temporary.

Welcome to try and report your numbers.

deeplabv3-tensorflow's People

Contributors

ndong-petuum avatar zl1446 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.