Code Monkey home page Code Monkey logo

plasticbag-faster-rcnn's Introduction

plasticbag-faster-rcnn

This fork is a Tensorflow implementation of Faster RCNN, which aims to accurately detect plastic bags on streets in Vietnam. I conducted the project during my time as an intern at VinAI Research Lab.

Note: The fork is mostly based on the implementation of tf-faster-rcnn. If you have any problems running the code, please refer to Issues in the original repository first. Also, check out the technical report An Implementation of Faster RCNN with Study for Region Sampling if needed. For details about the faster RCNN architecture, please refer to Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

My work integrates the PASCAL VOC 2007+2012 dataset with the customized PlasticVNOI dataset (which stands for Plasticbags in Vietnam + OpenImages + ImageNet). This dataset utilized a lot of images from the well-known OpenImages and ImageNet datasets. The remaining images were collected from different online sources in Vietnam. They were all taken on many polluted streets across the country. Most of the unannotated images were annotated manually, while the others were machine-annotated and human-verified. At the moment, the PlasticVNOI dataset contains over 1000 images with annotations.

Detection Performance

The current code supports VGG16, Resnet V1 and Mobilenet V1 models. I mainly tested on the Resnet architecture as it seemed to be the best for Faster RCNN compared to the others. The model for plastic bag detection performs very accurately in high resolution images and closed objects.

  • With VGG16, AP for plasticbag = 52.24.
  • With Resnet101, AP for plasticbag = 62.04.

Some of the results: (plasticbag only)

Prerequisites

Please follow the instructions in the original repository to install all prerequisites.

Installation

  1. Clone the repository
git clone https://github.com/ngthanhvinh/plasticbag-faster-rcnn.git
  1. Update your -arch in setup script to match your GPU
cd plasticbag-faster-rcnn/lib
# Change the GPU architecture (-arch) if necessary
vim setup.py
GPU model Architecture
TitanX (Maxwell/Pascal) sm_52
GTX 960M sm_50
GTX 1080 (Ti) sm_61
Grid K520 (AWS g2.2xlarge) sm_30
Tesla K80 (AWS p2.xlarge) sm_37

Note: You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs. Also even if you are only using CPU tensorflow, GPU based code (for NMS) will be used by default, so please set USE_GPU_NMS False to get the correct output.

  1. Build the Cython modules
make clean
make
cd ..

Setup data

Setup the PASCAL VOC 2007+2012

Please follow the instructions below to setup the VOC2007 dataset

  1. Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
  1. Extract all of these tars into one directory named VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
  1. It should have this basic structure
VOCdevkit/                           # development kit
VOCdevkit/VOCcode/                   # VOC utility code
VOCdevkit/VOC2007                    # image sets, annotations, etc.
# ... and several other directories ...
  1. Rename the directory to VOCdevkit2007
mv VOCdevkit VOCdevkit2007
  1. Create a symlink for the PASCAL VOC 2007 dataset
cd data
ln -s ../VOCdevkit2007 VOCdevkit2007

After setting up the VOCdevkit2007, repeat the similar process to set up VOCdevkit2012. Some useful links:

Note:

  • You only have to take the two links above into account because you would use voc_2007_trainval or voc2007_trainval+voc2012trainval to train/validate the model and voc2007_test to test the result.
  • After downloading the train/val data and the devkit, remember to move the VOC2012 folder of train/val data into the devkit folder.
  • The devkit folder should be placed in the repository root. In addition, remember to name it VOCdevkit2012 and create a symlink for the PASCAL VOC 2012 dataset in data.
  • Finally, the VOCdevkit2012 folder should have the exact same structure as the VOCdevkit2007 folder.

Setup the custom dataset for plasticbags (PlasticVNOI) and integrate it with the PASCAL VOC dataset

Download the PlasticVNOI dataset here, and then save it into the plasticbag_dataset folder.

Then, extract the dataset

cd plasticbag_dataset
tar xvf plasticVNOI.tar.xz
cd ..

After that, the plasticbag_dataset folder should have the following structure:

/plasticbag_dataset
  /annotations
    0a553ce06f26e637.xml
    00a76046606aa888.xml
    ...
  /images
    0a553ce06f26e637.jpg
    00a76046606aa888.jpg
    ...
  integrate_pascal_voc.py

The integration process of PlasticVNOI and Pascal VOC would add, remove, and rewrite a bunch of files in VOCdevkit2007/VOC2007 folder so that it fits both the original and the custom datasets. In order not to mess everything up, it is best to backup the VOC2007 data first:

cd VOCdevkit2007
cp VOC2007 VOC2007_backup
cd ..

Finally, run:

cd plasticbag_dataset
python3 integrate_pascal_voc.py
cd ..

Now, the PlasticVNOI dataset is integrated into VOCdevkit2007/VOC2007 folder. You may want to checkout the folder to understand what happened.

Demo and Test with pre-trained models

  1. Download pre-trained model

You can download the pre-trained model here. Save it into the data folder.

  1. Extract the downloaded model
cd data
tar xvf voc_0712_80k-200k.tar.xz
cd ..
  1. Create a folder and a soft link to use the pre-trained model
NET=res101
TRAIN_IMDB=voc_2007_trainval+voc_2012_trainval
mkdir -p output/${NET}/${TRAIN_IMDB}
cd output/${NET}/${TRAIN_IMDB}
ln -s ../../../data/voc_2007_trainval+voc_2012_trainval ./default
cd ../../..
  1. Demo for testing on custom images
# at repository root
GPU_ID=0
CUDA_VISIBLE_DEVICES=${GPU_ID} python3 tools/demo.py

Note: Resnet101 testing probably requires several gigabytes of memory, so if you encounter memory capacity issues, please install it with CPU support only. Refer to Issue 25 in the original repository.

  1. Test with pre-trained Resnet101 models
GPU_ID=0
./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc_0712 res101

Train your own model

  1. Download pre-trained models and weights. The current code support VGG16 and Resnet V1 models. Pre-trained models are provided by slim, you can get the pre-trained models here and set them in the data/imagenet_weights folder. For example for VGG16 model, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    wget -v http://download.tensorflow.org/models/vgg_16_2016_08_28.tar.gz
    tar -xzvf vgg_16_2016_08_28.tar.gz
    mv vgg_16.ckpt vgg16.ckpt
    cd ../..

    For Resnet101, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    wget -v http://download.tensorflow.org/models/resnet_v1_101_2016_08_28.tar.gz
    tar -xzvf resnet_v1_101_2016_08_28.tar.gz
    mv resnet_v1_101.ckpt res101.ckpt
    cd ../..
  2. Train (and test, evaluation)

./experiments/scripts/train_faster_rcnn.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712} is defined in train_faster_rcnn.sh. {coco} has not been supported yet.
# Examples:
./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc vgg16
./experiments/scripts/train_faster_rcnn.sh 1 pascal_voc_0712 res101
  1. Visualization with Tensorboard
tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval+voc_2012_trainval/ --port=7001 &
  1. Test and evaluate
./experiments/scripts/test_faster_rcnn.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712} is defined in test_faster_rcnn.sh. {coco} has not been supported yet.
# Examples:
./experiments/scripts/test_faster_rcnn.sh 0 pascal_voc vgg16
./experiments/scripts/test_faster_rcnn.sh 1 pascal_voc_0712 res101
  1. You can use tools/reval.sh for re-evaluation

By default, trained networks are saved under:

output/[NET]/[DATASET]/default/

Test outputs are saved under:

output/[NET]/[DATASET]/default/[SNAPSHOT]/

Tensorboard information for train and validation is saved under:

tensorboard/[NET]/[DATASET]/default/
tensorboard/[NET]/[DATASET]/default_val/

Scope of Improvement

  • Python3 adaption
  • Save every snapshot during the training process
  • PASCAL VOC 2007 integration
  • PASCAL VOC 2007+2012 integration
  • COCO integration
  • Collect more data with annotations for plastic bags

plasticbag-faster-rcnn's People

Contributors

endernewton avatar ngthanhvinh avatar philokey avatar mbuckler avatar detry322 avatar josephkj avatar shijunk avatar bhack avatar kukuruza avatar tungnk181120 avatar snshine avatar tao-j avatar xyutao avatar ppwwyyxx avatar vasilgeorge avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.