Code Monkey home page Code Monkey logo

pva-faster-rcnn's Introduction

PVANET: Deep but Lightweight Neural Networks for Real-time Object Detection

by Yeongjae Cheon, Sanghoon Hong, Kye-Hyeon Kim, Minje Park, Byungseok Roh (Intel Imaging and Camera Technology)

Introduction

This repository is a fork from py-faster-rcnn and demonstrates the performance of PVANET.

You can refer to py-faster-rcnn README.md and faster-rcnn README.md for more information.

Desclaimer

Please note that this repository doesn't contain our in-house runtime code used in the published article.

  • The original py-faster-rcnn is quite slow and there exist lots of inefficient code blocks.
  • We improved some of them, by 1) replacing the Caffe backend with its latest version (Sep 1, 2016), and 2) porting our implementation of the proposal layer.
  • However it is still slower than our in-house runtime code due to the image pre-processing code written in Python (+9ms) and some poorly implemented parts in Caffe (+5 ms).
  • PVANET was trained by our in-house deep learning library, not by this implementation.
  • There might be a tiny difference in VOC2012 test results, because some hidden parameters in py-faster-rcnn may be set differently with ours.
  • PVANET-lite (76.3% mAP on VOC2012, 10th place) is originally designed to verify the effectiveness of multi-scale features for object detection, so it only uses Inception and hyper features only. Further improvement may be achievable by adding C.ReLU, residual connections, etc.

Citing PVANET

If you find PVANET useful in your research, please consider citing:

@article{KimKH2016arXivPVANET,
    author = {Yeongjae Cheon and Sanghoon Hong and Kye-Hyeon Kim and Minje Park and Byungseok Roh},
    title = {{PVANET}: Deep but Lightweight Neural Networks for Real-time Object Detection},
    journal = {arXiv preprint arXiv:1608.08021},
    year = {2016}
}

Installation

  1. Clone the Faster R-CNN repository
# Make sure to clone with --recursive
git clone --recursive https://github.com/sanghoon/pva-faster-rcnn.git
  1. We'll call the directory that you cloned Faster R-CNN into FRCN_ROOT. Build the Cython modules

    cd $FRCN_ROOT/lib
    make
  2. Build Caffe and pycaffe

    cd $FRCN_ROOT/caffe-fast-rcnn
    # Now follow the Caffe installation instructions here:
    #   http://caffe.berkeleyvision.org/installation.html
    # For your Makefile.config:
    #   Do NOT uncomment `USE_CUDNN := 1` (for running PVANET, cuDNN is slower than Caffe native implementation)
    #   Uncomment `WITH_PYTHON_LAYER := 1`
    
    cp Makefile.config.example Makefile.config
    make -j8 && make pycaffe
  3. Download PVANET caffemodels

    cd $FRCN_ROOT
    ./models/pvanet/download_models.sh
  4. (Optional) Download original caffemodels (without merging batch normalization and scale layers)

    cd $FRCN_ROOT
    ./models/pvanet/download_original_models.sh
  5. (Optional) Download ImageNet pretrained models

    cd $FRCN_ROOT
    ./models/pvanet/download_imagenet_models.sh
  6. (Optional) Download PVANET-lite models

    cd $FRCN_ROOT
    ./models/pvanet/download_lite_models.sh

Models

  1. PVANET
  • ./models/pvanet/full/test.pt: For testing-time efficiency, batch normalization (w/ its moving averaged mini-batch statistics) and scale (w/ its trained parameters) layers are merged into the corresponding convolutional layer.
  • ./models/pvanet/full/original.pt: Original network structure.
  1. PVANET (compressed)
  • ./models/pvanet/comp/test.pt: Compressed network w/ merging batch normalization and scale.
  • ./models/pvanet/comp/original.pt: Original compressed network structure.
  1. PVANET (ImageNet pretrained model)
  • ./models/pvanet/imagenet/test.pt: Classification network w/ merging batch normalization and scale.
  • ./models/pvanet/imagenet/original.pt: Original classification network structure.
  1. PVANET-lite
  • ./models/pvanet/lite/test.pt: Compressed network w/ merging batch normalization and scale.
  • ./models/pvanet/lite/original.pt: Original compressed network structure.

How to run the demo

  1. Download PASCAL VOC 2007 and 2012
  1. PVANET+ on PASCAL VOC 2007
cd $FRCN_ROOT
./tools/test_net.py --gpu 0 --def models/pvanet/full/test.pt --net models/pvanet/full/test.model --cfg models/pvanet/cfgs/submit_160715.yml
  1. PVANET+ (compressed)
cd $FRCN_ROOT
./tools/test_net.py --gpu 0 --def models/pvanet/comp/test.pt --net models/pvanet/comp/test.model --cfg models/pvanet/cfgs/submit_160715.yml
  1. (Optional) ImageNet classification
cd $FRCN_ROOT
./caffe-fast-rcnn/build/tools/caffe test -gpu 0 -model models/pvanet/imagenet/test.pt -weights models/pvanet/imagenet/test.model -iterations 1000
  1. (Optional) PVANET-lite
cd $FRCN_ROOT
./tools/test_net.py --gpu 0 --def models/pvanet/lite/test.pt --net models/pvanet/lite/test.model --cfg models/pvanet/cfgs/submit_160715.yml

Expected results

  • PVANET+: 83.85% mAP
  • PVANET+ (compressed): 82.90% mAP
  • ImageNet classification: 68.998% top-1 accuracy, 88.8902% top-5 accuracy, 1.28726 loss
  • PVANET-lite: 79.10% mAP

pva-faster-rcnn's People

Contributors

rbgirshick avatar kyehyeon avatar dectinc avatar drozdvadym avatar wangdelp avatar

Watchers

 avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.