Code Monkey home page Code Monkey logo

apq's Introduction

APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy

@inproceedings{Wang2020APQ,
  title={APQ: Joint Search for Nerwork Architecture, Pruning and Quantization Policy},
  author={Tianzhe Wang and Kuan Wang and Han Cai and Ji Lin and Zhijian Liu and Song Han},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2020}
}

Overview

We release the PyTorch code for the APQ. [Paper|Video|Competition]:

Jointly Search for Optimal Model

Save Orders of Magnitude Searching Cost

Better Performance than Sequential Design

How to Use

Prerequisites

- Pytorch version >= 1.0
- Python version >= 3.6
- Progress >= 1.5
- For getting new models, you'll need the NVIDIA GPU

Dataset and Model Preparation

Codebase Structure

apq
- dataset (imagenet data path)
- elastic_nn (super network builder , w/ or w/o quantization)
    - modules (define the layers, w/ or w/o quantization)
    - networks (define the networks, w/ or w/o quantization)
    utils.py (some utility functions for elastic_nn folder)
- models (quantzation-aware predictor and once-for-all network checkpoint path)
- imagenet_codebase (training codebase for imagenet)
- lut (latency lookup table path)
- methods (methods to find the mixed-precision network)
    - evolution (evolution search code)
- utils (some utility functions, including converter)
    accuracy_predictor.py (construction of accuracy predictor)
    latency_predictor.py (construction of latency predictor)
    converter.py (encode a subnetwork in to 1-hot vector)
    quant-aware.py (code for quantization-aware training)
main.py
Readme.md

Testing

For instance, if you want to test the model under exps/test folder.

Run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python test.py \
    --exp_dir=exps/test

You will get the exact information (latency/energy) running on BitFusion platform and ImageNet Top-1 accuracy.

Example

Evolution search

For instance, if you want to search a model under 12.80ms latency constraint.

Run the following command:

CUDA_VISIBLE_DEVICES=0 python search.py \
    --mode=evolution \
    --acc_predictor_dir=models \
    --exp_name=test \
    --constraint=12.80 \
    --type=latency

You will get the candidate under the resource constraints (latency or energy), which is stored in exps/test folder.

Quantization-aware finetune on imagenet

For instance, if you want to quantization-aware finetuning for the model under exps/test folder.

Run the following command:

CUDA_VISIBLE_DEVICES=0,1,2,3 python quant_aware.py \
    --exp_name=test

You will get a mixed-precision model under the resource constraints (latency or energy) with considerable performance.

Models

We provide the checkpoints for our APQ reported in the paper:

Latency Energy BitOps Accuracy Model
6.11ms 9.14mJ 12.7G 72.8% download
8.45ms 11.81mJ 14.6G 73.8% download
8.40ms 12.18mJ 16.5G 74.1% download
12.17ms 14.14mJ 23.6G 75.1% download

You can download the models and put it into exps folder to test the performance. Note that the bold item means the search under that constraint.

Related work on automated model compression and acceleration:

Once for All: Train One Network and Specialize it for Efficient Deployment (ICLR'20, code)

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware (ICLR’19)

AMC: AutoML for Model Compression and Acceleration on Mobile Devices (ECCV’18)

HAQ: Hardware-Aware Automated Quantization (CVPR’19, oral)

Defenstive Quantization: When Efficiency Meets Robustness (ICLR'19)

apq's People

Contributors

usedtobe97 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.