DeepShift

This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplications in a neural networks with bitwise shift (and sign change).

This research project was done at Huawei Technologies.

Overview
Important Notes
Results
Running the Code
Code Walkthrough
Running the Bitwise Shift Cuda Kernels
Binary Files of Trained Models

Overview

The main idea of DeepShift is to test the ability to train and infer using bitwise shifts.

We present 2 approaches:

DeepShift-Q: the parameters are floating point weights just like regular networks, but the weights are rounded to powers of 2 during the forward and backward passes
DeepShift-PS: the parameters are signs and shift values

Important Notes

To use the DeepShift-PS, the --optimizer must be set to radam in order to obtain good results.
DeepShift-PS is currently much slower in training than DeepShift-Q, because it requries twice the number of parameters and twice the number of gradients. We're working on optimizing that by using lower precision type to represent their parameters and gradietns.

Results

TBD

Running the Code

Clone the repo:

git clone https://github.com/mostafaelhoushi/DeepShift

Change directory

cd DeepShift

Create virtual environment:

virtualenv venv --prompt="(DeepShift) " --python=/usr/bin/python3.6

(Needs to be done every time you run code) Source the environment:

source venv/bin/activate

Install required packages and build the spfpm package for fixed point

pip install -r requirements.txt

cd into pytorch directroy:

cd pytorch

Now you can run the different scripts with different options, e.g., a) Train a DeepShift simple fully-connected model on the MNIST dataset, using the PS apprach:

python mnist.py --shift-depth 3 --shift-type PS --optimizer radam

b) Train a DeepShift simple convolutional model on the MNIST dataset, using the Q approach:

python mnist.py --type conv --shift-depth 3 --shift-type Q

c) Train a DeepShift ResNet20 on the CIFAR10 dataset from scratch:

python cifar10.py --arch resnet20 --pretrained False --shift-depth 1000 --shift-type Q

d) Train a DeepShift ResNet18 model on the Imagenet dataset using converted pretrained weights for 5 epochs with learning rate 0.001:

python imagenet.py <path to imagenet dataset> --arch resnet18 --pretrained True --shift-depth 1000 --epochs 5 --lr 0.001

e) Train a DeepShift ResNet18 model on the Imagenet dataset from scratch with an initial learning rate of 0.01:

python imagenet.py <path to imagenet dataset> --arch resnet18 --pretrained False --shift-depth 1000 --lr 0.01

Code Walkthrough

TBD

Running the Bitwise Shift CUDA Kernels

TBD

Binary Files of Trained Models

TBD

milkigit / deepshift Goto Github PK

deepshift's Introduction

DeepShift

Table of Contents

Overview

Important Notes

Results

Running the Code

Code Walkthrough

Running the Bitwise Shift CUDA Kernels

Binary Files of Trained Models

deepshift's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent