Code Monkey home page Code Monkey logo

deepshift's People

Contributors

mostafaelhoushi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

deepshift's Issues

shift_kernel & shift_cuda_kernel compiled but can not import

Successfully setup everything, and compiled shift_kernel, but when import shift_kernel, error message appeared:

ImportError: /home/grant/venv/lib/python3.6/site-packages/shift_kernel-0.0.0-py3.6-linux-x86_64.egg/shift_kernel.cpython-36m-x86_64-linux-gnu.so: undefined symbol: _ZN6caffe26detail36_typeMetaDataInstance_preallocated_4E

For shift_cuda_kernel, the error message is:
Segmentation fault (core dumped)

I am working on Ubuntu 18.04, others are as required.

About trained model

Thank you for providing your code :)
I tried to train your code because there is no trained models.
However the accuracy is not high as much as you reported.
I really want to inference about DeepShift.
Could you provide your files of trained models?
Thank you

Round to Fixed to Deal with Unsigned Tensors

We need to deal with unsigned tensors where there is no sign bit which is the case with activations to convolution that are usually the output of a Relu layer. This will save us one bit when we want to quantize activations to lower bitwidths.

image

The Number of Shifted Bits

The weight value is cliped to [- 1, 1] in the code:
self.shift_range = (-1 * (2(weight_bits - 1) - 2), 0)**

But the weight should have a small number of values exceeding 1.
For example 2 , it is easily obtained by shifting,<<1.
Is it better to rewrite the code as follows:
self.shift_range = (-1 * (2(weight_bits - 1) - 2), 2**(weight_bits)**

Thank you. @mostafaelhoushi

Error When Shifting Twice

Copying the question by @mengjingyouling from this issue to create a new issue:

We also want to discuss a problem with you. In your paper, the shift network is applied in classification network, not target detection. What do you think? Is there a decline in the accuracy?

Because the shift 1 bit will lead to some accuracy loss. We want to shift twice to solve it. For example: 10 = 8 + 2( shift 3 bits + shift 1 bit). Therefore, we modify the code as follows:

def get_shift_and_sign(x, rounding='deterministic'):
  sign = torch.sign(x)
  x_abs = torch.abs(x)
  shift1 = round(torch.log(x_abs) / np.log(2), rounding)
  wr1 = 2 ** shift1
  w1 = x_abs-wr1
  shift2 = round(torch.log(w1) / np.log(2), rounding)
  return shift1,shift2, sign

def round_power_of_2(x, rounding='deterministic'):

  shift1,shift2,sign = get_shift_and_sign(x, rounding)
  x_rounded = (2.0 ** shift1+2.0 ** shift2) * sign
  return x_rounded

However, the input in class Conv2dShiftQ(_ConvNdShiftQ): function will become Nan, which should be caused by data overflow:

class Conv2dShiftQ(_ConvNdShiftQ):
... ....
... ...

  #@weak_script_method
  def forward(self, input):
    print("--------------------------------------forward---------------------------------------------------")
    print("input======",input)

Can you give some suggestions to solve it? Thank you very much.

Loading Weights from `weights.pth` Not Working

Passing a weights.pth file to the --weights is not working properly. It probably doesn't load the weights.

While passing a checkpoint.pth.tar file to the --weights is working properly

round_to_fixed unsigned tensors

hi, @mostafaelhoushi

The input x is converted to 32bit fixed point in you paper,as follows:
`def round_to_fixed(input, integer_bits=16, fraction_bits=16):
assert integer_bits >= 1, integer_bits
# TODO: Deal with unsigned tensors where there is no sign bit
# which is the case with activations to convolution that
# are usually the output of a Relu layer
if integer_bits == 1:
return torch.sign(input) - 1
delta = math.pow(2.0, -(fraction_bits))
bound = math.pow(2.0, integer_bits-1)
min_val = - bound
max_val = bound - 1
rounded = torch.floor(input / delta) * delta

clipped_value = torch.clamp(rounded, min_val, max_val)
return clipped_value`

In the annotation of this function, it is said that this function is about unsigned tensor.
But we think it is about signed tensor. For example:
signed int8=[-128,127]

round_to_fixed(-128, integer_bits=8, fraction_bits=8) =-128

Are we right? Thank you.

round_to_fixed

Hi, thanks for the awesome work!

I am very interested in this work. However, I am new to the area of quantization and have some questions about the round_to_fixed function in deepshift/utils.py.

This function aims to convert the input from FP32 to fixed-point format (e.g., fix16), to mimic the shift operation and precion of fixed-point input.

While the range of FP32 is very large, I didn't get how this round_to_fixed function can convert input to merely 16bits. In my opinion the delta should be considered together with the range of input. If the input is in [-1,1], this function works fine (although the bound here should be also 1), so is there any implication that input is in [-1,1]? Or how should I set the default parameters (fractions and intergers) if I want to convert input to fix16?

Could you give me some comments about the difference of these two implementations? Thanks!!

does wx+b or wx not need to be quantified?

hi @mostafaelhoushi

In your excellent work Deepshift, where input x and activation are quantified, does wx+b or wx not need to be quantified? If they don't quantify it, they should be occupying 64 bits of memory, right?Does this affect model acceleration?

best wishes

thank you!

A bug in shift.cu caused me to fail to compile deepshift-gpu

When I compile the shift.cu, I get a error:

error: no instance of function template "DEEP_SHIFT_GEMM_GPU_KERNEL" matches the argument list
argument types are: (int *, int *, int *, int *, int, int, int, int, int, int)

the error in DEEP_SHIFT_LINEAR_GPU and DEEP_SHIFT_CONV_GPU . when bits == 7.

According to my understanding, the template of DEEP_SHIFT_GEMM_GPU_KERNEL is:

template <int num, int bits, char mask_shift, char mask_sign, bool zero_base>

Where the error is reported, it is used as :

DEEP_SHIFT_GEMM_GPU_KERNEL<NUM_4, BIT_6, 0x7f,0x80, NON_ZERO_BASE><<<gridDim, blockDim>>>

Since char can only represent numbers from -0x80 to 0x7f. My compiler raised a error to me.

So I suggest changing the template of DEEP_SHIFT_GEMM_GPU_KERNEL to:

template <int num, int bits, unsigned char mask_shift, unsigned char mask_sign, bool zero_base>

This change really solved my problem.

However, for my partners, this issue did not cause an error, so I guess this issue may be related to the compiler version

test mAP

The minist.py code implements the training process of the Deepshift method. As it is a complete process, the model is trained and then tested. The test model is generated using model. eval (). After training, save the model weight file (. pth).This training process can achieve high accuracy(train_log.csv).

However, we found that loading the generated weight file(weights.pth) for inference(test fuction) will reduce the accuracy.

Do you have any suggestions?

Thanks!

CPU kernel acceleration

A CPU kernel was implemented in the project. We want to know which CPU can support it.What is the acceleration efficiency?
Thank you very much

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.