Hello, I have a couple of questions regarding quantizer options for Larq and LCE.

DoReFa quantizer with higher number of MACs/Ops, Grouped convs as custom ops on LCE 0.7.0 about compute-engine HOT 3 OPEN

lluevano commented on June 24, 2024

DoReFa quantizer with higher number of MACs/Ops, Grouped convs as custom ops on LCE 0.7.0

from compute-engine.

Comments (3)

lluevano commented on June 24, 2024

I noticed some issues with the latest version only (0.7.0) but not the one before (0.6.2).
Grouped convolutions (FP or binary) are converted as custom ops in the latest version.

Example:
Grouped (g=2) convs converter output:

2022-07-26 13:06:17.469686: W external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1903] The following operation(s) need TFLite custom op implementation(s):
Custom ops: Conv2D
Details:
tf.Conv2D(tensor<1x32x32x64xf32>, tensor<5x5x32x32xf32>) -> (tensor<1x11x11x32xf32>) : {data_format = "NHWC", dilations = [1, 1, 1, 1], explicit_paddings = [], padding = "SAME", strides = [1, 3, 3, 1], use_cudnn_on_gpu = true}
See instructions: https://www.tensorflow.org/lite/guide/ops_custom
2022-07-26 13:06:17.469772: I external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1963] Estimated count of arithmetic ops: 5792 ops, equivalently 2896 MACs

Estimated count of arithmetic ops: 5792 ops, equivalently 2896 MACs

Quantizer small example (2 qconv layers):

Example with ste_sign mode="weights":

2022-07-26 13:14:57.680246: I external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1963] Estimated count of arithmetic ops: 1.164 M ops, equivalently 0.582 M MACs

Estimated count of arithmetic ops: 1.164 M ops, equivalently 0.582 M MACs

Changing to DoReFa mode="weights":

2022-07-26 13:16:05.771057: I external/org_tensorflow/tensorflow/compiler/mlir/lite/flatbuffer_export.cc:1963] Estimated count of arithmetic ops: 1.663 M ops, equivalently 0.831 M MACs

Estimated count of arithmetic ops: 1.663 M ops, equivalently 0.831 M MACs

I was able to successfully benchmark my model with DoReFa and grouped convolutions converted on version 0.6.2 with a better-than-expected efficiency but not the one converted with version 0.7.0
I am using Tensorflow 2.8.0 and larq 0.12.2

from compute-engine.

lgeiger commented on June 24, 2024

Sorry for the late reply.

I noticed some issues with the latest version only (0.7.0) but not the one before (0.6.2).
Grouped convolutions (FP or binary) are converted as custom ops in the latest version.

Unfortunately this was an issue with TensorFlow 2.8 which LCE 0.7.0 uses under the hood. This has been fixed on master since we upgraded to 2.9, but we haven't published a new release with it yet. Sorry about that. For now, I'd recommend sticking with 0.6.2 if grouped convolution support is required.

Is the "ste_sign" quantizer the only viable option for efficient inference?

For binarised convolutions this is recommended for the activation. You can also use custom activation quantisers as well, but to make sure they convert correctly they should be implemented with larq.math.sign which unfortunately is not the case for DoReFa. Regarding weight quantization other quantisers should work fine as long as they binarise to {-1, 1} or {-alpha, alpha}.

I recommend looking at the converted model in Netron to make sure the conversion worked as intended.

from compute-engine.

lgeiger commented on June 24, 2024

I noticed some issues with the latest version only (0.7.0) but not the one before (0.6.2).
Grouped convolutions (FP or binary) are converted as custom ops in the latest version.

@lluevano sorry for the delay. We just release v0.8.0 including a fix for this. Let me know if that works for you.

from compute-engine.

DoReFa quantizer with higher number of MACs/Ops, Grouped convs as custom ops on LCE 0.7.0 about compute-engine HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent