Comments (13)
Indeed, if I run your snippet I get a model that has a float layer somewhere in the middle. You can see this in Netron:
@Tombana , do you perhaps have any idea why this might happen?
from compute-engine.
I think the problem occurs if we specify input_shape
or input_tensor
when creating model object lqz.literature.BinaryDenseNet28
.
from compute-engine.
My colleague @lgeiger pointed me to a similar LCE issue: #421, which in turn points to an unresolved TensorFlow issue: tensorflow/tensorflow#40055. I believe your issue might be the same.
Let's look at the model produced by the following code in Netron:
x = tf.keras.layers.Conv2D(64, kernel_size=3)(x)
x = tf.keras.layers.MaxPool2D(3)(x)
x = tf.keras.layers.BatchNormalization()(x)
I see two tensors of the same size and dimensions: one for the bias of the Conv2D
layer of size 64, and one for the batch-normalisation mean values of size 64. They both have the same value as well (all zeros), triggering tensorflow/tensorflow#40055.
Until this is solved in TensorFlow (which might take a very long time if ever), the work-around is to make sure this doesn't happen. There are several options. One option is to train the model for one step, since that will already change both tensors to non-zero, and the chance that they are equal is minimal. Or do a full training session or load pre-trained weights. The alternative is to initialize your model such that this doesn't happen, e.g.as follows:
x = tf.keras.layers.Conv2D(64, kernel_size=3, bias_initializer=tf.keras.initializers.Constant(0.1))(x)
x = tf.keras.layers.MaxPool2D(3)(x)
x = tf.keras.layers.BatchNormalization()(x)
So I propose to close this issue if @aqibsaeed agrees, and keep the other LCE issue open to track the bug in TensorFlow.
from compute-engine.
Got it! I just double checked if I load model weights, conversion works fine. Thanks again for looking into this. Closing this now.
from compute-engine.
Any pointers on how to solve this?
from compute-engine.
How did you conclude that it does not result in a quantized model? I got a INT8/BNN model by running the code snippet below:
from pathlib import Path
import larq_zoo
import larq_compute_engine as lce
import tensorflow as tf
keras_model = larq_zoo.literature.BinaryDenseNet28(
input_shape=None,
input_tensor=None,
weights="imagenet",
include_top=True,
num_classes=1000
)
tflite_model = lce.convert_keras_model(keras_model,
inference_input_type=tf.int8,
inference_output_type=tf.int8,
experimental_default_int8_range=(-3,3)
)
Path("model.tflite").write_bytes(tflite_model)
I inspected the resulting .tflite
file in Netron.
from compute-engine.
I tried the following:
model_a = lqz.literature.BinaryDenseNet28(
input_shape=(32,32,3),
weights=None,
include_top=True,
num_classes=10
)
with lq.context.quantized_scope(True):
weights = model_a.get_weights()
model_a.set_weights(weights)
tflite_model = lqce.convert_keras_model(model_a,
inference_input_type=tf.int8,
inference_output_type=tf.int8,
experimental_default_int8_range=(-3,3)
)
Path("model.tflite").write_bytes(tflite_model)
Now if I submit it to https://plumerai.com/benchmark
: I get the following response back: Your model contains layers that aren't INT8 quantized but instead use floating-point values. This is not suitable for microcontrollers. You can find information on how to quantize your model in the TensorFlow documentation.
from compute-engine.
I do not get the above mentioned error when I use your snippet. Maybe I am missing something here experimental_default_int8_range
.
from compute-engine.
There's also a difference at the start
Custom CIFAR-10 Model
IN-1K Model
from compute-engine.
I'm not sure, this sounds like a bug in the converter, it might be in the tensorflow converter itself. From what I see in the two code snippets, the difference is only in
with lq.context.quantized_scope(True):
weights = model_a.get_weights()
model_a.set_weights(weights)
That shouldn't affect the outcome of the converter though.
Which version of larq-compute-engine
was used for this?
from compute-engine.
I tested converting removing this
with lq.context.quantized_scope(True):
weights = model_a.get_weights()
model_a.set_weights(weights)
but it does not work.
flatbuffers-1.12 larq-compute-engine-0.7.0
from compute-engine.
I managed to reproduce the issue with a minimal example without the zoo, without binary layers, with just 3 layers:
from pathlib import Path
import larq_compute_engine as lce
import tensorflow as tf
image_input = tf.keras.layers.Input(shape=(32, 32, 3))
x = image_input
x = tf.keras.layers.Conv2D(64, kernel_size=3)(x)
x = tf.keras.layers.MaxPool2D(3)(x)
x = tf.keras.layers.BatchNormalization()(x)
keras_model = tf.keras.Model(inputs=image_input, outputs=x)
tflite = lce.convert_keras_model(
keras_model,
inference_input_type=tf.int8,
inference_output_type=tf.int8,
experimental_default_int8_range=(-3, 3),
)
Path("model.tflite").write_bytes(tflite)
With larq-compute-engine==0.7.0 tensorflow==2.8.0 keras==2.8.0
this gives:
If I remove any of the three layers (Conv2D
, MaxPool2D
, or BatchNormalization
) then it does produce the proper int8 model.
If I downgrade to larq-compute-engine==0.6.2 tensorflow==2.6.1 keras==2.6.0
then it looks better but still the first Conv2D layer is in float:
If I instead of LCE use the standard TFLite post-training quantizer then everything becomes INT8 as expected:
def representative_dataset():
for _ in range(10):
data = np.random.rand(1, 32, 32, 3)
yield [data.astype(np.float32)]
converter = tf.lite.TFLiteConverter.from_saved_model("test_keras_model")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite = converter.convert()
Path("model.tflite").write_bytes(tflite)
So in conclusion this seems to be an LCE issue, and we will investigate further.
from compute-engine.
Interesting. Thank you for looking into it.
from compute-engine.
Related Issues (20)
- Automatic release builds for benchmarking binaries are broken HOT 2
- Deployment on Cortex-M HOT 2
- Tensor transform triggers dequantization HOT 6
- Error on import HOT 2
- Select indirect BGEMM kernels - Benchmarking grouped binary convolutions HOT 3
- LCEInterpreter and converter design HOT 1
- core dumped when number of threads is larger than 2 HOT 3
- Benchmarking custom model HOT 3
- Failed import 'org.tensorflow.lite.DataType' on Android project HOT 8
- `convert_keras_model()` does not work as expected for BinaryDenseNet37 Dilated and XNORNet HOT 1
- DoReFa quantizer with higher number of MACs/Ops, Grouped convs as custom ops on LCE 0.7.0 HOT 3
- Get Operator-wise Profiling Results HOT 1
- Error while performing benchmarking HOT 53
- Bool input tensor HOT 7
- extra model size induced by non-parameter layer HOT 1
- Fix Android benchmarker build
- Larq Compute Engine seems incompatible with tensorflow-lite-task-vision on Android (using the latest tensorflow lite demo code) HOT 2
- Dorefa model size and behavior with full precision model and ste_sign model HOT 13
- Cannot save compressed binary or ternary weights, saved as float32 parameters HOT 12
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from compute-engine.