utensor / utensor_cgen Goto Github PK

View Code? Open in Web Editor NEW

48.0 48.0 40.0 19.23 MB

C++ code generator for uTensor https://utensor-cgen.readthedocs.io/en/latest/

License: Apache License 2.0

Python 80.60% C++ 5.01% Makefile 0.12% Jupyter Notebook 13.33% Dockerfile 0.07% C 0.87%

deep-learning edge-computing embedded iot microcontroller python utensor

utensor_cgen's Introduction

uTensor - Test Release

Note: If you are looking for stable releases, checkout master.

Tutorials

Building Tutorial Examples

Make sure cmake is available on your system and run following commands:

$ mkdir build
$ cd build
$ cmake -DPACKAGE_TUTORIALS=ON ..
$ make

After the building process finish, you should find the tutorial executables under build/tutorials/ directory.

Follow instructions in the README.md in each tutorial directories to learn how to use uTensor.

Here are the links to the tutorials:

Introduction

What is it?

uTensor is an extremely light-weight machine learning inference framework built on Tensorflow and optimized for Arm targets. It consists of a runtime library and an offline tool that handles most of the model translation work. This repo holds the core runtime and some example implementations of operators, memory managers/schedulers, and more, and the size of the core runtime is only ~2KB!

Module	.text	.data	.bss
uTensor/src/uTensor/core	1275(+1275)	4(+4)	28(+28)
uTensor/src/uTensor/tensors	791(+791)	0(+0)	0(+0)

How does the uTensor workflow work?

A model is constructed and trained in Tensorflow. uTensor takes the model and produces a .cpp and .hpp file. These files contains the generated C++11 code needed for inferencing. Working with uTensor on the embedded side is as easy as copy-and-paste.

How does the uTensor runtime work?

Check out the detailed description here

Release Note

The rearchitecture is fundamentally centered around a few key ideas, and the structure of the code base and build tools naturally followed. Old key points:

Tensors describe how data is accessed and where from
- Performance of ops depends on which tensors are used
Operators are Tensor agnostic
- High performance ops can fetch blocks of data at once
Strive for low total power in execution
Low static and dynamic footprint, be small
- Low cost per Tensor throughout the entire system, since most generated models have 100+ including intermediates, also impacts dynamic footprint
- Lightweight class hierarchy
- Duh

New additional key ideas:

System safety
- All tensor metadata and actual data are owned in dedicated regions
  - This can either be user provided, or one we create
- We can guarantee that runtime will use no more than N bytes of RAM at code gen time or at compile time!
- Generally should not collide with userspace or system space memory, i.e. dont share heaps
- Generally implications: a safe runtime means we can safely update models remotely
- As many compile time errors as possible!
  - Mismatched inputs, outputs, or numbers
  - wrong sizes used
  - Impossible memory accesses
  - etc.
Clear, Concise, and Debuggable
- Previous iteration of uTensor relied almost too heavily on codegen, making changes to a model for any reason was near impossible
- A developer should be able to make changes to the model without relying on code gen
- A developer should be able to look at a model file and immediately understand what the graph looks like, without a massive amound of jumping around
- Default tensor interface should behave like a higher level language, but exploit the speed of C++
  - Generally: No more pointer bullshit! C is super error prone, fight me
    - Only specialized operators have access to raw data blocks, and these ops will be wicked fast
- Extensible, configurable, and optimize-outable error handling
- GDB debugging IS NOW TRIVIAL

As mentioned before, these key ideas need to be reflected not only in the code, but in the code structure in such a way that it is Maintainable, Hackable, and User-extensible. Pretty much everything in the uTensor runtime can be divided into two components: core, and everything else. The core library contains all the deep low level functionality needed for the runtime to make the above guarantees, as well as the interfaces required for concrete implementation. Furthermore, the overhead of this core engine should be negligible relative to the system operation. Everything not in the core library really should just be thought of a reasonable defaults. For example, tensor implementations, default operators, example memory allocators, or even possible logging systems and error handlers. These modules should be the primary area for future optimization, especially before model deployment.

High level API

using namespace uTensor;

const uint8_t s_a[4] = {1, 2, 3, 4};
const uint8_t s_b[4] = {5, 6, 7, 8};
const uint8_t s_c_ref[4] = {19, 22, 43, 50};

// These can also be embedded in models
// Recommend, not putting these on the heap or stack directly as they can be large
localCircularArenaAllocator<256> meta_allocator; // All tensor metadata gets stored here automatically, even when new is called
localCircularArenaAllocator<256> ram_allocator;  // All temporary storage gets allocated here

void foo() {
  // Tell the uTensor context which allocators to use
  Context::get_default_context()->set_metadata_allocator(&meta_allocator);
  Context::get_default_context()->set_ram_data_allocator(&ram_allocator);

  // Tensors are simply handles for accessing data as necessary, they are no larger than a pointer
  // RomTensor(TensorShape, data_type, data*);
  Tensor a = new /*const*/ RomTensor({2, 2}, u8, s_a);
  Tensor b = new /*const*/ RomTensor({2, 2}, u8, s_b);
  Tensor c_ref = new RomTensor({2,2}, u8, s_c_ref);
  // RamTensors are held internally and can be moved or cleared depending on the memory schedule (optional)
  Tensor c = new RamTensor({2, 2}, u8);

  // Operators take in a fixed size map of (input_name -> parameter), this gives compile time errors on input mismatching
  // Also, the name binding + lack of parameter ordering makes ctag jumping and GDB sessions significantly more intuitive
  MatrixMultOperator<uint8_t> mult_AB;
  mult_AB
      .set_inputs({{MatrixMultOperator<uint8_t>::a, a}, {MatrixMultOperator<uint8_t>::b, b}})
      .set_outputs({{MatrixMultOperator<uint8_t>::c, c}})
      .eval();

  // Compare results
  TensorShape& c_shape = c->get_shape();
  for (int i = 0; i < c_shape[0]; i++) {
    for (int j = 0; j < c_shape[1]; j++) {
      // Just need to cast the access to the expected type
      if( static_cast<uint8_t>(c(i, j)) != static_cast<uint8_t>(c_ref(i, j)) ) {
        printf("Oh crap!\n");
        exit(-1);
      }
    }
  }
}

Building and testing locally

git clone [email protected]:uTensor/uTensor.git
cd uTensor/
git checkout proposal/rearch
git submodule init
git submodule update
mkdir build
cd build/
cmake -DPACKAGE_TESTS=ON -DCMAKE_BUILD_TYPE=Debug ..
make
make test

Building and running on Arm Mbed OS

The uTensor core library is configured as a mbed library out of the box, so we just need to import it into our project and build as normal.

mbed new my_project
cd my_project
mbed import https://github.com/uTensor/uTensor.git
# Create main file
# Run uTensor-cli workflow and copy model directory here
mbed compile # as normal

Building and running on Arm systems

TODO Note: CMake Support for Arm is currently experimental https://stackoverflow.com/questions/46916611/cross-compiling-googletest-for-arm64

Default build

mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=../extern/CMSIS_5/CMSIS/DSP/gcc.cmake  ..

With CMSIS optimized kernels

mkdir build && cd build
cmake -DARM_PROJECT=1 -DCMAKE_BUILD_TYPE=Debug -DCMAKE_TOOLCHAIN_FILE=../extern/CMSIS_5/CMSIS/DSP/gcc.cmake  ..

utensor_cgen's People

Contributors

Stargazers

Watchers

Forkers

mbartling knight-x edoffagne wchsieh lzufalcon hal2001 sanyaade-machine-learning ecscherrer mlubinsky-arm yuezha01 songwenbin247 wayne175 hpplinux senseserv-labs salmonfresh jakubvalis88 neil-tan asi-sx tung7970 aramrami zeta1999 navigateai flavio58it janjongboom dboyliao lewuathe ishine ufo2011 aabadie strakly stevenchang8 yellowelephanthive victorromeo faustpy python-repository-hub yuko29 albertkao227 jack482653 iq-scm

utensor_cgen's Issues

Feature: add TransposeOperator

^^^

TypeError: ("'backend' must be <type 'unicode'> (got 'tensorflow' that is a <type 'str'>)."

uTensor issue No. 119
uTensor/uTensor#119

Attribute error module has no attribute

I installed with pip but I am getting below Error

I tried with upgrade protobuf . If i tried with specified .whl then i am getting an import error. so I upgraded tensorflow to 1.13.

[Refactoring] Shared configuration within frontend parser, graph/ops lower and backend

Shared configuration such as padding and activation.
ex:
TF 1.x pb file: padding is raw string such as 'VALID', 'SAME', ...etc
TF 2.x tflite file: padding is int index such as '0' ('UNKNOWN'), '1' ('VALID'), ...etc

Need a better way to share such configurations

[re-arch-rc1] Conv2D SAME padding gets parsed as UNKNOWN

Putting here as a todo

'TypeError: init() missing 1 required positional argument: 'output_nodes''

I got this message when trying to convert the simple_mnist.pb file using the command line:

utensor-cli convert --output-nodes=y_pred simple_mnist.pb

What is the correct way of running this conversion?

Pre/Post Transform/Apply Observer

Implement an interface for users to register pre/post transformation/apply observers for BackendPart and Transformer.

@classmethod
def regiseter_prior_observer(cls, callback):
    cls.prior_observer.append(callback)

@classmethod
def register_post_observer(cls, callback):
    # similar code

def apply(self, ugraph):
    for callback in type(self).prior_observer:
        callback(ugraph)
    # do work with ugraph
    for callback in type(self).prior_observer:
        callback(ugraph)

Unify the interface for users to inspect the graph, such as inject pdb break point in the callback.
Note that these observers should not introduce side-effect to the graph.

Naming issue in utensor_cgen

The generated model code had around 51 errors where instead of writing for example t_sequential_1dense_9BiasAddReadVariableOpresource0 it would write t_sequential_1dense_9;BiasAddReadVariableOpresource0 so I had to remove the ; in each Tensor name.

Using list.find instead of list.index()

Pretty sure this is a simple mistake between matlab and python. Should be using list.index in base.py line 261 in click-cli branch.

(venv) micbar02 @ C02SP20YFVH4 ~/git/tf/rom-demo/utensor-mnist-demo (f/rom-tensor)
└─ $ ▶ utensor-cli convert tensorflow-models/mnist_model/deep_mlp.pb --output-nodes y_pred
/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/lib/python3.6/importlib/_bootstrap.py:219: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6
  return f(*args, **kwds)
2018-09-12 11:56:54.910455: I tensorflow/tools/graph_transforms/transform_graph.cc:318] Applying sort_by_execution_order
Traceback (most recent call last):
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/bin/utensor-cli", line 11, in <module>
    load_entry_point('utensor-cgen', 'console_scripts', 'utensor-cli')()
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/cli.py", line 85, in convet_graph
    generator.generate(model_path)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/code_generator.py", line 48, in generate
    self._generate_from_pb(src_fname)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/code_generator.py", line 71, in _generate_from_pb
    ugraph = uTensorGraph(graph_def, self.output_nodes)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/ir/base.py", line 218, in __init__
    self._init_from_graph_def(graph, output_nodes)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/ir/base.py", line 348, in _init_from_graph_def
    ugraph=self)
  File "<attrs generated init 2c40c8c78e6d0c7eddf61ca4dcbd323e4c642a3c>", line 17, in __init__
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/ir/base.py", line 178, in __attrs_post_init__
    self.ugraph.add_op(self)
  File "/Users/micbar02/git/tf/rom-demo/utensor-mnist-demo/venv/src/utensor-cgen/utensor_cgen/ir/base.py", line 261, in add_op
    op_idx = self.topo_order.find(op_name)
AttributeError: 'list' object has no attribute 'find'

utensor-cgen does not work with latest tensorflow (2.2.0)

I am getting this error when trying to run utensor_cgen with tensorflow 2.2.0:
pkg_resources.DistributionNotFound: The 'tensorflow==2.1.0' distribution was not found and is required by utensor-cgen
Is this a hard requirement or can it be updated to support tf 2.2.0? (edited)

the snippet output is different from what quantize conv expect

in the conv snippet,
ctx.add(new RamTensor(), "out_conv_eightbit_quantized_conv:0", 2);
ctx.add(new RamTensor(), "out_conv_eightbit_quantized_conv:1", 2); (output min)
ctx.add(new RamTensor(), "out_conv_eightbit_quantized_conv:2", 2); (output max)
ctx.push(new QntConvOp<uint8_t, uint8_t, float>({ 1, 2, 2, 1 }, VALID),
{ "x_quint8_const:0", "w_filter_quint8_const:0", "x_min:0", "x_max:0", "w_filter_min:0", "w_filter_max:0" },
{ "out_conv_eightbit_quantized_conv:0", "out_conv_eightbit_quantized_conv:1", "out_conv_eightbit_quantized_conv:2" });

in matmul snippet,
ctx.add(new RamTensor(), "z_eightbit_quantized_mat_mul:0", 2);
ctx.add(new RamTensor({1}), "z_eightbit_quantized_mat_mul:1", 2);
ctx.add(new RamTensor({1}), "z_eightbit_quantized_mat_mul:2", 2);
ctx.push(new QntMatMulOp<uint8_t, uint8_t, int>(),
{ "w_quint8_const:0", "w_min:0", "w_max:0", "x_quint8_const:0", "x_min:0", "x_max:0" },
{ "z_eightbit_quantized_mat_mul:0", "z_eightbit_quantized_mat_mul:1", "z_eightbit_quantized_mat_mul:2" });

The dim of output min and output max of Quantized MatMul is given by codegen , and it is as same as Quantized Conv2d. however, the second and third output does not have dim.

Move snippets_cfg into Snippet objects

snippets_cfg is redundant as these header dependancies can be encoded in the snippet objects directly. Then the header includes for a context are the union of includes of each of these objects

Codegen Unit tests

Update the tests to cover the recent changes to the code generator: IR pipeline

Automatic output nodes finding fail

Without --output-nodes option, utensor-cli fails to find the output nodes correctly.
You can test it with current demo project

https://github.com/uTensor/utensor-mnist-demo

High-Level Graph Builder

High level api for declaring IR-space graph.

Use case:

pattern graph declaration for the graph matcher
user-defined graph and code generation.

ROM Tensor constants show up in ./models and not ./constants

Hyphens in TF operation names being written into generated C++ code

using : utensor-cli, version 0.3.5

I've just found that if Tensorflow operation names contain hyphens these are translated directly into c++ identifiers in the generated code, resulting in invalid code.

I couldn't spot an obvious C identifier sanitiser function in the code, but hopefully this will be an easy fix.

init_env should be moved to code gen

Right now init_env() is platform specific. This needs to be either made into a class with empty default, or moved completely into code_gen

uTensor/uTensor#78 (comment)

Test Release - Readme

Release Note
Release Schedule

CLI list-support-ops fails

I've noticed that the command utensor-cli list-support-ops fails as the MissingOperator is incorrectly registered without a namespace.

(utensor_cgen_vr)ian@henry:~/Documents/source/utensor_cgen_vr_fix$ utensor-cli list-support-ops
2020-11-27 10:44:10.040446: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
[WARNING quantize.py <module> @ 12] trying to import deprecated quantization transformer
Traceback (most recent call last):
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/bin/utensor-cli", line 11, in <module>
    load_entry_point('utensor-cgen', 'console_scripts', 'utensor-cli')()
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/utensor_cgen/cli/backend.py", line 55, in list_support_ops
    pformat(backend.support_ops),
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/utensor_cgen/backend/utensor/_backend_impl.py", line 84, in support_ops
    return OperatorFactory.support_op_types()
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/utensor_cgen/backend/utensor/code_generator/rearch/_operators/_base.py", line 48, in support_op_types
    return set([
  File "/home/ian/Documents/source/utensor_cgen_vr/.venv/lib/python3.8/site-packages/utensor_cgen/backend/utensor/code_generator/rearch/_operators/_base.py", line 50, in <listcomp>
    for namespaces, op_type in cls._operators.keys()
ValueError: too many values to unpack (expected 2)

A fix is possible by supplying namespaces

diff --git a/utensor_cgen/backend/utensor/code_generator/rearch/_operators/_impls.py b/utensor_cgen/backend/utensor/code_generator/rearch/_operators/_impls.py
index a6f65bc..70b3c1e 100644
--- a/utensor_cgen/backend/utensor/code_generator/rearch/_operators/_impls.py
+++ b/utensor_cgen/backend/utensor/code_generator/rearch/_operators/_impls.py
@@ -822,8 +822,8 @@ class _FullyConnectedOperator(_CommonParams):
       nested_namespaces=type(self).namespaces,
     )
 
-
 class _MissingOperator(_Operator):
+  namespaces = ["ReferenceOperators","TflmSymQuantOps"]
   op_type = "_MissingOperator"
 
   def get_declare_snippet(self, op_var_name, with_const_params=True):
@@ -835,5 +835,5 @@ class _MissingOperator(_Operator):
   def get_construct_snippet(self, op_var_name):
     return None
 
-
-OperatorFactory._operators[_MissingOperator.op_type] = _MissingOperator
+for namespace in _MissingOperator.namespaces:
+  OperatorFactory._operators[((namespace,), _MissingOperator.op_type)] = _MissingOperator

pipenv defaults to >= Python 3.6

We need to default this to Python 2.7 to be compliant with mbed.

ONNX frontend bug with MissingOperator

In #133 Missing Operator was found to be incorrectly using dict key, due to missing Namespace term.
utensor_cgen/backend/utensor/code_generator/rearch/_operators/_impls.py

class _MissingOperator(_Operator):
  op_type = "_MissingOperator"

  def get_declare_snippet(self, op_var_name, with_const_params=True):
    return None

  def get_eval_snippet(self, op_var_name, op_info, tensor_var_map):
    return MissingOpEvalSnippet(op_info, op_var_name, tensor_var_map)

  def get_construct_snippet(self, op_var_name):
    return None


OperatorFactory._operators[_MissingOperator.op_type] = _MissingOperator # <== key should be a 2 term tensor, (namespace, op_type)

Move Quantization params to constants file

Right now these are small, but we can improve stack usage by moving these constants to a constants file.

[Simple] Weight header snippet missing include guards

https://github.com/uTensor/utensor_cgen/blob/develop/utensor_cgen/backend/snippets/templates/containers/weight_header.hpp missing include guards similar to

https://github.com/uTensor/utensor_cgen/blob/develop/utensor_cgen/backend/snippets/templates/snippets/get_ctx.hpp

It's an easy fix and I'd normally get to it. For now, documenting

A more fool-proof frontend parser selection scheme

Current frontend parser is selected via given graph file extension (such as .pb for an GraphDefParser...etc)
@mbartling and @neil-tan are talking about a more fool-proof scheme at PR #49.
Planning to implement it in the future PRs

Feature request Average Pooling and Min Pooling support

MinPooling appears to be only partially implemented. Avg Pooling is not implemented at all.
May we add? Thanks

'NoneType' object is not callable when converting a graph

I am following the demo in here but when I try to generate the graph I get the following error:

Traceback (most recent call last):
  File "/home/siraj/.local/bin/utensor-cli", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/dist-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/cli.py", line 60, in convet_graph
    from utensor_cgen.code_generator import CodeGenerator
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/code_generator.py", line 11, in <module>
    from .operators import OperatorFactory
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/operators.py", line 9, in <module>
    from utensor_cgen.transformer.optimizer import RefCntOptimizer
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/transformer/__init__.py", line 2, in <module>
    from .ns_transformer import *
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/transformer/ns_transformer.py", line 11, in <module>
    from utensor_cgen.ir import OperationInfo, uTensorGraph
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/ir/__init__.py", line 1, in <module>
    from .base import *
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/ir/base.py", line 23, in <module>
    from .converter import AttrValueConverter, ConverterFactory
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/ir/converter.py", line 149, in <module>
    class GenericTensorShapeMixin(Converter):
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/ir/converter.py", line 151, in GenericTensorShapeMixin
    class GenericType(object):
  File "/home/siraj/.local/lib/python2.7/site-packages/utensor_cgen/ir/converter.py", line 154, in GenericType
    @list_view.validator
TypeError: 'NoneType' object is not callable

I had this error before the current version, when weights were generated as .idx files. Thought it would be solved but no luck.
Any idea of what might be the cause?

Conversion from ONNX model fails with Relu

Simple models with Relu fail when converting via ONNX frontend, as Relu shape and dtype information is missing (None).

May I please confirm if ONNX is intended to be supported?

frontend/onnx.python

... # [line277]
  @staticmethod
  def _handle_relu(op_info):
    input_tensor = op_info.input_tensors[0]
    op_info.output_tensors[0].dtype = input_tensor.dtype
    op_info.output_tensors[0].shape = input_tensor.shape[:]
...

test make error

cp: cannot create regular file 'idx_data/': Not a directory

os: ubuntu 16

utensor-cli show still uses legacy backend

setuptools not installing depenencies using Readme instructions

README.md instructs the user to call 'python setup.py install' however this doesn't work. Recommend that the README.md is updated to follow CI script instruction 'pip3 install .'

unsupported op type in uTensor: Tanh

When running the tool I get this error. It appears in https://github.com/uTensor/utensor_cgen/blob/master/utensor_cgen/operators.py#L158-L170 that tanh isnt listed. Am I missing something here or is this tool still being developed. Thank you.

Last inference node does not have eval

The very last node in every graph is a Placeholder so no is_eval gets set for output nodes. Should be an easy fix

generate const pointer for binary tensor

In order to remove the dependency of sd card, we compile data which is const pointer in flash area. We may need to transform protocol buffer weights to const array pointer in c++. @dboyliao

Python2 broken

Problem File "/usr/local/lib/python2.7/site-packages/utensor_cgen/_snippets_base.py", line 2, in
from abc import ABC

Most likely this needs to be changed to ABCMeta

How to re-use a WrappedRamTensor and provide new input data

Hi,

I am beginner with uTensor and embedded C/C++. I have a little experience around Python and wanted to study development of intelligence at the edge by building models in Python and deploying on Cortex boards. @neil-tan helped me understand the basics and I used his tutorial to begin this understanding.

So passing the input data, wrapped in a WrappedRamTensor works great the 1st time. When I try to provide another instance of input data and do a second pass - it gives me an error. What could I be doing wrong? Does input data tensor have to be thread-safe?

Output with the error

[1] First instance of prediction: For input 10.000
 Input: 10.000 | Expected: 72.999 | Predicted: 71.871

 [2] Second instance of prediction: For input 40.000
[Error] lib\uTensor\core\context.cpp:96 @push Tensor "Placeholder:0" not found

Source code

  // A single value is being used so Tensor shape is {1, 1} 
  float input_data[1] = {10.0}; 
  Tensor* input_x = new WrappedRamTensor<float>({1, 1}, (float*) &input_data);

  // Value predicted by LR model
  S_TENSOR pred_tensor;         
  float pred_value;             
  
  // Compute model value for comparison
  float W = 6.968;
  float B = 3.319;
  float y;

  // First pass: Constant value 10.0 and evaluate first time:
  printf("\n [1] First instance of prediction: For input %4.3f", input_data[0]);
  get_LR_model_ctx(ctx, input_x);                   // Pass the 'input' data tensor to the context
  pred_tensor = ctx.get("y_pred:0");                // Get a reference to the 'output' tensor
  ctx.eval();                                       // Trigger the inference engine
  pred_value = *(pred_tensor->read<float>(0, 0));   // Get the result back

  y = W * input_data[0] + B;                        // Expected output

  printf("\n Input: %04.3f | Expected: %04.3f | Predicted: %04.3f", input_data[0], y, pred_value);
  
  // Second pass: Change input data and re-evalaute:
  input_data[0] = 40.0;
  printf("\n\n [2] Second instance of prediction: For input %4.3f\n", input_data[0]);
  get_LR_model_ctx(ctx, input_x);                   // Pass the 'input' data tensor to the context
  pred_tensor = ctx.get("y_pred:0");                // Get a reference to the 'output' tensor
  ctx.eval();                                       // Trigger the inference engine
  pred_value = *(pred_tensor->read<float>(0, 0));   // Get the result back

  y = W * input_data[0] + B;                        // Expected output

  printf("\n Input: %04.3f | Expected: %04.3f | Predicted: %04.3f", input_data[0], y, pred_value);
  
  printf("\n -------------------------------------------------------------------\n");
  return 0;
}

Allow inline optimizer to spit out a single variable

To do weight updates it's much easier if only a single block of memory is responsible for all the weights. One way of doing this would be to concat all the variables that the inline optimizer spits out, and reference the memory directly from the trained.cpp file. Bonus points if you can pass a block of memory into trained.cpp, so you can place them in memory-mapped flash (on many ST boards the QSPI is mapped at 0x90000000).

I figured at first that using the existing file system functions would help with the latter, but LittleFS on (at least my) QSPI boards is extremely slow and inferencing time goes from 320 ms. to 2000 ms. on DISCO-L475VG-IOT01A because the slow fopen() calls (see ARMmbed/mbed-os#11085).

pipenv forcing python 2 breaks cli

Steps to reproduce

pipenv --two install
pipenv shell
utensor-cli --help

Results in:

pkg_resources.DistributionNotFound: The 'backports.weakref>=1.0rc1' distribution was not found and is required by tensorflow