onnx / onnx Goto Github PK

Open standard for machine learning interoperability

License: Apache License 2.0

Shell 0.04% Python 55.44% C++ 43.15% Jupyter Notebook 0.62% C 0.04% CMake 0.68% Batchfile 0.01% PowerShell 0.02%

deep-learning deep-neural-networks neural-network onnx pytorch mxnet tensorflow keras scikit-learn ml

onnx's Introduction

Open Neural Network Exchange (ONNX) is an open ecosystem that empowers AI developers to choose the right tools as their project evolves. ONNX provides an open source format for AI models, both deep learning and traditional ML. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types. Currently we focus on the capabilities needed for inferencing (scoring).

ONNX is widely supported and can be found in many frameworks, tools, and hardware. Enabling interoperability between different frameworks and streamlining the path from research to production helps increase the speed of innovation in the AI community. We invite the community to join us and further evolve ONNX.

Use ONNX

Learn about the ONNX spec

Overview
ONNX intermediate representation spec
Versioning principles of the spec
Operators documentation
Operators documentation (latest release)
Python API Overview

Programming utilities for working with ONNX Graphs

Contribute

ONNX is a community project and the open governance model is described here. We encourage you to join the effort and contribute feedback, ideas, and code. You can participate in the Special Interest Groups and Working Groups to shape the future of ONNX.

Check out our contribution guide to get started.

If you think some operator should be added to ONNX specification, please read this document.

Community meetings

The schedules of the regular meetings of the Steering Committee, the working groups and the SIGs can be found here

Community Meetups are held at least once a year. Content from previous community meetups are at:

Discuss

We encourage you to open Issues, or use Slack (If you have not joined yet, please use this link to join the group) for more real-time discussion.

Stay up to date with the latest ONNX news. [Facebook] [Twitter]

Roadmap

A roadmap process takes place every year. More details can be found here

Installation

Official Python packages

ONNX released packages are published in PyPi.

pip install onnx  # or pip install onnx[reference] for optional reference implementation dependencies

ONNX weekly packages are published in PyPI to enable experimentation and early testing.

vcpkg packages

onnx is in the maintenance list of vcpkg, you can easily use vcpkg to build and install it.

git clone https://github.com/microsoft/vcpkg.git
cd vcpkg
./bootstrap-vcpkg.bat # For powershell
./bootstrap-vcpkg.sh # For bash
./vcpkg install onnx

Conda packages

A binary build of ONNX is available from Conda, in conda-forge:

conda install -c conda-forge onnx

Build ONNX from Source

Before building from source uninstall any existing versions of onnx pip uninstall onnx.

c++17 or higher C++ compiler version is required to build ONNX from source. Still, users can specify their own CMAKE_CXX_STANDARD version for building ONNX.

If you don't have protobuf installed, ONNX will internally download and build protobuf for ONNX build.

Or, you can manually install protobuf C/C++ libraries and tools with specified version before proceeding forward. Then depending on how you installed protobuf, you need to set environment variable CMAKE_ARGS to "-DONNX_USE_PROTOBUF_SHARED_LIBS=ON" or "-DONNX_USE_PROTOBUF_SHARED_LIBS=OFF". For example, you may need to run the following command:

Linux:

export CMAKE_ARGS="-DONNX_USE_PROTOBUF_SHARED_LIBS=ON"

Windows:

set CMAKE_ARGS="-DONNX_USE_PROTOBUF_SHARED_LIBS=ON"

The ON/OFF depends on what kind of protobuf library you have. Shared libraries are files ending with *.dll/*.so/*.dylib. Static libraries are files ending with *.a/*.lib. This option depends on how you get your protobuf library and how it was built. And it is default OFF. You don't need to run the commands above if you'd prefer to use a static protobuf library.

Windows

If you are building ONNX from source, it is recommended that you also build Protobuf locally as a static library. The version distributed with conda-forge is a DLL, but ONNX expects it to be a static library. Building protobuf locally also lets you control the version of protobuf. The tested and recommended version is 3.21.12.

The instructions in this README assume you are using Visual Studio. It is recommended that you run all the commands from a shell started from "x64 Native Tools Command Prompt for VS 2019" and keep the build system generator for cmake (e.g., cmake -G "Visual Studio 16 2019") consistent while building protobuf as well as ONNX.

You can get protobuf by running the following commands:

git clone https://github.com/protocolbuffers/protobuf.git
cd protobuf
git checkout v21.12
cd cmake
cmake -G "Visual Studio 16 2019" -A x64 -DCMAKE_INSTALL_PREFIX=<protobuf_install_dir> -Dprotobuf_MSVC_STATIC_RUNTIME=OFF -Dprotobuf_BUILD_SHARED_LIBS=OFF -Dprotobuf_BUILD_TESTS=OFF -Dprotobuf_BUILD_EXAMPLES=OFF .
msbuild protobuf.sln /m /p:Configuration=Release
msbuild INSTALL.vcxproj /p:Configuration=Release

Then it will be built as a static library and installed to <protobuf_install_dir>. Please add the bin directory(which contains protoc.exe) to your PATH.

set CMAKE_PREFIX_PATH=<protobuf_install_dir>;%CMAKE_PREFIX_PATH%

Please note: if your protobuf_install_dir contains spaces, do not add quotation marks around it.

Alternative: if you don't want to change your PATH, you can set ONNX_PROTOC_EXECUTABLE instead.

set CMAKE_ARGS=-DONNX_PROTOC_EXECUTABLE=<full_path_to_protoc.exe>

Then you can build ONNX as:

git clone https://github.com/onnx/onnx.git
cd onnx
git submodule update --init --recursive
# prefer lite proto
set CMAKE_ARGS=-DONNX_USE_LITE_PROTO=ON
pip install -e .

Linux

First, you need to install protobuf. The minimum Protobuf compiler (protoc) version required by ONNX is 3.6.1. Please note that old protoc versions might not work with CMAKE_ARGS=-DONNX_USE_LITE_PROTO=ON.

Ubuntu 20.04 (and newer) users may choose to install protobuf via

apt-get install python3-pip python3-dev libprotobuf-dev protobuf-compiler

In this case, it is required to add -DONNX_USE_PROTOBUF_SHARED_LIBS=ON to CMAKE_ARGS in the ONNX build step.

A more general way is to build and install it from source. See the instructions below for more details.

Installing Protobuf from source

Debian/Ubuntu:

  git clone https://github.com/protocolbuffers/protobuf.git
  cd protobuf
  git checkout v21.12
  git submodule update --init --recursive
  mkdir build_source && cd build_source
  cmake ../cmake -Dprotobuf_BUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_POSITION_INDEPENDENT_CODE=ON -Dprotobuf_BUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=Release
  make -j$(nproc)
  make install

CentOS/RHEL/Fedora:

  git clone https://github.com/protocolbuffers/protobuf.git
  cd protobuf
  git checkout v21.12
  git submodule update --init --recursive
  mkdir build_source && cd build_source
  cmake ../cmake  -DCMAKE_INSTALL_LIBDIR=lib64 -Dprotobuf_BUILD_SHARED_LIBS=OFF -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_POSITION_INDEPENDENT_CODE=ON -Dprotobuf_BUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=Release
  make -j$(nproc)
  make install

Here "-DCMAKE_POSITION_INDEPENDENT_CODE=ON" is crucial. By default static libraries are built without "-fPIC" flag, they are not position independent code. But shared libraries must be position independent code. Python C/C++ extensions(like ONNX) are shared libraries. So if a static library was not built with "-fPIC", it can't be linked to such a shared library.

Once build is successful, update PATH to include protobuf paths.

Then you can build ONNX as:

git clone https://github.com/onnx/onnx.git
cd onnx
git submodule update --init --recursive
# Optional: prefer lite proto
export CMAKE_ARGS=-DONNX_USE_LITE_PROTO=ON
pip install -e .

Mac

export NUM_CORES=`sysctl -n hw.ncpu`
brew update
brew install autoconf && brew install automake
wget https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protobuf-cpp-3.21.12.tar.gz
tar -xvf protobuf-cpp-3.21.12.tar.gz
cd protobuf-3.21.12
mkdir build_source && cd build_source
cmake ../cmake -Dprotobuf_BUILD_SHARED_LIBS=OFF -DCMAKE_POSITION_INDEPENDENT_CODE=ON -Dprotobuf_BUILD_TESTS=OFF -DCMAKE_BUILD_TYPE=Release
make -j${NUM_CORES}
make install

Once build is successful, update PATH to include protobuf paths.

Then you can build ONNX as:

git clone --recursive https://github.com/onnx/onnx.git
cd onnx
# Optional: prefer lite proto
set CMAKE_ARGS=-DONNX_USE_LITE_PROTO=ON
pip install -e .

Verify Installation

After installation, run

python -c "import onnx"

to verify it works.

Common Build Options

For full list refer to CMakeLists.txt

Environment variables

USE_MSVC_STATIC_RUNTIME should be 1 or 0, not ON or OFF. When set to 1 onnx links statically to runtime library. Default: USE_MSVC_STATIC_RUNTIME=0
DEBUG should be 0 or 1. When set to 1 onnx is built in debug mode. or debug versions of the dependencies, you need to open the CMakeLists file and append a letter d at the end of the package name lines. For example, NAMES protobuf-lite would become NAMES protobuf-lited. Default: Debug=0

CMake variables

ONNX_USE_PROTOBUF_SHARED_LIBS should be ON or OFF. Default: ONNX_USE_PROTOBUF_SHARED_LIBS=OFF USE_MSVC_STATIC_RUNTIME=0 ONNX_USE_PROTOBUF_SHARED_LIBS determines how onnx links to protobuf libraries.
- When set to ON - onnx will dynamically link to protobuf shared libs, PROTOBUF_USE_DLLS will be defined as described here, Protobuf_USE_STATIC_LIBS will be set to OFF and USE_MSVC_STATIC_RUNTIME must be 0.
- When set to OFF - onnx will link statically to protobuf, and Protobuf_USE_STATIC_LIBS will be set to ON (to force the use of the static libraries) and USE_MSVC_STATIC_RUNTIME can be 0 or 1.
ONNX_USE_LITE_PROTO should be ON or OFF. When set to ON onnx uses lite protobuf instead of full protobuf. Default: ONNX_USE_LITE_PROTO=OFF
ONNX_WERROR should be ON or OFF. When set to ON warnings are treated as errors. Default: ONNX_WERROR=OFF in local builds, ON in CI and release pipelines.

Common Errors

Note: the import onnx command does not work from the source checkout directory; in this case you'll see ModuleNotFoundError: No module named 'onnx.onnx_cpp2py_export'. Change into another directory to fix this error.
If you run into any issues while building Protobuf as a static library, please ensure that shared Protobuf libraries, like libprotobuf, are not installed on your device or in the conda environment. If these shared libraries exist, either remove them to build Protobuf from source as a static library, or skip the Protobuf build from source to use the shared version directly.
If you run into any issues while building ONNX from source, and your error message reads, Could not find pythonXX.lib, ensure that you have consistent Python versions for common commands, such as python and pip. Clean all existing build files and rebuild ONNX again.

Testing

ONNX uses pytest as test driver. In order to run tests, you will first need to install pytest:

pip install pytest nbval

After installing pytest, use the following command to run tests.

pytest

Development

Check out the contributor guide for instructions.

License

Apache License v2.0

Code of Conduct

ONNX Open Source Code of Conduct

onnx's People

Contributors

Stargazers

Watchers

Forkers

ezyang doneladams codeaudit shyamalschandra raonyguimaraes lihua213 sensorbotcluster qwliu honghuac sarangan live0717 superliujian jason124957 tangyiyong limberc diz-vara world2005 stoneyang shizihao123 caomw zgsxwsdxg micseb kfriesth lcskrishna zareefahmed rahulsherwan366 peterbai624 andreh7 intfrr alabazatam zdevito kgl-prml geoffreyporto frankatmech eliver8801 little1tow robertomalatesta deep-learning-fun cometyang rewathkafley skyformat99 mutual-ai alxsoares houseroad nakobi bddppq ifarhankhan star-python paprapara jammyzhou delta2323 vietkungfu jassonvia caozhengquan missfall itsarbit brettkoonce annielytix donbox ml-lab linkerzhang ahmedfadhil lxq2537664558 arnablegend shamoya onisimchukv zhixinshu neuroidss kekedan tony32769 dharmateja22 ctk0418 b2220333 briorig rosieb9 windyjune hbcbh1999 maniacs-oss aromazyl sibylfiresoul qeai babyformula hulalazz oztc moyanzitto krpopo imai-lm michaellee1994 agoila 2php tjingrant ankcrimson eachsaj scholltan mkolod jamesr66a dzhulgakov sagafav seanhsieh 10imaging

onnx's Issues

Protobuf naming consistency / nested protos

We've been keeping an eye on annoyances when we adjust PyTorch for latest changes in the ONNX proto, and we noticed some for #51

Not all protobuf messages have the Proto suffix anymore; in particular, the nested protobuf Dimension doesn't have the suffix. This makes it look strange when compared with all the other protobuf messages. Can we make this consistent?
Additionally, TypeProto defines nested protobufs. I wasn't sure how this was going to look in tooling, but we have discovered what happens with nanopb: you now get extremely ugly protobuf names like TypeProto_TensorTypeProto. This is especially ugly because Type is redundantly specified twice. Can we just put it at the top level of the protobuf?

ir_version, publisher_tag, and publisher_version should appear in as the initial fields in GraphProto

It's typical to put versioning fields at the beginning of a file/message format to allow implementations to efficiently sniff the version prior to parsing the entire buffer.

Doing so would a breaking change, but it's still quite early and we may have time.

ONNX design/larger feature process

As seen in the discussion in #9, I think there is some ambiguity when it comes to larger design/feature discussions and how to proceed. Eg:

We've definitely discussed about functions and subgraphs in the design discussions.

...

It was discussed quite a bit in our design meetings, and we even had some preliminary design. In the end, we decided to release only the parts that we felt safe and essential to get things started.

Is the plan to continue these design discussions in a public forum now that the project is public? My $0.02 is that we can instead do something like the Kubernetes proposal doc or the Rust language RFC process for larger features/IR design decisions.

Does a proposal doc PR as a concrete starting point sound like a reasonable idea for larger ideas/discussions for ONNX?

Are there any plans to add MXNet support?

Are there any plans to add Apache MXNet support and if so by when?

Thanks,
Jasmeet Bhatia

When will ncnn be supported?

hi
ncnn is an inference framework optimized for mobile platform, and is suitable to be implemented as onnx backend I think.
So when will ncnn be supported ?
Are there any documents on how to implement a backend?
http://github.com/tencent/ncnn

Unidentified "legacy functions"

I'm having some trouble exporting a model from Pytorch - I don't think it contains any legacy functions, but some code added to Onnx a few days ago seems to think it does:

(Pdb) torch.onnx.export(model, image, 'model.proto', verbose=True)
*** RuntimeError: Tracing of legacy functions is not supported
(Pdb) torch.onnx.export(model, image, 'model.proto')
*** RuntimeError: /mnt/nfs/users/dacart/pytorch/torch/csrc/jit/tracer.h:84: getTracingState: Assertion var_state == state failed.

To allow me to debug this, would it be possible to print some informative details of the function(s) that are causing the trouble? Thanks.

"conda install -c ezyang onnx" fails

The readme.md instruction for installing using Conda errors out with:

PackageNotFoundError: Packages missing in current channels:

  - onnx

We have searched for the packages in the following channels:

  - https://conda.anaconda.org/ezyang/osx-64
  - https://conda.anaconda.org/ezyang/noarch
  - https://repo.continuum.io/pkgs/main/osx-64
  - https://repo.continuum.io/pkgs/main/noarch
  - https://repo.continuum.io/pkgs/free/osx-64
  - https://repo.continuum.io/pkgs/free/noarch
  - https://repo.continuum.io/pkgs/r/osx-64
  - https://repo.continuum.io/pkgs/r/noarch
  - https://repo.continuum.io/pkgs/pro/osx-64
  - https://repo.continuum.io/pkgs/pro/noarch

support for locally connected layer (SpatialConvolutionLocal in pytorch)

are there any plans to support spatial convolution from pyTorch?
I saved my network using onnx from pytorch, and it includes SpatialConvolutionLocal node. When loading it from onnx, it throws the following error:

File ".../python2.7/site-packages/onnx/checker.py", line 36, in check_node 'Node op_type {} not recognized by onnx.'.format(node.op_type)) NameError: Node op_type SpatialConvolutionLocal not recognized by onnx.

The forward function of my network definition is like this:

   def forward(self, x):

    local = torch.nn.backends.thnn.backend.SpatialConvolutionLocal.apply

    x = local(x, self.W1, self.B1, 7, 7, 1, 1, 0, 0, 21, 29, 15,23)

    ...

    return x

Thanks!

Table of contents for operators

It can be sometimes tricky to jump to the definition of an operator like "Pad", where the string "pad" occurs multiple times in Operators.md. A modest usability improvement would be to add a table of contents, so that string search first discovers operator names, before a full text search.

Operator: Softmax

Requested in pytorch/pytorch#2747

Is Dlib support in the plans?

Is a to-Dlib support in the plans?
I find it very effective for pre-trained models deployment, especially in resource-constrained environments, so a Dlib ouputputter wold be really handy.

Thank you for this project.

-- Rob

Bundle TensorShapeProto with Tensor protos

Shape is a property of tensor types, one that no other type in the type system has. Rather than placing TensorShapeProto everywhere TypeProto appears, it would make more sense to move TensorShapeProto inside the Tensor messages.

This is a syntactic change, and should have any impact on the semantics or processing of shapes for inferencing or otherwise.

Element wise binary OPs / broadcasting

Why in element wise binary OP, we need a broadcast flag and axis attribute? Should we just follow numpy broadcast semantic by default?

Plans for Supporting Torch

@soumith: I know that there are plans for supporting PyTorch. Are there any plans for supporting Torch? Or will it be supported indirectly via a Torch to PyTorch converter?

torch->pytorch->caffe2

Is it possible using torch model and load it by pytorch and then turn it into caffe2 model?

Differences between BackendRep and Backend?

As the title said, what's the concrete meaning between them?
In this webpage,
https://github.com/onnx/onnx-caffe2/blob/master/onnx_caffe2/backend.py
It seems that Caffe2Backend extends more functions than BackendRep. th
what is the meaning of "Rep"?
Thank you in advance.

Composable Operators

There was an off-topic discussion in #3 about what the right level of abstraction is regarding ONNX operators. I'd like to further that discussion here and propose a potential solution.

This classic tradeoff was summarized by @soumith as:

Side 1: a mathematical IR, so that things are extremely expressive (also requested by the compiler efforts)
Side 2: vendors (particularly mobile vendors) implementing the operators in the IR efficiently

This tension between higher/coarser (side 2) and lower/finer (side 1) levels of abstraction is real and valid, but why not enable both levels to be represented simultaneously in a single data flow graph representation? By enabling operators to (optionally) be defined by other operators, it gives the mobile vendor the choice to match on the coarse operators (eg: SELU) and ignore the implementation of SELU provided for nascent backends.

One possibility then is to distribute the set of non-primitive operators in a 'standard library' of reference implementations using primitive operators (in addition to or in lieu of the C++ reference implementations), which can potentially grow/evolve at a more rapid pace from the primitive operators (mirroring something like libc++/libc and syscall interfaces).

Additionally, this enables a lifecycle for operators to make their way from researchers to hand tuned kernels: A researcher creates a new operator and defines it using ONNX primitives, this allows for all frameworks to implement this operator at some (potential) loss of performance immediately, but then once/if this operator becomes popularized, it can be matched by name directly to an optimized kernel.

This parallels the transition from assembly programming to higher (C) languages that enabled procedural programming and libraries, only in this case the domain of operators/functions is somewhat more constrained, so it is feasible for backends to pattern match on functions as well as instructions/blocks.

Another nice aspect of this design is that these composed operators are also useful for representing RNN cells, fused kernels, and managing the increasing levels of abstraction that we see in DL topologies (fprop, bprop, critic, generator, discriminator, etc), that will potentially be valuable for selecting/addressing regions of computation for manipulation (take all weights from 'generator' and quantize them for this backend). Or loading in a large graph and then subselecting a 'chunk' of it.

Dimension mismatch - did you forget to set broadcast=1?

I exported a simple model for MNIST from PyTorch 0.2.0+803afd5 (a commit from yesterday, Oct 9, 2017, self-built) and imported it with onnx 0.2 (py27hac7d9f4_1 from conda channel ezyang) into Caffe2 (commit cbe0d2821230a451c9dee9414abbace4a4112329 built from source).

Originally, the model tried to scale all pixel values from the range [0, 255] to [0, 1] by executing x = x / 255 but this causes an error during the export on the PyTorch side:

RuntimeError: Couldn't export DivConstant function - maybe it doesn't implement a symbolic definition?

Instead, I settled for

        c_np = np.array( [1.0/255], dtype=np.float32 )
        c_t = torch.Tensor(c_np)
        c = torch.autograd.Variable(c_t, requires_grad=False)

        x = x * c

When I try to run the imported ONNX model, I get the following output:

christoph:~/caffe2-demo$ python run-onnx.py MNIST.fc.protoWARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode.
WARNING:root:Debug message: No module named caffe2_pybind11_state_gpu
Traceback for operator 1 in network torch-jit-export
Traceback (most recent call last):
  File "run-onnx.py", line 50, in <module>
    sys.exit( main() )
  File "run-onnx.py", line 43, in main
    output = model.run(x)
  File "/home/christoph/anaconda2/lib/python2.7/site-packages/onnx_caffe2-0.2-py2.7.egg/onnx_caffe2/backend_rep.py", line 50, in run
  File "/home/christoph/caffe2.x86_64-linux-gnu/caffe2/python/workspace.py", line 221, in RunNet
    StringifyNetName(name), num_iter, allow_fail,
  File "/home/christoph/caffe2.x86_64-linux-gnu/caffe2/python/workspace.py", line 186, in CallWithExceptionIntercept
    return func(*args, **kwargs)
RuntimeError: [enforce fail at elementwise_op.h:187] A.dims() == B.dims(). 1 784 vs 1. Dimension mismatch - did you forget to set broadcast=1? Error from operator: 
input: "1" input: "8" output: "9" name: "" type: "Mul" device_option { device_type: 0 cuda_gpu_id: 0 }

Please provide advice.

Is there any plans to Consider Predictive Model Markup Language (PMML) ?

Hi everyone,

Though this project is a start up , the Idea of having a Standard Exchange format for Deep learning model was one of my interest, since majority similar Concepts and Layers.

Here I will share my two cents of what know and what I think it should be considered , or at least as references:

Predictive Model Markup Language (PMML) is an XML-based predictive model interchange format. it provide a way for analytic applications to describe and exchange predictive models produced by data mining and machine learning algorithms, as well neural networks. Other details , you can refer to Wikipedia.
There are many GitHub Repositories (142 til this post date) that use it as exchange format between different programming platforms (Python/ Numpy , R , Julia , DL4J, etc...) as well between High-level and low-level frameworks.
Here are some repositories to consider :
- keras2pmml Simple PMML exporter for Keras Deep Learning models.
- TensorFlowPMML takes a PMML, build and runs tensorflow model.
- jpmml-tensorflow converting TensorFlow models to PMML .

These is what i can share , excluding other machine learning frameworks such Spark, flink , weka and scikit-learn. I hope these will give you some new thoughts.

Take care and cheers,

Apply attribute to input transform uniformly / outline best practices

In #63 we came to the conclusion that, whenever possible, we should prefer representing operators as taking a zero-dim tensor input (i.e., scalar) rather than an attribute. The motivation is that scalars as tensor inputs is strictly more expressive than attributes, because their values can vary at runtime (as opposed to attributes, which are fixed), so you're going to need the tensor input variants anyway, and so you might as well drop the attribute.

However, the PR only handles Pow and Slice, but there are a number of other operators (e.g., AddConstant) which take scalars, and need to be harmonized by removing the AddConstant version and deferring it to Add. We may also need to determine a coherent story for broadcasting, because the most parsimonious way to give meaning to addition of a tensor with a zero-dim tensor is through broadcasting, but may operators today (e.g., Gemm) explicitly specify whether or not broadcasting on a particular argument takes place. (Though, we may be able to sidestep this issue by special casing scalar.)

Furthermore, writing ONNX in this style needs some logic from frontends-backends, which we should document, to help frontend/backend writers.

If you are a frontend, and you convert from an attribute as opposed to an input, you have to allocate a Scalar to pass into the operator. We should make sure it's clear and easy how to allocate a zero-dim tensor in this case.
If you are a backend, and you convert to a scalar to an attribute as opposed to an input, because, perhaps, you don't support broadcasting, or you have a more efficient non-broadcasting implementation. In this case, you need to implement a simple constant propagation analysis on ONNX, so that at any point in time, you know if tensor inputs are constants or not. This is perhaps best prototyped in onnx-caffe2.

List of operators to look at:

Split (split input/attribute, is a list; this is particularly perplexing because it looks like PyTorch is emitting slices as non-attributes right now)
AddConstant/SubConstant (which are experimental)
(probably more experimental ops)

TensorProto, SparseTensorProto, and their names

Currently, a TensorProto represents an optionally-named Tensor value (dense representation), while a SparseTensorProto does not have a name (though it contains two nested TensorProtos which could have names). It would help clarify or clean this up.

(a) Option 1: Distinguish values from named values by removing the name from TensorProto and creating a separate message/type for "NamedValue" or "NamedTensor".

(b) Option 2: Add an optional name to SparseTensorProto

I prefer option (a).

Versioning policies

We are currently reluctant to target ONNX directly for our more critical serialized model paths given the frequency of breaking changes currently being made to ONNX. We agree these breaking changes are needed for the health of this project, but it would be nice to have:

Some indication of how long this unstable period will last.
An idea of what the versioning policies and guarantees will look like in the future 'stable' era.

I realize there is already an IR version number in onnx.proto so the thrust of this issue is to discuss what that field means, when it should change, and if it relates at all to the set of operators (or if that is versioned separately).

Reference exports (Model zoo)

It may be a good idea to add a set of models serialized in the onnx.proto format to the onnx/examples directory. This could serve as a reference for people testing ONNX or writing their own import/export functions. What do you think?

I would propose starting with a set of very simple models, getting progressively more and more complex: a simple addition (a+b+c), linear regression, logistic regression, MLP and then more advanced models such as AlexNet, VGG, ResNet, etc.

Resolve matrix multiply / fully connected / gemm operator issues

Right now, onnx has a single Dot operator which does both matrix multiply and 1D dot product. I would at least like to split these into two concepts. @zdevito has also proposed that we should exposed a gemm, although @ebarsoum thinks such an operator is too low level.

Interpretation of graph-valued and tensor-valued attributes underspecified

For graph-valued attributes:

Does the name of the graph have significance, and if so, how is it referenced?
Do the nodes in the graph valued attribute have access to the same Value namespace as the parent graph?
What scenarios are we trying to satisfy with graph-valued attributes? Are they meant to be a poor-man's function definition? Are they meant to allow operators to work as scheme/lisp-style special forms?

For tensor-valued attributes:

Is the tensor value available as a Node/Graph input or output?
If so, is it referenced by TensorProto.name or AttributeProto.name?

ValueError: The graph does not have an ir_version set properly.

I am trying to load a model that I exported from PyTorch. This is the code that I use to load the ONNX model from disk:

    graph = onnx.load(filename)
    onnx.checker.check_graph(graph)

This is the error message:

Traceback (most recent call last):
  File "run-onnx.py", line 51, in <module>
    sys.exit( main() )
  File "run-onnx.py", line 38, in main
    onnx.checker.check_graph(graph)
  File "/home/christoph/anaconda2/lib/python2.7/site-packages/onnx/checker.py", line 52, in check_graph
    raise ValueError('The graph does not have an ir_version set properly.')
ValueError: The graph does not have an ir_version set properly.

Please advise me how I can resolve or help resolve this error.

I am using conda 4.3.27 for Python 2 with onnx 0.1 (py27hd76de5e_1). The model file was created with self-compiled PyTorch 0.2.0+803afd5.

Generate docs for experimental operators

We could put them in Experimental.md or something.

cannot export Mnist network

The network is from https://github.com/pytorch/examples/blob/master/mnist/main.py

Traceback (most recent call last):
  File "onnx_graph.py", line 77, in <module>
    torch.onnx.export(model, dummy_input, "test.proto", verbose=True)
  File "/Users/dexter/anaconda3/lib/python3.6/site-packages/torch/onnx.py", line 72, in export
    _export(model, args, f, export_params, verbose, training)
  File "/Users/dexter/anaconda3/lib/python3.6/site-packages/torch/onnx.py", line 96, in _export
    torch._C._jit_pass_onnx(trace)
  File "/Users/dexter/anaconda3/lib/python3.6/site-packages/torch/onnx.py", line 121, in run_symbolic
    return symbolic_fn(*args)
  File "/Users/dexter/anaconda3/lib/python3.6/site-packages/torch/nn/_functions/thnn/pooling.py", line 105, in symbolic
    .is_("strides", _pair(stride)))
TypeError: is_(): incompatible function arguments. The following argument types are supported:
    1. (self: torch._C.Node, arg0: str, arg1: List[int]) -> torch._C.Node

Invoked with: %13 : UNKNOWN_TYPE = MaxPool[kernel_shape=[2, 2], pads=[0, 0], dilations=[1, 1]](%12), uses = [];
, 'strides', (None, None) (occurred when translating MaxPool2d)

ps. alexnet, vgg, resnet, densenet from torchvision works fine. And my pytorch and onnx are built from source on Oct 9.

Should ONNX have Dropout / request for FeatureDropout

If ONNX is intended to be an inference oriented framework, arguably it should not have Dropout at all (since this is a purely training-time construct).

If we decide we should keep Dropout, then I'd also like to request adding FeatureDropout to ONNX https://github.com/pytorch/pytorch/blob/master/torch/nn/_functions/dropout.py#L58

Index operator? Reconsider slice?

Currently, indexing is implemented using the Slice operator, which is more general. But the conversion is not very nice: quite disgusting: because slice as implemented only supports slicing in 1D tensors, we have to first reshape a tensor into 1D, convert the indices into a scalar representation, and then shape back to the correct thing. This is a bit awful. Pave the cowpaths. we have to compute a dim-size array specifying what our slice is going to be, and the squeeze away the dimension we want to remove. Should we make it easier to do this?

ConvTranspose def

I feel attributes of ConvTranspose is not really well defined. Specifically why do we have "output_shape" as an argument, I think it should be something we infer from the size of input right? Did we discuss about it before?
https://github.com/onnx/onnx/blob/master/onnx/defs/nn/defs.cc#L145

Operator: SubConstant/AddConstant

Add/subtract a constant from all elements of a tensor

Note that MulConstant is handled as "Scale" atm.

ImportError: No module named version

With the latest __version__ support commits, if you pip install -e onnx from source:

>>> import onnx
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "onnx/__init__.py", line 9, in <module>
    from .version import version as __version__
ImportError: No module named version

I think this is because we are not generating version.py for develop builds.

Fix here probably must be applied to onnx-caffe2 as well.

Add some typedef and type reference mechanism

If we add record types to the type system, using them as anonymous types will be space inefficient and could also be a source of model generation errors.

Since records consist of fields with heterogenous types, each field needs a name and a TypeProto, which takes up space, and repeating them at every location they are used adds no information to the system.

If we add a top-level map from string to TypeProto, and a 'type reference' type, model generators would have the option of placing record types (and others, for that matter) definition in that map, and then refer to them using the type reference, much like typedef works in C/C++.

The name for the type is an alias only, and has no impact on semantics of its use.

Human-Readable (JSON) Output

Hi,

First of all, this is the project that I was waiting for.
However, I have often to do with many different layouts.
Would it be possible to add a human-readable output for the network structure? Similar to this project:
CaffeModel2Json

Thanks a lot!

Feature request: LogSoftMax

While in principle, LogSoftMax can be implemented with a Log and Softmax, in practice, it's important to implement the fused operator because the backwards is numerically unstable otherwise.

Annoyances while writing frontend for TensorFlow

As reported by @fmassa:

Here are some differences I found that were a bit annoying

need to create extra graph nodes because Reshape and Transpose requires the shape/permutation as input, and not attributes. Also, due to the default value of Transpose in ONNX (revert all dimensions), I need to get the information of the dimension of the node that is transposed to be able to recreate this node
TF doesn’t accept NCHW for MaxPool2d on the CPU, need to add transposes there or always use NHWC (edited)

Redundant Transpose

It seems that each of the weight matrices of the torch.nn.Linear layers are transposed two times. Is this an intended behavior?

This is what I tried

import torch
import onnx
from torch.autograd import Variable
model = torch.nn.Sequential(torch.nn.Linear(784, 1024), torch.nn.ReLU(), torch.nn.Linear(1024, 10))
x = Variable(torch.randn(128, 784), requires_grad=True)
torch_out = torch.onnx._export(model, x, "sequential.onnx", export_params=True)
graph = onnx.load("sequential.onnx")
for node in graph.node:
    print(node)

which gave me

input: "1"
output: "6"
op_type: "Transpose"
attribute {
  name: "perm"
  ints: 1
  ints: 0
}

input: "6"
output: "7"
op_type: "Transpose"

input: "5"
input: "7"
input: "2"
output: "8"
op_type: "FC"

input: "8"
output: "9"
op_type: "Relu"

input: "3"
output: "10"
op_type: "Transpose"
attribute {
  name: "perm"
  ints: 1
  ints: 0
}

input: "10"
output: "11"
op_type: "Transpose"

input: "9"
input: "11"
input: "4"
output: "12"
op_type: "FC"

Bias term in Conv operator

Is there a possible bug in Conv operator defined in ConvOpSchemaGenerator?

schema.NumInputs(2, 3);
            schema.NumOutputs(1);
            schema.Input(0,
                         "X",
                         "Input data tensor from previous layer; has size (N x C x H x W)"
                         ", where N is the batch size, C is the number of channels, and"
                         " H and W are the height and width. Note that this is for the 2D image."
                         "Otherwise the size is (N x D1 x D2 ... x Dn)");
            schema.Input(1,
                         "weights",
                         "The weight tensor that will be used in the convolutions; "
                         "has size (M x C x kH x kW), where C is the number of channels, "
                         "and kH and kW are the height and width of the kernel, and M is the number "
                         "of feature maps. For more than 2 dimensions, the kernel shape will be "
                         "(M x C x k1 x k2 x ... x kn), where is the dimension of the kernel");
            schema.Output(0,
                          "Y",
                          "Output data tensor that contains the result of the convolution. The "
                          "output dimensions are functions of the kernel size, stride size, "
                          "and pad lengths.");

Input 2 as bias missing here?

BTW, there's no 'num_output' attribute for conv, conv_transpose, fullyconnected, and no 'use_bias' attribute as well.
In other words, these attributes are not retrievable in graph, instead, we have to parse these attributes from the initializers, this design is obviously not friendly to frameworks other than caffe2.

Am I missing something?

Define onnx.version

Self-explanatory.

Get rid of non-experimental "mutable" operators

ONNX doesn't support any inplace operations... except BatchNormalization in training mode, due to Caffe2 implementation vagaries. Since inplace operations are not coming to ONNX any time soon, we should drop the training mode version of BatchNorm and support inference only, with an explicitly specified mean/variance. I don't believe there are any other mutable operators in ONNX.

Installation OSX

(You should add the requirement of installing protobuf so that protoc is available to the readme).

Trying to install on OSX, any idea what I'm doing wrong?

Ich:onnx mh$ pip install onnx
Requirement already satisfied: onnx in /usr/local/lib/python2.7/site-packages
Requirement already satisfied: numpy in /usr/local/lib/python2.7/site-packages (from onnx)
Requirement already satisfied: protobuf in /usr/local/lib/python2.7/site-packages (from onnx)
Requirement already satisfied: six>=1.9 in /usr/local/lib/python2.7/site-packages (from protobuf->onnx)
Requirement already satisfied: setuptools in /usr/local/lib/python2.7/site-packages (from protobuf->onnx)
Ich:onnx mh$ pip3 install onnx
Requirement already satisfied: onnx in /usr/local/lib/python3.6/site-packages
Requirement already satisfied: protobuf in /usr/local/lib/python3.6/site-packages (from onnx)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/site-packages (from onnx)
Requirement already satisfied: setuptools in /usr/local/lib/python3.6/site-packages (from protobuf->onnx)
Requirement already satisfied: six>=1.9 in /usr/local/lib/python3.6/site-packages (from protobuf->onnx)
Ich:onnx mh$ python -c 'import onnx'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "onnx/__init__.py", line 7, in <module>
    from . import checker, helper
  File "onnx/checker.py", line 14, in <module>
    from onnx import defs
  File "onnx/defs/__init__.py", line 6, in <module>
    import onnx.onnx_cpp2py_export as C
ImportError: No module named onnx_cpp2py_export

Support ReflectionPad2d (and/or ConstantPad2d)

Running the command torch_out = torch.onnx._export(torch_model, x, "saved_model.onnx", export_params=False) for custom model architecture produces following error:

Traceback (most recent call last):

File "pytorch_to_caffe2.py", line 133, in
torch_out = torch.onnx._export(torch_model, x, "saved_model.onnx", export_params=False
File "/opt/conda/lib/python2.7/site-packages/torch/onnx.py", line 54, in _export
proto = trace.export(verbose)
File "/opt/conda/lib/python2.7/site-packages/torch/nn/_function/thnn/auto.py", line 154, in symbolic
return symbolic_fn(*args, **kwargs)
TypeError: 'NoneType' object is not callable

I went into auto.py and print out symbolic_fn and indeed it is None when I get this error. For tutorial (which worked fine) this function is <function threshold_symoblic at 0x>.

When I print with verbose=True, I get the the following Error:

Error occured while handling:
%11 : Float(1, 3, 264, 264), %12 : Handle = ^ReflectionPad2d(4, 4, 4, 4)(%9), uses = [[%13.i0], []];

Exported graph so far:
graph(%1 : Float(32, 3, 9, 9)
%2 : Float(32)
%3 : Float(64, 32, 3, 3)
%4 : Float(64)
%5 : Float(32, 64, 3, 3)
%6 : Float(32)
%7 : Float(3, 32, 9, 9)
%8 : Float(3)
%9 : Float(1, 3, 256, 256)) {
return ();

Any ideas of what might be going on and how to successfully export my pytorch model to caffe2?

Thank you.

ONNX format documentation

Did I miss where the documentation for the ONNX format? I am hoping to be doing some work in the R language for some neural network/deep learning and would like to help support this for that framework as well. But I am struggling to find clear documentation of how the 'saved' model objects should be stored.

Resolve MaxPool channels issue

Reported at pytorch/pytorch#2717, apparently not all maxpool dimensions support channels. This might be just an onnx-caffe2 problem.

NNVM Compiler and Interpolation

Last month we saw the splashing release of onnx, which I think personally think is a great step toward common model exchange format of deep learning frameworks. I am glad that the community has been moving toward more interpolation and collaboration. First in case of dlpack for tensor structure and now in terms of model exchange. I also see a lot of common spirits when we develop NNVM.

We did not immediately jump into discussion issues because we think we should bring something to the community that can work with common exchange formats and focus on graph and tensor level optimization, thus the NNVM Compiler.

We would like now to jump in and discuss potential ways to work together. In the very least, now nnvm compiler provides an end to end solution to bring onnx to bare metal. As I mentioned to @soumith and @Yangqing personally before, I would like to see more interpolations happen and make nnvm compiler and onnx to work together to benefit the users.

Technically there is a barrier to code migration, barrier of the institutions or barrier of pride. But I think at least we can have an openminded discussion on what each member of the community think the exchange layer and compilation layer should look like, for example

Serialized IR(onnx) for exchange v.s. in memory IR(nnvm or pytorch's jit ir) for graph optimization
What can nnvm learn from onnx, vice versa
What should be good in-memory structure for graph IR
What is a good strategy for operator extension
Should numpy be supported as a basic primitive or lower level op for ease of vendor implementation

Through the discussion maybe there would be more interesting ways of interpolation and lessons learned.

Validate that raw_data XOR floats/etc field is used in TensorProto

In the onnx model (from the zoo), I'm seeing tensor data stored in raw_data field when it's "FLOAT" type. Shouldn't it be in float_data please? if not, do we want to create (or we already have) rules which field should be used for which type please? or we just have to go thru all fields to fetch out the data regardless of its type?

install failed. conda install -c ezyang onnx

Hit the this error on Windows 10 64bit (build 16299). Could someone please take a look? Thanks!

(C:\Users\swhsu\AppData\Local\Continuum\anaconda3) C:\Users\swhsu\Documents>conda install -c ezyang onnx
Fetching package metadata ...............

PackageNotFoundError: Packages missing in current channels:

onnx

We have searched for the packages in the following channels:

No support for map/dictionary or tuple/record values in attributes.

In looking at other frameworks, there are examples of operator attributes that are either maps (e.g., CategoricalMapping) or tuple/records (see most of CoreML's node definitions).

We may also want these as input/output values for operators.

onnx / onnx Goto Github PK

onnx's Introduction

Use ONNX

Learn about the ONNX spec

Programming utilities for working with ONNX Graphs

Contribute

Community meetings

Discuss

Follow Us

Roadmap

Installation

Official Python packages

vcpkg packages

Conda packages

Build ONNX from Source

Windows

Linux

Mac

Verify Installation

Common Build Options

Environment variables

CMake variables

Common Errors

Testing

Development

License

Code of Conduct

onnx's People

Contributors

Stargazers

Watchers

Forkers

onnx's Issues

Recommend Projects

Recommend Topics

Recommend Org