Code Monkey home page Code Monkey logo

onednn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

onednn's Issues

Access violation in JIT-compiled code

Summary

Disclaimer: I'm not very familiar with this library, or with neural network processing in general. I'm working with a model developed by a sister company (created using PyTorch, converted to OpenVINO IR from ONNX) and we are getting a persistent crash when running on Windows with a large number of concurrent streams. With some effort I narrowed down the cause and identified a fix that worked for us, but there are probably much more elegant ways to fix this, hence this is an issue report instead of a pull request.

The cause of the crash seems to be this instruction:

00000179D623120F C4 E3 6D 4A 18 00    vblendvps   ymm3,ymm2,ymmword ptr [rax],ymm0  

which is part of a JIT routine called by

template <cpu_isa_t isa, data_type_t acc_type, data_type_t dst_type>
void jit_pp_kernel_t<isa, acc_type, dst_type>::operator()(dst_data_t *dst,
        const acc_data_t *acc, const char *bias, const float *scales,
        size_t start, size_t end, size_t runtime_oc,
        const float *dst_zero_points) const

in jit_gemm_inner_product_utils.cpp. At the time of the crash RAX is pointing less than 32 bytes from the end of an allocated memory page, and the addresses following this page are invalid. The YMM instructions operate on 32 bytes (256 bits) at a time, so this causes an access violation.

For example, in one of my crash dumps RAX has the value 00000179EC8A6FE4, and the memory there looks like this:

0x00000179EC8A6FE4  06 6f 8f 3d fd fd fd fd  .o.=ýýýý
0x00000179EC8A6FEC  dd dd dd dd dd dd dd dd  ÝÝÝÝÝÝÝÝ
0x00000179EC8A6FF4  dd dd dd dd dd dd dd dd  ÝÝÝÝÝÝÝÝ
0x00000179EC8A6FFC  dd dd dd dd ?? ?? ?? ??  ÝÝÝÝ....
0x00000179EC8A7004  ?? ?? ?? ?? ?? ?? ?? ??  ........
...

I seem to have fixed this by adding 32 bytes of padding to all node/edge memory allocations, using the following patch:

diff --git a/src/common/memory_desc_wrapper.hpp b/src/common/memory_desc_wrapper.hpp
index 4017db3dc..158120ad5 100644
--- a/src/common/memory_desc_wrapper.hpp
+++ b/src/common/memory_desc_wrapper.hpp
@@ -162,7 +162,7 @@ struct memory_desc_wrapper : public c_compatible {
                 max_size = utils::array_product(bd.inner_blks, bd.inner_nblks);
             }

-            return max_size * data_type_size() + additional_buffer_size();
+            return max_size * data_type_size() + additional_buffer_size() + 32;
         }
     }

This is enough to allow us to move forward, but I'm sure someone with a better understanding of the code can do better.

Version

Git hash is e0381c3. This is the version referenced by OpenVINO 2021.4.

Environment

CPU: Intel(R) Core(TM) i7-9700 CPU @ 3.00GHz
OS version: Windows 10 Enterprise LTSC (10.0.17763)
Compiler version: Microsoft (R) C/C++ Optimizing Compiler Version 19.29.30133 for x64
CMake version: 3.16.2

Steps to reproduce

I wish I could give you something better to go on here, but we were only able to reproduce this in a system that was live streaming from 64 network cameras and running them all through the model in real time. I was not able to reproduce using a test harness that read the data from disk instead of over a network.

Observed behavior

The program crashes intermittently (see summary). This can happen after a couple of minutes or a couple of hours.

Expected behavior

The program does not crash.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.