Code Monkey home page Code Monkey logo

nnlib's People

Contributors

lee-bin avatar lu229 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

nnlib's Issues

What is d32 format?

Sorry for posting my question as an issue

Many quantized hexagon_nn ops has _d32 variant. What is D32 format and how it is different from "flat"?
e.g.

QuantizedAdd_8p8to8 - elementwisely adds Input A and Input B together. (flat format)
0: Input A data (quint8 tensor)
1: Input B data (quint8 tensor)
...
0: Output data (quint8 tensor)

QuantizedAdd_8p8to8_d32 - Elementwise Add; inputs and output are in d32 format
0: Input A data (qint8)
1: Input B data (qint8)
...
0: Output data (quint8)

make error: hexagon-clang: error: clang frontend command failed with exit code 70 (use -v to see invocation)

image
ubuntu:
Linux version 4.15.0-43-generic (buildd@lcy01-amd64-007)
(gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10))
SDK:
Qualcomm-Hexagon_SDK-3.3.3

first: source setup_sdk_env.source
then: python install_dependencies.py
then: git clone nnlib
then: make tree VERBOSE=1 V=hexagon_Release_dynamic_toolv81_v60
build error:
fatal error: error in backend: Cannot select: t29: v32i32 = select_cc t4, Constant:i32<0>, t9, t11, setne:ch
t4: i32 = AssertSext t2, ValueType:ch:i8
t2: i32,ch = CopyFromReg t0, Register:i32 %vreg6
t1: i32 = Register %vreg6
t5: i32 = Constant<0>
t9: v32i32,ch = CopyFromReg t0, Register:v32i32 %vreg64
t8: v32i32 = Register %vreg64
t11: v32i32,ch = CopyFromReg t0, Register:v32i32 %vreg63
t10: v32i32 = Register %vreg63
In function: core_add_d32_mixed_hvx
can you help me to solve this problem

Has any benchmark result with mace

Hi, all!

Has any benchmark result for nnlib with mace, ncnn or etc? Besides, I wanna ask which performance is better between nnlib and SNPE?

Thanks in advance.

How to use hexagon_nn with unsigned PD?

I apologies for posting my question as an issue.

I'm trying to run Qualcomm/Hexagon_SDK/3.4.3/examples/hexagon_nn/tutorials on Xiaomi Mi9 phone SM8150 (SDM855).

Example: 001-nop.c (using libs/hexagon_nn/2.6)

I added #pragma weak remote_session_control and hexnn_controller_request_unsigned_pd() to 001-nop.c.

Tutorial execution in examples/hexagon_nn/tutorials shows:

python tutorials_walkthrough.py -T sm8150 -N
...
---- Run Examples on cDSP ----
---- Runing 001-nop		----
adb wait-for-device push /root/3.4.3/libs/hexagon_nn/2.6/hexagon_Debug_dynamic_toolv83_v66/ship/libhexagon_nn_skel.so /data/local/tmp/vendor/lib/rfsa/adsp/
/root/3.4.3/libs/hexagon_nn/2.6/hexagon_Debug_dynamic_toolv83_v66/ship/libhexagon_nn_skel.so: 1 file pushed. 1.4 MB/s (5007704 bytes in 3.443s)
adb wait-for-device push /root/3.4.3/examples/hexagon_nn/tutorials/android_Debug_aarch64/ship/001-nop /data/local/tmp/vendor/bin
/root/3.4.3/examples/hexagon_nn/tutorials/android_Debug_aarch64/ship/001-nop: 1 file pushed. 0.6 MB/s (118624 bytes in 0.176s)
adb wait-for-device shell chmod 777 /data/local/tmp/vendor/bin/001-nop
adb wait-for-device shell ADSP_LIBRARY_PATH="/data/local/tmp/vendor/lib/rfsa/adsp" /data/local/tmp/vendor/bin/001-nop
***************** remote_session_control is TRUE ****************
***************** remote_session_control returned 0 ****************
fastrpc_setup Done
Trying to hexagon_nn_config
hexagon_nn_config Done
hexagon_nn_init Done
hexagon_nn_append_node Done
hexagon_nn_prepare Done
Trying hexagon_nn_execute
Whoops... run failed: -1
Test Failed, err=-1

logcat shows that remote_handle_open for libhexagon_nn_skel.so was successfull, but remote_handle_invoke failed

12-21 23:50:18.501  3565  3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1724: Successfully opened fastrpc_shell_unsigned_3
12-21 23:50:18.521  3565  3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1870: Successfully created user PD on domain 3 (attrs 0x8)
12-21 23:50:18.545  3565  3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1006: remote_handle_open: Successfully opened handle 0xed2620 for hexagon_nn on domain 3
12-21 23:50:18.557  3565  3565 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0xffffffff: remote_handle_invoke failed for handle 0xed2620, method 12 on domain 3 (sc 0xc020200)
12-21 23:50:18.557  3565  3566 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0xffffffff: remote_handle_invoke failed for handle 0x3, method 4 on domain 3 (sc 0x4020200)
12-21 23:50:18.558  3565  3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:244:listener protocol failure ffffffff
12-21 23:50:18.558  3565  3565 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0x27: remote_handle_invoke failed for handle 0xed2620, method 13 on domain 3 (sc 0xd010000)
12-21 23:50:18.558  3565  3566 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0x27: remote_handle_invoke failed for handle 0x3, method 4 on domain 3 (sc 0x4020200)
12-21 23:50:18.558  3565  3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:251::error: 39: 0 == (nErr = __QAIC_HEADER(adsp_listener_next2)( ctx, nErr, 0, 0, &ctx, &handle, &sc, inBufs, inBufsLen, &inBufsLenReq))
12-21 23:50:18.558  3565  3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:333:Error 0x27: listener2 thread exited

Full example code 001-nop.c:

#include "remote.h"

#pragma weak remote_session_control

// hexagon_nn.h includes most of the things you'll need to create and run graphs.
// Its most important includes are nn_graph.h, which includes nn_graph_if.h.
// Together, these provide the data-types for input/output tensors,
//   and the API you'll use for initializing, building, preparing and running
//   your graphs.
// NOTE: hexagon_nn.h redefines malloc(), alloc(), etc. so
//   they become a compile-error "OOPS MALLOC".  This is because you should
//   use rpcmem_alloc() instead.
#include <hexagon_nn.h>

// hexagon_nn_ops.h defines the various graph operations (e.g. "MatMul", "NOP"
//   and "Relu") which you can do.  Internally, it just expands
//   interface/ops.def into a usable format.  ops.def contains the list of
//   all implemented ops.
#include "hexagon_nn_ops.h"

// For printf, etc.
#include <stdio.h>

// If you're already familiar with SDK programming for the DSP,
//   you've probably used fastRPC.  There's already lots of examples
//   documenting its use, and the purpose of this tutorial is to
//   expose the hexagon_nn_* API, so we'll ignore the fastRPC details.
// For these tutorials, we create a couple functions
//   fastrpc_setup() and fastrpc_teardown(), and some required includes.
// FastRPC allows our code running on the ARM to call functions located
//   on the DSP, quite seamlessly.
// To enable this ARM/DSP communication, we need to open a channel.
//   We'll also need to be careful later how we call functions that cross
//   the ARM/DSP partition, e.g. sending pointers, to ensure the ARM and
//   DSP see the same data.
#include "sdk_fastrpc.h"




// The structure of our NOP network looks like this.
//   It's really just a NOP floating in space, with no inputs or outputs.
//
//
//                   ==============
//    ?????????      ||    NOP   ||      ?????????
//   ??nothing??     || id=0x1000||     ??nothing??
//    ?????????      ==============      ?????????
//

int hexnn_controller_request_unsigned_pd() {
  int ret = -1;
  if (remote_session_control) {
    printf("***************** remote_session_control is TRUE ****************\n");
    struct remote_rpc_control_unsigned_module data;
    data.enable = 1;
    data.domain = CDSP_DOMAIN_ID;
    ret = remote_session_control(DSPRPC_CONTROL_UNSIGNED_MODULE, (void *) &data, sizeof(data));
    printf("***************** remote_session_control returned %d ****************\n", ret);
  } else {
    printf("***************** remote_session_control is FALSE ****************\n");
  }
  return ret;
}


int main(int argc, char **argv) {
        int err = 0;
        hexnn_controller_request_unsigned_pd();
        // Start the ARM/DSP communications channel so we can call
        //   library functions that execute on the dsp.
        if (fastrpc_setup() != 0) return 1;
        printf("fastrpc_setup Done\n");
        // The nnlib API consists of functions that begin "hexagon_nn_*"
        // This prefix indicates that the function will actually run on the DSP.
        // To run a neural network we'll use this basic API:
        //   1) hexagon_nn_config            - Start nnlib, preparing globals
        //   2) hexagon_nn_init              - Initialize a new graph
        //   3) hexagon_nn_set_debug_level   - Enable debug
        //   4) hexagon_nn_append_node       - Add nodes to the graph
        //   5) hexagon_nn_append_const_node - Add constants (pure data, not ops)
        //                                     (we won't need any for now)
        //   6) hexagon_nn_prepare           - Allocate memory, strategize,
        //                                     and optimize the graph for speed
        //   7) hexagon_nn_execute           - Run an inference
        //   8) hexagon_nn_teardown          - Destroy the graph, free resources

        // Ensures that nnlib is ready to start working.
        printf("Trying to hexagon_nn_config\n");
        hexagon_nn_config();
        printf("hexagon_nn_config Done\n");

        // Initialize a fresh, empty graph.  Return a graph-handle by reference.
        hexagon_nn_nn_id graph_id;
        if (hexagon_nn_init(&graph_id)) {
                printf("Whoops... Cannot init\n");
				return 2;
        }
        printf("hexagon_nn_init Done\n");

        // Set power level (to max/turbo)
        if ((err = hexagon_nn_set_powersave_level(0)) != 0) {
                printf("Whoops... Cannot set power level: %d\n", err);
                goto TEARDOWN;
        }

        // Select our debug level.  0=none, >4=max
        // When creating new graphs, it's nice to have max debug
        //   even if you don't think you need it.
        //hexagon_nn_set_debug_level(graph_id, 100);

        // Append a node to the graph.
        // We need to provide a unique-id so other nodes can connect.
        // The operation can be any of the ops found in interface/ops.def,
        //   prefixed with "OP_" (e.g. OP_MatMul_f, OP_Relu_f, OP_MaxPool_f)
        // Our NOP node doesn't need any padding, because it won't do anything.
        // Our input/output lists will be NULL in this example,
        //   but for real graphs we'll need to connect nodes using these lists.
        hexagon_nn_append_node(
                graph_id,           // Graph handle we're appending into
                0x1000,             // Node identifier (any unique uint32)
                OP_Nop,             // Operation of this node (e.g. Concat, Relu)
                NN_PAD_NA,          // Padding type for this node
                NULL,               // The list of inputs to this node
                0,                  //   How many elements in input list?
                NULL,               // The list of outputs from this node
                0                   //   How many elements in output list?
                );
        printf("hexagon_nn_append_node Done\n");
        // Prepare the graph for execution by optimizing it, allocating storage,
        //   connecting all the input/output pointers between nodes, and
        //   doing some basic checks, like number of input/output tensors and
        //   sizing for each node.
        if (hexagon_nn_prepare(graph_id)) {
                printf("Whoops... Cannot prepare\n");
        }
        printf("hexagon_nn_prepare Done\n");


        // Execute an inference on our input data.
        // Real graphs require input and output buffers, but we'll
        //   just use zero-size NULL pointers for this NOP example.
        uint32_t out_batches, out_height, out_width, out_depth, out_data_size;
        printf("Trying hexagon_nn_execute\n");
        if ((err = hexagon_nn_execute(
                     graph_id,
                     0, 0, 0, 0,             // Our input has 0-dimension
                     NULL,                   // Pointer to input data
                     0,                      // How many total bytes of input?
                     (unsigned int *) &out_batches,
                     (unsigned int *) &out_height,
                     (unsigned int *) &out_width,
                     (unsigned int *) &out_depth,
                     (uint8_t *)NULL,        // Pointer to output buffer
                     0,                      // Max size of output buffer
                     (unsigned int*) &out_data_size)         // Actual size used for output
                    ) != 0) {

                printf("Whoops... run failed: %d\n",err);
        }
        

TEARDOWN:
    // Free the memory, especially if we want to build subsequent graphs
    hexagon_nn_teardown(graph_id);

    // Stop fastRPC
    fastrpc_teardown();

    if (!err) printf("Test Passed!\n");
    else printf ("Test Failed, err=%d\n", err);

    return err;
}

is FullyConnected operator missing?

I have a question about FullyConnected operator.

I found that NNLib has some draft op code in https://github.com/XiaoMi/nnlib/blob/master/hexagon/ops/src/op_fully_connected.c

It has FullyConnected_u8 op and couple other commented out ops such as QuantizedFC_8x8p8to8.

ops.def has only FullyConnected_u8
https://github.com/XiaoMi/nnlib/blob/master/interface/ops.def#L193

Looks like FullyConnected_u8 was designed for uint8 inputs, but I do not see input min/max tensors, weight min/max, bais min/max. So it works with uint8 but non-quantized data??? That is strange.

What about QuantizedFC_8x8p8to8? It uses two non existing functions fc_layer_execute_opt and fc_layer_execute_ref. Another functions are found instead fc_layer_execute and fc_layer_execute_hvx but they are not used by any operators.

supported_ops.txt does not have any info about FullyConnected op.
https://github.com/XiaoMi/nnlib/blob/master/docs/supported_ops.txt

So, what should I do if my model has FullyConnected layer. How to convert it to Hexagon NNLib graph?

To where PPrint op outputs the result

I'm running hexagon_nn example 004-xor-graph.c. It uses three PPrint_f operators.
But I do not see that it dumps tensors content to terminal or to logcat.
To where PPrint_f outputs tensor content? Anyone tried to use it?

//                  ACTUAL GRAPH
//         [ 1,-1]                 [1]
//         [-1, 1]                 [1]
//         "layer1a"            "layer3a"
//          0x1001a              0x1003a
//               \                   \.
//   [1,0] -----> MatMul --> Relu --> MatMul ----.
//   Const       "layer1"  "layer2"  "layer3"     \.
//  0x1001b       0x1001    0x1002    0x1003       \.
//                     \         \         \        \.
//                      PPrint    PPrint    PPrint   Close
//                      0x2000    0x2001    0x2002   0x2003
//                                                   /
//                                                  /
//                                                [1]
//                                               Const
//                                              0x2003a

pmu_adsp.h not found

Hi. I am currently trying to run the nnlib example, but I encounter the following error

test/graphmain.c:436:22: fatal error: pmu_adsp.h: No such file or directory #include "pmu_adsp.h" ^ compilation terminated.

I have searched for the header fileonline, but can't seem to find it:

  1. Is this something that has been encountered on your system?
  2. If so, how did you resolve it?

CreateProcess fail

When I use "make tree VERBOSE=1 V=hexagon_Release_dynamic_toolv83_v66" command to compile, the following error occurs:

make[1]: Entering directory E:/Tools/Hexagon_SDK_3.4.3/Hexagon_SDK/3.4.3/libs/hexagon_nn/nnlib-master_20191101/nnlib-master' interface/proto_to_idl.pl process_begin: CreateProcess(NULL, perl E:\Tools\Hexagon_SDK_3.4.3\Hexagon_SDK\3.4.3\libs\hexagon_nn\nnlib-master_20191101\nnlib-master\interface\proto_to_idl.pl, ...) failed. make (e=2): 系统找不到指定的文件。 make[1]: *** [interface/hexagon_nn.idl] Error 2 make[1]: Leaving directory E:/Tools/Hexagon_SDK_3.4.3/Hexagon_SDK/3.4.3/libs/hexagon_nn/nnlib-master_20191101/nnlib-master'
make: *** [tree] Error 2

Do not generate graph_app executable file

Hello, I have compiled hexagon_nnlib after
0) Download and install the Hexagon SDK 3.3 (3.2 may also work)

  1. Source the Hexagon SDK setup_sdk_env.sh script.
  2. make tree VERBOSE=1 V=hexagon_Release_dynamic_toolv81_v60
  3. make tree VERBOSE=1 V=android_Release

However, I did not got graph_app executable file under android_Release/ship. I only got dynamic file

这个仓库的用法和相关疑问

你好,我想问下

  1. 这个nnlib是Hexagon DSP SDK的上一层封装,用来打通安卓嘛?底层调用关系的疑问
  2. 这个仓库代码是谁来维护的呀,是安卓官方嘛,因为我也看到仓库给了连接https://source.codeaurora.org/quic/hexagon_nn/nnlib,而且我看这个链接里的代码11天前有更新,咱们这边小米的提交是同步更新嘛?
  3. 这个nnlib和SNPE dsp/aip runtime与Hexagon DSP SDK,Android NN API的关系?

问题有点多哈,感谢~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.