xiaomi / nnlib Goto Github PK
View Code? Open in Web Editor NEWFork of https://source.codeaurora.org/quic/hexagon_nn/nnlib
License: BSD 3-Clause Clear License
Fork of https://source.codeaurora.org/quic/hexagon_nn/nnlib
License: BSD 3-Clause Clear License
Sorry for posting my question as an issue
Many quantized hexagon_nn ops has _d32 variant. What is D32 format and how it is different from "flat"?
e.g.
QuantizedAdd_8p8to8 - elementwisely adds Input A and Input B together. (flat format)
0: Input A data (quint8 tensor)
1: Input B data (quint8 tensor)
...
0: Output data (quint8 tensor)
QuantizedAdd_8p8to8_d32 - Elementwise Add; inputs and output are in d32 format
0: Input A data (qint8)
1: Input B data (qint8)
...
0: Output data (quint8)
ubuntu:
Linux version 4.15.0-43-generic (buildd@lcy01-amd64-007)
(gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.10))
SDK:
Qualcomm-Hexagon_SDK-3.3.3
first: source setup_sdk_env.source
then: python install_dependencies.py
then: git clone nnlib
then: make tree VERBOSE=1 V=hexagon_Release_dynamic_toolv81_v60
build error:
fatal error: error in backend: Cannot select: t29: v32i32 = select_cc t4, Constant:i32<0>, t9, t11, setne:ch
t4: i32 = AssertSext t2, ValueType:ch:i8
t2: i32,ch = CopyFromReg t0, Register:i32 %vreg6
t1: i32 = Register %vreg6
t5: i32 = Constant<0>
t9: v32i32,ch = CopyFromReg t0, Register:v32i32 %vreg64
t8: v32i32 = Register %vreg64
t11: v32i32,ch = CopyFromReg t0, Register:v32i32 %vreg63
t10: v32i32 = Register %vreg63
In function: core_add_d32_mixed_hvx
can you help me to solve this problem
Hi, all!
Has any benchmark result for nnlib with mace, ncnn or etc? Besides, I wanna ask which performance is better between nnlib and SNPE?
Thanks in advance.
I apologies for posting my question as an issue.
I'm trying to run Qualcomm/Hexagon_SDK/3.4.3/examples/hexagon_nn/tutorials
on Xiaomi Mi9 phone SM8150 (SDM855).
Example: 001-nop.c
(using libs/hexagon_nn/2.6)
I added #pragma weak remote_session_control
and hexnn_controller_request_unsigned_pd()
to 001-nop.c
.
Tutorial execution in examples/hexagon_nn/tutorials shows:
python tutorials_walkthrough.py -T sm8150 -N
...
---- Run Examples on cDSP ----
---- Runing 001-nop ----
adb wait-for-device push /root/3.4.3/libs/hexagon_nn/2.6/hexagon_Debug_dynamic_toolv83_v66/ship/libhexagon_nn_skel.so /data/local/tmp/vendor/lib/rfsa/adsp/
/root/3.4.3/libs/hexagon_nn/2.6/hexagon_Debug_dynamic_toolv83_v66/ship/libhexagon_nn_skel.so: 1 file pushed. 1.4 MB/s (5007704 bytes in 3.443s)
adb wait-for-device push /root/3.4.3/examples/hexagon_nn/tutorials/android_Debug_aarch64/ship/001-nop /data/local/tmp/vendor/bin
/root/3.4.3/examples/hexagon_nn/tutorials/android_Debug_aarch64/ship/001-nop: 1 file pushed. 0.6 MB/s (118624 bytes in 0.176s)
adb wait-for-device shell chmod 777 /data/local/tmp/vendor/bin/001-nop
adb wait-for-device shell ADSP_LIBRARY_PATH="/data/local/tmp/vendor/lib/rfsa/adsp" /data/local/tmp/vendor/bin/001-nop
***************** remote_session_control is TRUE ****************
***************** remote_session_control returned 0 ****************
fastrpc_setup Done
Trying to hexagon_nn_config
hexagon_nn_config Done
hexagon_nn_init Done
hexagon_nn_append_node Done
hexagon_nn_prepare Done
Trying hexagon_nn_execute
Whoops... run failed: -1
Test Failed, err=-1
logcat shows that remote_handle_open for libhexagon_nn_skel.so was successfull, but remote_handle_invoke failed
12-21 23:50:18.501 3565 3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1724: Successfully opened fastrpc_shell_unsigned_3
12-21 23:50:18.521 3565 3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1870: Successfully created user PD on domain 3 (attrs 0x8)
12-21 23:50:18.545 3565 3565 V /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:1006: remote_handle_open: Successfully opened handle 0xed2620 for hexagon_nn on domain 3
12-21 23:50:18.557 3565 3565 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0xffffffff: remote_handle_invoke failed for handle 0xed2620, method 12 on domain 3 (sc 0xc020200)
12-21 23:50:18.557 3565 3566 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0xffffffff: remote_handle_invoke failed for handle 0x3, method 4 on domain 3 (sc 0x4020200)
12-21 23:50:18.558 3565 3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:244:listener protocol failure ffffffff
12-21 23:50:18.558 3565 3565 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0x27: remote_handle_invoke failed for handle 0xed2620, method 13 on domain 3 (sc 0xd010000)
12-21 23:50:18.558 3565 3566 D /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/fastrpc_apps_user.c:925: Error 0x27: remote_handle_invoke failed for handle 0x3, method 4 on domain 3 (sc 0x4020200)
12-21 23:50:18.558 3565 3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:251::error: 39: 0 == (nErr = __QAIC_HEADER(adsp_listener_next2)( ctx, nErr, 0, 0, &ctx, &handle, &sc, inBufs, inBufsLen, &inBufsLenReq))
12-21 23:50:18.558 3565 3566 E /data/local/tmp/vendor/bin/001-nop: vendor/qcom/proprietary/commonsys-intf/adsprpc/src/listener_android.c:333:Error 0x27: listener2 thread exited
Full example code 001-nop.c
:
#include "remote.h"
#pragma weak remote_session_control
// hexagon_nn.h includes most of the things you'll need to create and run graphs.
// Its most important includes are nn_graph.h, which includes nn_graph_if.h.
// Together, these provide the data-types for input/output tensors,
// and the API you'll use for initializing, building, preparing and running
// your graphs.
// NOTE: hexagon_nn.h redefines malloc(), alloc(), etc. so
// they become a compile-error "OOPS MALLOC". This is because you should
// use rpcmem_alloc() instead.
#include <hexagon_nn.h>
// hexagon_nn_ops.h defines the various graph operations (e.g. "MatMul", "NOP"
// and "Relu") which you can do. Internally, it just expands
// interface/ops.def into a usable format. ops.def contains the list of
// all implemented ops.
#include "hexagon_nn_ops.h"
// For printf, etc.
#include <stdio.h>
// If you're already familiar with SDK programming for the DSP,
// you've probably used fastRPC. There's already lots of examples
// documenting its use, and the purpose of this tutorial is to
// expose the hexagon_nn_* API, so we'll ignore the fastRPC details.
// For these tutorials, we create a couple functions
// fastrpc_setup() and fastrpc_teardown(), and some required includes.
// FastRPC allows our code running on the ARM to call functions located
// on the DSP, quite seamlessly.
// To enable this ARM/DSP communication, we need to open a channel.
// We'll also need to be careful later how we call functions that cross
// the ARM/DSP partition, e.g. sending pointers, to ensure the ARM and
// DSP see the same data.
#include "sdk_fastrpc.h"
// The structure of our NOP network looks like this.
// It's really just a NOP floating in space, with no inputs or outputs.
//
//
// ==============
// ????????? || NOP || ?????????
// ??nothing?? || id=0x1000|| ??nothing??
// ????????? ============== ?????????
//
int hexnn_controller_request_unsigned_pd() {
int ret = -1;
if (remote_session_control) {
printf("***************** remote_session_control is TRUE ****************\n");
struct remote_rpc_control_unsigned_module data;
data.enable = 1;
data.domain = CDSP_DOMAIN_ID;
ret = remote_session_control(DSPRPC_CONTROL_UNSIGNED_MODULE, (void *) &data, sizeof(data));
printf("***************** remote_session_control returned %d ****************\n", ret);
} else {
printf("***************** remote_session_control is FALSE ****************\n");
}
return ret;
}
int main(int argc, char **argv) {
int err = 0;
hexnn_controller_request_unsigned_pd();
// Start the ARM/DSP communications channel so we can call
// library functions that execute on the dsp.
if (fastrpc_setup() != 0) return 1;
printf("fastrpc_setup Done\n");
// The nnlib API consists of functions that begin "hexagon_nn_*"
// This prefix indicates that the function will actually run on the DSP.
// To run a neural network we'll use this basic API:
// 1) hexagon_nn_config - Start nnlib, preparing globals
// 2) hexagon_nn_init - Initialize a new graph
// 3) hexagon_nn_set_debug_level - Enable debug
// 4) hexagon_nn_append_node - Add nodes to the graph
// 5) hexagon_nn_append_const_node - Add constants (pure data, not ops)
// (we won't need any for now)
// 6) hexagon_nn_prepare - Allocate memory, strategize,
// and optimize the graph for speed
// 7) hexagon_nn_execute - Run an inference
// 8) hexagon_nn_teardown - Destroy the graph, free resources
// Ensures that nnlib is ready to start working.
printf("Trying to hexagon_nn_config\n");
hexagon_nn_config();
printf("hexagon_nn_config Done\n");
// Initialize a fresh, empty graph. Return a graph-handle by reference.
hexagon_nn_nn_id graph_id;
if (hexagon_nn_init(&graph_id)) {
printf("Whoops... Cannot init\n");
return 2;
}
printf("hexagon_nn_init Done\n");
// Set power level (to max/turbo)
if ((err = hexagon_nn_set_powersave_level(0)) != 0) {
printf("Whoops... Cannot set power level: %d\n", err);
goto TEARDOWN;
}
// Select our debug level. 0=none, >4=max
// When creating new graphs, it's nice to have max debug
// even if you don't think you need it.
//hexagon_nn_set_debug_level(graph_id, 100);
// Append a node to the graph.
// We need to provide a unique-id so other nodes can connect.
// The operation can be any of the ops found in interface/ops.def,
// prefixed with "OP_" (e.g. OP_MatMul_f, OP_Relu_f, OP_MaxPool_f)
// Our NOP node doesn't need any padding, because it won't do anything.
// Our input/output lists will be NULL in this example,
// but for real graphs we'll need to connect nodes using these lists.
hexagon_nn_append_node(
graph_id, // Graph handle we're appending into
0x1000, // Node identifier (any unique uint32)
OP_Nop, // Operation of this node (e.g. Concat, Relu)
NN_PAD_NA, // Padding type for this node
NULL, // The list of inputs to this node
0, // How many elements in input list?
NULL, // The list of outputs from this node
0 // How many elements in output list?
);
printf("hexagon_nn_append_node Done\n");
// Prepare the graph for execution by optimizing it, allocating storage,
// connecting all the input/output pointers between nodes, and
// doing some basic checks, like number of input/output tensors and
// sizing for each node.
if (hexagon_nn_prepare(graph_id)) {
printf("Whoops... Cannot prepare\n");
}
printf("hexagon_nn_prepare Done\n");
// Execute an inference on our input data.
// Real graphs require input and output buffers, but we'll
// just use zero-size NULL pointers for this NOP example.
uint32_t out_batches, out_height, out_width, out_depth, out_data_size;
printf("Trying hexagon_nn_execute\n");
if ((err = hexagon_nn_execute(
graph_id,
0, 0, 0, 0, // Our input has 0-dimension
NULL, // Pointer to input data
0, // How many total bytes of input?
(unsigned int *) &out_batches,
(unsigned int *) &out_height,
(unsigned int *) &out_width,
(unsigned int *) &out_depth,
(uint8_t *)NULL, // Pointer to output buffer
0, // Max size of output buffer
(unsigned int*) &out_data_size) // Actual size used for output
) != 0) {
printf("Whoops... run failed: %d\n",err);
}
TEARDOWN:
// Free the memory, especially if we want to build subsequent graphs
hexagon_nn_teardown(graph_id);
// Stop fastRPC
fastrpc_teardown();
if (!err) printf("Test Passed!\n");
else printf ("Test Failed, err=%d\n", err);
return err;
}
I have a question about FullyConnected
operator.
I found that NNLib has some draft op code in https://github.com/XiaoMi/nnlib/blob/master/hexagon/ops/src/op_fully_connected.c
It has FullyConnected_u8
op and couple other commented out ops such as QuantizedFC_8x8p8to8
.
ops.def
has only FullyConnected_u8
https://github.com/XiaoMi/nnlib/blob/master/interface/ops.def#L193
Looks like FullyConnected_u8
was designed for uint8
inputs, but I do not see input min/max tensors, weight min/max, bais min/max. So it works with uint8 but non-quantized data??? That is strange.
What about QuantizedFC_8x8p8to8
? It uses two non existing functions fc_layer_execute_opt
and fc_layer_execute_ref
. Another functions are found instead fc_layer_execute
and fc_layer_execute_hvx
but they are not used by any operators.
supported_ops.txt
does not have any info about FullyConnected op.
https://github.com/XiaoMi/nnlib/blob/master/docs/supported_ops.txt
So, what should I do if my model has FullyConnected layer. How to convert it to Hexagon NNLib graph?
I'm running hexagon_nn example 004-xor-graph.c
. It uses three PPrint_f
operators.
But I do not see that it dumps tensors content to terminal or to logcat.
To where PPrint_f
outputs tensor content? Anyone tried to use it?
// ACTUAL GRAPH
// [ 1,-1] [1]
// [-1, 1] [1]
// "layer1a" "layer3a"
// 0x1001a 0x1003a
// \ \.
// [1,0] -----> MatMul --> Relu --> MatMul ----.
// Const "layer1" "layer2" "layer3" \.
// 0x1001b 0x1001 0x1002 0x1003 \.
// \ \ \ \.
// PPrint PPrint PPrint Close
// 0x2000 0x2001 0x2002 0x2003
// /
// /
// [1]
// Const
// 0x2003a
thanks in advance.
I read the intro.txt between docs, but not sure.
Hi. I am currently trying to run the nnlib example, but I encounter the following error
test/graphmain.c:436:22: fatal error: pmu_adsp.h: No such file or directory #include "pmu_adsp.h" ^ compilation terminated.
I have searched for the header fileonline, but can't seem to find it:
When I use "make tree VERBOSE=1 V=hexagon_Release_dynamic_toolv83_v66" command to compile, the following error occurs:
make[1]: Entering directory E:/Tools/Hexagon_SDK_3.4.3/Hexagon_SDK/3.4.3/libs/hexagon_nn/nnlib-master_20191101/nnlib-master' interface/proto_to_idl.pl process_begin: CreateProcess(NULL, perl E:\Tools\Hexagon_SDK_3.4.3\Hexagon_SDK\3.4.3\libs\hexagon_nn\nnlib-master_20191101\nnlib-master\interface\proto_to_idl.pl, ...) failed. make (e=2): 系统找不到指定的文件。 make[1]: *** [interface/hexagon_nn.idl] Error 2 make[1]: Leaving directory
E:/Tools/Hexagon_SDK_3.4.3/Hexagon_SDK/3.4.3/libs/hexagon_nn/nnlib-master_20191101/nnlib-master'
make: *** [tree] Error 2
Where is this function "hexagon_nn_remove_clocks " implemented? Thanks
I follow the read me to run all the command, and I get the "cannot link ..." error when run "adb shell "LD_LIBRARY_PATH=/data/local/tmp/hexagon_nn /data/local/tmp/hexagon_nn/controller_test", is any possible problem here to fix it? Thanks.
Hello, I have compiled hexagon_nnlib after
0) Download and install the Hexagon SDK 3.3 (3.2 may also work)
However, I did not got graph_app executable file under android_Release/ship. I only got dynamic file
你好,我想问下
问题有点多哈,感谢~
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.