Code Monkey home page Code Monkey logo

csi-nn2's Introduction

English | 简体中文

SHL(Structure of Heterogeneous Library, Chinese name: ShiHulan) is a high-performance Heterogeneous computing library provided by T-HEAD. The interface of SHL uses T-HEAD neural network library API for XuanTie CPU platform: CSI-NN2, and provides a series of optimized binary libraries.

Features for SHL:

  • Reference implementation of c code version
  • Assembly optimization implementation for XuanTie CPU
  • Supports symmetric quantization and asymmetric quantization
  • Support 8bit, 16bit, and f16 data types
  • compaatible with NCHW and NHWC formates
  • Use HHB to automatically call API
  • Covers different architectures, such as CPU and NPU
  • Reference heterogeneous schedule implementation

In principle, SHL only provides the reference implementation of XuanTie CPU platform, and the optimization of each NPU target platform is completed by the vendor of the specific platform.

Use SHL

Installation

Official Python packages

SHL released packages are published in PyPi, can install with hhb.

pip3 install hhb

binary libary is at /usr/local/lib/python3.6/dist-packages/tvm/install_nn2/

Build SHL from Source

Here is one example to build C906 library.

We need to install T-HEAD RISC-V GCC 2.6, which can get from T-HEAD OCC, download, decompress, and set path environment.

wget https://occ-oss-prod.oss-cn-hangzhou.aliyuncs.com/resource//1663142514282/Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz
tar xf Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1-20220906.tar.gz
export PATH=${PWD}/Xuantie-900-gcc-linux-5.10.4-glibc-x86_64-V2.6.1/bin:$PATH

Download source code

git clone -b 2.4 https://github.com/T-head-Semi/csi-nn2.git
cd csi-nn2
git submodule update --init --recursive

compile c906

make nn2_c906

install c906

make install_nn2

Quick Start Example

Here is one example for XuanTie C906 to run mobilenetv1. It shows how to call SHL API to inference the whole model.

compile command:

cd example
make c906_m1_f16

c906_mobilenetv1_f16.elf will be generated after completion. After copying it to the development board with C906 CPU [such as D1], execute:

./c906_mobilenetv1_f16.elf

NOTE: Original mobilenetv1's every conv2d has one BN(batch norm), but the example assumes BN had been fused into conv2d。About how to use deployment tools to fuse BN, and emit right weight float16 value, can reference HHB.

Resources

Acknowledgement

SHL refers to the following projects:

csi-nn2's People

Contributors

zhangwm-pt avatar alter-xp avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.