Code Monkey home page Code Monkey logo

trt-lightnet's Introduction

TensorRT-LightNet: High-Efficiency and Real-Time CNN Implementation on Edge AI

trt-lightNet is a CNN implementation optimized for edge AI devices that combines the advantages of LightNet [1] and TensorRT [2]. LightNet is a lightweight and high-performance neural network framework designed for edge devices, while TensorRT is a high-performance deep learning inference engine developed by NVIDIA for optimizing and running deep learning models on GPUs. trt-lightnet uses the Network Definition API provided by TensorRT to integrate LightNet into TensorRT, allowing it to run efficiently and in real-time on edge devices. This is a reproduction of trt-lightnet [6], which generates a TensorRT engine from the ONNX format.

Key Improvements

2:4 Structured Sparsity

trt-lightnet utilizes 2:4 structured sparsity [3] to further optimize the network. 2:4 structured sparsity means that two values must be zero in each contiguous block of four values, resulting in a 50% reduction in the number of weights. This technique allows the network to use fewer weights and computations while maintaining accuracy.

Sparsity

NVDLA Execution

trt-lightnet also supports the execution of the neural network on the NVIDIA Deep Learning Accelerator (NVDLA) [4] , a free and open architecture that provides high performance and low power consumption for deep learning inference on edge devices. By using NVDLA, trt-lightnet can further improve the efficiency and performance of the network on edge devices.

NVDLA

Multi-Precision Quantization

In addition to post training quantization [5], trt-lightnet also supports multi-precision quantization, which allows the network to use different precision for weights and activations. By using mixed precision, trt-lightnet can further reduce the memory usage and computational requirements of the network while still maintaining accuracy. By writing it in CFG, you can set the precision for each layer of your CNN.

Quantization

Multitask Execution (Detection/Segmentation)

trt-lightnet also supports multitask execution, allowing the network to perform both object detection and segmentation tasks simultaneously. This enables the network to perform multiple tasks efficiently on edge devices, saving computational resources and power.

Installation

Requirements

For Local Installation

  • CUDA 11.0 or later

  • TensorRT 8.5 or 8.6

  • cnpy for debug of tensors This repository has been tested with the following environments:

  • CUDA 11.7 + TensorRT 8.5.2 on Ubuntu 22.04

  • CUDA 12.2 + TensorRT 8.6.0 on Ubuntu 22.04

  • CUDA 11.4 + TensorRT 8.6.0 on Jetson JetPack5.1

  • CUDA 11.8 + TensorRT 8.6.1 on Ubuntu 22.04

For Docker Installation

  • Docker
  • NVIDIA Container Toolkit

This repository has been tested with the following environments:

  • Docker 24.0.7 + NVIDIA Container Toolkit 1.14.3 on Ubuntu 20.04

Steps for Local Installation

  1. Clone the repository.
$ git clone [email protected]:tier4/trt-lightnet.git
$ cd trt-lightnet
  1. Install libraries.
$ sudo apt update
$ sudo apt install libgflags-dev
$ sudo apt install libboost-all-dev
$ sudo apt install libopencv-dev

Install from the following repository.

https://github.com/rogersce/cnpy

  1. Compile the TensorRT implementation.
$ mkdir build && cd build
$ cmake ../
$ make -j

Steps for Docker Installation

  1. Clone the repository.
$ git clone [email protected]:tier4/trt-lightnet.git
$ cd trt-lightnet
  1. Build the docker image.
$ docker build -t trt-lightnet:latest .
  1. Run the docker container.
$ docker run -it --gpus all trt-lightnet:latest

Model

T.B.D

Usage

Converting a LightNet model to a TensorRT engine

Build FP32 engine

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision fp32

Build FP16(HALF) engine

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision fp16

Build INT8 engine
(You need to prepare a list for calibration in "models/calibration_images.txt".)

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision int8 --first true

First layer is much more sensitive for quantization. Threfore, the first layer is not quanitzed using "--first true"

Build DLA engine (Supported by only Xavier and Orin)

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision int8 --first true --dla [0/1]

Inference with the TensorRT engine

Inference from images

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision [fp32/fp16/int8] --first true {--dla [0/1]} --d DIRECTORY

Inference from images

$ ./trt-lightnet --flagfile ../configs/CONFIGS.txt --precision [fp32/fp16/int8] --first true {--dla [0/1]} --v VIDEO

Implementation

trt-lightnet is built on the LightNet framework and integrates with TensorRT using the Network Definition API. The implementation is based on the following repositories:

Conclusion

trt-lightnet is a powerful and efficient implementation of CNNs using Edge AI. With its advanced features and integration with TensorRT, it is an excellent choice for real-time object detection and semantic segmentation applications on edge devices.

References

[1]. LightNet
[2]. TensorRT
[3]. Accelerating Inference with Sparsity Using the NVIDIA Ampere Architecture and NVIDIA TensorRT
[4]. NVDLA
[5]. Achieving FP32 Accuracy for INT8 Inference Using Quantization Aware Training with NVIDIA TensorRT
[6]. lightNet-TR

trt-lightnet's People

Contributors

dan-dnn avatar kan-bayashi avatar

Stargazers

 avatar  avatar Kaz Kawabata avatar

Watchers

Shinpei Kato avatar Makoto Kurihara avatar Yasuyuki Takahashi avatar Dai Utsui avatar Akihito Ohsato avatar Yusuke FUJII avatar Takuya Azumi avatar Yuki Iida avatar Manato Hirabayashi avatar yabuta avatar Eiji Sekiya avatar Kenzo Lobos Tsunekawa avatar  avatar Kentaro Nagatomo avatar RyuYamamoto avatar kuwabara avatar Guolong Zhang avatar Yoshifumi Hayashi avatar Takenobu Tani avatar Fumiya Watanabe avatar Takanori Ishibashi avatar  avatar AkiTakeuchi avatar Go Sakayori avatar K.Hoshi avatar  avatar Junya Sasaki avatar Alexander Carballo avatar Hayato Mizushima avatar Takahiro Ishikawa avatar Keisuke Shima avatar  avatar  avatar Zulfaqar Azmi avatar

trt-lightnet's Issues

[feature request] save detection overrided image with `--dont_show` option

What?

  • --dont_show optionをつけて処理を実行した場合に outputs/detections 以下の画像にBBOXが重畳されるようにして欲しい
     # コマンド例
     $ ./trt-lightnet --flagfile ../configs/lightNetV2-T4XP2-960x960-autolabel.txt --precision fp16 --v ../samples/sakae_night_00-00-00_00-00-10.mp4 --save_detections true --dont_show true --save_detections_path outputs/sakae_night_00-00-00_00-00-10

Why?

  • --dont_show optionをつけると outputs/detections 以下の画像にBBOXが重畳されない
  • optionをつけない場合は問題ないが、SSH CLI環境だとX windowの転送が検出の時間よりもかかってしまいボトルネックなってしまう

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.