Code Monkey home page Code Monkey logo

pelee-tensorrt's Introduction

Pelee-TensorRT using Caffe Parser

Accelerate Pelee with TensorRT Pelee: A Real-Time Object Detection System on Mobile Devices (NeurIPS 2018)

TensorRT-Pelee can run over 70FPS(11ms) on Jetson TX2(FP32)


Performance(FP32)

  1. NVIDIA Jetson TX2: 72 FPS (13.2~11 ms)
  2. Titan V: 200 FPS (5 ms)

Requirements:

  1. TensorRT 4.x (Jetpack 3.3)
  2. CUDA 9.0
  3. cudnn 7.

Run:

cmake .
make
./build/bin/pelee

TODO:

  • FP16 Implementation
  • Change Custom layers IPlugin to IPluginExt

The bug has been fixed

image

pelee-tensorrt's People

Contributors

cathy-kim avatar seojink avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pelee-tensorrt's Issues

Crash problem

Thanks for your great works ,
I have a problem when run demo on jetson nano , the error message was shown below , is it normal ?

OpenCV Error: Assertion failed (ssize.width > 0 && ssize.height > 0) in resize, file /home/nvidia/build_opencv/opencv/modules/imgproc/src/resize.cpp, line 3289
terminate called after throwing an instance of 'cv::Exception'
what(): /home/nvidia/build_opencv/opencv/modules/imgproc/src/resize.cpp:3289: error: (-215) ssize.width > 0 && ssize.height > 0 in function resize

Aborted (core dumped)

Performance

the performance on my Jetson Tx2 is just 30ms,can't reach 72FPS

fatal error: glog/logging.h: No such file or directory

[ 5%] Building NVCC (Device) object CMakeFiles/inferLib.dir/util/cuda/inferLib_generated_cudaRGB.cu.o
[ 11%] Building NVCC (Device) object CMakeFiles/inferLib.dir/inferLib_generated_mathFunctions.cu.o
In file included from /home/nvidia/TRT-Pelee/mathFunctions.cu:1:0:
/home/nvidia/TRT-Pelee/mathFunctions.h:17:26: fatal error: glog/logging.h: No such file or directory
compilation terminated.
CMake Error at inferLib_generated_mathFunctions.cu.o.cmake:207 (message):
Error generating
/home/nvidia/TRT-Pelee/CMakeFiles/inferLib.dir//./inferLib_generated_mathFunctions.cu.o

CMakeFiles/inferLib.dir/build.make:63: recipe for target 'CMakeFiles/inferLib.dir/inferLib_generated_mathFunctions.cu.o' failed
make[2]: *** [CMakeFiles/inferLib.dir/inferLib_generated_mathFunctions.cu.o] Error 1
CMakeFiles/Makefile2:67: recipe for target 'CMakeFiles/inferLib.dir/all' failed
make[1]: *** [CMakeFiles/inferLib.dir/all] Error 2
Makefile:127: recipe for target 'all' failed
make: *** [all] Error 2

do u parser the caffe model directly?

I find that you use the code`

ICaffeParser* parser = createCaffeParser();

99 | parser->setPluginFactory(&pluginFactory);

`
why not use the caffeparser directly transfrom the caffe model to trt model?

I will appreciate u if u can reply soon,thanks

I can not get 70+ fps in TX2 board

hi, thank you for sharing your project. It's useful for me.
But I can only get 40 FPS in my Tx2 board,here is my configuration:
1.I use your model
2.use high performace mode by sudo ~/jetson_clocks.sh

I do not know why,would you please help me?

About batchnorm and scale layer

I have questions about the layer types.
The original Pelee model contains batchnorm and scale layer, but it seems like deploy_iplugin.prototxt in tensorRT does not have that layers.
I want to train my own dataset using pre-trained weights from coco-pretrained pelee model, but the model that can be downloaded from https://github.com/Robert-JunWang/Pelee have batchnorm and scale layers, which is different from TensorRT models.
If you trained your model that does not have bn and scale layers, could you share the coco pretrained caffemodel(coco only, not voc) please...?
I'm looking forward to your answer
Thank you so much :)

Assertion `C2 == inputDims[param.inputOrder[1]].d[0]' failed.

since my tensorrt is 6.0,so I removed pelee_merged.caffemodel.cache and run ./build/bin/pelee
But it did not succeed.
The log shows:
pelee: nvPluginsLegacy.cpp:1026: virtual void nvinfer1::plugin::DetectionOutputLegacy::configure(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, int): Assertion `C2 == inputDims[param.inputOrder[1]].d[0]' failed.
Aborted (core dumped)

Please give my some hints for modification, Sincere thanks for you!

The Project Error

hi @ginn24
I have some questions:
first, I had trained my own model by using https://github.com/eric612/MobileNet-YOLO/tree/master/models/pelee.
second, I want to use your project run my model. so I used https://github.com/eric612/MobileNet-YOLO/blob/master/models/pelee/deploy.prototxt and my own model replace your model and weight.
but, when creat the .tensorcache file, had a error:
//------------------------------------------------------------
detection_out
Plugin layer output count is not equal to caffe output count
Pelee_Object_Road: ../tensorNet.cpp:118: bool TensorNet::caffeToTRTModel(const char*, const char*, const std::vector<std::__cxx11::basic_string >&, unsigned int, std::ostream&): Assertion `blobNameToTensor != nullptr' failed.
//------------------------------------------------------------------------------------
second, I used your .prototxt file creat .tensorcache, and had some problems:
//-----------------------------------------------------
detection_out
Weights for layer seg_conv1 doesn't exist
CaffeParser: ERROR: Attempting to access NULL weights
Weights for layer seg_conv2 doesn't exist
CaffeParser: ERROR: Attempting to access NULL weights
//----------------------------------------------------
even it creatd the .tensorcache file, but don't detect the object and road.

I hope you can give me some advices.
I also want to know how you train your model.
very very thanks.

I have a problem Segmentation fault!

Hi, Thank you for you good work! I have a problem , while i run the binary file ./build/bin/pelee , I have a problem Segmentation fault while debug in tensorNet.imageInference() function. Can you give some help for me, and I run the code in cloud server.

How can I make tensorcache file?

한국사람입니당...
텐서RT에 많이 익숙하지 않아서 그런데, 제가 새로 train한 모델을 사용하려면
카페모델과 카페모델.tensorcache 둘다 필요한것 같은데요,
이 tensorcache를 어떻게 만드는건가요? 참조할 만한 사이트가 있나요?
아 참! 그리고 COCO에서 pretrain된 모델인가요 아니면 VOC만 train된 모델인가요?

I try to run your code on a pc with 1080ti, but core dumped

then I run it in gdb and backtrace , here is the information:

Thread 1 "pelee" received signal SIGSEGV, Segmentation fault.
0x00007fffecedc970 in nvinfer1::cudnn::ExecutionContext::setDeviceMemoryInternal(void*) () from /home/senjary/downloads/TensorRT-4.0.1.6/lib/libnvinfer.so.4
(gdb) bt
#0 0x00007fffecedc970 in nvinfer1::cudnn::ExecutionContext::setDeviceMemoryInternal(void*) ()
from /home/senjary/downloads/TensorRT-4.0.1.6/lib/libnvinfer.so.4
#1 0x00007fffecee1b5e in nvinfer1::cudnn::ExecutionContext::ExecutionContext(nvinfer1::cudnn::Engine const&, bool) ()
from /home/senjary/downloads/TensorRT-4.0.1.6/lib/libnvinfer.so.4
#2 0x00007fffecee1e16 in nvinfer1::cudnn::Engine::createExecutionContext() ()
from /home/senjary/downloads/TensorRT-4.0.1.6/lib/libnvinfer.so.4
#3 0x00007ffff7af21e0 in TensorNet::imageInference(void**, int, int) ()
from /home/senjary/caffe/Pelee-TensorRT-master/build/lib/libinferLib.so
#4 0x000000000041ed01 in main ()

I also run the code on a tx2 with jetpack 3.3 , and it succeed.
How can I solve the problem with pc?

What should I do if I want to test with BATCH_SIZE=n

First of all, thanks for sharing awesome project. your project help me a lot.
In the main.cpp I saw that there is BACH_SIZE parameter and this parameter is set to 1. I wonder can I do inference for 2 or 3 images at the same time? If that is possible, which parts should I change to do make it working?

Thanks

How can I train my own model?

@ginn24
hi , I want to train my own model, including target detection and lane lines. But I don't know how to train. I have collected 2000 pictures. Can you give me some advice? Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.