Code Monkey home page Code Monkey logo

Comments (22)

lucasjinreal avatar lucasjinreal commented on April 28, 2024 7

@makaveli10 I already converted the model to onnx and inferenced it on TensorRT.

image

However, this involved some special operations different than this repo does, and accordingly on TensorRT side needs some special operation to do. Overall, the TensorRT accelerated speed is about: 38ms with a 1280x768 input resolution, the performance is quite well:
image

you can add my wechat: jintianiloveu if you intested in this accelerate tech.

from yolov5.

glenn-jocher avatar glenn-jocher commented on April 28, 2024 1

Output should look like this. You might want to git pull also, we've made recent changes to onnx export.

...
  %416 = Shape(%285)
  %417 = Constant[value = <Scalar Tensor []>]()
  %418 = Gather[axis = 0](%416, %417)
  %419 = Shape(%285)
  %420 = Constant[value = <Scalar Tensor []>]()
  %421 = Gather[axis = 0](%419, %420)
  %422 = Shape(%285)
  %423 = Constant[value = <Scalar Tensor []>]()
  %424 = Gather[axis = 0](%422, %423)
  %427 = Unsqueeze[axes = [0]](%418)
  %430 = Unsqueeze[axes = [0]](%421)
  %431 = Unsqueeze[axes = [0]](%424)
  %432 = Concat[axis = 0](%427, %439, %440, %430, %431)
  %433 = Reshape(%285, %432)
  %434 = Transpose[perm = [0, 1, 3, 4, 2]](%433)
  return %output, %415, %434
}
Export complete. ONNX model saved to ./weights/yolov5s.onnx
View with https://github.com/lutzroeder/netron

from yolov5.

glenn-jocher avatar glenn-jocher commented on April 28, 2024

@jinfagang you should be able to export to onnx like this. Run this command from the /yolov5 directory.

export PYTHONPATH="$PWD" 
python models/onnx_export.py --weights ./weights/yolov5s.pt --img 640 640 --batch 1

from yolov5.

lucasjinreal avatar lucasjinreal commented on April 28, 2024

@glenn-jocher turns out my model is trained on GPU, and the model serialized as cuda device, so the input does not to cuda throw this error.

However, when I force it to cuda, it still got error oppsite, seems some code inside model, still using CPU tensor instead.

is there any special reason for using cpu tensor there?

from yolov5.

lucasjinreal avatar lucasjinreal commented on April 28, 2024

image

the generated onnx default seems enabled augmentation.

How to obtain boxes and scores and class from these outputs?

image

from yolov5.

glenn-jocher avatar glenn-jocher commented on April 28, 2024

@jinfagang onnx export should only be done when the model is on cpu.

The netron image you show is correct. The boxes are part of the v5 architecture, they are not related to image augmentation during training.

At the moment onnx export stops at the output features. This is an example P3 output (smallest boxes) for 3 anchors with a grid size 40x24. The 85 features are xywh, objectness, and 80 class confidences.

Screen Shot 2020-06-12 at 10 25 35 AM

from yolov5.

glenn-jocher avatar glenn-jocher commented on April 28, 2024

@jinfagang I ran into a cuda issue with an onnx export today, and pushed a fix 1e2cb6b for this. This may or may not solve your original issue.

from yolov5.

lucasjinreal avatar lucasjinreal commented on April 28, 2024

@glenn-jocher So the output is same with yolov3 in your previous repo? I wanna access the outputs and accelerate it in tensorrt.

from yolov5.

lucasjinreal avatar lucasjinreal commented on April 28, 2024

@glenn-jocher Does anchors decodee process can also exported into onnx? So that it can be more end2end when transfer into other paltforms for inference?

from yolov5.

glenn-jocher avatar glenn-jocher commented on April 28, 2024

@jinfagang yes, this would be more useful. It is more complicated to implement though, especially if you want a clean onnx graph. We will try to add this in the future.

from yolov5.

lucasjinreal avatar lucasjinreal commented on April 28, 2024

@glenn-jocher I had a tiny experiments on this, it ends involved a ScatterND op there, this op is hard to convert to other platforms. If we want eliminate this op, postprocess scripts (Detect layer here) need re-written (only for export mode, in a more complicated way but can export and works perfectly)

from yolov5.

makaveli10 avatar makaveli10 commented on April 28, 2024

@jinfagang I also ran into this issue. Resolved it by converting the model to cuda and then saving the weights. Used those weights to convert to onnx model but I ran into some issue in converting onnx to tensorRT.

If you successfully converted the model to TensorRT please let me know how you did that.
Thanks

from yolov5.

glenn-jocher avatar glenn-jocher commented on April 28, 2024

@jinfagang great work! What is the speedup compared to using detect.py? What GPU are you using?

from yolov5.

lucasjinreal avatar lucasjinreal commented on April 28, 2024

@glenn-jocher Am using GTX1080Ti, speed tested on this. The speed measured included post process time (from engine forward to nms and copy data back to CPU etc.). I think the speed is almost same with darknet version yolov4 converted tensorrt. (I previously tested with 800x800 input).

the speed can still be optimized by including all postprocess to cuda kernel and fp16 or int8 quantization

from yolov5.

kingardor avatar kingardor commented on April 28, 2024

@jinfagang amazing to see you got it running in such a short time. I'm able to convert the pth files to onnx format but I keep gettting this error when I try to convert to tensorrt6:
(Unnamed Layer* 0) [Slice]: slice is out of input range
While parsing node number 9 [Slice]:
3
If you have some pointers for me, I would really appreciate it. Connecting on WeChat is difficult for me cause I don't have an account and don't have a friend who can validate my new account.

from yolov5.

makaveli10 avatar makaveli10 commented on April 28, 2024

@jinfagang I dont have an account on WeChat. Neither I have a friend who can verify a new account. Can you please share your code to inference onnx on TensorRT somehow? I am getting incorrect outputs from the engine that I generated using onnx model.

from yolov5.

kingardor avatar kingardor commented on April 28, 2024

@makaveli10 mind sharing how you were able to generate an onnx model that worked with TensorRT? Also, which version of TRT did you use?

from yolov5.

makaveli10 avatar makaveli10 commented on April 28, 2024

@aj-ames https://github.com/TrojanXu/yolov5-tensorrt
Let me know if you make any sort of progress please!

from yolov5.

kingardor avatar kingardor commented on April 28, 2024

@makaveli10 thanks. I will update my findings here.

from yolov5.

yushanshan05 avatar yushanshan05 commented on April 28, 2024

@aj-ames https://github.com/TrojanXu/yolov5-tensorrt
Let me know if you make any sort of progress please!

I use this project. But I enconter the same error when I try to convert to tensorrt6:
[TensorRT] ERROR: (Unnamed Layer* 0) [Slice]: slice is out of input range
ERROR: Failed to parse the ONNX file.

If you have some pointers for me, I would really appreciate it.

from yolov5.

yushanshan05 avatar yushanshan05 commented on April 28, 2024

@glenn-jocher Am using GTX1080Ti, speed tested on this. The speed measured included post process time (from engine forward to nms and copy data back to CPU etc.). I think the speed is almost same with darknet version yolov4 converted tensorrt. (I previously tested with 800x800 input).

the speed can still be optimized by including all postprocess to cuda kernel and fp16 or int8 quantization

When I convert onnx to tensorrt, I enconter the same error when I try to convert to tensorrt6:
[TensorRT] ERROR: (Unnamed Layer* 0) [Slice]: slice is out of input range
ERROR: Failed to parse the ONNX file.

I use tensorrt 6.0 and onnx 1.5.0 or 1.6.0, they are all not work.
If you have some pointers for me, I would really appreciate it.

from yolov5.

github-actions avatar github-actions commented on April 28, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

from yolov5.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.