Code Monkey home page Code Monkey logo

Comments (31)

jkjung-avt avatar jkjung-avt commented on June 26, 2024 1

The score_threshold in "batch_non_max_suppression" only affects inference side of the model. You could also refer to this for what this value means. Changing this value does not make a difference to training of the model.

I think it should be possible to convert your trained ssd_incpetion_v2 model (pb) to UFF then to optimized TensorRT engine. I might try it on my own custom ssd_inception_v2_egohands model. I'll let you know when I find time to make that work.

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

So you want to optimize your own custom ssd_inception_v2 model with TensorRT and run inference on Jetson Xavier, right?

In that case, I think you just need to make necessary modifications to my build_engine.py. You could also reference NVIDIA's model_ssd_inception_v2_coco_2017_11_17.py.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Yeah that's exactly what I want to do :) Thanks !

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

So I've adapted your build_engine.py script for my purpose but i'm stuck with the TensorRT error :
[TensorRT] ERROR: UffParser: Graph error: Cycle graph detected [TensorRT] ERROR: Network must have at least one output
I've been looking for this cycle graph detected error but no one seems to have solved this issue except on this link :

https://forums.developer.nvidia.com/t/tensorrt-error-uffparser-graph-error-cycle-graph-detected/68407/6

He said that he has removed the bug by removing the "map_fn" operation from tensorflow. Is it a good idea in your opinion ? If yes, where can I find this operation ?

If you have any other idea, please let me know :)

And another time, thank you for your time, I'm asking a lot of questions on your repos but I'm very interested in it.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Ps: This the MODEL_SPEC for my model :

'ssd_inception_v2_boats': { 'input_pb': os.path.abspath(os.path.join(DIR_NAME,Frozen_trt_graph.pb')), 'tmp_uff': os.path.abspath(os.path.join( DIR_NAME, 'tmp_v2_boats.uff')), 'output_bin': os.path.abspath(os.path.join( DIR_NAME, 'TRT_ssd_inception_v2_boats.bin')), 'num_classes': 1, 'min_size': 0.1, 'max_size': 0.9, 'input_order': [0, 2, 1], # order of loc_data, conf_data, priorbox_data }
min_size and max_size are the same written in my config file so that's not the problem etheir ...

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

Could you try to use the following to convert your pb to uff? Save it as "config.py". Then do python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py -O NMS -t --input-file Frozen_trt_graph.pb -p config.py --output tmp_v2_boats.

"""config-ssd_inception_v2_coco.py
This is the config file for converting ssd_inception_v2_coco.pb to UFF.
This file was modified from NVIDIA's original "sampleUffSSD" according to:
https://devtalk.nvidia.com/default/topic/1037256/tensorrt/sampleuffssd-conversion-fails-keyerror-image_tensor-/post/5270435/#5270435
"""


import graphsurgeon as gs
import tensorflow as tf


Input = gs.create_node(
    "Input",
    op="Placeholder",
    dtype=tf.float32,
    shape=[1, 3, 300, 300])

PriorBox = gs.create_node(
    "PriorBox",
    numLayers=6,
    minScale=0.2,
    maxScale=0.95,
    aspectRatios=[1.0, 2.0, 0.5, 3.0, 0.33],
    layerVariances=[0.1, 0.1, 0.2, 0.2],
    featureMapShapes=[19, 10, 5, 3, 2, 1])

NMS = gs.create_node(
    "NMS",
    scoreThreshold=0.3,  # was 1e-8
    iouThreshold=0.6,
    maxDetectionsPerClass=100,
    maxTotalDetections=100,
    numClasses=2,  # only 1 class + 'background'
    inputOrder=[0, 2, 1],
    scoreConverter="SIGMOID")

concat_priorbox = gs.create_node(
    "concat_priorbox",
    dtype=tf.float32,
    axis=2)

concat_box_loc = gs.create_node(
    "concat_box_loc")

concat_box_conf = gs.create_node(
    "concat_box_conf")

namespace_for_removal = [
    "ToFloat",
    "image_tensor",
    "Preprocessor/map/TensorArrayStack_1/TensorArrayGatherV3",
]

namespace_plugin_map = {
    "MultipleGridAnchorGenerator": PriorBox,
    "Postprocessor": NMS,
    "Preprocessor": Input,
    "ToFloat": Input,
    "image_tensor": Input,
    "MultipleGridAnchorGenerator/Concatenate": concat_priorbox,
    "concat": concat_box_loc,
    "concat_1": concat_box_conf
}


def preprocess(dynamic_graph):
    # remove the unrelated or error layers
    dynamic_graph.remove(
        dynamic_graph.find_nodes_by_path(namespace_for_removal),
        remove_exclusive_dependencies=False)

    # Now create a new graph by collapsing namespaces
    dynamic_graph.collapse_namespaces(
        namespace_plugin_map)

    # Remove the outputs, so we just have a single output node (NMS).
    dynamic_graph.remove(
        dynamic_graph.graph_outputs,
        remove_exclusive_dependencies=False)

    # Remove the Squeeze to avoid "Assertion 'isPlugin(layerName)' failed"
    Squeeze = dynamic_graph.find_node_inputs_by_name(
        dynamic_graph.graph_outputs[0],
        'Squeeze')
    dynamic_graph.forward_inputs(Squeeze)

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

After you've done uff conversion stated above, you should have "tmp_v2_boats.uff" in the same directory.

Then you could comment out line 192 to 201 and let build_engine.py do uff-to-trt conversion directly.

#    dynamic_graph = add_plugin(
#        gs.DynamicGraph(spec['input_pb']),
#        model,
#        spec)
#    _ = uff.from_tensorflow(
#        dynamic_graph.as_graph_def(),
#        output_nodes=['NMS'],
#        output_filename=spec['tmp_uff'],
#        text=True,
#        debug_mode=DEBUG_UFF)

Let me know if it works.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Thanks, will give it a try! Since the error is coming from the trt conversion, and the uff conversion is before the trt step in the script, i already have the tmp uff file Do you mean that the uff conversion you told me will not give me the same uff file? Will keep you updated, thanks!

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

TRT coversion fails due to bad/incompatible UFF. You don't see error in UFF coversion phase, but it's actually causing the problem.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Ok thanks :) It's late here in France but will give it a try tomorrow morning!

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

I've just updated #35 (comment) by modifying "numClasses" from 91 to 2. That should suit your case.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Hi, so I tested the convert_to_uff.py script with the config.py file you gave me and it is well registering the UFF file as you see in the output below. I commented the part in the build_engine.py script which were responsible of building the uff file but I still get the same error for TRT....

OUtput of uff conversion : DEBUG [/usr/lib/python3.6/dist-packages/uff/bin/../../uff/converters/tensorflow/converter.py:96] Marking ['NMS'] as outputs No. nodes: 801 UFF Output written to tmp_v2_boats.uff UFF Text Output written to tmp_v2_boats.pbtxt

Output of TRT :

`[TensorRT] ERROR: Could not register plugin creator: FlattenConcat_TRT in namespace:

[TensorRT] ERROR: UffParser: Graph error: Cycle graph detected
[TensorRT] ERROR: Network must have at least one output`

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

the num_classes parameter in the config file for training the model was set to 1 when I did the training, is it wrong ?

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

"numClasses" should be 2, since the SSD model would output (softmax) 2 classes: "background" and "your_target_object".

However, [TensorRT] ERROR: UffParser: Graph error: Cycle graph detected is another problem. After correcting "numClasses", you still need to solve that somehow.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Yeah somehow ^^ I went on every website talking about it but it seems like nobody has a solution for it...

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

This post (sorry it's in Chinese): https://www.twblogs.net/a/5d5f17a2bd9eee541c32823c

The author basically says that he solved the "Cycle graph detected" problem by using "python2" to run convert_to_uff.py. I'm not sure if you feel like giving that a try...

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Yeah I translated this post and saw the solution and I will definitely give it a try if I find nothing else ... I just postponed it because I would need to install protobuf , tensorflow etc and it is taking a long time on the Xavier to be honest. Thanks a lot, will continue to keep you updated if the python2 solution works.

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

Sorry in my last post, I actually meant "I'm not sure" (I've made the correction). I will probably looking into this problem more when I have time during the weekend.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Do you have any knowledge on using the import tensorflow.contrib.tensorrt as trt instead of import tensorrt as trt ? Maybe I could find a way to use this tensorrt library from tensorflow (TF-TRT) instead of the tensorrt nvidia library ? 🤔

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

My jkjung-avt/tf_trt_models repository does exactly that. You need to make sure your tensorflow is built with TF-TRT support, though.

I have also tested my custom ssd egohands models with tf_trt_models. You could refer to my blog post: Deploying the Hand Detector onto Jetson TX2

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

I remember now why i left TF-TRT on the side .... I'm working with virtual environments and i always python path problems with this .... 10 minutes ago i had "no module named object_detection " and I solved it by deleting the "--user" in your install.sh script. Then I had the no module named nets and now the same but for protos .... But my tensorflow installation is working perfectly fine since I've been able to train, do inference etc ... But every PYTHONPATH option I add, another error is popping off. Will tell you if I've been able to launch your camera_tf_trt script today.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

The problem is that everybody have a tensorflow/models/research/slim directory wich contains all the module required by your camera_tf_trt script but i do not have this directory. I installed tensorflow by following the guide from NVIDIA for installing tensorflow (sudo pip3 install --pre --extra-index-url https://developer.download.nvidia.com/compute/redist/jp/v43 tensorflow==1.15.2+nv20.3) and since I need to work with virtualenvs, I'm kinda lost ...

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

My jkjung-avt/tf_trt_models code should work in a virtualenv. I think you just need to fix the pip3 installs (make sure to install packages into the virtualenv).

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

I run the trt building script outside of my virtualenv and I finally have my trt optimized model. I hope it will be very fast 👍

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Quite disappointed for the moment.... Not going above the 24 FPS on the xavier ... Will try to find why :(
Ps: The pb file is sooooo slow to load, I saw on your blog that it could be linked to the treshold which should at least be 0.3 and that's the value of mine in the config file ...

EDIT : It is displaying only 24 FPS but the running time of the inference is about 15 ms so I'm more about 66 FPS which is a good improvement but so fare away from the performance of the Jetson-inference Github with which i got up to 150 FPS with SSD-inception-V2

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Hi jkjung ! Hope you had a great weekend. I'm about to retrain my model with the "nms_treshold" parameter value from the config file changed from 1e-08 to 0.3 in a way (I hope) to improve my model performance. Do you know any other way to go up to the 150 FPS gave the Jetson-inference SSD-Inception-v2 engine ?

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Yeah but the workflow pb-->UFF-->TRT Engine is what I was trying in the beginning of this issue with your advice 5 days ago on which I was stuck because of the cycle graph detected error. Do you know other way to do the uff to trt conversion ?

Could you try to use the following to convert your pb to uff? Save it as "config.py". Then do python3 /usr/lib/python3.6/dist-packages/uff/bin/convert_to_uff.py -O NMS -t --input-file Frozen_trt_graph.pb -p config.py --output tmp_v2_boats.

After you've done uff conversion stated above, you should have "tmp_v2_boats.uff" in the same directory.

Then you could comment out line 192 to 201 and let build_engine.py do uff-to-trt conversion directly.

#    dynamic_graph = add_plugin(
#        gs.DynamicGraph(spec['input_pb']),
#        model,
#        spec)
#    _ = uff.from_tensorflow(
#        dynamic_graph.as_graph_def(),
#        output_nodes=['NMS'],
#        output_filename=spec['tmp_uff'],
#        text=True,
#        debug_mode=DEBUG_UFF)

Let me know if it works.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Little update from my last message, updating protobuf by building it from source (v3.8.0) decreased a lot the loading time of my pb file.

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

Glad to hear this update.

from hand-detection-tutorial.

Kmarconi avatar Kmarconi commented on June 26, 2024

Hi jkjung ! I hope you are doing fine. I was wondering if you could give me some tips about training with tensorflow on ssd-inception. In fact, since the last time we spoke I did some dataset augmentation by adding coco classes which were interesting for me in the dataset. Yesterday I successfully converted coco json files to pascal voc format and created a tfrecord of my new dataset. The problem is that no matter what I do, the training is always stuck at a loss of a value around 6 and a mAP which is not better than 0.35 which is pretty bad. I first thought that it was related to my learning rate but no matter what value I use for my decay steps, the result is always the same. See my config file in the attached zip and feel free to tell me any wrong thing I made, I still have a lot to learn.

Ps : I'm doing 18000 steps because I'm only training on one class with 4100 images and a batch size of 24 and since steps = (1x4100x100)/24 = 17083 I rounded it to 18000.

Thanks again !

from hand-detection-tutorial.

jkjung-avt avatar jkjung-avt commented on June 26, 2024

Sorry I don't have time to look into this. But from your description, I would suggest you to double-check whether your tfrecord data is good: (a) make sure the annotations (bounding boxes and classes) are correct; (2) your model should be able to learn from the training set and do a good job on inferencing the validation set.

from hand-detection-tutorial.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.