nvidia-ai-iot / tf_to_trt_image_classification Goto Github PK

Image classification with NVIDIA TensorRT from TensorFlow models.

License: BSD 3-Clause "New" or "Revised" License

CMake 1.02% Python 69.94% Shell 4.77% Cuda 19.18% C++ 5.10%

tensorflow-models tensorrt tensorflow jetson-tx2 benchmark

tf_to_trt_image_classification's Introduction

TensorFlow->TensorRT Image Classification

This contains examples, scripts and code related to image classification using TensorFlow models (from here) converted to TensorRT. Converting TensorFlow models to TensorRT offers significant performance gains on the Jetson TX2 as seen below.

Models
Setup
Download models and create frozen graphs
Convert frozen graph to TensorRT engine
Execute TensorRT engine
Benchmark all models

Models

The table below shows various details related to pretrained models ported from the TensorFlow slim model zoo.

_Model	_{Input Size}	_{TensorRT (TX2 / Half)}	_{TensorRT (TX2 / Float)}	_{TensorFlow (TX2 / Float)}	_{Input Name}	_{Output Name}	_{Preprocessing Fn.}
_{inception_v1}	_224x224	_7.98ms	_12.8ms	_27.6ms	_input	_{InceptionV1/Logits/SpatialSqueeze}	_inception
_{inception_v3}	_299x299	_26.3ms	_46.1ms	_98.4ms	_input	_{InceptionV3/Logits/SpatialSqueeze}	_inception
_{inception_v4}	_299x299	_52.1ms	_88.2ms	_176ms	_input	_{InceptionV4/Logits/Logits/BiasAdd}	_inception
_{inception_resnet_v2}	_299x299	_53.0ms	_98.7ms	_168ms	_input	_{InceptionResnetV2/Logits/Logits/BiasAdd}	_inception
_{resnet_v1_50}	_224x224	_15.7ms	_27.1ms	_63.9ms	_input	_{resnet_v1_50/SpatialSqueeze}	_vgg
_{resnet_v1_101}	_224x224	_29.9ms	_51.8ms	_107ms	_input	_{resnet_v1_101/SpatialSqueeze}	_vgg
_{resnet_v1_152}	_224x224	_42.6ms	_78.2ms	_157ms	_input	_{resnet_v1_152/SpatialSqueeze}	_vgg
_{resnet_v2_50}	_299x299	_27.5ms	_44.4ms	_92.2ms	_input	_{resnet_v2_50/SpatialSqueeze}	_inception
_{resnet_v2_101}	_299x299	_49.2ms	_83.1ms	_160ms	_input	_{resnet_v2_101/SpatialSqueeze}	_inception
_{resnet_v2_152}	_299x299	_74.6ms	_124ms	_230ms	_input	_{resnet_v2_152/SpatialSqueeze}	_inception
_{mobilenet_v1_0p25_128}	_128x128	_2.67ms	_2.65ms	_15.7ms	_input	_{MobilenetV1/Logits/SpatialSqueeze}	_inception
_{mobilenet_v1_0p5_160}	_160x160	_3.95ms	_4.00ms	_16.9ms	_input	_{MobilenetV1/Logits/SpatialSqueeze}	_inception
_{mobilenet_v1_1p0_224}	_224x224	_12.9ms	_12.9ms	_24.4ms	_input	_{MobilenetV1/Logits/SpatialSqueeze}	_inception
_{vgg_16}	_224x224	_38.2ms	_79.2ms	_171ms	_input	_{vgg_16/fc8/BiasAdd}	_vgg

The times recorded include data transfer to GPU, network execution, and data transfer back from GPU. Time does not include preprocessing. See scripts/test_tf.py, scripts/test_trt.py, and src/test/test_trt.cu for implementation details.

Setup

Flash the Jetson TX2 using JetPack 3.2. Be sure to install
- CUDA 9.0
- OpenCV4Tegra
- cuDNN
- TensorRT 3.0
Install pip on Jetson TX2.
```
sudo apt-get install python-pip
```
Install TensorFlow on Jetson TX2.
1. Download the TensorFlow 1.5.0 pip wheel from here. This build of TensorFlow is provided as a convenience for the purposes of this project.
2. Install TensorFlow using pip
```
  sudo pip install tensorflow-1.5.0rc0-cp27-cp27mu-linux_aarch64.whl
```
Install uff exporter on Jetson TX2.
1. Download TensorRT 3.0.4 for Ubuntu 16.04 and CUDA 9.0 tar package from https://developer.nvidia.com/nvidia-tensorrt-download.
2. Extract archive
```
  tar -xzf TensorRT-3.0.4.Ubuntu-16.04.3.x86_64.cuda-9.0.cudnn7.0.tar.gz
```
3. Install uff python package using pip
```
  sudo pip install TensorRT-3.0.4/uff/uff-0.2.0-py2.py3-none-any.whl
```

Clone and build this project

git clone --recursive https://github.com/NVIDIA-Jetson/tf_to_trt_image_classification.git
cd tf_to_trt_image_classification
mkdir build
cd build
cmake ..
make 
cd ..

Download models and create frozen graphs

Run the following bash script to download all of the pretrained models.

source scripts/download_models.sh

If there are any models you don't want to use, simply remove the URL from the model list in scripts/download_models.sh.
Next, because the TensorFlow models are provided in checkpoint format, we must convert them to frozen graphs for optimization with TensorRT. Run the scripts/models_to_frozen_graphs.py script.

python scripts/models_to_frozen_graphs.py

If you removed any models in the previous step, you must add 'exclude': true to the corresponding item in the NETS dictionary located in scripts/model_meta.py. If you are following the instructions for executing engines below, you may also need some sample images. Run the following script to download a few images from ImageNet.

source scripts/download_images.sh

Convert frozen graph to TensorRT engine

Run the scripts/convert_plan.py script from the root directory of the project, referencing the models table for relevant parameters. For example, to convert the Inception V1 model run the following

python scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float

The inputs to the convert_plan.py script are

frozen graph path
output plan path
input node name
input height
input width
output node name
max batch size
max workspace size
data type (float or half)

This script assumes single output single input image models, and may not work out of the box for models other than those in the table above.

Execute TensorRT engine

Call the examples/classify_image program from the root directory of the project, referencing the models table for relevant parameters. For example, to run the Inception V1 model converted as above

./build/examples/classify_image/classify_image data/images/gordon_setter.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inception

For reference, the inputs to the example program are

input image path
plan file path
labels file (one label per line, line number corresponds to index in output)
input node name
output node name
preprocessing function (either vgg or inception)

We provide two image label files in the data folder. Some of the TensorFlow models were trained with an additional "background" class, causing the model to have 1001 outputs instead of 1000. To determine the number of outputs for each model, reference the NETS variable in scripts/model_meta.py.

Benchmark all models

To benchmark all of the models, first convert all of the models that you downloaded above into TensorRT engines. Run the following script to convert all models

python scripts/frozen_graphs_to_plans.py

If you want to change parameters related to TensorRT optimization, just edit the scripts/frozen_graphs_to_plans.py file. Next, to benchmark all of the models run the scripts/test_trt.py script

python scripts/test_trt.py

Once finished, the timing results will be stored at data/test_output_trt.txt. If you want to also benchmark the TensorFlow models, simply run.

python scripts/test_tf.py

The results will be stored at data/test_output_tf.txt. This benchmarking script loads an example image as input, make sure you have downloaded the sample images as above.

tf_to_trt_image_classification's People

Contributors

Stargazers

Watchers

Forkers

chomolungma kenttleton ngeorgis josesaribeiro snci alexflanker zmandyhe kadefue yunuri sram-v ksharpdabu subhadeep-maishal chybhao666 ywujudy aifei7320 kitter adityag6994 dinngoman awesome-computervision enwaytech 307509256 simonchu47 woisnow davidsonggithub alexvonduar foreveryounggithub ztjnwu optimus1072 bkanaki solarleisu leonardopsantos fujingling hangjie720 tigerbiao nehran keymanchen1215 uuvv arvind-india deepdriving hirovi hefv57 xqpinitial mogli0528 dchichkov sangha25 david20120720 zacalim16 tlf30 zackpashkin supermanhuyu iamweiweishi kaiyuryozin mithundeshmukh8 khurramhazen qingfengting2017 fakeryfx fendaq wacoder hz658832 junggil keigo-k dou8856 jianweilin bubulv cwlseu chen89 anujonthemove xuansan915 ailibrary zhucheng725 surfswift213us sangyy davis-love-ai vinitmuchhala dattv jiajuns bhardwajrahul arifsohaib cscn89 beiieb essenlin yangjian615 internet-com yuvaramsingh94 oscarriddle xiaoye77 amir22010 ieee820 rathan007 dahburj caecars clackhan mihirmohan38 imprld01 jiapeiyuan abhay-venkatesh stevelin168 aiedward zombie0117 fweih

tf_to_trt_image_classification's Issues

Build issues with modern libraries

Is this repository still maintained / a recommended way to do things?

Having installed the latest Jetpack (4.5.1), and other contemporary packages, I notice that this code doesn't build as-is any more.

There are a couple of references to OpenCV's cv::imread() which fail due to needing cv::IMREAD_COLOR instead of CV_LOAD_IMAGE_COLOR, and the call to IUFFParser::registerInput fails with needing to specify the dims order (I guess nvuffparser::UffInputOrder::kNCHW).

I notice various other things are marked as deprecated (the use of DimsCHW for example), so probably should be updated to match the latest nv interfaces.

trt5 has a issue when convert model in half mode

Hello everyone,
Recently i update to trt5, and try to convert the same model, got below issue, is anybody got same issue?

UFFParser: Parser error: res_aspp_g/decoder/resnet/bn_conv1/batch_normalization/moving_variance: Weight 78177.664062 at index 4 is outside of [-65504.000000, 65504.000000]. Please try running the parser in a higher precision mode and setting the builder to fp16 mode instead.
Failed to parse UFF

Killed

On a brand new jetson while running the test:

ubuntu@tegra-ubuntu:~/Documents/Projects/tf_to_trt_image_classification$ python scripts/test_tf.py
Testing mobilenet_v1_0p25_128
2018-03-15 01:29:49.697714: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:865] ARM64 does not support NUMA - returning NUMA node zero                                                                                              
2018-03-15 01:29:49.697838: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1212] Found device 0 with properties:
name: NVIDIA Tegra X2 major: 6 minor: 2 memoryClockRate(GHz): 1.3005
pciBusID: 0000:00:00.0
totalMemory: 7.67GiB freeMemory: 6.14GiB
2018-03-15 01:29:49.697886: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-15 01:29:50.859554: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5664 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus
id: 0000:00:00.0, compute capability: 6.2)
['Gordon setter\n', 'Rottweiler\n', 'Tibetan mastiff\n', 'black-and-tan coonhound\n', 'flat-coated retriever\n']
Testing resnet_v1_50
2018-03-15 01:29:57.964211: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-15 01:29:57.964345: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4491 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus
id: 0000:00:00.0, compute capability: 6.2)
['Gordon setter\n', 'Irish setter, red setter\n', 'cocker spaniel, English cocker spaniel, cocker\n', 'black-and-tan coonhound\n', 'Rottweiler\n']                                                                                          
Testing mobilenet_v1_1p0_224
2018-03-15 01:30:06.583719: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-15 01:30:06.583871: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2178 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus
id: 0000:00:00.0, compute capability: 6.2)
['Gordon setter\n', 'Irish setter, red setter\n', 'black-and-tan coonhound\n', 'cocker spaniel, English cocker spaniel, cocker\n', 'Tibetan mastiff\n']                                                                                     
Testing inception_v2
2018-03-15 01:30:11.103292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-15 01:30:11.103426: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2131 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus
id: 0000:00:00.0, compute capability: 6.2)
['Gordon setter\n', 'Irish setter, red setter\n', 'cocker spaniel, English cocker spaniel, cocker\n', 'Rottweiler\n', 'black-and-tan coonhound\n']                                                                                          
Testing inception_v3
2018-03-15 01:30:19.184979: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-15 01:30:19.185114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 2030 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus
id: 0000:00:00.0, compute capability: 6.2)
['Gordon setter\n', 'Rottweiler\n', 'Irish setter, red setter\n', 'black-and-tan coonhound\n', 'English setter\n']
Testing resnet_v2_152
2018-03-15 01:30:38.026162: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1312] Adding visible gpu devices: 0
2018-03-15 01:30:38.026349: I tensorflow/core/common_runtime/gpu/gpu_device.cc:993] Creating TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 27 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X2, pci bus id: 0000:00:00.0, compute capability: 6.2)
Killed

new Jetpack. Any ideas why the jetson is not returning full memory?

[Not an Issue] Modify for Running on 1080ti, failure on cmake

Hey, thanks for making this repo.

I was hoping to run this on a 1080ti to get benchmarks, however being new to TensoRT and these platforms, I'm having a bit of trouble with making the directory.

My main issue is that "make .." returns an error of
/usr/bin/ld: cannot find -lnvinfer
/usr/bin/ld: cannot find -lnvparsers

However, I did install CUDA, TensorRT
which tensorrt: /usr/local/bin/tensorrt

In my Cmake File I also added:
include_directories(/home/mikeliao/Downloads/TensorRT-3.0.4/include)
link_directories(/home/mikeliao/Downloads/TensorRT-3.0.4/include)
and commented out #add_subdirectory(examples)

Any help is appreciated, thank you for your hard work.

convert issue

Hello,

I have a tf pb file, and I have found many convert issues, and remove '_ResizeBilinear', '_BatchToSpaceND' and '_SpaceToBatchND', but got below issue:

xhz@xhz-omen:tf_to_trt_image_classification$ python scripts/convert_plan.py /home/xhz/Projects/models/xw_model/new/LsGan_model_test.pb /home/xhz/Projects/models/xw_model/LsGan_model.pb.plan input_img 360 640 sigmoid_logits 1 0 float
Using output node sigmoid_logits
Converting to UFF graph
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
WARNING: The UFF converter currently only supports 2D dilated convolutions
No. nodes: 570
UFF Output written to data/tmp.uff
UFFParser: parsing input_img
UFFParser: parsing res_aspp_g/decoder/resnet/conv1/weights
UFFParser: parsing res_aspp_g/decoder/resnet/conv1/conv1
UFFParser: Convolution: add Padding Layer to support asymmetric padding
UFFParser: Convolution: Left: 2
UFFParser: Convolution: Right: 3
UFFParser: Convolution: Top: 2
UFFParser: Convolution: Bottom: 3
UFFParser: parsing res_aspp_g/decoder/resnet/bn_conv1/BatchNorm/Const
UFFParser: parsing res_aspp_g/decoder/resnet/bn_conv1/BatchNorm/beta
UFFParser: parsing res_aspp_g/decoder/resnet/bn_conv1/BatchNorm/Const_1
UFFParser: parsing res_aspp_g/decoder/resnet/bn_conv1/BatchNorm/Const_2
UFFParser: parsing res_aspp_g/decoder/resnet/bn_conv1/BatchNorm/FusedBatchNorm
Parameter check failed at: Utils.cpp::reshapeWeights::71, condition: input.values != nullptr
UFFParser: Parser error: res_aspp_g/decoder/resnet/bn_conv1/BatchNorm/FusedBatchNorm: reshape weights failed!
Failed to parse UFF

so i remove layer 'FusedBatchNorm', so got below issue:

UFFParser: parsing res_aspp_g/decoder/resnet/res4b3_branch2a/weights
UFFParser: parsing res_aspp_g/decoder/resnet/res4b3_branch2a/Conv2D
res_aspp_g/decoder/resnet/res4b1_branch2b/convolution: at least three non-batch dimensions are required for input
UFFParser: parsing res_aspp_g/decoder/resnet/res4b3_branch2a/Relu
UFFParser: parsing res_aspp_g/decoder/resnet/res4b3_branch2b/weights
UFFParser: parsing res_aspp_g/decoder/resnet/res4b3_branch2b/convolution
UFFParser: parsing res_aspp_g/decoder/resnet/res4b3_branch2b/Relu
UFFParser: parsing res_aspp_g/decoder/resnet/conv/weights
UFFParser: parsing res_aspp_g/decoder/resnet/conv/Conv2D
res_aspp_g/decoder/resnet/res4b1_branch2b/Relu: at least one non-batch dimension is required for input
UFFParser: parsing res_aspp_g/decoder/resnet/conv/biases
UFFParser: parsing res_aspp_g/decoder/resnet/conv/BiasAdd
res_aspp_g/decoder/resnet/res4b1_branch2c/Conv2D: at least three non-batch dimensions are required for input
UFFParser: Parser error: res_aspp_g/decoder/resnet/conv/BiasAdd: The input to the Scale Layer is required to have a minimum of 3 dimensions.
Failed to parse UFF

ImportError: No module named 'graphsurgeon'

my platform is TX2，
dpkg -l | grep TensorRT
ii libnvinfer-dev 4.1.3-1+cuda9.0 arm64 TensorRT development libraries and headers
ii libnvinfer-samples 4.1.3-1+cuda9.0 arm64 TensorRT samples and documentation
ii libnvinfer4 4.1.3-1+cuda9.0 arm64 TensorRT runtime libraries
ii tensorrt 4.0.2.0-1+cuda9.0 arm64 Meta package of TensorRT

when run

python scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float

it shows

Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 36, in from_tensorflow
import graphsurgeon as gs
ImportError: No module named 'graphsurgeon'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "scripts/convert_plan.py", line 71, in
data_type
File "scripts/convert_plan.py", line 22, in frozenToPlan
text=False,
File "/usr/local/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 149, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 41, in from_tensorflow
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/#python and click on the 'TensoRT Python API' link""".format(err))
ImportError: ERROR: Failed to import module (No module named 'graphsurgeon')
Please make sure you have graphsurgeon installed.
For installation instructions, see:
https://docs.nvidia.com/deeplearning/sdk/tensorrt-api/#python and click on the 'TensoRT Python API' link

so how can i fix it?

convert pb to uff

I am try to inference a image classfication model on Jetson xavier.
Environment: tensorflow1.15 tensorrt7 python3.6
After training, I convert ckpt files to a pb files. And I can use this pb files to predict images successfully. Does it mean I freeze the model successfully?
When I try to convert pb to uff, i have lots of problems.
I wonder can this code can work on my problem or not, can you give me some advice?

models_to_frozen_graphs.py killed

Running tensorflow1.13, CUDA 10, and python3

Jetson Nano

~/Desktop/tf_to_trt_image_classification$ python3 scripts/models_to_frozen_graphs.py Converting vgg_16 {'model': <function vgg_16 at 0x7f2d7e81e0>, 'arg_scope': <function vgg_arg_scope at 0x7f78c67950>, 'num_classes': 1000, 'input_name': 'input', 'output_names': ['vgg_16/fc8/BiasAdd'], 'input_width': 224, 'input_height': 224, 'input_channels': 3, 'preprocess_fn': <function preprocess_vgg at 0x7f2d75cd90>, 'postprocess_fn': <function postprocess_vgg at 0x7f2d75ce18>, 'checkpoint_filename': 'data/checkpoints/vgg_16.ckpt', 'frozen_graph_filename': 'data/frozen_graphs/vgg_16.pb', 'trt_convert_status': 'works', 'plan_filename': 'data/plans/vgg_16.plan'} 2019-06-20 15:05:59.799985: W tensorflow/core/platform/profile_utils/cpu_utils.cc:98] Failed to find bogomips in /proc/cpuinfo; cannot determine CPU frequency 2019-06-20 15:05:59.802687: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x2c3b6ff0 executing computations on platform Host. Devices: 2019-06-20 15:05:59.802777: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): <undefined>, <undefined> 2019-06-20 15:05:59.947895: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:965] ARM64 does not support NUMA - returning NUMA node zero 2019-06-20 15:05:59.948185: I tensorflow/compiler/xla/service/service.cc:161] XLA service 0x2ad35380 executing computations on platform CUDA. Devices: 2019-06-20 15:05:59.948250: I tensorflow/compiler/xla/service/service.cc:168] StreamExecutor device (0): NVIDIA Tegra X1, Compute Capability 5.3 2019-06-20 15:05:59.949514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1433] Found device 0 with properties: name: NVIDIA Tegra X1 major: 5 minor: 3 memoryClockRate(GHz): 0.9216 pciBusID: 0000:00:00.0 totalMemory: 3.86GiB freeMemory: 1.51GiB 2019-06-20 15:05:59.949596: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2019-06-20 15:06:04.930047: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-06-20 15:06:04.930119: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2019-06-20 15:06:04.930152: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2019-06-20 15:06:04.930426: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 697 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3) 2019-06-20 15:06:04.934580: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0 2019-06-20 15:06:04.934683: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-06-20 15:06:04.934720: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0 2019-06-20 15:06:04.934753: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0: N 2019-06-20 15:06:04.934868: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 697 MB memory) -> physical GPU (device: 0, name: NVIDIA Tegra X1, pci bus id: 0000:00:00.0, compute capability: 5.3) WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/op_def_library.py:263: colocate_with (from tensorflow.python.framework.ops) is deprecated and will be removed in a future version. Instructions for updating: Colocations handled automatically by placer. WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1266: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. Killed

convert_plan.py error

Hi,

I am using the following command:
python scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float to convert my network based on the pix2pix tensorflow model, but I am met with the following error:

Using output node generate_output/output
Converting to UFF graph
Warning: No conversion function registered for layer: Cast yet.
Converting as custom op Cast generate_output/output
name: "generate_output/output"
op: "Cast"
input: "generate_output/output/Minimum"
attr {
key: "DstT"
value {
type: DT_UINT8
}
}
attr {
key: "SrcT"
value {
type: DT_FLOAT
}
}

Traceback (most recent call last):
File "scripts/convert_plan.py", line 71, in
data_type
File "scripts/convert_plan.py", line 22, in frozenToPlan
text=False,
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 103, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 75, in from_tensorflow
name="main")
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 64, in convert_tf2uff_graph
uff_graph, input_replacements)
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 51, in convert_tf2uff_node
op, name, tf_node, inputs, uff_graph, tf_nodes=tf_nodes)
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 28, in convert_layer
fields = cls.parse_tf_attrs(tf_node.attr)
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 177, in parse_tf_attrs
for key, val in attrs.items()}
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 177, in
for key, val in attrs.items()}
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 172, in parse_tf_attr_value
return cls.convert_tf2uff_field(code, val)
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 146, in convert_tf2uff_field
return TensorFlowToUFFConverter.convert_tf2numpy_dtype(val)
File "/usr/local/lib/python2.7/dist-packages/uff/converters/tensorflow/converter.py", line 74, in convert_tf2numpy_dtype
return np.dtype(dt[dtype])
TypeError: list indices must be integers, not AttrValue

I believe it is due to some unsupported layers. Does anyone know of a work around for this?

Error - While Creating Plan -convert_plan.py

Hi,

I am getting the following error, when I try to create engine with command given.
Also attached is the error log. Appreciate info/help.
(Using inception_v1.pb)
Command::
python scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float

Error:
--------------- Timing InceptionV1/InceptionV1/Conv2d_1a_7x7/Conv2D + InceptionV1/InceptionV1/Conv2d_1a_7x7/Relu(14)
createConstants (243) - Cask Error in ../builder/caskConvolutionTraits.cpp: 0 (initDeviceReservedSpace)
createConstants (243) - Cask Error in ../builder/caskConvolutionTraits.cpp: 0 (initDeviceReservedSpace)

Error_log.txt

'nvinfer1::CudaError'

I trying Execute TensorRT engine.
but,

./build/examples/classify_image/classify_image data/images/gordon_setter.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inception Loading TensorRT engine from plan file... cudnnEngine.cpp (56) - Cuda Error in initializeCommonContext: 4 terminate called after throwing an instance of 'nvinfer1::CudaError' what(): std::exception Aborted (core dumped)

Memory Issue when running test_tf.py

Hello, thanks for making this repo!

When I am running test_tf.py, I am experiencing running out of memory. What should I do?

Not able to run demo on Jetson TX2 with USB camera

Hello I am trying to run the image classification demo using OpenCV and USB camera on Jetson TX2. I have tested the camera and its working fine. But once, I tried feeding the camera frames to the modified classify_image demo it is throwing segmentation fault.
`
/**

Copyright (c) 2018, NVIDIA CORPORATION. All rights reserved.
Full license terms provided in LICENSE.md file.
*/

#include
#include
#include
#include
#include <NvInfer.h>
#include <opencv2/opencv.hpp>
#include "examples/classify_image/utils.h"

using namespace std;
using namespace cv;
using namespace nvinfer1;

class Logger : public ILogger
{
void log(Severity severity, const char * msg) override
{
if (severity != Severity::kINFO)
cout << msg << endl;
}
} gLogger;

/**

image_file: path to image
plan_file: path of the serialized engine file
label_file: file with <class_name> per line
input_name: name of the input tensor
output_name: name of the output tensor
preprocessing_fn: 'vgg' or 'inception'
*/
int main(int argc, char *argv[])
{

if (argc != 6)
{
cout << "Usage: classify_image <image_file> <plan_file> <label_file> <input_name> <output_name> <preprocessing_fn>\n";
return 0;
}

//string videoFilename = argv[1];
string planFilename = argv[1];
string labelFilename = argv[2];
string inputName = argv[3];
string outputName = argv[4];
string preprocessingFn = argv[5];

/* load the engine */
cout << "Loading TensorRT engine from plan file..." << endl;
ifstream planFile(planFilename);

if (!planFile.is_open())
{
cout << "Could not open plan file." << endl;
return 1;
}

stringstream planBuffer;
planBuffer << planFile.rdbuf();
string plan = planBuffer.str();
IRuntime *runtime = createInferRuntime(gLogger);
ICudaEngine engine = runtime->deserializeCudaEngine((void)plan.data(), plan.size(), nullptr);
IExecutionContext *context = engine->createExecutionContext();

/* get the input / output dimensions */
int inputBindingIndex, outputBindingIndex;
inputBindingIndex = engine->getBindingIndex(inputName.c_str());
outputBindingIndex = engine->getBindingIndex(outputName.c_str());

if (inputBindingIndex < 0)
{
cout << "Invalid input name." << endl;
return 1;
}

if (outputBindingIndex < 0)
{
cout << "Invalid output name." << endl;
return 1;
}

Dims inputDims, outputDims;
inputDims = engine->getBindingDimensions(inputBindingIndex);
outputDims = engine->getBindingDimensions(outputBindingIndex);
int inputWidth, inputHeight;
inputHeight = inputDims.d[1];
inputWidth = inputDims.d[2];

/* read image, convert color, and resize */
cout << "Preprocessing Video input..." << endl;
VideoCapture cap(1);
// Check if camera opened successfully
if(!cap.isOpened())
{
cout << "Error opening video stream or file" << endl;
return -1;
}
for(;;)
{

cv::Mat image;

cap >> image;

if (image.empty())
{
cout << "Could not read image from file." << endl;
return 1;
}
cv::imshow("Frame",image);
cv::cvtColor(image,image,cv::COLOR_BGR2RGB, 3);
cv::resize(image, image, cv::Size(inputWidth, inputHeight));

/* convert from uint8+NHWC to float+NCHW */
float inputDataHost, outputDataHost;
size_t numInput, numOutput;
numInput = numTensorElements(inputDims);
numOutput = numTensorElements(outputDims);
inputDataHost = (float) malloc(numInput * sizeof(float));
outputDataHost = (float) malloc(numOutput * sizeof(float));
cvImageToTensor(image, inputDataHost, inputDims);
if (preprocessingFn == "vgg")
preprocessVgg(inputDataHost, inputDims);
else if (preprocessingFn == "inception")
preprocessInception(inputDataHost, inputDims);
else
{
cout << "Invalid preprocessing function argument, must be vgg or inception. \n" << endl;
return 1;
}

/* transfer to device */
float *inputDataDevice, outputDataDevice;
cudaMalloc(&inputDataDevice, numInput * sizeof(float));
cudaMalloc(&outputDataDevice, numOutput * sizeof(float));
cudaMemcpy(inputDataDevice, inputDataHost, numInput * sizeof(float), cudaMemcpyHostToDevice);
void bindings[2];
bindings[inputBindingIndex] = (void) inputDataDevice;
bindings[outputBindingIndex] = (void) outputDataDevice;

/* execute engine /
cout << "Executing inference engine..." << endl;
const int kBatchSize = 1;
context->execute(kBatchSize, bindings);
/ transfer output back to host */
cudaMemcpy(outputDataHost, outputDataDevice, numOutput * sizeof(float), cudaMemcpyDeviceToHost);

/* parse output */
vector<size_t> sortedIndices = argsort(outputDataHost, outputDims);

cout << "\nThe top-5 indices are: ";
for (int i = 0; i < 5; i++)
cout << sortedIndices[i] << " ";

ifstream labelsFile(labelFilename);

if (!labelsFile.is_open())
{
cout << "\nCould not open label file." << endl;
return 1;
}

vector labelMap;
string label;
while(getline(labelsFile, label))
{
labelMap.push_back(label);
}

cout << "\nWhich corresponds to class labels: ";
for (int i = 0; i < 5; i++)
cout << endl << i << ". " << labelMap[sortedIndices[i]];
cout << endl;

/* clean up */
runtime->destroy();
engine->destroy();
context->destroy();
free(inputDataHost);
free(outputDataHost);
cudaFree(inputDataDevice);
cudaFree(outputDataDevice);
char c=(char)waitKey(25);
if(c==27)
break;
}
cap.release();
// Closes all the frames

return 0;
}`

What's clock frequency of GPU in the benchmark?

What frequency do you use for the benchmark results in the readme page? Is it the max clock, 1300MHz?

Tensorrt problem with ArgMax node

Hi, I am studied this repo and it was very helpful. Now I am trying to convert the Lanenet (https://github.com/MaybeShewill-CV/lanenet-lane-detection) to tensorrt. I was able to conver the model and write the code. However, when I ran the inference, I got the following error:

[TRT] UffParser: Parser error: lanenet_model/vgg_backend/binary_seg/ArgMax: Reductions cannot be applied to the batch dimension.

Anyone knows why?

1080TI vs TITAN V

Hello,
I use 1080TI and TITAN V to test this sample code, TRT4.09 cost time has same result, detail as below:

is anyone can help me to explian it?
Thanks a lot.

Converting face_classification-emotion_models (tensorflow) to TensorRT

Trying the first step tf_to_trt_image_classification/scripts/models_to_frozen_graphs.py

Please guide on the second part of this question
Trying to convert emotion detection into TensorRT

What should be the output_names of the custom model be?


emotion_labels = get_labels('fer2013')
emotion_model_path = 'trained_models/emotion_models/fer2013_mini_XCEPTION.102-0.66.hdf5'
emotion_classifier = load_model(emotion_model_path, compile=False)
'emotions': {
		'model': emotion_classifier ,
		'arg_scope': emotion_arg_scope, # what should this be changed to
		'num_classes': 7,
		'input_name': 'input', # what should this be changed to
		'output_names': ['InceptionResnetV2/Logits/Logits/BiasAdd'],, # what should this be changed to
		'input_width': 64,
		'input_height': 64,
		'input_channels': 1, 
		'preprocess_fn': preprocess_emotion, #preprocessing 
		'postprocess_fn': postprocess_emotion, #postprocessing
		'checkpoint_filename': CHECKPOINT_DIR + 'emotions.ckpt', 
		'frozen_graph_filename': FROZEN_GRAPHS_DIR + 'emotions.pb',
		'trt_convert_status': "works", # what should this be changed to
		'plan_filename': PLAN_DIR + 'inception_resnet_v2.plan' # what should this be changed to
	}}

Needs update for Jetson Xavier

Jetson Xavier uses TensorRT 5 and CUDA 10, which aren't supported by the current setup instructions.

test_trt error "Mismatch between allocated memory size and expected size of serialized engine"

I'm trying to run the tf_to_trt_image_classification app, but it only runs the Inception_v1, vgg_16 and mobilenet_v1_0p5_160 NNs. I'm using a Jetson TX2 with JetPack 3.2.

I've followed the tutorial, but the test_trt binary crashes with the error:

test_trt: cudnnEngine.cpp:640: bool nvinfer1::cudnn::Engine::deserialize(const void*, std::size_t, nvinfer1::IPluginFactory*): Assertion 'size >= bsize && "Mismatch between allocated memory size and expected size of serialized engine."' failed.

I've tried running it with TensorFlow 1.5.0 pip wheel from the tutorial and with a TensorFlow 1.7 wheel from this topic.

The second line from src/test/test_trt.cu is failing:

IRuntime *runtime = createInferRuntime(gLogger); ICudaEngine *engine = runtime->deserializeCudaEngine((void*)plan.data(), plan.size(), nullptr);

Any pointers??

Thanks a lot!!

can't find CV2 on testing with tensorflow

Hi guys,

I'm getting this error after being able to successfully run the rest. Python in my case points to /usr/bin/python3 (symlink).

nvidia@tegra-ubuntu:/dev/tf_to_trt_image_classification$ python scripts/test_tf.py
Traceback (most recent call last):
File "scripts/test_tf.py", line 14, in
import cv2
ImportError: No module named 'cv2'
nvidia@tegra-ubuntu:/dev/tf_to_trt_image_classification$

I'm not finding the way to properly install the opencv-python package, does that package need to be compiled for this? Am I missing something obvious? thanks!

Working with TensorRT 4?

Does this project also work with the newer Version of TensorRT 4.0 RC?

Inception Model Error

Trying the readme, I encounter:

UFFParser: Validator error: InceptionV1/Logits/SpatialSqueeze: Unsupported operation Squeeze
Failed to parse UFF

Convert frozen graph to TensorRT engine issue

Hi, While conversion from frozen graph to TensorRT engine step in jetson TX2, it gives error-
Using output node InceptionV1/Logits/SpatialSqueeze
Converting to UFF graph
No. nodes: 486
UFF Output written to data/tmp.uff
UFFParser: Validator error: InceptionV1/Logits/SpatialSqueeze: Unsupported operation Squeeze
Failed to parse UFF

Any suggestion?
Thanks.

Error while running python scripts/convert_plan.py

root@61dd12ecedc9:/home/tf_to_trt_image_classification# python scripts/convert_plan.py data/frozen_graphs/mobilenet_v1_0p25_128.pb data/plans/mobilenet_v1_0p25_128.plan input 128 128 MobilenetV1/Logits/SpatialSqueeze 1 0 float
Using output node MobilenetV1/Logits/SpatialSqueeze
Converting to UFF graph
No. nodes: 306
UFF Output written to data/tmp.uff
cudnnLayerUtils.cpp (288) - Cuda Error in smVersion: 35
terminate called after throwing an instance of 'nvinfer1::CudaError'
what(): std::exception

I am getting error while calling convert_plan.py

Please suggest.
Using below versions:
Cuda 9.0
Ubuntu 16.04
Tensorrt 3

How to resolve unsupported layer issue for my model

Hi,

Im currently working on facenet,mtcnn model while i need to optimize my model with TensorRT. I have .pb file for model, when i try to convert with uff model i have some error kind of;

uff_model = uff.from_tensorflow_frozen_model("20180518-115854.pb", ["embeddings"])

Warning: keepdims is ignored by the UFF Parser and defaults to True
Warning: No conversion function registered for layer: QueueDequeueUpToV2 yet.
Converting as custom op QueueDequeueUpToV2 batch_join
name: "batch_join"
op: "QueueDequeueUpToV2"
input: "batch_join/fifo_queue"
input: "batch_size"
attr {
key: "component_types"
value {
list {
type: DT_FLOAT
type: DT_INT64
}
}
}
attr {
key: "timeout_ms"
value {
i: -1
}
}

Warning: No conversion function registered for layer: FIFOQueueV2 yet.
Converting as custom op FIFOQueueV2 batch_join/fifo_queue
name: "batch_join/fifo_queue"
op: "FIFOQueueV2"
attr {
key: "capacity"
value {
i: 1440
}
}
attr {
key: "component_types"
value {
list {
type: DT_FLOAT
type: DT_INT64
}
}
}
attr {
key: "container"
value {
s: ""
}
}
attr {
key: "shapes"
value {
list {
shape {
dim {
size: 160
}
dim {
size: 160
}
dim {
size: 3
}
}
shape {
}
}
}
}
attr {
key: "shared_name"
value {
s: ""
}
}

Traceback (most recent call last):
File "", line 1, in
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 149, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 120, in from_tensorflow
name="main")
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 76, in convert_tf2uff_graph
uff_graph, input_replacements)
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 63, in convert_tf2uff_node
op, name, tf_node, inputs, uff_graph, tf_nodes=tf_nodes)
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 38, in convert_layer
fields = cls.parse_tf_attrs(tf_node.attr)
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 209, in parse_tf_attrs
for key, val in attrs.items()}
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 209, in
for key, val in attrs.items()}
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 204, in parse_tf_attr_value
return cls.convert_tf2uff_field(code, val)
File "/usr/lib/python3.5/dist-packages/uff/converters/tensorflow/converter.py", line 189, in convert_tf2uff_field
'type': 'dtype', 'list': 'list'}[code]
KeyError: 'shape'

I can't proceed further, im stuck on this, pls give me some suggestion how to resolve this issue

my Jetpack is 4.4.1，can I successfully run this code

My Jetpack is 4.4.1，can I successfully run this code, I am looking forward to your reply, thank you very much.

Error when running the convert_plan.py script

Hi,

To run the script convert_plan.py, I use the command

python scripts/convert_plan.py data/frozen_graphs/mobilenet_v1_1p0_224.pb data/plans/mobilenet_v1_1p0_224.plan input 224 224 MobilenetV1/Logits/SpatialSqueeze 1 0 float

The script ends with an error :

After conv-act fusion: 115 layers
After tensor merging: 115 layers
After concat removal: 115 layers
cudnnEngine.cpp (56) - Cuda Error in initializeCommonContext: 4
cudnnEngine.cpp (56) - Cuda Error in initializeCommonContext: 4

This is when I run it on Ubuntu 16.04 with Nvidia 1080ti with Tensorflow 1.4, TensorRT 3.0.4 and Cuda 8.
Are there any changes which I need to do in the Make file for this to work?

how to use tensorrt for rnn

hello, I want to use tensorrt to speed up my rcnn model, but I can find not any reference material for this , can you give some advises?
thanks a million

looking forward your reply.

Wheel for TF 1.8?

Did you build a wheel file for Tensorflow 1.8? Or if you didn't, which instructions did you follow?

OSError while running convert_plan.py

while running
python scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float

after following the instructions, I get the following error -

Using output node InceptionV1/Logits/SpatialSqueeze
Converting to UFF graph
No. nodes: 486
UFF Output written to data/tmp.uff
Traceback (most recent call last):
File "scripts/convert_plan.py", line 71, in
data_type
File "scripts/convert_plan.py", line 37, in frozenToPlan
subprocess.call([UFF_TO_PLAN_EXE_PATH] + args)
File "/usr/lib/python2.7/subprocess.py", line 523, in call
return Popen(*popenargs, **kwargs).wait()
File "/usr/lib/python2.7/subprocess.py", line 711, in init
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

Any idea whats wrong ?

Using TensorRT 3.0

Documentation

Hi everyone! Great Project! I think it will be advisable to add pip installations for the project in the readme. I'll make a PR soon.

i want to try camera detection.

I want to try camera detection(mobilenet) on jetson tx2 and this source.
how to easy edit source code?

inception_v3 produces different results for RGB input

When processed on RGB images, the inception_v3 model produces results with slightly-worse semantics than BGR inputs. When processed on lifeboat.jpg (from the sample images) the output is

RGB Input:

lifeboat
backpack
speedboat
poncho
hook, claw

BGR Input:

lifeboat
fireboat
speedboat
amphibious vehicle
container ship

We must investigate which models this input format applies to, and if there are other pre-processing discrepancies.

Could NOT find CUDA: Found unsuitable version "9.1", but required is exact version "7.5"

Upon running the cmake command, I encounter the following error:

CMake Error at /usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:148 (message):
Could NOT find CUDA: Found unsuitable version "9.1", but required is exact
version "7.5" (found /usr/local/cuda)
Call Stack (most recent call first):
/usr/share/cmake-3.5/Modules/FindPackageHandleStandardArgs.cmake:386 (_FPHSA_FAILURE_MESSAGE)
/usr/share/cmake-3.5/Modules/FindCUDA.cmake:949 (find_package_handle_standard_args)
/usr/local/share/OpenCV/OpenCVConfig.cmake:86 (find_package)
/usr/local/share/OpenCV/OpenCVConfig.cmake:105 (find_host_package)
CMakeLists.txt:4 (find_package)

I have the Cuda 9.0 and 9.1 versions installed. I would prefer a workaround because I do not wish to install Cuda 7.5.

Wrong imports file 'scripts/models_to_frozen_graphs.py' and 'scripts/model_meta.py'

Hello,
When executing the models_to_frozen_graphs.py script I receive the following error

Traceback (most recent call last):
  File "scripts/models_to_frozen_graphs.py", line 12, in <module>
    import slim.nets as nets
ImportError: No module named slim.nets

I don't know where you got the 'slim' from but with my Jetson TX2 it is located in /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim
So I changed the import to:
import tensorflow.contrib.slim.nets as nets

Then the same kind of error came for the nets.vgg import so I located this file to /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/nets folder
So I again changed that import to:
import tensorflow.contrib.slim.python.slim.nets.vgg

Then the errors came from the model_meta.py script. These were the same import errors as the others.
Only now you import also inception, resnet_v1, resnet_v2 and mobilenet_v1.
These were also found in the /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/nets folder, EXCEPT the mobilenet_v1 file.

It appears that I'm totally missing that file.
I followed the README completely so how come these imports are locating to the wrong place and I am missing the mobilenet_v1 file. The mobilenet is the reason I'm trying this is the first place.
What have I done wrong?

Tensorflow Python3 Support?

Hi, is there support for tensorflow in python3?

how about the score

I print the output of top 5， the score is such as：-0.18758957, 0.67330343, 2.45672223 .........
but I want to get the score such as: dog : 0.91 cat : 0.23

otherwise, i want to use the model of faster_rcnn_inception_resnet_v2_atrous_coco_11_06_2017.tar

for example:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/detection_model_zoo.md

Error while running make: cannot find -lnvparsers

$ dpkg -l | grep tensorRT
ii tensorrt-2.1.2 3.0.2-1+cuda8.0 arm64 Meta package of TensorRT

while running make, i am getting below error:
nvidia@tegra-ubuntu:~/Documents/tf_to_trt_models/tf_to_trt_image_classification/build$ cmake ..
-- Found CUDA: /usr/local/cuda-8.0 (found suitable exact version "8.0")
-- Found CUDA: /usr/local/cuda-8.0 (found version "8.0")
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nvidia/Documents/tf_to_trt_models/tf_to_trt_image_classification/build

nvidia@tegra-ubuntu:~/Documents/tf_to_trt_models/tf_to_trt_image_classification/build$ make
[ 16%] Linking CXX executable classify_image
/usr/bin/ld: cannot find -lnvparsers
collect2: error: ld returned 1 exit status
examples/classify_image/CMakeFiles/classify_image.dir/build.make:513: recipe for target 'examples/classify_image/classify_image' failed
make[2]: *** [examples/classify_image/classify_image] Error 1
CMakeFiles/Makefile2:103: recipe for target 'examples/classify_image/CMakeFiles/classify_image.dir/all' failed
make[1]: *** [examples/classify_image/CMakeFiles/classify_image.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

Please help

Unsupported operation _FusedBatchNormV3 Failed to parse UFF when converting frozen to plan on Jetson Nano

ENV:
Jetson Nano board with JetPack4.2.1
cuda 10.0
cuDNN 7.5
TensorFlow with GPU 1.13.1

Following the README,

clone git repository and checkout to trt_4plus, then build the uff_to_plan
run "source scripts/download_models.sh", get the models, for example inrectpion_v1
run "python scripts/models_to_frozen_graphs.py", convert models to frozen graphs, for exammple inception_v1.pb
run "python3 scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float"
Failed, with below warning and error message.
Then, how to fix this issue?

nano@nano-2:~/work/tf_to_trt_image_classification$ python3 scripts/convert_plan.py data/frozen_graphs/inception_v1.pb data/plans/inception_v1.plan input 224 224 InceptionV1/Logits/SpatialSqueeze 1 0 float
......

Using output node InceptionV1/Logits/SpatialSqueeze
Converting to UFF graph
Warning: No conversion function registered for layer: FusedBatchNormV3 yet.
Converting InceptionV1/InceptionV1/Mixed_5c/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNormV3 as custom op: FusedBatchNormV3
Warning: No conversion function registered for layer: FusedBatchNormV3 yet.
Converting InceptionV1/InceptionV1/Mixed_5b/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNormV3 as custom op: FusedBatchNormV3
Warning: No conversion function registered for layer: FusedBatchNormV3 yet.
Converting InceptionV1/InceptionV1/Mixed_4f/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNormV3 as custom op: FusedBatchNormV3
Warning: No conversion function registered for layer: FusedBatchNormV3 yet.
Converting InceptionV1/InceptionV1/Mixed_4e/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNormV3 as custom op: FusedBatchNormV3
Warning: No conversion function registered for layer: FusedBatchNormV3 yet.
Converting InceptionV1/InceptionV1/Mixed_4d/Branch_3/Conv2d_0b_1x1/BatchNorm/FusedBatchNormV3 as custom op: FusedBatchNormV3
...
Warning: No conversion function registered for layer: FusedBatchNormV3 yet.
Converting InceptionV1/InceptionV1/Mixed_5c/Branch_0/Conv2d_0a_1x1/BatchNorm/FusedBatchNormV3 as custom op: FusedBatchNormV3
No. nodes: 486
UFF Output written to data/tmp.uff
UffParser: Validator error: InceptionV1/InceptionV1/Mixed_5c/Branch_0/Conv2d_0a_1x1/BatchNorm/FusedBatchNormV3: Unsupported operation _FusedBatchNormV3
Failed to parse UFF

Vectorization of util functions

The functions in utils.h are inefficient.
I noticed that it takes about 9ms to run cvImageToTensor() and preprocessInception() on TX2. In contrast to this, the inference time of SSD (half precision) for batchsize=1 is 27ms.
There is a triple loop in cvImageToTensor(). Is it possible to let the compiler do the auto-vectorization? Or we need to manually reimplement these functions?

Using tf_to_trt_image_classification with DLA support

I have modified uff_to_plan.cpp to use DLA on Xavier Using the following function:

inline void enableDLA(IBuilder* b, int dlaID)
{
    b->allowGPUFallback(true);
    b->setFp16Mode(true);
    b->setDefaultDeviceType(static_cast<DeviceType>(dlaID));
}

The output is:

./build/examples/classify_image/classify_image data/images/gordon_setter.jpg data/plans/inception_v1.plan data/imagenet_labels_1001.txt input InceptionV1/Logits/SpatialSqueeze inception
Loading TensorRT engine from plan file...
Preprocessing input...
Executing inference engine...

The top-5 indices are: 215 235 166 214 213
Which corresponds to class labels:
0. Gordon setter

Rottweiler

black-and-tan coonhound

Irish setter, red setter

English setter
dla/eglUtils.cpp (121) - EGL Error in validateEglStream: 12289
terminate called after throwing an instance of 'nvinfer1::EglError'
what(): std::exception
Aborted (core dumped)

Do you have a clue how can I use DLA with this project?
Thanks,

A problem when make

I got a error when I make the project, who knows how can I solve it?
cuda version :8.0
TensorRT :3.0.4
env: Ubantu

The following is the error information:
`project/tf_to_trt_image_classification-master/examples/classify_image/classify_image.cu:10:21: fatal error: NvInfer.h: No such file or directory
#include <NvInfer.h>
^
compilation terminated.
CMake Error at classify_image_generated_classify_image.cu.o.cmake:207 (message):
Error generating
/project/tf_to_trt_image_classification-master/build/examples/classify_image/CMakeFiles/classify_image.dir//./classify_image_generated_classify_image.cu.o

make[2]: *** [examples/classify_image/CMakeFiles/classify_image.dir/classify_image_generated_classify_image.cu.o] Error 1
make[1]: *** [examples/classify_image/CMakeFiles/classify_image.dir/all] Error 2
make: *** [all] Error 2`

frozen graph convert error

Hey Hi,

Im running some sample tf-trt, i have been successfully build and compiled good, when i try to run;
python scripts/models_to_frozen_graphs.py
i got some import error, but i have downloaded all models from base source, is there anything i missed;

Traceback (most recent call last):
File "scripts/models_to_frozen_graphs.py", line 12, in
import slim.nets as nets
ImportError: No module named 'slim.nets'

Please give me some suggestion to resolve this issue, Thanks in advance

I got a big error

nvidia@tegra-ubuntu:/demo/tf_to_trt_image_classification/build$ cmake ..
-- Configuring done
-- Generating done
-- Build files have been written to: /home/nvidia/demo/tf_to_trt_image_classification/build
nvidia@tegra-ubuntu:/demo/tf_to_trt_image_classification/build$ make
[ 33%] Built target classify_image
[ 50%] Building CXX object src/CMakeFiles/uff_to_plan.dir/uff_to_plan.cpp.o
/home/nvidia/demo/tf_to_trt_image_classification/src/uff_to_plan.cpp: In function ‘int main(int, char**)’:
/home/nvidia/demo/tf_to_trt_image_classification/src/uff_to_plan.cpp:71:79: error: no matching function for call to ‘nvuffparser::IUffParser::registerInput(const char*, nvinfer1::DimsCHW)’
parser->registerInput(inputName.c_str(), DimsCHW(3, inputHeight, inputWidth));
^
In file included from /home/nvidia/demo/tf_to_trt_image_classification/src/uff_to_plan.cpp:12:0:
/usr/include/aarch64-linux-gnu/NvUffParser.h:182:18: note: candidate: virtual bool nvuffparser::IUffParser::registerInput(const char*, nvinfer1::Dims, nvuffparser::UffInputOrder)
virtual bool registerInput(const char* inputName, nvinfer1::Dims inputDims,
^
/usr/include/aarch64-linux-gnu/NvUffParser.h:182:18: note: candidate expects 3 arguments, 2 provided
src/CMakeFiles/uff_to_plan.dir/build.make:62: recipe for target 'src/CMakeFiles/uff_to_plan.dir/uff_to_plan.cpp.o' failed
make[2]: *** [src/CMakeFiles/uff_to_plan.dir/uff_to_plan.cpp.o] Error 1
CMakeFiles/Makefile2:160: recipe for target 'src/CMakeFiles/uff_to_plan.dir/all' failed
make[1]: *** [src/CMakeFiles/uff_to_plan.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2
nvidia@tegra-ubuntu:~/demo/tf_to_trt_image_classification/build$

how to convert my own slim model

hello, i want to use tensorrt to speed up the facenet project. davidsandberg/facenet

I freeze the model into .pb tensorflow modle, but I am confused about how can I convert it into .plan.
and, I tried to modify the code similar to the model zoo' inception-resnet-v2 and re-train, but it could not be convert into an .uff model too.

Converting to frozen graph failed on Jetson TX2

When I run the scripts/models_to_frozen_graphs.py script, I got below error ion Jetson TX2. Any comment:
...
018-03-13 18:52:01.449359: I tensorflow/core/common_runtime/bfc_allocator.cc:684] Sum Total of in-use chunks: 152.63MiB
2018-03-13 18:52:01.449608: I tensorflow/core/common_runtime/bfc_allocator.cc:686] Stats:
Limit: 374509568
InUse: 160042752
MaxInUse: 271571968
NumAllocs: 10753
MaxAllocSize: 107122688

2018-03-13 18:52:01.449952: W tensorflow/core/common_runtime/bfc_allocator.cc:277] __xxx__*************__________________________________________
Traceback (most recent call last):
File "scripts/models_to_frozen_graphs.py", line 63, in
sess=tf_sess
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/training/saver.py", line 1750, in restore
{self.saver_def.filename_tensor_name: save_path})
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1128, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1344, in _do_run
options, run_metadata)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1363, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Dst tensor is not initialized.
[[Node: save/RestoreV2_27/_11 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device_incarnation=1, tensor_name="edge_78_save/RestoreV2_27", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"]]

raise UffException(str(name) + " was not found in the graph error

python3 scripts/convert_plan.py data/frozen_graphs/mobilenet_v2.pb data/plans/mobilenet_v2.plan Placeholder 224 224 Softmax 1 0 float
NOTE: UFF has been tested with TensorFlow 1.12.0. Other versions are not guaranteed to work
WARNING: The version of TensorFlow installed on this system is not guaranteed to work with UFF.
UFF Version 0.6.3
=== Automatically deduced input nodes ===
[name: "Placeholder"
op: "Placeholder"
attr {
key: "dtype"
value {
type: DT_FLOAT
}
}
attr {
key: "shape"
value {
shape {
dim {
size: -1
}
dim {
size: 224
}
dim {
size: 224
}
dim {
size: 3
}
}
}
}
]

Using output node Softmax
Converting to UFF graph
Traceback (most recent call last):
File "scripts/convert_plan.py", line 71, in
data_type
File "scripts/convert_plan.py", line 22, in frozenToPlan
text=False,
File "/usr/lib/python3.6/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 233, in from_tensorflow_frozen_model
return from_tensorflow(graphdef, output_nodes, preprocessor, **kwargs)
File "/usr/lib/python3.6/dist-packages/uff/converters/tensorflow/conversion_helpers.py", line 181, in from_tensorflow
debug_mode=debug_mode)
File "/usr/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py", line 94, in convert_tf2uff_graph
uff_graph, input_replacements, debug_mode=debug_mode)
File "/usr/lib/python3.6/dist-packages/uff/converters/tensorflow/converter.py", line 62, in convert_tf2uff_node
raise UffException(str(name) + " was not found in the graph. Please use the -l option to list nodes in the graph.")

Please provide soln to this