Comments (4)
The UFF file itself is not a TRT plan. You need to import the UFF file into TRT and from that generate a plan. You can then use that plan file in your model store. I think this sample shows what you are trying to do: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#mnist_uff_sample
from server.
thanks! it works by serializing the engine
from server.
The UFF file itself is not a TRT plan. You need to import the UFF file into TRT and from that generate a plan. You can then use that plan file in your model store. I think this sample shows what you are trying to do: https://docs.nvidia.com/deeplearning/sdk/tensorrt-developer-guide/index.html#mnist_uff_sample
We got the error when transforming UFF File to TensorRT Engine (TensorRT 5.0.2.6):
"[TensorRT] ERROR: UFFParser: Parser error: add: The input to the Scale Layer is required to have a minimum of 3 dimensions."
our model details as:
input nodes ==:
[name: "Placeholder"
op: "Placeholder"
attr {
key: "_output_shapes"
value {
list {
shape {
dim {
size: -1
}
dim {
size: 1000
}
}
}
}
}
output nodes ==:
[name: "add"
op: "Add"
input: "MatMul_2"
input: "b"
attr {
key: "T"
value {
type: DT_FLOAT
}
}
attr {
key: "_output_shapes"
value {
list {
shape {
dim {
size: -1
}
dim {
size: 2
}
}
}
}
}
]
from server.
UFF expects tensors in the network to be 3D. There will be some documentation on this limitation in the (future) 5.1 release of TensorRT.
from server.
Related Issues (20)
- Request for Improved Metrics and Real-Time Concurrency Reporting in Triton Inference Server
- Python AsyncIO infer does not support shared memory HOT 1
- client silent failure - E0422 05:03:24.145960 1 pb_stub.cc:402] An error occurred while trying to load GPU buffers in the Python backend stub: failed to copy data: invalid argument HOT 3
- CUDA Graph not work HOT 4
- [RFE] HandleGenerate equivalent for sagemaker_server.cc HOT 1
- The time spent on the inference request process far exceeds the model inference time. How can I determine where this additional time is being consumed?
- Casting NumPy string array to np_utils.Tensor disproportionately increases latency HOT 2
- On server/deploy/oci -> running "helm install example ." to deploy the Inference Server and pod doesn't get to running due to Liveness probe failed & Readiness probe failed HOT 1
- trt_profile_max_shapes not supported for ONNX-TRT backend HOT 1
- Failed to initialize Python stub + ModuleNotFoundError: No module named 'nvtabular', 'merlin' HOT 1
- does triton support different model-repository assemble into a batch? HOT 1
- Question: Which backends automatically warm up models? HOT 1
- [Question] Is it possible to shutdown Triton if we detect certain cuda errors ? HOT 1
- Perf Analyzer Error: Cannot send stop request without specifying a request_id HOT 1
- Python Backend: one model instance over multiple GPUs HOT 2
- Logs not getting generated with GRPC HOT 1
- Input data/shape validation HOT 7
- Manually update model repository index HOT 5
- Triton server execution is aborted in mac m3 pro as soon as a client sends a new request!!! HOT 2
- Unable to use triton client with shared memory in C++ (Jetpack 6 device) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from server.