Comments (2)
Instead of wrapping the server - can you place a reverse proxy in front of your server instance?
The endpoint restrictions were not designed to replace a full featured solution with key management (revoke, change permissions, etc.) in mind. Our initial inclination is not to add additional features but encourage use of api gateways designed with this in mind.
Are there down sides to keeping such key management outside of the triton server itself?
from server.
The only downside of keeping it outside of Triton, is that every engineer will have to take into account endpoint changes instead of it being built-in to the server. So if an update introduces a new set of endpoints, it is up to every single manager to update his gateway after updating the server, instead of it already being taken care of as part of the update.
from server.
Related Issues (20)
- Why is my model in ensemble receiving out-of-order input HOT 2
- Add TT-Metalium as a backend
- unexpected datatype TYPE_INT64 for inference input ,expecting TYPE_INT32 HOT 1
- triton malloc fail HOT 7
- Peaks in instantaneous traffic lead to high TP99 inference latency. HOT 2
- Low QPS with momentary traffic surges cause significant increases in inference TP99 latency.
- Single docker layer is too large
- Memory over 100% with decoupled dali video model
- When the request is large, the Triton server has a very high TTFT.
- Uneven QPS leads to low throughput and high latency as well as low GPU utilization HOT 4
- Triton Server 24.05 can't detect CUDA drivers if host system has installed Nvidia driver 555.85
- Building and developing with libtritonserver.so
- CUDA runtime API error raised when using only cpu on Mac M3
- Segmentation fault (core dumped) - Server version 2.46.0 HOT 1
- Segmentation fault when multi-requsts to triton-vllm HOT 1
- Does Triton Server support Dynamic Request Batching for models which has sparse tensors as inputs HOT 2
- Triton Tensorrt-LLM 24.04 and 24.05 are very large
- Poll failed for model directory 'diabetes_model': Invalid model name: Could not determine backend for model 'diabetes_model' with no backend in model configuration. Expected model name of the form 'model.<backend_name>' HOT 1
- Triton server crash when running a large model with an ONNX/CPU backend
- could you give some examples about ragged input config for tensorrt backend
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from server.