ghas-results / onnx-mlir-serving Goto Github PK
View Code? Open in Web Editor NEWThis project forked from ibm/onnx-mlir-serving
ONNX Serving is a project written with C++ to serve onnx-mlir compiled models with GRPC and other protocols.Benefiting from C++ implementation, ONNX Serving has very low latency overhead and high throughput. ONNX Servring provides dynamic batch aggregation and workers pool to fully utilize AI accelerators on the machine.
License: Apache License 2.0