I'm using yolonas to detect handwrite numbers, and the inference time goes well locall

This is how it looks normal <a target="_blank" rel="noopener noreferrer" href="htt

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Speed inference Time about super-gradients HOT 6 OPEN

CrasCris commented on September 23, 2024

Speed inference Time

from super-gradients.

Comments (6)

BloodAxe commented on September 23, 2024 1

Can you elaborate on what "deploy it" means? Have you used our examples to Export YoloNAS to ONNX
and YoloNAS_Inference_using_TensorRT or did you write some your own code for this?

Based on previous similar issues, usually it's the error on the user side (E.g missing image normalization, wrong size preprocessing etc) that affects the model accuracy. On our end we have integration tests that verify exported ONNX model provides the same results as regular pytorch model.

from super-gradients.

CrasCris commented on September 23, 2024 1

You should be using same or nearly same image resolution during training/validation & inference. These images looks really different in size. Without seeing the full code you are using for export and inference it is impossible for me to help you.

Here is what can help to understand what is going on

Export code

Result of model.export(...) call. E.g print(model.export(...))

Inference code

Thanks i will post it as soon as i can :D

from super-gradients.

CrasCris commented on September 23, 2024

Ohh i see, i'll check the TensorRT, i follow the ONNX notebook to export the model, but maybe i done it wrong, i need that the inference time be less than a second, i'm using fastapi framework to users can use the model, but the time of response increase with 50 users at time, this is a basic example of the code
ml_models["num"] = models.get( 'yolo_nas_s', num_classes=10, checkpoint_path=f"newmodels/num.pth" ).eval()
await asyncio.to_thread(number, image, model_name, return_digits)
number is the inference function is a not async, and i can't use GPU, only the CPU of the Pod

from super-gradients.

CrasCris commented on September 23, 2024

This is how it looks normal

This is how it looks with onnx

I already using preprocessing true and postprocessing

from super-gradients.

BloodAxe commented on September 23, 2024

You should be using same or nearly same image resolution during training/validation & inference.
These images looks really different in size. Without seeing the full code you are using for export and inference it is impossible for me to help you.

Here is what can help to understand what is going on

Export code
Result of model.export(...) call. E.g print(model.export(...))
Inference code

from super-gradients.

CrasCris commented on September 23, 2024

@BloodAxe This is the export_result
`Model exported successfully to numb.onnx
Model expects input image of shape [1, 3, 640, 640]
Input image dtype is torch.uint8
Exported model already contains preprocessing (normalization) step, so you don't need to do it manually.
Preprocessing steps to be applied to input image are:
Sequential(
(0): CastTensorTo(dtype=torch.float32)
(1): ChannelSelect(channels_indexes=tensor([2, 1, 0]))
)

Exported model contains postprocessing (NMS) step with the following parameters:
num_pre_nms_predictions=1000
max_predictions_per_image=1000
nms_threshold=0.65
confidence_threshold=0.5
output_predictions_format=batch

Exported model is in ONNX format and can be used with ONNXRuntime
To run inference with ONNXRuntime, please use the following code snippet:

import onnxruntime
import numpy as np
session = onnxruntime.InferenceSession("numb.onnx", providers=["CUDAExecutionProvider", "CPUExecutionProvider"])
inputs = [o.name for o in session.get_inputs()]
outputs = [o.name for o in session.get_outputs()]
example_input_image = np.zeros((1, 3, 640, 640)).astype(np.uint8)
predictions = session.run(outputs, {inputs[0]: example_input_image})

Exported model has predictions in batch format:

num_detections, pred_boxes, pred_scores, pred_classes = predictions
for image_index in range(num_detections.shape[0]):
  for i in range(num_detections[image_index,0]):
    class_id = pred_classes[image_index, i]
    confidence = pred_scores[image_index, i]
    x_min, y_min, x_max, y_max = pred_boxes[image_index, i]
    print(f"Detected object with class_id={class_id}, confidence={confidence}, x_min={x_min}, y_min={y_min}, x_max={x_max}, y_max={y_max}")`

from super-gradients.

Speed inference Time about super-gradients HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent