Code Monkey home page Code Monkey logo

Comments (7)

pderrenger avatar pderrenger commented on July 18, 2024

@ilanb hi there! πŸ™Œ

It sounds like you're experiencing inconsistencies in detection results between the Ultralytics preview and your custom Python application. A possible reason for this could be differences in preprocessing or configuration settings applied to the model before running predictions.

Since you’ve experimented with some parameters already, you might want to verify the following:

  • Image preprocessing: Ensure that the images are preprocessed in the same way in both environments (your app and the Ultralytics preview) before they are fed into the model.
  • Model configuration: Double-check that the model configuration (e.g., iou, imgsz, max_det, device, augment, agnostic_nms) mirrors the setup used in the Ultralytics preview as closely as possible.

Additionally, please ensure that you are using the same model version and weights in both your custom app and the preview for a fair comparison.

For specific instructions on setting and matching configuration, please refer to the Ultralytics HUB Docs at https://docs.ultralytics.com/hub.

Hope this helps! If the issue persists, please provide more details about your image preprocessing steps and model setup in both environments for further diagnosis. 😊

from hub.

ilanb avatar ilanb commented on July 18, 2024

Thanks for reply :-)

What do you mean by " this could be differences in preprocessing " I only upload image from postman or html like your preview in utltralytics hub.

what "Image preprocessing" do you apply in preview ?

For the parameters I already tested both with and without.

Image used is the same image to serve as training, valid, test

here is my simple app:

`from ultralytics import YOLO
from flask import request, Flask, jsonify
from flask_cors import CORS
from PIL import Image
import json

app = Flask(name)
CORS(app)

YOLO model

yolo_model = None

@app.route("/")
def root():
"""
Site main page handler function.
:return: Content of index.html file
"""
with open("index.html") as file:
return file.read()

@app.route("/detect", methods=["POST"])
def detect():
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image(buf.stream)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

@app.route("/detecthtml", methods=["POST"])
def detecthtml():
confidence_threshold = float(request.args.get("confidence_threshold", 0.2))
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image_html(buf.stream, confidence_threshold)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

def detect_objects_on_image(buf):
try:
confidence_threshold = float(request.form.get("confidence_threshold", "0.2"))
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
#results = yolo_model.predict(Image.open(buf))
result = results[0]
output = []
for box in result.boxes:
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
if prob >= confidence_threshold:
output.append([result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

def detect_objects_on_image_html(buf, confidence_threshold=0.2):
try:
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
result = results[0]
output = []
for box in result.boxes:
x1, y1, x2, y2 = [round(x) for x in box.xyxy[0].tolist()]
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
if prob >= confidence_threshold:
output.append([x1, y1, x2, y2, result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

if name == 'main':
yolo_model = YOLO("ponantyolo8.pt")
app.run(debug=True, host='0.0.0.0', port=5080)`

Thanks

from hub.

pderrenger avatar pderrenger commented on July 18, 2024

Hi @ilanb, thanks for providing more details! 😊

When I mention "image preprocessing," I'm referring to how the image is prepared before it's input into the model. This includes resizing, normalization, and possibly other transformations to ensure the image is in the correct format for the model to process.

In the Ultralytics preview, images are typically resized and normalized to match the input expectations of the model. It's crucial to ensure that the same preprocessing steps are applied in your app as well.

From your code, it looks like you're directly using Image.open(buf) without explicitly resizing or normalizing the image. Here's a quick suggestion to ensure the image is resized correctly:

from PIL import Image

def prepare_image(image_path):
    img = Image.open(image_path)
    img = img.resize((640, 640))  # Resize the image to the expected input size
    return img

# Then use this function to prepare your image before prediction
img = prepare_image(buf)
results = yolo_model.predict(img, iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)

Make sure that the image size (imgsz) and other parameters match those used during the model's training and in the Ultralytics preview. This consistency is key to achieving similar detection results.

Let me know if aligning these preprocessing steps helps or if there's anything else you'd like to explore! πŸš€

from hub.

ilanb avatar ilanb commented on July 18, 2024

Thank you,
I tryed that too, but same problem, "no detection" .

I tried to simplify to results = yolo_model.predict(img) but same.

with same image, detection working on your preview but failed with my code...

`from ultralytics import YOLO
from flask import request, Flask, jsonify
from flask_cors import CORS
from PIL import Image
import json

app = Flask(name)
CORS(app)

@app.route("/")
def root():
with open("index.html") as file:
return file.read()

@app.route("/detect", methods=["POST"])
def detect():
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image(buf.stream)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

@app.route("/detecthtml", methods=["POST"])
def detecthtml():
confidence_threshold = float(request.args.get("confidence_threshold", 0.2))
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image_html(buf.stream, confidence_threshold)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500

def prepare_image(image_path):
img = Image.open(image_path)
img = img.resize((640, 640)) # Resize the image to the expected input size
return img

def detect_objects_on_image(buf):
try:
confidence_threshold = float(request.form.get("confidence_threshold", "0.2"))
img = prepare_image(buf)
results = yolo_model.predict(img)
result = results[0]
output = []
for box in result.boxes:
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
output.append([result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

def detect_objects_on_image_html(buf, confidence_threshold=0.2):
try:
img = prepare_image(buf)
results = yolo_model.predict(img)
result = results[0]
output = []
for box in result.boxes:
x1, y1, x2, y2 = [round(x) for x in box.xyxy[0].tolist()]
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
output.append([x1, y1, x2, y2, result.names[class_id], prob])

    return output
except Exception as e:
    raise Exception("Error during object detection: " + str(e))

if name == 'main':
yolo_model = YOLO("ponantyolo8.pt")
app.run(debug=True, host='0.0.0.0', port=5080)`

from hub.

ilanb avatar ilanb commented on July 18, 2024

The strange behaviour that is when I try .onnx exported model in unity sentis c# all images are correctly detected too..

Didn't understant what cause the problem with python and .pt model

`using System.Collections.Generic;
using System.Collections;
using Unity.Sentis;
using UnityEngine;
using UnityEngine.UI;
using UnityEngine.Video;
using Lays = Unity.Sentis.Layers;
using MugHeadStudios;

public class RunYOLO8n : MonoBehaviour
{
const string modelName = "lastponant.sentis";
// Link the classes.txt here:
public TextAsset labelsAsset;
// Create a Raw Image in the scene and link it here:
public RawImage displayImage;
// Link to a bounding box texture here:
public Sprite boxTexture;
// Link to the font for the labels:
public Font font;

const BackendType backend = BackendType.CPU;

private Transform displayLocation;
private Model model;
private IWorker engine;
private string[] labels;
private RenderTexture targetRT;


//Image size for the model
private const int imageWidth = 640;
private const int imageHeight = 640;

//The number of classes in the model
private const int numClasses = 50;

private VideoPlayer video;

List<GameObject> boxPool = new List<GameObject>();

[SerializeField, Range(0, 1)] float iouThreshold = 0.5f;
[SerializeField, Range(0, 1)] float scoreThreshold = 0.5f;
int maxOutputBoxes = 64;

//For using tensor operators:
Ops ops;

//bounding box data
public struct BoundingBox
{
    public float centerX;
    public float centerY;
    public float width;
    public float height;
    public string label;
}

public Texture2D[] textures; // Assign this array in the Unity Editor.
public float delayInSeconds = 5f; // Time delay between textures.

void Start()
{
    Application.targetFrameRate = 60;
    Screen.orientation = ScreenOrientation.LandscapeLeft;

    ops = WorkerFactory.CreateOps(backend, null);

    //Parse neural net labels
    labels = labelsAsset.text.Split('\n');

    LoadModel();

    targetRT = new RenderTexture(imageWidth, imageHeight, 0);

    //Create image to display video
    displayLocation = displayImage.transform;

    //Create engine to run model
    engine = WorkerFactory.CreateWorker(backend, model);
    
    if (textures.Length > 0)
    {
    	SubRoutines.Repeat(10, textures.Length*delayInSeconds, r => { StartCoroutine(LoadTexturesOneByOne(delayInSeconds)); }, () => { Debug.Log("Restarted"); });
    }
}

void LoadModel()
{
    //Load model
    model = ModelLoader.Load(Application.streamingAssetsPath + "/" + modelName);

    //The classes are also stored here in JSON format:
    Debug.Log($"Class names: \n{model.Metadata["names"]}");

    //We need to add some layers to choose the best boxes with the NMSLayer
    
    //Set constants
    model.AddConstant(new Lays.Constant("0", new int[] { 0 }));
    model.AddConstant(new Lays.Constant("1", new int[] { 1 }));
    model.AddConstant(new Lays.Constant("4", new int[] { 4 }));


    model.AddConstant(new Lays.Constant("classes_plus_4", new int[] { numClasses + 4 }));
    model.AddConstant(new Lays.Constant("maxOutputBoxes", new int[] { maxOutputBoxes }));
    model.AddConstant(new Lays.Constant("iouThreshold", new float[] { iouThreshold }));
    model.AddConstant(new Lays.Constant("scoreThreshold", new float[] { scoreThreshold }));
   
    //Add layers
    model.AddLayer(new Lays.Slice("boxCoords0", "output0", "0", "4", "1")); 
    model.AddLayer(new Lays.Transpose("boxCoords", "boxCoords0", new int[] { 0, 2, 1 }));
    model.AddLayer(new Lays.Slice("scores0", "output0", "4", "classes_plus_4", "1")); 
    model.AddLayer(new Lays.ReduceMax("scores", new[] { "scores0", "1" }));
    model.AddLayer(new Lays.ArgMax("classIDs", "scores0", 1));

    model.AddLayer(new Lays.NonMaxSuppression("NMS", "boxCoords", "scores",
        "maxOutputBoxes", "iouThreshold", "scoreThreshold",
        centerPointBox: Lays.CenterPointBox.Center
    ));

    model.outputs.Clear();
    model.AddOutput("boxCoords");
    model.AddOutput("classIDs");
    model.AddOutput("NMS");
}

IEnumerator LoadTexturesOneByOne(float wait)
{
	foreach (var texture in textures)
	{
		displayImage.texture = texture;

		if (displayImage == null || texture == null)
		{
			Debug.LogError("Please assign the RawImage and testImage in the Inspector.");
		}

		// Set the test image as the texture of the Raw Image
		displayImage.texture = texture;

		// Calculate the aspect ratio of the test image
		float aspectRatio = (float)texture.width / texture.height;

		// Get the screen dimensions
		float screenWidth = Screen.width;
		float screenHeight = Screen.height;

		// Calculate the size of the RawImage to fit the screen while maintaining aspect ratio
		float imageWidth = screenWidth;
		float imageHeight = screenWidth / aspectRatio;

		if (imageHeight > screenHeight)
		{
			imageHeight = screenHeight;
			imageWidth = screenHeight * aspectRatio;
		}

		// Set the size of the RawImage
		RectTransform rawImageRect = displayImage.GetComponent<RectTransform>();
		rawImageRect.sizeDelta = new Vector2(imageWidth, imageHeight);
		// Perform ML inference on the test image
		ExecuteML(texture);
		
		yield return new WaitForSeconds(wait);		
	}
}

private void Update()
{
    if (Input.GetKeyDown(KeyCode.Escape))
    {
        Application.Quit();
    }
}

public void ExecuteML(Texture2D inputTexture)
{
    ClearAnnotations();

    // Process the input texture
    using var input = TextureConverter.ToTensor(inputTexture, imageWidth, imageHeight, 3);
    engine.Execute(input);

    var boxCoords = engine.PeekOutput("boxCoords") as TensorFloat;
    var NMS = engine.PeekOutput("NMS") as TensorInt;
    var classIDs = engine.PeekOutput("classIDs") as TensorInt;

    using var boxIDs = ops.Slice(NMS, new int[] { 2 }, new int[] { 3 }, new int[] { 1 }, new int[] { 1 });
    using var boxIDsFlat = boxIDs.ShallowReshape(new TensorShape(boxIDs.shape.length)) as TensorInt;
    using var output = ops.Gather(boxCoords, boxIDsFlat, 1);
    using var labelIDs = ops.Gather(classIDs, boxIDsFlat, 2);
    
    output.MakeReadable();
    labelIDs.MakeReadable();

    float displayWidth = displayImage.rectTransform.rect.width;
    float displayHeight = displayImage.rectTransform.rect.height;

    float scaleX = displayWidth / imageWidth;
    float scaleY = displayHeight / imageHeight;

    //Draw the bounding boxes
    for (int n = 0; n < output.shape[1]; n++)
    {
        var box = new BoundingBox
        {
            centerX = output[0, n, 0] * scaleX - displayWidth / 2,
            centerY = output[0, n, 1] * scaleY - displayHeight / 2,
            width = output[0, n, 2] * scaleX,
            height = output[0, n, 3] * scaleY,
            label = labels[labelIDs[0, 0,n]],
        };
        DrawBox(box, n);
    }
}

public void DrawBox(BoundingBox box , int id)
{
    //Create the bounding box graphic or get from pool
    GameObject panel;
    if (id < boxPool.Count)
    {
        panel = boxPool[id];
        panel.SetActive(true);
    }
    else
    {
        panel = CreateNewBox(Color.yellow);
    }
    //Set box position
    panel.transform.localPosition = new Vector3(box.centerX, -box.centerY);

    //Set box size
    RectTransform rt = panel.GetComponent<RectTransform>();
    rt.sizeDelta = new Vector2(box.width, box.height);
    
    //Set label text
    var label = panel.GetComponentInChildren<Text>();
    label.text = box.label;
}

public GameObject CreateNewBox(Color color)
{
    //Create the box and set image

    var panel = new GameObject("ObjectBox");
    panel.AddComponent<CanvasRenderer>();
    Image img = panel.AddComponent<Image>();
    img.color = color;
    img.sprite = boxTexture;
    img.type = Image.Type.Sliced;
    panel.transform.SetParent(displayLocation, false);

    //Create the label

    var text = new GameObject("ObjectLabel");
    text.AddComponent<CanvasRenderer>();
    text.transform.SetParent(panel.transform, false);
    Text txt = text.AddComponent<Text>();
    txt.font = font;
    txt.color = color;
    txt.fontSize = 40;
    txt.horizontalOverflow = HorizontalWrapMode.Overflow;

    RectTransform rt2 = text.GetComponent<RectTransform>();
    rt2.offsetMin = new Vector2(20, rt2.offsetMin.y);
    rt2.offsetMax = new Vector2(0, rt2.offsetMax.y);
    rt2.offsetMin = new Vector2(rt2.offsetMin.x, 0);
    rt2.offsetMax = new Vector2(rt2.offsetMax.x, 30);
    rt2.anchorMin = new Vector2(0, 0);
    rt2.anchorMax = new Vector2(1, 1);

    boxPool.Add(panel);
    return panel;
}

public void ClearAnnotations()
{
    foreach(var box in boxPool)
    {
        box.SetActive(false);
    }
}

private void OnDestroy()
{
    engine?.Dispose();
    ops?.Dispose();
}

}
`

from hub.

pderrenger avatar pderrenger commented on July 18, 2024

Hi @ilanb,

It's intriguing that the .onnx model works well in Unity with Sentis but not the .pt model in Python. This could suggest a difference in how the models handle the input data or in the post-processing steps.

Here are a couple of things to consider:

  • Model Version Compatibility: Ensure that the .pt model and the .onnx model are from the same training and have the same architecture.
  • Input Normalization: Check if there's a difference in how images are preprocessed and normalized in Unity vs. your Python setup. Sometimes, models expect inputs normalized in a specific way (e.g., scaled to [0,1] or mean-subtracted).

If you haven't already, you might also want to try running inference with a very simple setup in Python to rule out any issues with Flask or image handling:

from ultralytics import YOLO
from PIL import Image

# Load model
model = YOLO("path_to_your_model.pt")

# Load image
img = Image.open("path_to_your_image.jpg")
img = img.resize((640, 640))

# Predict
results = model.predict(img)
print(results)

This minimal example can help isolate the problem by removing potential complications from web server code or image streaming.

Let me know how it goes! πŸš€

from hub.

github-actions avatar github-actions commented on July 18, 2024

πŸ‘‹ Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.

For additional resources and information, please see the links below:

Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!

Thank you for your contributions to YOLO πŸš€ and Vision AI ⭐

from hub.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.