Comments (7)
@ilanb hi there! π
It sounds like you're experiencing inconsistencies in detection results between the Ultralytics preview and your custom Python application. A possible reason for this could be differences in preprocessing or configuration settings applied to the model before running predictions.
Since youβve experimented with some parameters already, you might want to verify the following:
- Image preprocessing: Ensure that the images are preprocessed in the same way in both environments (your app and the Ultralytics preview) before they are fed into the model.
- Model configuration: Double-check that the model configuration (e.g.,
iou
,imgsz
,max_det
,device
,augment
,agnostic_nms
) mirrors the setup used in the Ultralytics preview as closely as possible.
Additionally, please ensure that you are using the same model version and weights in both your custom app and the preview for a fair comparison.
For specific instructions on setting and matching configuration, please refer to the Ultralytics HUB Docs at https://docs.ultralytics.com/hub.
Hope this helps! If the issue persists, please provide more details about your image preprocessing steps and model setup in both environments for further diagnosis. π
from hub.
Thanks for reply :-)
What do you mean by " this could be differences in preprocessing " I only upload image from postman or html like your preview in utltralytics hub.
what "Image preprocessing" do you apply in preview ?
For the parameters I already tested both with and without.
Image used is the same image to serve as training, valid, test
here is my simple app:
`from ultralytics import YOLO
from flask import request, Flask, jsonify
from flask_cors import CORS
from PIL import Image
import json
app = Flask(name)
CORS(app)
YOLO model
yolo_model = None
@app.route("/")
def root():
"""
Site main page handler function.
:return: Content of index.html file
"""
with open("index.html") as file:
return file.read()
@app.route("/detect", methods=["POST"])
def detect():
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image(buf.stream)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500
@app.route("/detecthtml", methods=["POST"])
def detecthtml():
confidence_threshold = float(request.args.get("confidence_threshold", 0.2))
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image_html(buf.stream, confidence_threshold)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500
def detect_objects_on_image(buf):
try:
confidence_threshold = float(request.form.get("confidence_threshold", "0.2"))
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
#results = yolo_model.predict(Image.open(buf))
result = results[0]
output = []
for box in result.boxes:
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
if prob >= confidence_threshold:
output.append([result.names[class_id], prob])
return output
except Exception as e:
raise Exception("Error during object detection: " + str(e))
def detect_objects_on_image_html(buf, confidence_threshold=0.2):
try:
results = yolo_model.predict(Image.open(buf), iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
result = results[0]
output = []
for box in result.boxes:
x1, y1, x2, y2 = [round(x) for x in box.xyxy[0].tolist()]
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
if prob >= confidence_threshold:
output.append([x1, y1, x2, y2, result.names[class_id], prob])
return output
except Exception as e:
raise Exception("Error during object detection: " + str(e))
if name == 'main':
yolo_model = YOLO("ponantyolo8.pt")
app.run(debug=True, host='0.0.0.0', port=5080)`
Thanks
from hub.
Hi @ilanb, thanks for providing more details! π
When I mention "image preprocessing," I'm referring to how the image is prepared before it's input into the model. This includes resizing, normalization, and possibly other transformations to ensure the image is in the correct format for the model to process.
In the Ultralytics preview, images are typically resized and normalized to match the input expectations of the model. It's crucial to ensure that the same preprocessing steps are applied in your app as well.
From your code, it looks like you're directly using Image.open(buf)
without explicitly resizing or normalizing the image. Here's a quick suggestion to ensure the image is resized correctly:
from PIL import Image
def prepare_image(image_path):
img = Image.open(image_path)
img = img.resize((640, 640)) # Resize the image to the expected input size
return img
# Then use this function to prepare your image before prediction
img = prepare_image(buf)
results = yolo_model.predict(img, iou=0.45, imgsz=640, max_det=1, device="cpu", augment=True, agnostic_nms=True)
Make sure that the image size (imgsz
) and other parameters match those used during the model's training and in the Ultralytics preview. This consistency is key to achieving similar detection results.
Let me know if aligning these preprocessing steps helps or if there's anything else you'd like to explore! π
from hub.
Thank you,
I tryed that too, but same problem, "no detection" .
I tried to simplify to results = yolo_model.predict(img) but same.
with same image, detection working on your preview but failed with my code...
`from ultralytics import YOLO
from flask import request, Flask, jsonify
from flask_cors import CORS
from PIL import Image
import json
app = Flask(name)
CORS(app)
@app.route("/")
def root():
with open("index.html") as file:
return file.read()
@app.route("/detect", methods=["POST"])
def detect():
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image(buf.stream)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500
@app.route("/detecthtml", methods=["POST"])
def detecthtml():
confidence_threshold = float(request.args.get("confidence_threshold", 0.2))
buf = request.files["image_file"]
try:
if yolo_model is None:
raise Exception("Model has not been loaded.")
boxes = detect_objects_on_image_html(buf.stream, confidence_threshold)
return jsonify(boxes)
except Exception as e:
error_message = str(e)
return jsonify({"error": error_message}), 500
def prepare_image(image_path):
img = Image.open(image_path)
img = img.resize((640, 640)) # Resize the image to the expected input size
return img
def detect_objects_on_image(buf):
try:
confidence_threshold = float(request.form.get("confidence_threshold", "0.2"))
img = prepare_image(buf)
results = yolo_model.predict(img)
result = results[0]
output = []
for box in result.boxes:
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
output.append([result.names[class_id], prob])
return output
except Exception as e:
raise Exception("Error during object detection: " + str(e))
def detect_objects_on_image_html(buf, confidence_threshold=0.2):
try:
img = prepare_image(buf)
results = yolo_model.predict(img)
result = results[0]
output = []
for box in result.boxes:
x1, y1, x2, y2 = [round(x) for x in box.xyxy[0].tolist()]
class_id = box.cls[0].item()
prob = round(box.conf[0].item(), 2)
output.append([x1, y1, x2, y2, result.names[class_id], prob])
return output
except Exception as e:
raise Exception("Error during object detection: " + str(e))
if name == 'main':
yolo_model = YOLO("ponantyolo8.pt")
app.run(debug=True, host='0.0.0.0', port=5080)`
from hub.
The strange behaviour that is when I try .onnx exported model in unity sentis c# all images are correctly detected too..
Didn't understant what cause the problem with python and .pt model
`using System.Collections.Generic;
using System.Collections;
using Unity.Sentis;
using UnityEngine;
using UnityEngine.UI;
using UnityEngine.Video;
using Lays = Unity.Sentis.Layers;
using MugHeadStudios;
public class RunYOLO8n : MonoBehaviour
{
const string modelName = "lastponant.sentis";
// Link the classes.txt here:
public TextAsset labelsAsset;
// Create a Raw Image in the scene and link it here:
public RawImage displayImage;
// Link to a bounding box texture here:
public Sprite boxTexture;
// Link to the font for the labels:
public Font font;
const BackendType backend = BackendType.CPU;
private Transform displayLocation;
private Model model;
private IWorker engine;
private string[] labels;
private RenderTexture targetRT;
//Image size for the model
private const int imageWidth = 640;
private const int imageHeight = 640;
//The number of classes in the model
private const int numClasses = 50;
private VideoPlayer video;
List<GameObject> boxPool = new List<GameObject>();
[SerializeField, Range(0, 1)] float iouThreshold = 0.5f;
[SerializeField, Range(0, 1)] float scoreThreshold = 0.5f;
int maxOutputBoxes = 64;
//For using tensor operators:
Ops ops;
//bounding box data
public struct BoundingBox
{
public float centerX;
public float centerY;
public float width;
public float height;
public string label;
}
public Texture2D[] textures; // Assign this array in the Unity Editor.
public float delayInSeconds = 5f; // Time delay between textures.
void Start()
{
Application.targetFrameRate = 60;
Screen.orientation = ScreenOrientation.LandscapeLeft;
ops = WorkerFactory.CreateOps(backend, null);
//Parse neural net labels
labels = labelsAsset.text.Split('\n');
LoadModel();
targetRT = new RenderTexture(imageWidth, imageHeight, 0);
//Create image to display video
displayLocation = displayImage.transform;
//Create engine to run model
engine = WorkerFactory.CreateWorker(backend, model);
if (textures.Length > 0)
{
SubRoutines.Repeat(10, textures.Length*delayInSeconds, r => { StartCoroutine(LoadTexturesOneByOne(delayInSeconds)); }, () => { Debug.Log("Restarted"); });
}
}
void LoadModel()
{
//Load model
model = ModelLoader.Load(Application.streamingAssetsPath + "/" + modelName);
//The classes are also stored here in JSON format:
Debug.Log($"Class names: \n{model.Metadata["names"]}");
//We need to add some layers to choose the best boxes with the NMSLayer
//Set constants
model.AddConstant(new Lays.Constant("0", new int[] { 0 }));
model.AddConstant(new Lays.Constant("1", new int[] { 1 }));
model.AddConstant(new Lays.Constant("4", new int[] { 4 }));
model.AddConstant(new Lays.Constant("classes_plus_4", new int[] { numClasses + 4 }));
model.AddConstant(new Lays.Constant("maxOutputBoxes", new int[] { maxOutputBoxes }));
model.AddConstant(new Lays.Constant("iouThreshold", new float[] { iouThreshold }));
model.AddConstant(new Lays.Constant("scoreThreshold", new float[] { scoreThreshold }));
//Add layers
model.AddLayer(new Lays.Slice("boxCoords0", "output0", "0", "4", "1"));
model.AddLayer(new Lays.Transpose("boxCoords", "boxCoords0", new int[] { 0, 2, 1 }));
model.AddLayer(new Lays.Slice("scores0", "output0", "4", "classes_plus_4", "1"));
model.AddLayer(new Lays.ReduceMax("scores", new[] { "scores0", "1" }));
model.AddLayer(new Lays.ArgMax("classIDs", "scores0", 1));
model.AddLayer(new Lays.NonMaxSuppression("NMS", "boxCoords", "scores",
"maxOutputBoxes", "iouThreshold", "scoreThreshold",
centerPointBox: Lays.CenterPointBox.Center
));
model.outputs.Clear();
model.AddOutput("boxCoords");
model.AddOutput("classIDs");
model.AddOutput("NMS");
}
IEnumerator LoadTexturesOneByOne(float wait)
{
foreach (var texture in textures)
{
displayImage.texture = texture;
if (displayImage == null || texture == null)
{
Debug.LogError("Please assign the RawImage and testImage in the Inspector.");
}
// Set the test image as the texture of the Raw Image
displayImage.texture = texture;
// Calculate the aspect ratio of the test image
float aspectRatio = (float)texture.width / texture.height;
// Get the screen dimensions
float screenWidth = Screen.width;
float screenHeight = Screen.height;
// Calculate the size of the RawImage to fit the screen while maintaining aspect ratio
float imageWidth = screenWidth;
float imageHeight = screenWidth / aspectRatio;
if (imageHeight > screenHeight)
{
imageHeight = screenHeight;
imageWidth = screenHeight * aspectRatio;
}
// Set the size of the RawImage
RectTransform rawImageRect = displayImage.GetComponent<RectTransform>();
rawImageRect.sizeDelta = new Vector2(imageWidth, imageHeight);
// Perform ML inference on the test image
ExecuteML(texture);
yield return new WaitForSeconds(wait);
}
}
private void Update()
{
if (Input.GetKeyDown(KeyCode.Escape))
{
Application.Quit();
}
}
public void ExecuteML(Texture2D inputTexture)
{
ClearAnnotations();
// Process the input texture
using var input = TextureConverter.ToTensor(inputTexture, imageWidth, imageHeight, 3);
engine.Execute(input);
var boxCoords = engine.PeekOutput("boxCoords") as TensorFloat;
var NMS = engine.PeekOutput("NMS") as TensorInt;
var classIDs = engine.PeekOutput("classIDs") as TensorInt;
using var boxIDs = ops.Slice(NMS, new int[] { 2 }, new int[] { 3 }, new int[] { 1 }, new int[] { 1 });
using var boxIDsFlat = boxIDs.ShallowReshape(new TensorShape(boxIDs.shape.length)) as TensorInt;
using var output = ops.Gather(boxCoords, boxIDsFlat, 1);
using var labelIDs = ops.Gather(classIDs, boxIDsFlat, 2);
output.MakeReadable();
labelIDs.MakeReadable();
float displayWidth = displayImage.rectTransform.rect.width;
float displayHeight = displayImage.rectTransform.rect.height;
float scaleX = displayWidth / imageWidth;
float scaleY = displayHeight / imageHeight;
//Draw the bounding boxes
for (int n = 0; n < output.shape[1]; n++)
{
var box = new BoundingBox
{
centerX = output[0, n, 0] * scaleX - displayWidth / 2,
centerY = output[0, n, 1] * scaleY - displayHeight / 2,
width = output[0, n, 2] * scaleX,
height = output[0, n, 3] * scaleY,
label = labels[labelIDs[0, 0,n]],
};
DrawBox(box, n);
}
}
public void DrawBox(BoundingBox box , int id)
{
//Create the bounding box graphic or get from pool
GameObject panel;
if (id < boxPool.Count)
{
panel = boxPool[id];
panel.SetActive(true);
}
else
{
panel = CreateNewBox(Color.yellow);
}
//Set box position
panel.transform.localPosition = new Vector3(box.centerX, -box.centerY);
//Set box size
RectTransform rt = panel.GetComponent<RectTransform>();
rt.sizeDelta = new Vector2(box.width, box.height);
//Set label text
var label = panel.GetComponentInChildren<Text>();
label.text = box.label;
}
public GameObject CreateNewBox(Color color)
{
//Create the box and set image
var panel = new GameObject("ObjectBox");
panel.AddComponent<CanvasRenderer>();
Image img = panel.AddComponent<Image>();
img.color = color;
img.sprite = boxTexture;
img.type = Image.Type.Sliced;
panel.transform.SetParent(displayLocation, false);
//Create the label
var text = new GameObject("ObjectLabel");
text.AddComponent<CanvasRenderer>();
text.transform.SetParent(panel.transform, false);
Text txt = text.AddComponent<Text>();
txt.font = font;
txt.color = color;
txt.fontSize = 40;
txt.horizontalOverflow = HorizontalWrapMode.Overflow;
RectTransform rt2 = text.GetComponent<RectTransform>();
rt2.offsetMin = new Vector2(20, rt2.offsetMin.y);
rt2.offsetMax = new Vector2(0, rt2.offsetMax.y);
rt2.offsetMin = new Vector2(rt2.offsetMin.x, 0);
rt2.offsetMax = new Vector2(rt2.offsetMax.x, 30);
rt2.anchorMin = new Vector2(0, 0);
rt2.anchorMax = new Vector2(1, 1);
boxPool.Add(panel);
return panel;
}
public void ClearAnnotations()
{
foreach(var box in boxPool)
{
box.SetActive(false);
}
}
private void OnDestroy()
{
engine?.Dispose();
ops?.Dispose();
}
}
`
from hub.
Hi @ilanb,
It's intriguing that the .onnx model works well in Unity with Sentis but not the .pt model in Python. This could suggest a difference in how the models handle the input data or in the post-processing steps.
Here are a couple of things to consider:
- Model Version Compatibility: Ensure that the .pt model and the .onnx model are from the same training and have the same architecture.
- Input Normalization: Check if there's a difference in how images are preprocessed and normalized in Unity vs. your Python setup. Sometimes, models expect inputs normalized in a specific way (e.g., scaled to [0,1] or mean-subtracted).
If you haven't already, you might also want to try running inference with a very simple setup in Python to rule out any issues with Flask or image handling:
from ultralytics import YOLO
from PIL import Image
# Load model
model = YOLO("path_to_your_model.pt")
# Load image
img = Image.open("path_to_your_image.jpg")
img = img.resize((640, 640))
# Predict
results = model.predict(img)
print(results)
This minimal example can help isolate the problem by removing potential complications from web server code or image streaming.
Let me know how it goes! π
from hub.
π Hello there! We wanted to give you a friendly reminder that this issue has not had any recent activity and may be closed soon, but don't worry - you can always reopen it if needed. If you still have any questions or concerns, please feel free to let us know how we can help.
For additional resources and information, please see the links below:
- Docs: https://docs.ultralytics.com
- HUB: https://hub.ultralytics.com
- Community: https://community.ultralytics.com
Feel free to inform us of any other issues you discover or feature requests that come to mind in the future. Pull Requests (PRs) are also always welcomed!
Thank you for your contributions to YOLO π and Vision AI β
from hub.
Related Issues (20)
- YOLOv10 ? HOT 1
- Ultralytics HUB API Issue HOT 2
- Ultralytics Pro member doesn't work dataset HOT 2
- instances.keypoints: index is out of bounds HOT 2
- Labeling tool in the process HOT 4
- This account is currently being deleted. HOT 4
- Model parallelization for inference on large images HOT 2
- 'yolov8n-pose.yaml' HOT 19
- det HOT 2
- No response from the server. HOT 14
- canΒ΄t remove Roboflow linked workspaces HOT 9
- How to download training logs to view in tensorflow? HOT 1
- Team member invitation is not sending the email invitation HOT 1
- Resolution Option, Does it need to be square 1920*1920? Can I do 1920*1080? HOT 2
- Integrating Yolo V8 For Football Analysis HOT 5
- unable to process the dataset HOT 3
- APP11 HOT 3
- my model is optimizing the weights and giving me the option of preview and deployment HOT 38
- Merging datasets HOT 4
- Ultralitics HUB can retrieve Roboflow dataset to train HOT 16
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hub.