Comments (11)
FYI, this is the YoloNAS image_processing params:
from super_gradients.training.processing import DetectionCenterPadding, StandardizeImage, NormalizeImage, ImagePermute, ComposeProcessing, DetectionLongestMaxSizeRescale
image_processor = ComposeProcessing(
[
DetectionLongestMaxSizeRescale(output_shape=(636, 636)),
DetectionCenterPadding(output_shape=(640, 640), pad_value=114),
StandardizeImage(max_value=255.0),
ImagePermute(permutation=(2, 0, 1)),
]
)
The ImagePermute
, for instance, is mandatory for YoloNAS because it permutes the axis of the image (H, W, C)
to (C, H, W)
and YoloNAS expects (C, H, W)
.
from super-gradients.
Great to hear it works. To give you some more info:
1. When you use predict
on a model that was not trained using SG recipes, you get the following error:
Please call `model.set_dataset_processing_params(...)` first.
Why ? Because if you trained it yourself with a custom dataset, the model has no way to know what processing functions you applied to the images before feeding it to the model so you need to let it know using set_dataset_processing_params
.
2. In your case you said you are loading a pretrained weights from SG, so why didnt it work ?
Because you loaded using checkpoint_path
which is meants for loading local checkpoint (i.e. usually trained by yourself). The normal way of loading a pretrained model is to simply do : model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")
This way you should not need to call model.set_dataset_processing_params(...)
Also note that that way you don't need to set num_classes
because you are using the default of the model. If you set it to another number, the head of the model will be changed and your model won't be able to predict with it (you will need to fine-tune on the new number of heads)
3. Tip/Fix
Besides, you are doing the following:
image = Image.open("./img.jpg")
# Prepare preprcoess transformations
pre_proccess = transforms.Compose([
transforms.Resize([640, 640]),
transforms.ToTensor(),
transforms.Normalize([.485, .456, .406], [.229, .224, .225])
])
# Run preprocess on image. unsqueeze for [Batch x Channels x Width x Height] format
transformed_image = pre_proccess(image).unsqueeze(0)
...
model.predict(transformed_image, conf=0.45).save("output")
But the simpler approach and more correct approach would be this
model.predict("./img.jpg", conf=0.45).save("output")
With your approach, you apply transforms to the image before feeding it to the model.predict(...)
, which will anyway preprocess the images in the right way. So with your approach the image will be resized twice (no problem) but also normalized by you and then standardized by the model (with StandardizeImage
).
Eventually your code should work like that:
from super_gradients.common.object_names import Models
from super_gradients.training import models
model = models.get(Models.YOLO_NAS_L, pretrained_weights='coco')
model.predict("./img.jpg", conf=0.45).save("output")
Does that all make sens?
from super-gradients.
ToTensor
applies some transformations which currently break compatibility with our predict, but there are workarounds:
1.
model.predict(...)
expects the input images to be "channel last ". We will improve that in the near future, but currently, you need to make sure it's channel last (This is why we recommend, when possible to just pass the url/pass of the image(s) so that the model handles it for you).
Solution:
add transforms.Lambda(lambda x: x.permute(1, 2, 0))
to the transform OR do it afterward:
transformed_image = pre_proccess(image).permute(1, 2, 0)
2.
The input image is expected to be scaled [0-255] and ToTensor
scales [0-1].
Solution:
add transforms.Lambda(lambda x: x*255)
OR do it afterward:
transformed_image = pre_proccess(image) * 255
2.bis
The image should be [0-255], so you should either drop transforms.Normalize([.485, .456, .406], [.229, .224, .225]),
or undo the transformation.
3.
The image should be of type uint8 (this is not desired, but we need to fix it). We will very soon support it natively so you won't have to do it yourself.
Solution:
model.predict(transformed_image.numpy().astype(np.uint8), conf=0.45)
4.
This is a bug on our side, but passing batch is not supported (by that, I mean a tensor of shape [batch_size, C, H, W]).
Solution:
In your case, don't call .unsqueeze(0)
. Now if you have a batch, you can simply do:
images = [image for image in images] # from [BS, C, H, W] to of List of BS Tensors, each of shape [C, H, W]
Note that we process multiple images by batch anyway, this will not affect performance but is simply due how we parse the input images. We we resolve it soon
All together
from super_gradients.common.object_names import Models
from super_gradients.training import models
import torchvision.transforms as transforms
from PIL import Image
import numpy as np
model = models.get(Models.YOLO_NAS_L, pretrained_weights="coco")
image = Image.open("../../../../documentation/source/images/examples/countryside.jpg")
pre_proccess = transforms.Compose([
transforms.Resize([640, 640]),
transforms.ToTensor(),
transforms.Lambda(lambda x: x.permute(1, 2, 0)),
transforms.Lambda(lambda x: x * 255)]
)
transformed_image = pre_proccess(image)
model.predict(transformed_image.numpy().astype(np.uint8), conf=0.45).show()
Of course, I encourage to use model.predict(image_path)
or model.predict(image_without_any_processing)
whenever possible, but if for some reason you don't have access to the original image, this is how I would do it :)
from super-gradients.
Join the discussion on DagsHub!
from super-gradients.
I have tried following image_processor
steps but similar results only i am getting . even though i give the required parameter it still tells that its missing eg: mean and std in NormalizeImage
['ComposeProcessing', 'ImagePermute', 'ReverseImageChannels', 'StandardizeImage', 'NormalizeImage', 'DetectionCenterPadding', 'DetectionBottomRightPadding', 'DetectionRescale', 'DetectionLongestMaxSizeRescale']
from super-gradients.
@Louis-Dupont could u kindly help me with this ?
from super-gradients.
Hi @ajithkumarmcw, the syntax would be the following
image_processor={"NormalizeImage": {"mean": [1.0, 1.0, 1.0], "std": [1.0, 1.0, 1.0]}}
(the values are per-channel)
This syntax is used to work with our config files (the recipes), but you can also do all of this with regular python objects:
from super_gradients.training.processing import NormalizeImage
model.set_dataset_processing_params(image_processor=NormalizeImage(mean=[1.0, 1.0, 1.0], std=[1.0, 1.0, 1.0]))
That being said, this might not be required if you are using a model finetuned by SG. Unless you are fine-tuning the model without using SG recipes, you should just ignore this parameter since this is related to "how to make sure the model gets the image in the right format" which is automatically handled by SG.
Besides, setting image_processor=NormalizeImage
means that you won't do any other image processing other than normalizing the image, which can be enough for some models but not for YoloNAS, and this would lead to an error.
What is your motivation for using overriding image_processor
? Is it out of curiosity or because you have a custom training?
Also, I saw you setting checkpoint_path="./yolo_nas_l_coco.pth"
. Is this the default pretrained weights or the weight of a model you fine-tuned?
Feel free to let me know more about your use case if this doesn't answer your question :)
from super-gradients.
Hi @Louis-Dupont thanks for the detailed reply. My target is to run a finetuned model for a single image . As of now i am using "./yolo_nas_l_coco.pth" which is a pretrained weight provided by you.
from super_gradients.common.object_names import Models
from super_gradients.training import models
from super_gradients.training.processing import NormalizeImage
import torchvision.transforms as transforms
from PIL import Image
model = models.get(Models.YOLO_NAS_L,
checkpoint_path="./yolo_nas_l_coco.pth",
num_classes=80)
# Get PIL image
image = Image.open("./img.jpg")
# Prepare preprcoess transformations
pre_proccess = transforms.Compose([
transforms.Resize([640, 640]),
transforms.ToTensor(),
transforms.Normalize([.485, .456, .406], [.229, .224, .225])
])
# Run preprocess on image. unsqueeze for [Batch x Channels x Width x Height] format
transformed_image = pre_proccess(image).unsqueeze(0)
model.predict(transformed_image, conf=0.45).save("output")
Above was my script but when ever i run above script. I get below error
Traceback (most recent call last):
File "/~/yolo_nas_predict_1.py", line 29, in <module>
model.predict(transformed_image, conf=0.45).save("output")
File "/~/super_gradients/training/models/detection_models/customizable_detector.py", line 174, in predict
pipeline = self._get_pipeline(iou=iou, conf=conf)
File "/~/super_gradients/training/models/detection_models/customizable_detector.py", line 151, in _get_pipeline
raise RuntimeError(
RuntimeError: You must set the dataset processing parameters before calling predict.
Please call `model.set_dataset_processing_params(...)` first.
thats why i added `model.set_dataset_processing_params(...)' function
Now even after adding this i am getting following error
model.set_dataset_processing_params( class_names=["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79"],
image_processor=NormalizeImage(mean=[0, 0, 0], std=[0, 0, 0]),
iou=0.35,conf=0.25,
)
Traceback (most recent call last):
File "/~/yolo_nas/yolo_nas_predict_1.py", line 28, in <module>
model.predict(transformed_image, conf=0.45).save("output")
File "/~/yolonas/lib/python3.9/site-packages/super_gradients/training/models/prediction_results.py", line 196, in save
for i, prediction in enumerate(self._images_prediction_lst):
File "~/super_gradients/training/pipelines/pipelines.py", line 133, in _generate_prediction_result
yield from self._generate_prediction_result_single_batch(batch_images)
File "/~/super_gradients/training/pipelines/pipelines.py", line 151, in _generate_prediction_result_single_batch
preprocessed_image, processing_metadata = self.image_processor.preprocess_image(image=image.copy())
File "/~super_gradients/training/processing/processing.py", line 162, in preprocess_image
return (image - self.mean) / self.std, None
ValueError: operands could not be broadcast together with shapes (1,3,640,640) (1,1,3)
from super-gradients.
it worked now i am able to predict this is the script which worked thanks for your help
from super_gradients.common.object_names import Models
from super_gradients.training import models
from super_gradients.training.processing import NormalizeImage
import torchvision.transforms as transforms
from PIL import Image
from super_gradients.training.processing import DetectionCenterPadding, StandardizeImage, NormalizeImage, ImagePermute, ComposeProcessing, DetectionLongestMaxSizeRescale
model = models.get(Models.YOLO_NAS_L,
checkpoint_path="./yolo_nas_l_coco.pth",
num_classes=80)
# Get PIL image
image = Image.open("./img.jpg")
model.set_dataset_processing_params( class_names=["0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30", "31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41", "42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52", "53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63", "64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74", "75", "76", "77", "78", "79"],
image_processor=ComposeProcessing(
[
DetectionLongestMaxSizeRescale(output_shape=(636, 636)),
DetectionCenterPadding(output_shape=(640, 640), pad_value=114),
StandardizeImage(max_value=255.0),
ImagePermute(permutation=(2, 0, 1)),
]
),
iou=0.35,conf=0.25,
)
model.predict(image, conf=0.45).save("output")
from super-gradients.
Yes, thanks for the detailed explanation. Any reason for this error
ValueError: operands could not be broadcast together with shapes (1,3,640,640) (1,1,3)
or how could it be fixed without removing transforms()
from super-gradients.
thanks a lot
from super-gradients.
Related Issues (20)
- Incorrect arguments in super_gradients/training/utils/distributed_training_utils.py HOT 1
- Correct image transforms for Yolo-NAS
- Work with keypoints for recognize some poses HOT 1
- Custom metrics that depends on image_path?
- DetectionRandomAffine target-size is in wrong format HOT 2
- COCO Recipe reporting low precision
- ImportError: cannot import name 'utils' from partially initialized module 'super_gradients.training' (most likely due to a circular import HOT 4
- yolo-nas-sat model availability
- AttributeError: 'RegSeg48' object has no attribute 'set_dataset_processing_params' HOT 1
- How to set different weight decay values for different modules of the model
- yolo nas pose demo/colab is broken
- How to get edge_links, edge_colors, keypoint_colors when using yolo nas pose onnx?
- Validation metrics = 0.0 during training yolo-nas
- YOLO NAS'S Precision is significantly lower compare to other later YOLO model even when using same dataset ? HOT 4
- BaseSGLogger storage_location parameter is systematically overriden, why?
- Access Joints Coordinate
- Ground tensor shape issue when training YOLO_NAS_S model on a custom dataset HOT 1
- Issue when training and predicting with a custom dataset and the YOLO_NAS_S model HOT 1
- Model training process halted for small dataset HOT 4
- Inquiry About Official Release Date of OBB Detection Models for YOLO-NAS and Training HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from super-gradients.