Code Monkey home page Code Monkey logo

Comments (18)

qingqing01 avatar qingqing01 commented on May 22, 2024

@lyw615 There is no this operation now. Maybe can refer multi-scale in face-detection: https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.1/tools/face_eval.py#L75

But different from face-detection, can make the multi images with 512 x 512 into a batch, then do predict.

from paddledetection.

qingqing01 avatar qingqing01 commented on May 22, 2024

According to #39 , you use YOLOv3.

If only for prediction, you also can export model, refer https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.1/docs/EXPORT_MODEL.md . Then refer following code to add multi-scale predcition. I think this may be easier than modify the data-feed.

import paddle.fluid as fluid
import numpy as np
import cv2


def Permute(im, channel_first=True, to_bgr=False):
    if channel_first:
        im = np.swapaxes(im, 1, 2)
        im = np.swapaxes(im, 1, 0)
    if to_bgr:
        im = im[[2, 1, 0], :, :]
    return im


def DecodeImage(im_path):
    with open(im_path, 'rb') as f:
        im = f.read()
    data = np.frombuffer(im, dtype='uint8')
    im = cv2.imdecode(data, 1)  # BGR mode, but need RGB mode
    im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
    return im


def ResizeImage(im, target_size=608, max_size=0):
    if len(im.shape) != 3:
        raise ImageError('image is not 3-dimensional.')
    im_shape = im.shape
    print(im_shape)
    im_size_min = np.min(im_shape[0:2])
    im_size_max = np.max(im_shape[0:2])
    if float(im_size_min) == 0:
        raise ZeroDivisionError('min size of image is 0')
    if max_size != 0:
        im_scale = float(target_size) / float(im_size_min)
        # Prevent the biggest axis from being more than max_size
        if np.round(im_scale * im_size_max) > max_size:
            im_scale = float(max_size) / float(im_size_max)
        im_scale_x = im_scale
        im_scale_y = im_scale
    else:
        im_scale_x = float(target_size) / float(im_shape[1])
        im_scale_y = float(target_size) / float(im_shape[0])
    
    im = cv2.resize(
             im,
             None,
             None,
             fx=im_scale_x,
             fy=im_scale_y,
             interpolation=2)
    return im


def NormalizeImage(im,mean = [0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], is_scale=True):
    """Normalize the image.
    Operators:
        1.(optional) Scale the image to [0,1]
        2. Each pixel minus mean and is divided by std
    """
    im = im.astype(np.float32, copy=False)
    mean = np.array(mean)[np.newaxis, np.newaxis, :]
    std = np.array(std)[np.newaxis, np.newaxis, :]
    if is_scale:
        im = im / 255.0
    im -= mean
    im /= std
    return im


def Prepocess(img_path):
    test_img = DecodeImage(img_path)
    test_img = ResizeImage(test_img)
    test_img = NormalizeImage(test_img)
    test_img = Permute(test_img)
    test_img = test_img[np.newaxis,:]#.reshape(1, 3, 608, 608)
    np.save('infer_pre.npy', test_img)
    return test_img

def train():
    infer_prog = fluid.Program()
    startup_prog = fluid.Program()
    
    place = fluid.CUDAPlace(0)
    exe = fluid.Executor(place)
    exe.run(startup_prog)
    
    path = "output/yolov3_darknet/"
    img_path = "demo/1.jpg"
    
    test_img = Prepocess(img_path)
    print("shape of test_img:", test_img.shape)
    img_shape = np.array([608, 608]).reshape(1, 2)
    img_shape = img_shape.astype('int32')
    print(img_shape.dtype)
    #exit()
    [inference_program, feed_target_names, fetch_targets] = (fluid.io.load_inference_model(
        dirname=path, executor=exe, model_filename='__model__', params_filename='__params__'))
    outs = exe.run(inference_program,
              feed={feed_target_names[0]: test_img, feed_target_names[1]: img_shape},
              fetch_list=fetch_targets,
              return_numpy=False)
    res = [
             (np.array(v), v.recursive_sequence_lengths())
             for v in outs
          ]
    print(res[0][0])
    np.save('infer.npy', res[0][0])

if __name__ == '__main__':
    train()

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

Thank you very much.After tried, i'll give the reply

from paddledetection.

heavengate avatar heavengate commented on May 22, 2024

Thank you very much.After tried, i'll give the reply

FYI, the image_shape input in the code above is hard code 608*608

img_shape = np.array([608, 608]).reshape(1, 2)

detection bbox will be rescale to image shape 608*608 , if you want to get the predict bbox in original image shape scale, set img_shape as the original image height and width and rewrite the code as follows:

import paddle.fluid as fluid
import numpy as np
import cv2


def Permute(im, channel_first=True, to_bgr=False):
    if channel_first:
        im = np.swapaxes(im, 1, 2)
        im = np.swapaxes(im, 1, 0)
    if to_bgr:
        im = im[[2, 1, 0], :, :]
    return im


def DecodeImage(im_path):
    with open(im_path, 'rb') as f:
        im = f.read()
    data = np.frombuffer(im, dtype='uint8')
    im = cv2.imdecode(data, 1)  # BGR mode, but need RGB mode
    im = cv2.cvtColor(im, cv2.COLOR_BGR2RGB)
    return im


def ResizeImage(im, target_size=608, max_size=0):
    if len(im.shape) != 3:
        raise ImageError('image is not 3-dimensional.')
    im_shape = im.shape
    print(im_shape)
    im_size_min = np.min(im_shape[0:2])
    im_size_max = np.max(im_shape[0:2])
    if float(im_size_min) == 0:
        raise ZeroDivisionError('min size of image is 0')
    if max_size != 0:
        im_scale = float(target_size) / float(im_size_min)
        # Prevent the biggest axis from being more than max_size
        if np.round(im_scale * im_size_max) > max_size:
            im_scale = float(max_size) / float(im_size_max)
        im_scale_x = im_scale
        im_scale_y = im_scale
    else:
        im_scale_x = float(target_size) / float(im_shape[1])
        im_scale_y = float(target_size) / float(im_shape[0])
    
    im = cv2.resize(
             im,
             None,
             None,
             fx=im_scale_x,
             fy=im_scale_y,
             interpolation=2)
    return im


def NormalizeImage(im,mean = [0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], is_scale=True):
    """Normalize the image.
    Operators:
        1.(optional) Scale the image to [0,1]
        2. Each pixel minus mean and is divided by std
    """
    im = im.astype(np.float32, copy=False)
    mean = np.array(mean)[np.newaxis, np.newaxis, :]
    std = np.array(std)[np.newaxis, np.newaxis, :]
    if is_scale:
        im = im / 255.0
    im -= mean
    im /= std
    return im


def Prepocess(img_path):
    test_img = DecodeImage(img_path)
    img_shape = test_img.shape[:2]
    test_img = ResizeImage(test_img)
    test_img = NormalizeImage(test_img)
    test_img = Permute(test_img)
    test_img = test_img[np.newaxis,:]#.reshape(1, 3, 608, 608)
    np.save('infer_pre.npy', test_img)
    return test_img, img_shape

def train():
    infer_prog = fluid.Program()
    startup_prog = fluid.Program()
    
    place = fluid.CUDAPlace(0)
    exe = fluid.Executor(place)
    exe.run(startup_prog)
    
    path = "output/yolov3_darknet/"
    img_path = "demo/1.jpg"
    
    test_img, img_shape = Prepocess(img_path)
    print("shape of test_img:", test_img.shape)
    img_shape = np.array(img_shape).reshape(1, 2)
    img_shape = img_shape.astype('int32')
    print(img_shape.dtype)
    #exit()
    [inference_program, feed_target_names, fetch_targets] = (fluid.io.load_inference_model(
        dirname=path, executor=exe, model_filename='__model__', params_filename='__params__'))
    outs = exe.run(inference_program,
              feed={feed_target_names[0]: test_img, feed_target_names[1]: img_shape},
              fetch_list=fetch_targets,
              return_numpy=False)
    res = [
             (np.array(v), v.recursive_sequence_lengths())
             for v in outs
          ]
    print(res[0][0])
    np.save('infer.npy', res[0][0])

if __name__ == '__main__':
    train()

from paddledetection.

heavengate avatar heavengate commented on May 22, 2024

output format of thie code please refer to https://www.paddlepaddle.org.cn/documentation/docs/en/api/layers/multiclass_nms.html
Out format is (label, confidence, xmin, ymin, xmax, ymax)

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

The size of images for prediction is normally 3600*2400.The effect of directly entering the network for prediction is not as good as expected. Existing model settings are enough good, so I don't need to export the model. I just want to explore the way cropping the largesize image into multi smaller image. And whether there will be a better result

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

Can i only crop a large size image into some smaller size array for prediction,rather than some image.This is my original intention

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

[inference_program, feed_target_names, fetch_targets] = (fluid.io.load_inference_model(
dirname=path, executor=exe, model_filename='model', params_filename='params')) what's this mean? Setting dirname for the path of saving vehicle_yolov3_darkne model,but getting an error reportFileNotFoundError: [Errno 2] No such file or directory: 'G:\git_download\model\car\vehicle_yolov3_darknet\model`

from paddledetection.

qingqing01 avatar qingqing01 commented on May 22, 2024

[inference_program, feed_target_names, fetch_targets] = (fluid.io.load_inference_model(
dirname=path, executor=exe, model_filename='model', params_filename='params')) what's this mean? Setting dirname for the path of saving vehicle_yolov3_darkne model,but getting an error reportFileNotFoundError: [Errno 2] No such file or directory: 'G:\git_download\model\car\vehicle_yolov3_darknet\model`

  • dirname: the model path
  • model_filename: the model file name in dirname
  • params_filename: the param file name in dirname

No such file or directory: 'G:\git_download\model\car\vehicle_yolov3_darknet\model`

Please make sure the path exist.

from paddledetection.

qingqing01 avatar qingqing01 commented on May 22, 2024

only crop a large size image into some smaller size array for prediction

If only test one cropped image, and still want to use tools/infer.py, maybe need to add a Crop Operator in https://github.com/PaddlePaddle/PaddleDetection/blob/release/0.1/ppdet/data/transform/operators.py

then change YoloTestFeed in yml config:

YoloTestFeed:
  batch_size: 1
  image_shape: [3, 608, 608]
  dataset:
    annotation: dataset/coco/annotations/instances_val2017.json
  sample_transforms:
    - !DecodeImage
      to_rgb: True
      with_mixup: False
    - !ResizeImage  # or change to crop
      interp: 2
      target_size: 608
    - !NormalizeImage
      mean:
      - 0.485
      - 0.456
      - 0.406
      std:
      - 0.229
      - 0.224
      - 0.225
      is_scale: False
      is_channel_first: False
    - !Permute
      to_bgr: False

I'm sorry if I did not fully understanding what your mean.

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

Dirname means absolute path for vehicle_yolov3_darknet? Which is weight path of infer.py
param means file vehicle_yolov3_darknet.yml
What's the modelfilename?
Not only a single image,but many largesize images. After getting its' pixels value as numpy array,cropping the array into smaller multi arrays for prediction, and then calculate the bboxes points on the largesize image.

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024
for img in  infer_images:
     result_bboxes=[ ]
     image_array=cv2.imread(img)
     #crop results likes this
     [small_array1,small_array2,small_array3...]=crop(image_array)
     for  array in [small_array1,small_array2,small_array3...]:
           bbox=model.predict_single_image(array)
           result_bboxes.append(bbox)
     for bbox in result_bboxes:
        calculate the bbox on the  img,and draw the bbox

from paddledetection.

qingqing01 avatar qingqing01 commented on May 22, 2024

为了说清楚一点,我还是用中文:

上面其实提到的是两种方式,

  1. 直接用 tools/infer.py 我觉得比较难满足你描述的。目前的reader部分不够友好,正在简化逻辑。
  2. 用上面我们贴的code,fluid.io.load_inference_model加载模型,自己构造run的逻辑,会简单一些。 用上面贴的code,需要导出下模型:
python tools/export_model.py -c configs/yolov3_darknet.yml \
        --output_dir=./inference_model \
        -o weights=https://paddlemodels.bj.bcebos.com/object_detection/vehicle_yolov3_darknet.tar \
           YoloTestFeed.image_shape=[3,320,320]

运行完,你会在inference_model 路径下看到 model, param__文件。 上述代码,dirname是inference_model, model_filename是__model, params_filename是__param__

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

现在设定的图像维度(3,608,608)其实对于我的数据预测效果算不错了,只是我的数据尺寸过大,所以想试试把读入的图像数组切成几块放进去预测效果会不会更好。导出模型的维度成YoloTestFeed.image_shape=[3,320,320],会不会难以复现目前这个项目设定参数的效果啊?我可以导出成(3,608,608)的吗

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

python tools/export_model.py -c configs/yolov3_darknet.yml --output_dir=./inference_model -o weights=../model/car_p/vehicle_yolov3_darknet YoloTestFeed.image_shape=[3,608,608]
2019-11-27 15:09:45,628-INFO: Loading parameters from ../model/car_p/vehicle_yolov3_darknet...
Traceback (most recent call last):
File "tools/export_model.py", line 120, in
main()
File "tools/export_model.py", line 107, in main
checkpoint.load_params(exe, infer_prog, cfg.weights)
File "./ppdet/utils/checkpoint.py", line 118, in load_params
fluid.io.load_vars(exe, path, prog, predicate=_if_exist)
File "/software/conda/envs/super_mask/lib/python3.6/site-packages/paddle/fluid/io.py", line 682, in load_vars
filename=filename)
File "/software/conda/envs/super_mask/lib/python3.6/site-packages/paddle/fluid/io.py", line 741, in load_vars
format(orig_shape, each_var.name, new_shape))
RuntimeError: Shape not matching: the Program requires a parameter with a shape of **((255, 1024, 1, 1)),** while the loaded parameter (namely [ yolo_output.0.conv.weights ]) has a shape of **((33, 1024, 1, 1)).**
按照指令输入后显示这样的错误,下载的车辆检测权重是可以预测成功的

from paddledetection.

qingqing01 avatar qingqing01 commented on May 22, 2024

python tools/export_model.py -c configs/yolov3_darknet.yml --output_dir=./inference_model -o weights=../model/car_p/vehicle_yolov3_darknet YoloTestFeed.image_shape=[3,608,608]

  1. configs文件指定不正确,configs/yolov3_darknet.yml 这个是针对COCO数据集的,需要指定对应的车辆检测那个。
  2. 注意看导出模型文档, 导出的shape只是针对,TensorRT预测需要固定shape,python/普通c++预测,支持变长输入。
  3. 另外,YOLOv3支持对原图片大小预测吧。

from paddledetection.

qingqing01 avatar qingqing01 commented on May 22, 2024

related to #46

from paddledetection.

lyw615 avatar lyw615 commented on May 22, 2024

按照您这边提供的代码,#44 (comment)
,我成功实现了对单张图片的处理,但是之前都是在Linux下运行,最近在windows7上用GPU跑,发现处理第一张图片的时候会报出以下错误,我跳过它继续运行后一切正常,报错如下,/software/conda/envs/super_mask是我Linux下的python路径,不知道为何会出现在这里:

C++ Call Stacks (More useful to developers):
--------------------------------------------
Windows not support stack backtrace yet.

------------------------------------------
Python Call Stacks (More useful to users):
------------------------------------------
  File "/software/conda/envs/super_mask/lib/python3.6/site-packages/paddle/fluid/framework.py", line 2459, in append_op
    attrs=kwargs.get("attrs", None))
  File "/software/conda/envs/super_mask/lib/python3.6/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
    return self.main_program.current_block().append_op(*args, **kwargs)
  File "/software/conda/envs/super_mask/lib/python3.6/site-packages/paddle/fluid/layers/nn.py", line 2803, in conv2d
    "data_format": data_format,
  File "./ppdet/modeling/backbones/darknet.py", line 69, in _conv_norm
    bias_attr=False)
  File "./ppdet/modeling/backbones/darknet.py", line 151, in __call__
    name=self.prefix_name + "yolo_input")
  File "./ppdet/modeling/architectures/yolov3.py", line 56, in build
    body_feats = self.backbone(im)
  File "./ppdet/modeling/architectures/yolov3.py", line 86, in test
    return self.build(feed_vars, mode='test')
  File "tools/export_model.py", line 103, in main
    test_fetches = model.test(feed_vars)
  File "tools/export_model.py", line 120, in <module>
    main()

----------------------
Error Message Summary:
----------------------
Error: An error occurred here. There is no accurate error hint for this error yet. We are continuously in the process of increasing hint for this kind of error check. It would be helpful if you could inform us of how this conversion went by opening a github issue. And we will resolve it with high priority.
  - New issue link: https://github.com/PaddlePaddle/Paddle/issues/new
  - Recommended issue content: all error stack information
  [Hint: CUDNN_STATUS_EXECUTION_FAILED] at (D:/1.7.2/paddle/paddle/fluid/operators/conv_cudnn_op.cu:286)
  [operator < conv2d > error]

配置信息:win7
GPU: RTX2060super
python3.6
paddle-gpu 1.7.2
在非GPU的paddle上可以正常运行

from paddledetection.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.