Code Monkey home page Code Monkey logo

scaled-yolov4-tensorflow2's Introduction

Scaled-YOLOv4-tensorflow2

Python 3.7 TensorFlow 2.4

A Tensorflow2.x implementation of Scaled-YOLOv4 as described in Scaled-YOLOv4: Scaling Cross Stage Partial Network

Update Log

[2021-07-02]:

  • Add support for: Exponential moving average decay for variables. Improve mAP from 0.985 to 0.990 on Chess Pieces dataset.

[2021-06-29]:

Major Features and Improvements:

  • Add support for: Sharpness-Aware Minimization(SAM_sgd,SAM_adam).

Bug Fixes and Changes:

  • Fix the nan loss error when using adam optimizer
  • Set default optimizer as SAM_adam
  • Change default running mode from 'fit' to 'eager mode'

[2021-06-27] Add support for: resuming training from checkpoints.

[2021-02-21] Add support for: model.fit(dramatic improvement in GPU utilization); online coco evaluation callback; change default optimizer from sgd to adam

[2021-02-11] Add support for: one-click deployment using tensorflow Serving(very fast)

[2021-01-29] Add support for: mosaic,ssd_random_crop

[2021-01-25] Add support for: ciou loss,hard-nms,DIoU-nms,label_smooth,transfer learning,tensorboard

[2021-01-23] Add support for: scales_x_y/eliminate grid sensitivity,accumulate gradients for using big batch size,focal loss,diou loss

[2021-01-16] Add support for: warmup,Cosine annealing scheduler,Eager mode training with tf.GradientTape,support voc/coco dataset format

[2021-01-10] Add support for: yolov4-tiny,yolov4-large p5/p6/p7,online coco evaluation,multi scale training

Demo

ScaledYOLOv4_p5_detection_result:

pothole_p5_detection_3.png chess_p5_detection.png

ScaledYOLOv4_tiny_detection_result:

safehat_tiny_detection_1.png safehat_tiny_detection_2.png

Installation

1. Clone project

git clone https://github.com/wangermeng2021/Scaled-YOLOv4-tensorflow2.git
cd Scaled-YOLOv4-tensorflow2

2. Install environment

  • install tesnorflow ( skip this step if it's already installed,test environment:tensorflow 2.4.0)
  • pip install -r requirements.txt
    

Note:

I strongly recommend using voc dataset type(default dataset type), because my GPU is old, so coco dataset type is not fully tested.

Training:

  • Download Pre-trained p5 coco pretrain models and place it under directory 'pretrained/ScaledYOLOV4_p5_coco_pretrain' :
    https://drive.google.com/file/d/1glOCE3Y5Q5enW3rpVq3SmKDXzaKIw4YL/view?usp=sharing

  • Download Pre-trained p6 coco pretrain models and place it under directory 'pretrained/ScaledYOLOV4_p6_coco_pretrain' :
    https://drive.google.com/file/d/1EymbpgiO6VkCCFdB0zSTv0B9yB6T9Fw1/view?usp=sharing

  • Download Pre-trained tiny coco pretrain models and place it under directory 'pretrained/ScaledYOLOV4_tiny_coco_pretrain' :
    https://drive.google.com/file/d/1x15FN7jCAFwsntaMwmSkkgIzvHXUa7xT/view?usp=sharing

  • For training on Pothole dataset(No need to download dataset,it's already included in project):
    p5(single scale):

    python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset dataset/pothole_voc --num-classes 1 --class-names pothole.names  --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 200 --batch-size 4 --multi-scale 416 --augment ssd_random_crop 
    

    p5(multi scale):

    python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset dataset/pothole_voc --num-classes 1 --class-names pothole.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 200 --batch-size 4 --multi-scale 320,352,384,416,448,480,512 --augment ssd_random_crop 
    
  • For training on Chess Pieces dataset(No need to download dataset,it's already included in project):
    tiny(single scale):

    python train.py --use-pretrain True --model-type tiny --dataset-type voc --dataset dataset/chess_voc --num-classes 12 --class-names chess.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 400 --batch-size 32 --multi-scale 416 --augment ssd_random_crop 
    

    tiny(multi scale):

    python train.py --use-pretrain True --model-type tiny --dataset-type voc --dataset dataset/chess_voc --num-classes 12 --class-names chess.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 400 --batch-size 32 --multi-scale 320,352,384,416,448,480,512 --augment ssd_random_crop
    
    
  • For training with SAM_sgd on Chess Pieces dataset:

    python train.py --optimizer SAM_sgd --use-pretrain True --model-type tiny --dataset-type voc --dataset dataset/chess_voc --num-classes 12 --class-names chess.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 400 --batch-size 32 --multi-scale 416 --augment ssd_random_crop 
    
  • For training with ema(Exponential Moving Average) on Chess Pieces dataset:

    python train.py --ema True --use-pretrain True --model-type tiny --dataset-type voc --dataset dataset/chess_voc --num-classes 12 --class-names chess.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 400 --batch-size 32 --multi-scale 416 --augment ssd_random_crop 
    

Tensorboard visualization:

Evaluation results(GTX2080,[email protected]):

model Chess Pieces pothole VOC COCO
Scaled-YoloV4-tiny(416) 0.985
Scaled-YoloV4-tiny(416)+ema 0.990
AlexeyAB's YoloV4(416) 0.814
Scaled-YoloV4-p5(416) 0.826
  • Evaluation on Pothole dataset: tensorboard_pothole_p5.png
  • Evaluation on chess dataset: tensorboard_chess_tiny.png

Detection

  • For detection on Chess Pieces dataset:

    python3 detect.py --pic-dir images/chess_pictures --model-path output_model/best_model_tiny_0.985/1 --class-names dataset/chess.names --nms-score-threshold 0.1
    

    detection result:

    chess_p5_detection.png

  • For detection on Pothole dataset:

    python3 detect.py --pic-dir images/pothole_pictures --model-path output_model/best_model_p5_0.827/1 --class-names dataset/pothole.names --nms-score-threshold 0.1
    

    detection result:

    pothole_p5_detection_2.png

Customzied training

  • Convert your dataset to Pascal VOC format(you can use labelImg to generate VOC format dataset)
  • Generate class names file(such as xxx.names)
  • python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset your_dataset_root_dir --num-classes num_of_classes --class-names path_of_xxx.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 200 --batch-size 8 --multi-scale 416  --augment ssd_random_crop 
    

Deployment

TensorFlow Serving is a flexible, high-performance serving system for machine learning models, designed for production environments.it include two parts:clients and server, we can run them on one machine.

  • Navigate to deployment directory:
  cd  deployment/tfserving
  • Generate a docker image which contains your trained model (it will take minutes,you only have to run it one time):
  ./gen_image --model-dir ScaledYOLOv4-tensorflow2/output_model/pothole/best_model_p5_0.811
  • Deploy model:
    • Server side( docker and nvidia-docker installed ):

      ./run_image

    • Client side(no need to install tensorflow):

      1. install client package

        pip install tfservingclient-1.0.0-cp37-cp37m-manylinux1_x86_64.whl

      2. predict images

        python demo.py --pic-dir xxxx --class-names xxx.names

References

scaled-yolov4-tensorflow2's People

Contributors

wangermeng2020 avatar wangermeng2021 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

scaled-yolov4-tensorflow2's Issues

Duplicate name in graph `ones`

Hello, using TF 2.1 training fails during model export. It throws an error in box_decode function in box_coder.py. See traceback:

Traceback (most recent call last):
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1619, in _create_c_op
    c_op = c_api.TF_FinishOperation(op_desc)
tensorflow.python.framework.errors_impl.InvalidArgumentError: Duplicate node name in graph: 'ones'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ondra/Downloads/ScaledYOLOv4-tensorflow2-main/train.py", line 272, in <module>
    main(args)
  File "/Users/ondra/Downloads/ScaledYOLOv4-tensorflow2-main/train.py", line 266, in main
    model = Yolov4(args, training=False)
  File "/Users/ondra/Downloads/ScaledYOLOv4-tensorflow2-main/model/yolov4.py", line 21, in Yolov4
    pre_nms_decoded_boxes, pre_nms__scores = postprocess(outputs,args)
  File "/Users/ondra/Downloads/ScaledYOLOv4-tensorflow2-main/model/postprocess.py", line 22, in postprocess
    decoded_boxes = box_decode(output[..., 0:4], args, index)
  File "/Users/ondra/Downloads/ScaledYOLOv4-tensorflow2-main/model/box_coder.py", line 22, in box_decode
    grid_xy = tf.stack(tf.meshgrid(tf.range(grid_width), tf.range(grid_height)), axis=-1)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 3065, in meshgrid
    mult_fact = ones(shapes, output_dtype)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 2671, in ones
    output = fill(shape, constant(one, dtype=dtype), name=name)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/ops/array_ops.py", line 233, in fill
    result = gen_array_ops.fill(dims, value, name=name)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/ops/gen_array_ops.py", line 3247, in fill
    "Fill", dims=dims, value=value, name=name)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/framework/op_def_library.py", line 742, in _apply_op_helper
    attrs=attr_protos, op_def=op_def)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/framework/func_graph.py", line 595, in _create_op_internal
    compute_device)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 3322, in _create_op_internal
    op_def=op_def)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1786, in __init__
    control_input_ops)
  File "/Users/ondra/opt/anaconda3/envs/tf_21_py37/lib/python3.7/site-packages/tensorflow_core/python/framework/ops.py", line 1622, in _create_c_op
    raise ValueError(str(e))
ValueError: Duplicate node name in graph: 'ones'

The problem is in repeated call of tf.meshgrid function which calls ones function with same naming which works for the first YOLO head, but second time it fails because of already existing name.

Maybe in higher TF version the problem is handled on framework level. My solution is to run postprocess block in for loop in postprocess.py (for loop stats on line 18) under unique name space for each head (iteration).

training time is too high

I am trying to train a model (p5), based on my dataset.
It takes about 30 hours on Darknet. But it takes incredibly much more time with the Scaled-YOLOv4-tensorflow2 code:
time elapsed: 1.813 hour, time left: 360.723 hour

Error showing up during detection

Hi,
When I try to run detect.py using this:
"python detect.py --pic-dir images/chess_pictures --model-path output_model/chess/best_model_tiny_0.985/1 --class-names dataset/chess.names --nms-score-threshold 0.1",
the error which shows up is:

Traceback (most recent call last):
File "detect.py", line 119, in
main(args)
File "detect.py", line 95, in main
model = tf.keras.models.load_model(args.model_path)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\save.py", line 212, in load_model
return saved_model_load.load(filepath, compile, options)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 147, in load
keras_loader.finalize_objects()
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 612, in finalize_objects
self._reconstruct_all_models()
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 631, in _reconstruct_all_models
self._reconstruct_model(model_id, model, layers)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\saving\saved_model\load.py", line 677, in _reconstruct_model
created_layers) = functional_lib.reconstruct_from_config(
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 1285, in reconstruct_from_config
process_node(layer, node_data)
File "C:\Users\tsath\anaconda3\envs\yolo\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 1222, in process_node
nest.flatten(inbound_node.outputs)[inbound_tensor_index])
IndexError: list index out of range

I used the default dataset (chess) for training and tried to detect using the saved model.
The tensorflow version running is: TF 2.4.1
Can someone help me with resolving this issue? Thanks!!

Training with datasets in COCO format

Hi, first of all, thank you for your work!

I'am trying to train COCO 2017 dataset and also my own dataset in COCO format but without success:

!python train.py --use-pretrain True\
                 --model-type p5\
                 --dataset-type coco\
                 --dataset dataset/coco/\
                 --num-classes 32\
                 --class-names coco.names\
                 --coco-train-set train2017\
                 --coco-valid-set val2017\
                 --epochs 200\
                 --batch-size 8\
                 --multi-scale 416\
                 --augment ssd_random_crop 

Output:

...
Load p5 weight successfully!
Tensorboard engine is running at http://localhost:6006/
loading dataset...
  0%|                                                   | 0/625 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 298, in <module>
    main(args)
  File "train.py", line 194, in main
    coco_map_callback = CocoMapCallback(pred_generator,model,args,mAP_writer)
  File "/tf/parkinto-object-detection/Scaled-YOLOv4-tensorflow2/utils/fit_coco_map.py", line 30, in __init__
    for batch_img, batch_boxes, batch_valids in pred_generator_tqdm:
  File "/usr/local/lib/python3.6/dist-packages/tqdm/std.py", line 1170, in __iter__
    for obj in iterable:
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 483, in __iter__
    for item in (self[i] for i in range(len(self))):
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 483, in <genexpr>
    for item in (self[i] for i in range(len(self))):
  File "/tf/parkinto-object-detection/Scaled-YOLOv4-tensorflow2/generator/coco_generator.py", line 183, in __getitem__
    y_true = get_y_true(self.max_side, batch_boxes, groundtruth_valids, self._args)
  File "/tf/parkinto-object-detection/Scaled-YOLOv4-tensorflow2/generator/get_y_true.py", line 92, in get_y_true
    grids[grid_index][batch_index][grid_xy[1]][grid_xy[0]][grid_anchor_index][5+batch_boxes[batch_index][box_index][4].astype(np.int32)] = 1
IndexError: index 44 is out of bounds for axis 0 with size 37

I'am using TF 2.4.1

Is possible to train on datasets in COCO format or I have to convert them to VOC format? Thank you

Customzied training Error

when i trained myself datasets,it comes
+-------------------------------------------+
loading dataset...
loading dataset...
100%|█████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:03<00:00, 1.16s/it]
creating index...
index created!
1e-06
0%| | 0/20 [00:00<?, ?it/s]2021-04-28 11:28:55.279041: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic librarylibcudnn.so.7
2021-04-28 11:28:57.187358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic librarylibcublas.so.10
0%| | 0/20 [00:09<?, ?it/s]
Traceback (most recent call last):
File "train.py", line 298, in
main(args)
File "train.py", line 235, in main
model_outputs = model(batch_imgs, training=True)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in call
outputs = self.call(cast_inputs, *args, **kwargs)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 719, in call
convert_kwargs_to_constants=base_layer_utils.call_context().saving)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/engine/network.py", line 888, in _run_internal_graph
output_tensors = layer(computed_tensors, **kwargs)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py", line 968, in call
outputs = self.call(cast_inputs, *args, **kwargs)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/layers/merge.py", line 183, in call
return self._merge_function(inputs)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/layers/merge.py", line 522, in _merge_function
return K.concatenate(inputs, axis=self.axis)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/keras/backend.py", line 2709, in concatenate
return array_ops.concat([to_dense(x) for x in tensors], axis)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/util/dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/ops/array_ops.py", line 1606, in concat
return gen_array_ops.concat_v2(values=values, axis=axis, name=name)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/ops/gen_array_ops.py", line 1181, in concat_v2
_ops.raise_from_not_ok_status(e, name)
File "/opt/anaconda3/envs/pray2/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 6653, in raise_from_not_ok_status
six.raise_from(core._status_to_exception(e.code, message), None)
File "", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: ConcatOp : Dimensions of inputs should match: shape[0] = [9,13,13,512] vs. shape[1] = [9,12,12,512] [Op:ConcatV2] name: concat
+-------------------------------+
could you help me》

library version

i wonder your env
tensorflow, numpy, python, pycocotools

Please tell me the exact version number. The version is different from yours, so an error seems to occur in the middle.

YOLOv4-P7 Nan error

Hello.

I've tried to train your YOLOv4-P7 model and previously was facing NaN error.

I have observed that you have updated loss function slightly to fix the issue and wonder whether the issue has been resolved, as mAP scores of P6 and P7 haven't been updated yet.

Has the issue been solved??

Unable to load checkpoint

I assumed that the way to load a saved checkpoint is the same as loading pretrained weight. However, when I try to load my own saved checkpoint and train again with the exact same data and exact same command, I got this error:

Traceback (most recent call last):
  File "train.py", line 310, in <module>
    main(args)
  File "train.py", line 151, in main
    pretrain_model.load_weights(args.p5_coco_pretrained_weights).expect_partial()
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\keras\engine\training.py", line 2205, in load_weights
    status = self._trackable_saver.restore(filepath, options)
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\tracking\util.py", line 1336, in restore
    base.CheckpointPosition(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\tracking\base.py", line 253, in restore
    restore_ops = trackable._restore_from_checkpoint_position(self)  # pylint: disable=protected-access
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\tracking\base.py", line 972, in _restore_from_checkpoint_position
    current_position.checkpoint.restore_saveables(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\tracking\util.py", line 307, in restore_saveables
    new_restore_ops = functional_saver.MultiDeviceSaver(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\saving\functional_saver.py", line 345, in restore
    restore_ops = restore_fn()
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\saving\functional_saver.py", line 321, in restore_fn
    restore_ops.update(saver.restore(file_prefix, options))
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\saving\functional_saver.py", line 115, in restore
    restore_ops[saveable.name] = saveable.restore(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\training\saving\saveable_object_util.py", line 131, in restore
    return resource_variable_ops.shape_safe_assign_variable_handle(
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\ops\resource_variable_ops.py", line 307, in shape_safe_assign_variable_handle
    shape.assert_is_compatible_with(value_tensor.shape)
  File "C:\Program Files\Python38\lib\site-packages\tensorflow\python\framework\tensor_shape.py", line 1134, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (340,) and (40,) are incompatible

The command I'm using: python train.py --epochs 200 --batch-size 4 --start-eval-epoch 0 --model-type p5 --use-pretrain True --dataset-type coco --dataset dataset/CV1/ --num-classes 5 --class-names CV1.names --coco-train-set train --coco-valid-set val --augment ssd_random_crop --p5-coco-pretrained-weights checkpoints/best_weight_p5_27_0.872

Let me know if you need the weight files to test. I can share

Index out of bound

Trying to train on tiny model with custom data and image_size 608, got the following error. Using coco format
Any idea?

loading dataset...
 2% 2/88 [00:00<00:22,  3.77it/s]Traceback (most recent call last):
 File "train.py", line 298, in <module>
   main(args)
 File "train.py", line 194, in main
   coco_map_callback = CocoMapCallback(pred_generator,model,args,mAP_writer)
 File "/content/ScaledYOLOv4-tensorflow2/utils/fit_coco_map.py", line 30, in __init__
   for batch_img, batch_boxes, batch_valids in pred_generator_tqdm:
 File "/usr/local/lib/python3.7/dist-packages/tqdm/std.py", line 1104, in __iter__
   for obj in iterable:
 File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 483, in __iter__
   for item in (self[i] for i in range(len(self))):
 File "/usr/local/lib/python3.7/dist-packages/tensorflow/python/keras/utils/data_utils.py", line 483, in <genexpr>
   for item in (self[i] for i in range(len(self))):
 File "/content/ScaledYOLOv4-tensorflow2/generator/coco_generator.py", line 183, in __getitem__
   y_true = get_y_true(self.max_side, batch_boxes, groundtruth_valids, self._args)
 File "/content/ScaledYOLOv4-tensorflow2/generator/get_y_true.py", line 91, in get_y_true
   grids[grid_index][batch_index][grid_xy[1]][grid_xy[0]][grid_anchor_index][0:4] = np.concatenate([dxdy,dwdh])
IndexError: index 19 is out of bounds for axis 0 with size 19

Unsuccessful TensorSliceReader constructor: Failed to find any matching files for ./pretrain/ScaledYOLOV4_p6_coco_pretrain/coco_pretrain

Somehow, I cant load the pretrained weights.
My pretrained folder for p6 contains the following files:

  • \pretrain\ScaledYOLOV4_p6_coco_pretrain\checkpoint
  • \pretrain\ScaledYOLOV4_p6_coco_pretrain\coco_pretrain.data-00000-of-00001
  • \pretrain\ScaledYOLOV4_p6_coco_pretrain\coco_pretrain.index

Can you tell me where my mistake is?

(To be more precise, I'm only interested in saving the keras models as .h5 for inference only in my environment.)

Cannot reproduce mAP with potholes data

I ran the command as specified in the docs to train on an RTX3090 using the potholes dataset as follows.

python train.py --use-pretrain True --model-type p5 --dataset-type voc --dataset dataset/pothole_voc --num-classes 1 --class-names pothole.names --voc-train-set dataset_1,train --voc-val-set dataset_1,val  --epochs 200 --batch-size 4 --multi-scale 320,352,384,416,448,480,512 --augment ssd_random_crop

However I cannot reproduce the mAP as presented.
image

Similarly the training loss looks quite different.
image

Any thoughts on what I might be doing wrong? Or perhaps there has been a material change to the repo since those results were posted.

Feedback appreciated and thanks for providing this good work to the community.

Model fails to save after training

Hi,

I am seeing the following error:

Training is finished!
Exporting model...
Traceback (most recent call last):
File "C:\TensorFlowObjectDetection\TFODCourse\Scaled_YOLOv4_tf2\train.py", line 382, in
main(args)
File "C:\TensorFlowObjectDetection\TFODCourse\Scaled_YOLOv4_tf2\train.py", line 374, in main
model = Yolov4(args, training=False)
File "C:\TensorFlowObjectDetection\TFODCourse\Scaled_YOLOv4_tf2\model\yolov4.py", line 11, in Yolov4
scaled_yolov4_csp_darknet53_outputs = scaled_yolov4_csp_darknet53(input,mode=args.model_type)
File "C:\TensorFlowObjectDetection\TFODCourse\Scaled_YOLOv4_tf2\model\CSPDarknet53.py", line 35, in scaled_yolov4_csp_darknet53
x = conv2d_bn_mish(x, 32, (3, 3), name="first_block")
File "C:\TensorFlowObjectDetection\TFODCourse\Scaled_YOLOv4_tf2\model\common.py", line 7, in conv2d_bn_mish
return x * tf.math.tanh(tf.math.softplus(x))
File "C:\TensorFlowObjectDetection\TFODCourse\tfod\lib\site-packages\tensorflow\python\util\traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:\TensorFlowObjectDetection\TFODCourse\tfod\lib\site-packages\keras\layers\core\tf_op_layer.py", line 119, in handle
return TFOpLambda(op)(*args, **kwargs)
File "C:\TensorFlowObjectDetection\TFODCourse\tfod\lib\site-packages\keras\utils\traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
OverflowError: Exception encountered when calling layer "tf.math.softplus" (type TFOpLambda).

Python int too large to convert to C long

Call arguments received by layer "tf.math.softplus" (type TFOpLambda):
• features=tf.Tensor(shape=(None, None, None, 32), dtype=float32)
• name=None

I still have the checkpoints saved and a temp_model_variables.h5 has been created but I am unable to convert these into a saved model.

Any advice?

Multi GPU support

There are 4 GPUs are available for tensorflow on my machine, but only one is used on training.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.