Code Monkey home page Code Monkey logo

Comments (23)

sph1n3x avatar sph1n3x commented on August 26, 2024 4

@drachu Sorry for the late reply! I have been running some extended tests for benchmarking purposes, but the results are not exciting at all. Although the network structure of YOLOv7 Tiny was designed with edge devices in mind, it is not optimized for exportation with the edge compiler. Many operations are still run on the CPU (~ 30%). I have also tried many delegation options without any luck :(

YOLOv5 models do not have these issues. There recent update (v6.2) incorporates optimizations for edge devices which is why almost all operations run on the Edge TPU (<5% on the CPU) at quite feasible speed.

If you still want to use YOLOv7 models, I recommend looking at some edge devices such as the Jetson. You will probably get very good results and speed with TensorRT. You can also consider running the tiny model on a CPU (without the Edge TPU). I can easily achieve ~10 FPS on a Ryzen 3 4300U without any optimizations using the tiny model @ 640 or up to 30 fps @ 416.

from yolov7.

hardikdava avatar hardikdava commented on August 26, 2024 1

Hello, I was able to export yolov7-tiny.pt to edgetpu. But there is a limitations for numbers of class and image size. But yolov7-tiny was able to export as edgetpu and all the ops are compiled successfully. I made my commit to this #1672 pr. It is made under branch u5. Please find below logs for complete export of model.

TensorFlow SavedModel: starting export with tensorflow 2.12.0...

                 from  n    params  module                                  arguments                     
2023-04-24 18:23:29.086379: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
  0                -1  1       928  models.common.Conv                      [3, 32, 3, 2]                 
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1      2112  models.common.Conv                      [64, 32, 1, 1]                
  3                -2  1      2112  models.common.Conv                      [64, 32, 1, 1]                
  4                -1  1      9280  models.common.Conv                      [32, 32, 3, 1]                
  5                -1  1      9280  models.common.Conv                      [32, 32, 3, 1]                
  6  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
  7                -1  1      8320  models.common.Conv                      [128, 64, 1, 1]               
  8                -1  1         0  models.common.MP                        []                            
  9                -1  1      4224  models.common.Conv                      [64, 64, 1, 1]                
 10                -2  1      4224  models.common.Conv                      [64, 64, 1, 1]                
 11                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 12                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 13  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 15                -1  1         0  models.common.MP                        []                            
 16                -1  1     16640  models.common.Conv                      [128, 128, 1, 1]              
 17                -2  1     16640  models.common.Conv                      [128, 128, 1, 1]              
 18                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 19                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 20  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 21                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 22                -1  1         0  models.common.MP                        []                            
 23                -1  1     66048  models.common.Conv                      [256, 256, 1, 1]              
 24                -2  1     66048  models.common.Conv                      [256, 256, 1, 1]              
 25                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 26                -1  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 27  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 28                -1  1    525312  models.common.Conv                      [1024, 512, 1, 1]             
 29                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 30                -2  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 31                -1  1         0  models.common.SP                        [5]                           
 32                -2  1         0  models.common.SP                        [9]                           
 33                -3  1         0  models.common.SP                        [13]                          
 34  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 35                -1  1    262656  models.common.Conv                      [1024, 256, 1, 1]             
 36          [-1, -7]  1         0  models.common.Concat                    [1]                           
 37                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 38                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 39                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 40                21  1     65792  models.common.Conv                      [512, 128, 1, 1]              
 41          [-1, -2]  1         0  models.common.Concat                    [1]                           
 42                -1  1     16512  models.common.Conv                      [256, 64, 1, 1]               
 43                -2  1     16512  models.common.Conv                      [256, 64, 1, 1]               
 44                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 45                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 46  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 47                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 48                -1  1      8320  models.common.Conv                      [128, 64, 1, 1]               
 49                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 50                14  1     16512  models.common.Conv                      [256, 64, 1, 1]               
 51          [-1, -2]  1         0  models.common.Concat                    [1]                           
 52                -1  1      4160  models.common.Conv                      [128, 32, 1, 1]               
 53                -2  1      4160  models.common.Conv                      [128, 32, 1, 1]               
 54                -1  1      9280  models.common.Conv                      [32, 32, 3, 1]                
 55                -1  1      9280  models.common.Conv                      [32, 32, 3, 1]                
 56  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 57                -1  1      8320  models.common.Conv                      [128, 64, 1, 1]               
 58                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
 59          [-1, 47]  1         0  models.common.Concat                    [1]                           
 60                -1  1     16512  models.common.Conv                      [256, 64, 1, 1]               
 61                -2  1     16512  models.common.Conv                      [256, 64, 1, 1]               
 62                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 63                -1  1     36992  models.common.Conv                      [64, 64, 3, 1]                
 64  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 65                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 66                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
 67          [-1, 37]  1         0  models.common.Concat                    [1]                           
 68                -1  1     65792  models.common.Conv                      [512, 128, 1, 1]              
 69                -2  1     65792  models.common.Conv                      [512, 128, 1, 1]              
 70                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 71                -1  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 72  [-1, -2, -3, -4]  1         0  models.common.Concat                    [1]                           
 73                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 74                57  1    147712  models.common.Conv                      [128, 128, 3, 1]              
 75                65  1    590336  models.common.Conv                      [256, 256, 3, 1]              
 76                73  1   2360320  models.common.Conv                      [512, 512, 3, 1]              
 77      [74, 75, 76]  1     21576  models.yolo.Detect                      [3, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512], [416, 416]]
Model: "model"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
==================================================================================================
 input_1 (InputLayer)           [(1, 416, 416, 3)]   0           []                               
                                                                                                  
 tf_conv (TFConv)               (1, 208, 208, 32)    896         ['input_1[0][0]']                
                                                                                                  
 tf_conv_1 (TFConv)             (1, 104, 104, 64)    18496       ['tf_conv[0][0]']                
                                                                                                  
 tf_conv_3 (TFConv)             (1, 104, 104, 32)    2080        ['tf_conv_1[0][0]']              
                                                                                                  
 tf_conv_4 (TFConv)             (1, 104, 104, 32)    9248        ['tf_conv_3[0][0]']              
                                                                                                  
 tf_conv_5 (TFConv)             (1, 104, 104, 32)    9248        ['tf_conv_4[0][0]']              
                                                                                                  
 tf_conv_2 (TFConv)             (1, 104, 104, 32)    2080        ['tf_conv_1[0][0]']              
                                                                                                  
 tf_concat (TFConcat)           (1, 104, 104, 128)   0           ['tf_conv_5[0][0]',              
                                                                  'tf_conv_4[0][0]',              
                                                                  'tf_conv_3[0][0]',              
                                                                  'tf_conv_2[0][0]']              
                                                                                                  
 tf_conv_6 (TFConv)             (1, 104, 104, 64)    8256        ['tf_concat[0][0]']              
                                                                                                  
 tfmp (TFMP)                    (1, 52, 52, 64)      0           ['tf_conv_6[0][0]']              
                                                                                                  
 tf_conv_8 (TFConv)             (1, 52, 52, 64)      4160        ['tfmp[0][0]']                   
                                                                                                  
 tf_conv_9 (TFConv)             (1, 52, 52, 64)      36928       ['tf_conv_8[0][0]']              
                                                                                                  
 tf_conv_10 (TFConv)            (1, 52, 52, 64)      36928       ['tf_conv_9[0][0]']              
                                                                                                  
 tf_conv_7 (TFConv)             (1, 52, 52, 64)      4160        ['tfmp[0][0]']                   
                                                                                                  
 tf_concat_1 (TFConcat)         (1, 52, 52, 256)     0           ['tf_conv_10[0][0]',             
                                                                  'tf_conv_9[0][0]',              
                                                                  'tf_conv_8[0][0]',              
                                                                  'tf_conv_7[0][0]']              
                                                                                                  
 tf_conv_11 (TFConv)            (1, 52, 52, 128)     32896       ['tf_concat_1[0][0]']            
                                                                                                  
 tfmp_1 (TFMP)                  (1, 26, 26, 128)     0           ['tf_conv_11[0][0]']             
                                                                                                  
 tf_conv_13 (TFConv)            (1, 26, 26, 128)     16512       ['tfmp_1[0][0]']                 
                                                                                                  
 tf_conv_14 (TFConv)            (1, 26, 26, 128)     147584      ['tf_conv_13[0][0]']             
                                                                                                  
 tf_conv_15 (TFConv)            (1, 26, 26, 128)     147584      ['tf_conv_14[0][0]']             
                                                                                                  
 tf_conv_12 (TFConv)            (1, 26, 26, 128)     16512       ['tfmp_1[0][0]']                 
                                                                                                  
 tf_concat_2 (TFConcat)         (1, 26, 26, 512)     0           ['tf_conv_15[0][0]',             
                                                                  'tf_conv_14[0][0]',             
                                                                  'tf_conv_13[0][0]',             
                                                                  'tf_conv_12[0][0]']             
                                                                                                  
 tf_conv_16 (TFConv)            (1, 26, 26, 256)     131328      ['tf_concat_2[0][0]']            
                                                                                                  
 tfmp_2 (TFMP)                  (1, 13, 13, 256)     0           ['tf_conv_16[0][0]']             
                                                                                                  
 tf_conv_18 (TFConv)            (1, 13, 13, 256)     65792       ['tfmp_2[0][0]']                 
                                                                                                  
 tf_conv_19 (TFConv)            (1, 13, 13, 256)     590080      ['tf_conv_18[0][0]']             
                                                                                                  
 tf_conv_20 (TFConv)            (1, 13, 13, 256)     590080      ['tf_conv_19[0][0]']             
                                                                                                  
 tf_conv_17 (TFConv)            (1, 13, 13, 256)     65792       ['tfmp_2[0][0]']                 
                                                                                                  
 tf_concat_3 (TFConcat)         (1, 13, 13, 1024)    0           ['tf_conv_20[0][0]',             
                                                                  'tf_conv_19[0][0]',             
                                                                  'tf_conv_18[0][0]',             
                                                                  'tf_conv_17[0][0]']             
                                                                                                  
 tf_conv_21 (TFConv)            (1, 13, 13, 512)     524800      ['tf_concat_3[0][0]']            
                                                                                                  
 tf_conv_23 (TFConv)            (1, 13, 13, 256)     131328      ['tf_conv_21[0][0]']             
                                                                                                  
 tfsp_2 (TFSP)                  (1, 13, 13, 256)     0           ['tf_conv_23[0][0]']             
                                                                                                  
 tfsp_1 (TFSP)                  (1, 13, 13, 256)     0           ['tf_conv_23[0][0]']             
                                                                                                  
 tfsp (TFSP)                    (1, 13, 13, 256)     0           ['tf_conv_23[0][0]']             
                                                                                                  
 tf_concat_4 (TFConcat)         (1, 13, 13, 1024)    0           ['tfsp_2[0][0]',                 
                                                                  'tfsp_1[0][0]',                 
                                                                  'tfsp[0][0]',                   
                                                                  'tf_conv_23[0][0]']             
                                                                                                  
 tf_conv_24 (TFConv)            (1, 13, 13, 256)     262400      ['tf_concat_4[0][0]']            
                                                                                                  
 tf_conv_22 (TFConv)            (1, 13, 13, 256)     131328      ['tf_conv_21[0][0]']             
                                                                                                  
 tf_concat_5 (TFConcat)         (1, 13, 13, 512)     0           ['tf_conv_24[0][0]',             
                                                                  'tf_conv_22[0][0]']             
                                                                                                  
 tf_conv_25 (TFConv)            (1, 13, 13, 256)     131328      ['tf_concat_5[0][0]']            
                                                                                                  
 tf_conv_26 (TFConv)            (1, 13, 13, 128)     32896       ['tf_conv_25[0][0]']             
                                                                                                  
 tf_conv_27 (TFConv)            (1, 26, 26, 128)     32896       ['tf_conv_16[0][0]']             
                                                                                                  
 tf_upsample (TFUpsample)       (1, 26, 26, 128)     0           ['tf_conv_26[0][0]']             
                                                                                                  
 tf_concat_6 (TFConcat)         (1, 26, 26, 256)     0           ['tf_conv_27[0][0]',             
                                                                  'tf_upsample[0][0]']            
                                                                                                  
 tf_conv_29 (TFConv)            (1, 26, 26, 64)      16448       ['tf_concat_6[0][0]']            
                                                                                                  
 tf_conv_30 (TFConv)            (1, 26, 26, 64)      36928       ['tf_conv_29[0][0]']             
                                                                                                  
 tf_conv_31 (TFConv)            (1, 26, 26, 64)      36928       ['tf_conv_30[0][0]']             
                                                                                                  
 tf_conv_28 (TFConv)            (1, 26, 26, 64)      16448       ['tf_concat_6[0][0]']            
                                                                                                  
 tf_concat_7 (TFConcat)         (1, 26, 26, 256)     0           ['tf_conv_31[0][0]',             
                                                                  'tf_conv_30[0][0]',             
                                                                  'tf_conv_29[0][0]',             
                                                                  'tf_conv_28[0][0]']             
                                                                                                  
 tf_conv_32 (TFConv)            (1, 26, 26, 128)     32896       ['tf_concat_7[0][0]']            
                                                                                                  
 tf_conv_33 (TFConv)            (1, 26, 26, 64)      8256        ['tf_conv_32[0][0]']             
                                                                                                  
 tf_conv_34 (TFConv)            (1, 52, 52, 64)      8256        ['tf_conv_11[0][0]']             
                                                                                                  
 tf_upsample_1 (TFUpsample)     (1, 52, 52, 64)      0           ['tf_conv_33[0][0]']             
                                                                                                  
 tf_concat_8 (TFConcat)         (1, 52, 52, 128)     0           ['tf_conv_34[0][0]',             
                                                                  'tf_upsample_1[0][0]']          
                                                                                                  
 tf_conv_36 (TFConv)            (1, 52, 52, 32)      4128        ['tf_concat_8[0][0]']            
                                                                                                  
 tf_conv_37 (TFConv)            (1, 52, 52, 32)      9248        ['tf_conv_36[0][0]']             
                                                                                                  
 tf_conv_38 (TFConv)            (1, 52, 52, 32)      9248        ['tf_conv_37[0][0]']             
                                                                                                  
 tf_conv_35 (TFConv)            (1, 52, 52, 32)      4128        ['tf_concat_8[0][0]']            
                                                                                                  
 tf_concat_9 (TFConcat)         (1, 52, 52, 128)     0           ['tf_conv_38[0][0]',             
                                                                  'tf_conv_37[0][0]',             
                                                                  'tf_conv_36[0][0]',             
                                                                  'tf_conv_35[0][0]']             
                                                                                                  
 tf_conv_39 (TFConv)            (1, 52, 52, 64)      8256        ['tf_concat_9[0][0]']            
                                                                                                  
 tf_conv_40 (TFConv)            (1, 26, 26, 128)     73856       ['tf_conv_39[0][0]']             
                                                                                                  
 tf_concat_10 (TFConcat)        (1, 26, 26, 256)     0           ['tf_conv_40[0][0]',             
                                                                  'tf_conv_32[0][0]']             
                                                                                                  
 tf_conv_42 (TFConv)            (1, 26, 26, 64)      16448       ['tf_concat_10[0][0]']           
                                                                                                  
 tf_conv_43 (TFConv)            (1, 26, 26, 64)      36928       ['tf_conv_42[0][0]']             
                                                                                                  
 tf_conv_44 (TFConv)            (1, 26, 26, 64)      36928       ['tf_conv_43[0][0]']             
                                                                                                  
 tf_conv_41 (TFConv)            (1, 26, 26, 64)      16448       ['tf_concat_10[0][0]']           
                                                                                                  
 tf_concat_11 (TFConcat)        (1, 26, 26, 256)     0           ['tf_conv_44[0][0]',             
                                                                  'tf_conv_43[0][0]',             
                                                                  'tf_conv_42[0][0]',             
                                                                  'tf_conv_41[0][0]']             
                                                                                                  
 tf_conv_45 (TFConv)            (1, 26, 26, 128)     32896       ['tf_concat_11[0][0]']           
                                                                                                  
 tf_conv_46 (TFConv)            (1, 13, 13, 256)     295168      ['tf_conv_45[0][0]']             
                                                                                                  
 tf_concat_12 (TFConcat)        (1, 13, 13, 512)     0           ['tf_conv_46[0][0]',             
                                                                  'tf_conv_25[0][0]']             
                                                                                                  
 tf_conv_48 (TFConv)            (1, 13, 13, 128)     65664       ['tf_concat_12[0][0]']           
                                                                                                  
 tf_conv_49 (TFConv)            (1, 13, 13, 128)     147584      ['tf_conv_48[0][0]']             
                                                                                                  
 tf_conv_50 (TFConv)            (1, 13, 13, 128)     147584      ['tf_conv_49[0][0]']             
                                                                                                  
 tf_conv_47 (TFConv)            (1, 13, 13, 128)     65664       ['tf_concat_12[0][0]']           
                                                                                                  
 tf_concat_13 (TFConcat)        (1, 13, 13, 512)     0           ['tf_conv_50[0][0]',             
                                                                  'tf_conv_49[0][0]',             
                                                                  'tf_conv_48[0][0]',             
                                                                  'tf_conv_47[0][0]']             
                                                                                                  
 tf_conv_51 (TFConv)            (1, 13, 13, 256)     131328      ['tf_concat_13[0][0]']           
                                                                                                  
 tf_conv_52 (TFConv)            (1, 52, 52, 128)     73856       ['tf_conv_39[0][0]']             
                                                                                                  
 tf_conv_53 (TFConv)            (1, 26, 26, 256)     295168      ['tf_conv_45[0][0]']             
                                                                                                  
 tf_conv_54 (TFConv)            (1, 13, 13, 512)     1180160     ['tf_conv_51[0][0]']             
                                                                                                  
 tf_detect (TFDetect)           ((1, 10647, 8),      21576       ['tf_conv_52[0][0]',             
                                )                                 'tf_conv_53[0][0]',             
                                                                  'tf_conv_54[0][0]']             
                                                                                                  
==================================================================================================
Total params: 6,012,040
Trainable params: 0
Non-trainable params: 6,012,040
__________________________________________________________________________________________________
TensorFlow SavedModel: export success ✅ 8.6s, saved as runs/train/exp8/weights/best_saved_model (23.1 MB)

TensorFlow Lite: starting export with tensorflow 2.12.0...
WARNING:absl:Found untraced functions such as conv2d_3_layer_call_fn, conv2d_3_layer_call_and_return_conditional_losses, _jit_compiled_convolution_op, conv2d_4_layer_call_fn, conv2d_4_layer_call_and_return_conditional_losses while saving (showing 5 of 172). These functions will not be directly callable after loading.
fully_quantize: 0, inference_type: 6, input_inference_type: UINT8, output_inference_type: UINT8
TensorFlow Lite: export success ✅ 105.0s, saved as runs/train/exp8/weights/best-int8.tflite (6.0 MB)

Edge TPU: starting export with Edge TPU compiler 16.0.384591198...
Edge TPU Compiler version 16.0.384591198
Searching for valid delegate with step 10
Try to compile segment with 246 ops
Started a compilation timeout timer of 180 seconds.

Model compiled successfully in 3133 ms.

Input model: runs/train/exp8/weights/best-int8.tflite
Input size: 6.00MiB
Output model: runs/train/exp8/weights/best-int8_edgetpu.tflite
Output size: 6.47MiB
On-chip memory used for caching model parameters: 5.90MiB
On-chip memory remaining for caching model parameters: 1.16MiB
Off-chip memory used for streaming uncached model parameters: 41.75KiB
Number of Edge TPU subgraphs: 1
Total number of operations: 246
Operation log: runs/train/exp8/weights/best-int8_edgetpu.log

Operator                       Count      Status

ADD                            3          Mapped to Edge TPU
RESHAPE                        6          Mapped to Edge TPU
MAX_POOL_2D                    4          Mapped to Edge TPU
CONCATENATION                  18         Mapped to Edge TPU
PAD                            4          Mapped to Edge TPU
LOGISTIC                       64         Mapped to Edge TPU
RESIZE_NEAREST_NEIGHBOR        2          Mapped to Edge TPU
QUANTIZE                       5          Mapped to Edge TPU
MUL                            73         Mapped to Edge TPU
STRIDED_SLICE                  9          Mapped to Edge TPU
CONV_2D                        58         Mapped to Edge TPU
Compilation child process completed within timeout period.
Compilation succeeded! 
Edge TPU: export success ✅ 3.4s, saved as runs/train/exp8/weights/best-int8_edgetpu.tflite (6.5 MB)

from yolov7.

hardikdava avatar hardikdava commented on August 26, 2024

Is there any update on this issue? It will be interesting to see the performance on edgetpu.

from yolov7.

xrbeattx avatar xrbeattx commented on August 26, 2024

Bump

from yolov7.

xrbeattx avatar xrbeattx commented on August 26, 2024

From what I understand it seems that their are two issues to this. 1) The EdgeTPU only works with .tflite files/models and 2)The needed libraries to run this require python3.9 and I cannot for the life of me get it to update to it. I found a stack overflow question regarding this but no one has answered, this leads me to believe that you cant for some reason. I honestly dont know why some versions of unbuntu/debian/mendel dont support certain versions of python or vice versa. I really want this to work but TBH I think I am wasting my time

from yolov7.

Baael avatar Baael commented on August 26, 2024

just convert it to tflite
link for example how to do that:
https://medium.com/geekculture/converting-yolo-v7-to-tensorflow-lite-for-mobile-deployment-ebc1103e8d1e

from yolov7.

keesschollaart81 avatar keesschollaart81 commented on August 26, 2024

@Baael Are you able to convert it to edgetpu?

from yolov7.

hardikdava avatar hardikdava commented on August 26, 2024

@keesschollaart81 @Baael I have tried the mentioned workflow. I was able to convert model into tflite but not int8 quantize model which is needed by coral edgetpu. I would still prefer to follow workflow as in YoloV5 i.e. create full network using tensorflow layers which seems to be correct way since many of the operations are not supported by edgetpu.

from yolov7.

keesschollaart81 avatar keesschollaart81 commented on August 26, 2024

Clear. What was the report/output of the edgetpu_compiler? Like how many of the operations are running on the CPU vs TPU?

from yolov7.

hardikdava avatar hardikdava commented on August 26, 2024

We need to perform full integer quantization by using tflite converter on tensorflow saved_model and save int8 tflite model. Then we have to compile the model using edgecompiler which generates compiled network for edgetpu. But I got this error when I am performing quantization.

RuntimeError: tensorflow/lite/kernels/conv.cc:357 input_channel % filter_input_channel != 0 (1 != 0)Node number 2 (CONV_2D) failed to prepare.

I think this error is due to channel mismatching. If you know how to solve this error then let me know. @keesschollaart81

from yolov7.

drachu avatar drachu commented on August 26, 2024

Any updates? I'm searching the internet for a solution. I was able to convert model to tflite too but quantization int8 fails everytime.

from yolov7.

sph1n3x avatar sph1n3x commented on August 26, 2024

I have been running some tests for the past few weeks and played around with different input sizes (640, 512, 448, 416) due to limitations of the Coral EdgeTPU. The YOLOv7 standard model is rather large and runs into compilation timeouts at least for object detection @ 640. I have been testing it for instance segmentation mostly though. You might be lucky if you choose smaller input sizes for training (YOLOv7), I have no time to look further into it atm :(

I can confirm that export to edgetpu works with YOLOv7Tiny (640, 512, 448, 416) and YOLOv5s / YOLOv5m (640, 512, 448, 416) without running into any compilation timeouts and subgraph issues.

My only problem here is that reparameterization for segmentation is currently not available, so I have no choice but use the YOLOv5 head (instead of YOLOR) which results in a loss of 1-2% (in terms of precision, recall, mAP).

Maybe @WongKinYiu @AlexeyAB have an idea for the reparameterization of the segmentation model in the u7 branch?

from yolov7.

drachu avatar drachu commented on August 26, 2024

@sph1n3x What was your export process like for YOLOv7Tiny? Did you use https://medium.com/geekculture/converting-yolo-v7-to-tensorflow-lite-for-mobile-deployment-ebc1103e8d1e workflow? If yes which parameters you used in tf.lite.TFLiteConverter for quantization? Good work with your tests! :)

from yolov7.

sph1n3x avatar sph1n3x commented on August 26, 2024

@drachu I actually used a modified export.py script from the u7 branch with changes from the main and u5 branch. It wasn't straightforward as some functions and classes were missing in TensorFlow which had to be implemented. I can, however, provide the changes 👍

from yolov7.

drachu avatar drachu commented on August 26, 2024

@sph1n3x It would be great!

from yolov7.

drachu avatar drachu commented on August 26, 2024

Thanks for the answer, tips and your research work @sph1n3x!

from yolov7.

triptec avatar triptec commented on August 26, 2024

@sph1n3x I've been trying to just get a tflite with uint8, currently I'm not so worried about performance, I just want it to run with the same infra as my yolov5 models. Could you share the code you used for the conversion and detection with the result?
I have been able to export to onnx and then with https://github.com/MPolaris/onnx2tflite convert it into a tflite with uint8 but I can't make sense of the output from the resulting model. Samething when I used
https://github.com/PINTO0309/onnx2tf, can't make sense of the output.

from yolov7.

35grain avatar 35grain commented on August 26, 2024

For anyone still looking for a solution - I was able to convert the YOLOv7 tiny model to an Edge TPU compatible tflite format with a resolution of 640 via the openvino2tensorflow converter. Almost all operations were mapped to the Edge TPU (see output below), while with onnx2tf it was the other way round. Unfortunately, that's where the fun ends as I am not faced with an error when running the model on my Raspberry Pi 4 F driver/usb/usb_driver.cc:857] transfer on tag 1 failed. Abort. Deadline exceeded: USB transfer error 2 [LibUsbDataOutCallback]. I tend to believe that this isn't caused by insufficient power delivery from the Pi either as it should be able to output 1200mA total across all ports and I have nothing else plugged in.

Edge TPU Compiler version 16.0.384591198
Searching for valid delegate with step 1
Try to compile segment with 330 ops
Started a compilation timeout timer of 3600 seconds.

Model compiled successfully in 40666 ms.

Input model: saved_model/model_full_integer_quant.tflite
Input size: 6.01MiB
Output model: saved_model/model_full_integer_quant_edgetpu.tflite
Output size: 6.42MiB
On-chip memory used for caching model parameters: 5.90MiB
On-chip memory remaining for caching model parameters: 659.50KiB
Off-chip memory used for streaming uncached model parameters: 0.00B
Number of Edge TPU subgraphs: 2
Total number of operations: 330
Operation log: saved_model/model_full_integer_quant_edgetpu.log

Model successfully compiled but not all operations are supported by the Edge TPU. A percentage of the model will instead run on the CPU, which is slower. If possible, consider updating your model to use only operations supported by the Edge TPU. For details, visit g.co/coral/model-reqs.
Number of operations that will run on Edge TPU: 321
Number of operations that will run on CPU: 9

Operator                       Count      Status

ADD                            55         Mapped to Edge TPU
MAX_POOL_2D                    6          Mapped to Edge TPU
CONV_2D                        58         Mapped to Edge TPU
RESIZE_NEAREST_NEIGHBOR        2          Mapped to Edge TPU
CONCATENATION                  14         Mapped to Edge TPU
RELU                           55         Mapped to Edge TPU
MUL                            55         Mapped to Edge TPU
MIRROR_PAD                     3          Operation not supported
RESHAPE                        3          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
TRANSPOSE                      3          Tensor has unsupported rank (up to 3 innermost dimensions mapped)
PAD                            21         Mapped to Edge TPU
MINIMUM                        55         Mapped to Edge TPU
Compilation child process completed within timeout period.
Compilation succeeded! 

Update: Using a resolution of 512 resulted in a different error: KeyError: 'output_0. But at least it looks like it is trying to run it now?
Update update: I was able to reproduce this on a Windows machine too, perhaps the model output is simply too big.

from yolov7.

hardikdava avatar hardikdava commented on August 26, 2024

Hello @35grain , what was your onnx2tf command? And which model were able to convert?

from yolov7.

35grain avatar 35grain commented on August 26, 2024

Hello @35grain , what was your onnx2tf command? And which model were able to convert?

@hardikdava I used the following workflow with openvino2tensorflow: YOLOv7-tiny custom trained model > ONNX > OpenVINO > tflite int8. Here's a code snippet you could modify for your use (I ran it in Colab):

pip install -r requirements.txt # for using the export command from YOLOv7 repo
pip install openvino-dev
pip install openvino2tensorflow
pip install onnx onnxsim # onnxsim for simplifying the model using the export command (optional)

python export.py --weights '/path/to/model.pt' --simplify
mo --input_model '/path/to/model.onnx'
openvino2tensorflow --model_path '/path/to/model.xml' --output_edgetpu

Let me know if you have any luck with getting it running. Currently only YOLOv5n and v5n6 are working for me (though with lower accuracy) while YOLOv7 is in the described state and YOLOv8 has its own issues with exporting. I really just need something usable for my project.

from yolov7.

35grain avatar 35grain commented on August 26, 2024

Hello, I was able to export yolov7-tiny.pt to edgetpu. But there is a limitations for numbers of class and image size. But yolov7-tiny was able to export as edgetpu and all the ops are compiled successfully. I made my commit to this #1672 pr. It is made under branch u5. Please find below logs for complete export of model.

TensorFlow SavedModel: starting export with tensorflow 2.12.0...

                 from  n    params  module                                  arguments                     
2023-04-24 18:23:29.086379: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:266] failed call to cuInit: 
...

Operator                       Count      Status

ADD                            3          Mapped to Edge TPU
RESHAPE                        6          Mapped to Edge TPU
MAX_POOL_2D                    4          Mapped to Edge TPU
CONCATENATION                  18         Mapped to Edge TPU
PAD                            4          Mapped to Edge TPU
LOGISTIC                       64         Mapped to Edge TPU
RESIZE_NEAREST_NEIGHBOR        2          Mapped to Edge TPU
QUANTIZE                       5          Mapped to Edge TPU
MUL                            73         Mapped to Edge TPU
STRIDED_SLICE                  9          Mapped to Edge TPU
CONV_2D                        58         Mapped to Edge TPU
Compilation child process completed within timeout period.
Compilation succeeded! 
Edge TPU: export success ✅ 3.4s, saved as runs/train/exp8/weights/best-int8_edgetpu.tflite (6.5 MB)

Yeah, it exported for me too but did you try running it as well?

from yolov7.

hardikdava avatar hardikdava commented on August 26, 2024

hello @35grain, int8 tflite is detecting fine. It has also the same issue of accuracy drop after quantization.

from yolov7.

35grain avatar 35grain commented on August 26, 2024

The pull request appears to be for a really old branch..

from yolov7.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.