dog-qiuqiu / yolo-fastest Goto Github PK

:zap: Based on yolo's ultra-lightweight universal target detection algorithm, the calculation amount is only 250mflops, the ncnn model size is only 666kb, the Raspberry Pi 3b can run up to 15fps+, and the mobile terminal can run up to 178fps+

License: Other

CMake 1.36% Makefile 0.27% PowerShell 0.39% Shell 0.56% C# 0.14% Batchfile 0.58% Python 4.61% C 64.36% C++ 12.78% Cuda 14.96%

yolo-fastest's Introduction

2021.8.12: yolo-fastestV2版本已发布:https://github.com/dog-qiuqiu/Yolo-FastestV2
2021.3.21: 对模型结构进行细微调整优化，更新Yolo-Fastest-1.1模型
2021.3.19: NCNN Camera Demo https://github.com/dog-qiuqiu/Yolo-Fastest/tree/master/sample/ncnn
2021.3.16: 修复分组卷积在某些旧架构GPU推理耗时异常的问题

⚡Yolo-Fastest⚡

Simple, fast, compact, easy to transplant
A real-time target detection algorithm for all platforms
The fastest and smallest known universal target detection algorithm based on yolo
Optimized design for ARM mobile terminal, optimized to support NCNN reasoning framework
Based on NCNN deployed on RK3399 ,Raspberry Pi 4b... and other embedded devices to achieve full real-time 30fps+

中文介绍https://zhuanlan.zhihu.com/p/234506503
相比AlexeyAB/darknet，此版本的darknet修复分组卷积在某些旧架构GPU推理耗时异常的问题(例如1050ti:40ms->4ms速度提升10倍)，强烈建议用此仓库框架训练模型
Compared with AlexeyAB/darknet, this version of darknet fixes the problem of abnormal time-consuming inference of grouped convolution in some old architecture GPUs (for example, 1050ti:40ms->4ms speed up 10 times), it is strongly recommended to use this warehouse framework for training model
Darknet CPU推理效率优化不好，不建议使用Darknet作为CPU端的推理框架，建议使用NCNN
Darknet CPU reasoning efficiency optimization is not good, it is not recommended to use Darknet as the CPU side reasoning framework, it is recommended to use ncnn
Based on pytorch training framework: https://github.com/dog-qiuqiu/yolov3

Evaluating indicator/Benchmark

Network	COCO mAP(0.5)	Resolution	Run Time(Ncnn 4xCore)	Run Time(Ncnn 1xCore)	FLOPS	Params	Weight size
Yolo-Fastest-1.1	24.40 %	320X320	5.59 ms	7.52 ms	0.252BFlops	0.35M	1.4M
Yolo-Fastest-1.1-xl	34.33 %	320X320	9.27ms	15.72ms	0.725BFlops	0.925M	3.7M
Yolov3-Tiny-Prn	33.1%	416X416	%ms	%ms	3.5BFlops	4.7M	18.8M
Yolov4-Tiny	40.2%	416X416	23.67ms	40.14ms	6.9 BFlops	5.77M	23.1M

Test platform Mi 11 Snapdragon 888 CPU，Based on NCNN
COCO 2017 Val mAP（no group label）
Suitable for hardware with extremely tight computing resources
This model is recommended to do some simple single object detection suitable for simple application scenarios

Yolo-Fastest-1.1 Multi-platform benchmark

Equipment	Computing backend	System	Framework	Run time
Mi 11	Snapdragon 888	Android(arm64)	ncnn	5.59ms
Mate 30	Kirin 990	Android(arm64)	ncnn	6.12ms
Meizu 16	Snapdragon 845	Android(arm64)	ncnn	7.72ms
Development board	Snapdragon 835(Monkey version)	Android(arm64)	ncnn	20.52ms
Development board	RK3399	Linux(arm64)	ncnn	35.04ms
Raspberrypi 3B	4xCortex-A53	Linux(arm64)	ncnn	62.31ms
Orangepi Zero Lts	H2+ 4xCortex-A7	Linux(armv7)	ncnn	550ms
Nvidia	Gtx 1050ti	Ubuntu(x64)	darknet	4.73ms
Intel	i7-8700	Ubuntu(x64)	ncnn	5.78ms

The above is a multi-core test benchmark
The above speed benchmark is tested by big core in big.little CPU
Raspberrypi 3B enable bf16s optimization，Raspberrypi 64 Bit OS
Rk3399 needs to lock the cpu to the highest frequency, ncnn and enable bf16s optimization

Pascal VOC performance index comparison

Network	Model Size	mAP(VOC 2007)	FLOPS
Tiny YOLOv2	60.5MB	57.1%	6.97BFlops
Tiny YOLOv3	33.4MB	58.4%	5.52BFlops
YOLO Nano	4.0MB	69.1%	4.51Bflops
MobileNetv2-SSD-Lite	13.8MB	68.6%	&Bflops
MobileNetV2-YOLOv3	11.52MB	70.20%	2.02Bflos
Pelee-SSD	21.68MB	70.09%	2.40Bflos
Yolo Fastest	1.3MB	61.02%	0.23Bflops
Yolo Fastest-XL	3.5MB	69.43%	0.70Bflops
MobileNetv2-Yolo-Lite	8.0MB	73.26%	1.80Bflops

Performance indicators reference from the papers and public indicators in the github project
MobileNetv2-Yolo-Lite: https://github.com/dog-qiuqiu/MobileNet-Yolo#mobilenetv2-yolov3-litenano-darknet

Yolo-Fastest-1.1 Pedestrian detection

Equipment	System	Framework	Run time
Raspberrypi 3B	Linux(arm64)	ncnn	62ms

Simple real-time pedestrian detection model based on yolo-fastest-1.1
Enable bf16s optimization，Raspberrypi 64 Bit OS

Demo

Compile

How to compile on Linux

This repo is based on Darknet project so the instructions for compiling the project are same (https://github.com/MuhammadAsadJaved/darknet#how-to-compile-on-windows-legacy-way)

Just do make in the Yolo-Fastest-master directory. Before make, you can set such options in the Makefile: link

GPU=1 to build with CUDA to accelerate by using GPU (CUDA should be in /usr/local/cuda)
CUDNN=1 to build with cuDNN v5-v7 to accelerate training by using GPU (cuDNN should be in /usr/local/cudnn)
CUDNN_HALF=1 to build for Tensor Cores (on Titan V / Tesla V100 / DGX-2 and later) speedup Detection 3x, Training 2x
OPENCV=1 to build with OpenCV 4.x/3.x/2.4.x - allows to detect on video files and video streams from network cameras or web-cams
Set the other options in the Makefile according to your need.

Test/Demo

*Run Yolo-Fastest , Yolo-Fastest-xl , Yolov3 or Yolov4 on image or video inputs

Demo on image input

*Note: change .data , .cfg , .weights and input image file in image_yolov3.sh for Yolo-Fastest-x1, Yolov3 and Yolov4

  sh image_yolov3.sh

Demo on video input

*Note: Use any input video and place in the data folder or use 0 in the video_yolov3.sh for webcam

*Note: change .data , .cfg , .weights and input video file in video_yolov3.sh for Yolo-Fastest-x1, Yolov3 and Yolov4

  sh video_yolov3.sh

Yolo-Fastest Test

Yolo-Fastest-xl Test

How to Train

Generate a pre-trained model for the initialization of the model backbone

  ./darknet partial yolo-fastest.cfg yolo-fastest.weights yolo-fastest.conv.109 109

Train

交流qq群:1062122604
https://github.com/AlexeyAB/darknet

  ./darknet detector train voc.data yolo-fastest.cfg yolo-fastest.conv.109

Deploy

NCNN

NCNN Conversion Tutorial

Benchmark:https://github.com/Tencent/ncnn/tree/master/benchmark
NCNN supports direct conversion of darknet models
darknet2ncnn: https://github.com/Tencent/ncnn/tree/master/tools/darknet

NCNN Sample

CamSample:https://github.com/dog-qiuqiu/Yolo-Fastest/tree/master/sample/ncnn
AndroidSample: https://github.com/WZTENG/YOLOv5_NCNN

MNN&TNN&MNN

https://github.com/dog-qiuqiu/MobileNet-Yolo#darknet2caffe-tutorial
Based on MNN: https://github.com/geekzhu001/Yolo-Fastest-MNN Run on : raspberry pi 4B 2G Input size : 320320 Average inference time : 0.035s*

ONNX&TensorRT

https://github.com/CaoWGG/TensorRT-YOLOv4
It is not efficient to run on Psacal and earlier GPU architectures. It is not recommended to deploy on such devices such as jeston nano(17ms/img), Tx1, Tx2, but there is no such problem in Turing GPU, such as jetson-Xavier-NX Can run efficiently

OpenCV DNN

https://blog.csdn.net/nihate/article/details/108670542

Thanks

Cite as

dog-qiuqiu. (2021, July 24). dog-qiuqiu/Yolo-Fastest: yolo-fastest-v1.1.0 (Version v.1.1.0). Zenodo. http://doi.org/10.5281/zenodo.5131532

yolo-fastest's People

Contributors

Stargazers

Watchers

Forkers

maxuehao qaz734913414 wavelet2008 zlszhonglongshen jinqi2376 wuxiaolianggit simonaries j201111100523 yippeesoft cqray1990 wanglong100 zy9957 cqu-night tonywork dsp6414 supershisong blankworld lianwaijinxifeng robin1987z xm88 witium superjcer mc261670164 vcvycy yingunjun zyg11 hell-to-heaven royzon perfyperfect youlei0106 watsonming qqyouhappy liuwenhaha yanggui19891007 chankeh alex-1997-wzx zilipeng learncrazy 141141 tonystark-001 zhangxg001 arnoldfychen 2019-paper-fun verigle maoweinuaa wangxl12 zgle-fork songhongxiang robotseye nestorlong muhammadasadjaved chenyuyi94 crystalgrape b-xiang zeta1999 yangyin2016 liaoheping toanpv-0639 zhouhaocomeon1 chaoli1991 flysj gudumingjian kangtingying limitmhw deepbehavier icewm nnnnnkk timverion 875353581 super-ljg introyz genff jackgwb loulansuiye ykingdsjj intjun rushgun studiouspan aliushn antonizdp tjufan cxfchxy gongkecun pinery-sls benjamesbabala wharu msnh2012 gzzgz zongyang-li xrosliang airopti jeshy tcsong jawaechan dashpulsar cv-ip howtolovechina ptklx lidaweinuc adrianhust

yolo-fastest's Issues

how to test and train on ubuntu16.04?

Hello,how to use Yolo-Fastest( weight and cfg of COCO and VOC)?Thanks for your reply.

code for benchmark test

Is these speed testing on Kirin 990 CPU or NPU，and could you please give the benchmark testing code demo？

TensorRT supported?

Amazing work!
I test the model of yolo-fastest-xl on a jetson nano, the FPS is more than 10 which is quite good.
I wonder if the TensorRT is introduced according to jkjung's work(https://github.com/jkjung-avt/tensorrt_demos), which could we achieve a real-time detection on edge devices.
Many thanks!

如何实现Alexnet网络的ncnn加速

您好，我是初学者。我根据Alexnet做了一个分类网络，然后如何能想您的yolofast一样进行ncnn加速？大概需要做的流程是什么？期待您的回答，谢谢

Architecture Definition and Training Details

Hi,

Really interesting work. I was wondering where can I find a simple architecture definition for Yolo-Fastest or Yolo-Fastest-XL? I am asking because I wanted to port it to TensorFlow to compare it with my own work.

Also, if you can provide more details on the training method, like which data augmentations, loss functions etc. you use, it would also be an interesting study.

Even if you are not releasing a paper, a simple doc with such details would be really helpful for someone following up on your work. For example, what exactly is your backbone? Where do you cut it off and what feature aggregation method did you use? Also how many object detection heads does your model have? These details would be really helpful.

Thank You

INT8 quantization

Can it do INT8 quantization with MNN or NCNN?
I tried to quantify it, the conversion did not report any errors, but I didn't get any predicted results.

训练模型对GPU的占用率不高

How to load your pretrain model in pytorch?

I am trying to convert your darknet model to pytorch, but I find that the size of your weight is mismatch with my model, here is my convert code:

def load_darknet_weights(self, weights_path):
     """Parses and loads the weights stored in 'weights_path'"""

     # Open the weights file
     with open(weights_path, "rb") as f:
         header = np.fromfile(f, dtype=np.int32, count=5)  # First five are header values
         print(header)
         self.header_info = header  # Needed to write header when saving weights
         self.seen = header[3]  # number of images seen during training
         weights = np.fromfile(f, dtype=np.float32)  # The rest are weights

     # Establish cutoff for loading backbone weights
     cutoff = None
     if "darknet53.conv.74" in weights_path:
         cutoff = 75

     ptr = 0
     for i, (module_def, module) in enumerate(zip(self.module_defs, self.module_list)):
         if i == cutoff:
             break
         if module_def["type"] in ["convolutional","normal_convolutional"]:
             conv_layer = module[0]
             if int(module_def["batch_normalize"])==1:
                 # Load BN bias, weights, running mean and running variance
                 print(module[0].state_dict()["weight"].shape)
                 bn_layer = module[1]
                 # print(bn_layer.state_dict())
                 num_b = bn_layer.bias.numel()  # Number of biases
                 print(num_b)
                 # Bias
                 bn_b = torch.from_numpy(weights[ptr : ptr + num_b]).view_as(bn_layer.bias)
                 bn_layer.bias.data.copy_(bn_b)
                 ptr += num_b
                 # Weight
                 bn_w = torch.from_numpy(weights[ptr : ptr + num_b]).view_as(bn_layer.weight)
                 bn_layer.weight.data.copy_(bn_w)
                 ptr += num_b
                 # Running Mean
                 bn_rm = torch.from_numpy(weights[ptr : ptr + num_b]).view_as(bn_layer.running_mean)
                 bn_layer.running_mean.data.copy_(bn_rm)
                 ptr += num_b
                 # Running Var
                 bn_rv = torch.from_numpy(weights[ptr : ptr + num_b]).view_as(bn_layer.running_var)
                 bn_layer.running_var.data.copy_(bn_rv)
                 ptr += num_b
                 # Load conv. weights
                 num_w = conv_layer.weight.numel()
                 print(num_w)
                 conv_w = torch.from_numpy(weights[ptr: ptr + num_w]).view_as(conv_layer.weight)
                 conv_layer.weight.data.copy_(conv_w)
                 ptr += num_w
             else:
                 # 对于yolov3.weights,不带bn的卷积层就是YOLO前的卷积层
                 if ".weights" in weights_path:
                     num_b = 255
                     ptr += num_b
                     num_w = int(self.module_defs[i-1]["filters"]) * 255
                     ptr += num_w
                 else:
                     # Load conv. bias
                     num_b = conv_layer.bias.numel()
                     conv_b = torch.from_numpy(weights[ptr : ptr + num_b]).view_as(conv_layer.bias)
                     conv_layer.bias.data.copy_(conv_b)
                     ptr += num_b
                     # Load conv. weights
                     num_w = conv_layer.weight.numel()
                     conv_w = torch.from_numpy(weights[ptr : ptr + num_w]).view_as(conv_layer.weight)
                     conv_layer.weight.data.copy_(conv_w)
                     ptr += num_w

I guess My convolution layer is different with yours, Here is my convolution layer:

elif module_def["type"] == "normal_convolutional":
            # print("==============input filtrs:",output_filters[-1],"=================")
            # print('convolution')
            bn = int(module_def["batch_normalize"])
            filters = int(module_def["filters"])
            kernel_size = int(module_def["size"])
            stride = int(module_def['stride'])
            pad = int(module_def['pad'])
            groups=int(module_def['groups'])
            modules.add_module(
                f"conv_3{module_i}",
                nn.Conv2d(
                    in_channels=output_filters[-1],
                    out_channels=filters,
                    kernel_size=kernel_size,
                    stride=stride,
                    padding=pad,
                    bias=not bn,
                    groups=groups
                    ),
                )
            # modules.add_module(f"batch_norm_{module_i}", nn.BatchNorm2d(filters, momentum=0.9, eps=1e-5))
            if bn:
                modules.add_module(f"batch_norm_1{module_i}", nn.BatchNorm2d(filters))
            if module_def["activation"] == "relu6":
                modules.add_module(f"leaky_{module_i}", nn.ReLU6(inplace=True))

I don't know where is different, if your know, could you finger out? thanks a lot.

Why using the leakyrelu activation Function？

Good job！it‘s really fast. and i wanna know why you use leaky relu but not relu?

train error

Hi,how to solve?

About ncnn_sample Compile Guide

Sincerely suggest that you could add “\” before the “`” in ncnn_sample's README.md, so that it may be easier for new comers to get started.

Why did you choose 109?

I already know that, Thank you very much.

class_loss is very high, even to 340+?

hello, when i prepare training with a generated pretrained model, and follow darknet yolov3 configure. the total loss is large big because of class_loss is very high ,even to 340+. how can i sovle this problem? ths

yolo-fastest.conv.109

hello I want to know where I can find the yolo-fastest.conv.109

To tranform caffe failed！ How can we solute the question？

Difference between Yolo-Fastest and MobileNetV2-YOLO-Fastest

Hi, i'm converting your models to TensorRT model to use in NVIDIA DeepStream SDK. I already ported MobileNetV2-YOLOv3-Nano and MobileNetV2-YOLOv3-Lite.

The Yolo-Fastest and MobileNetV2-YOLO-Fastest are the same model?

I need to know, to create parse functions.

Thanks.

custom widthxheight

How can one use this repos with a custom widthxheight? what should I change in config, width, height anchors, train from scratch??

CUDNN < 7 doesn't support groups, please upgrade!

您好，请教一下，我是在alexeyAB的darknet下训练的，会报上面的错误，是要更新cudnn吗？
因为修改环境不太方便，所以我试着将groups都改为=1和将所有的groups行删掉，然后用一个数据集进行训练，跑一小会就nan了。。。请问有什么解决办法吗？

先验框问题

老哥你两个先验框size是多少哇

预训练模型分类错误的问题

我是用预训练模型在训练了10000批次后，发现模型对区域的检测基本都能检测到，但检测归类的class还是预训练模型里的“car,bus.plan”之类的东西，而不是我自定义的分类。我在各种**.names都定义了我自己的分类，voc.data里也指定了name文件，但还是无效，这和我的训练次数或样本数量有关吗？还是使用预训练模型的问题呢？

operating speed of Raspberry Pi

Hello, I am on 1660TiGPU in ubuntu16.04, yolofastest speed is 30 frames, and about 1 frame on Raspberry Pi. Is this normal?Looking forward to your reply, thank you.

How to run sameple to detect video in Ros?

In the Ros system, the release topic of the camera is received, then the NCNNSAMPLE is used for testing and the release topic is the test result.

Update readme

Hi,
The readme file is not very clear for the demo.
1-what is the difference between darknet_images.py vs image_yolov3.sh and the same for video? Just two ways to run the demo or these are different for comparison?

2- How to compare the speed of yolov3 or v4 with yolo-fastest?

8 bit problem

hey,man,thanks for the work,it's cool. When i convert the model to 8bit by ncnn tool,the accuracy has dropped a lot,can you give me some suggetions?

球球大佬你好，请问下Efficient-lite把se层给去掉了和mobilenetv2的block一样，那和mobilenetv2还有什么区别的？

model convert(weights to h5、pb...) error

When I convert the weights file to other model files, it always report error: buffer is too small for requested array. I check the .cfg file,then I found there some 'groups' vars in [convolutional] layer, maybe it reads more weights than GroupConv2D needs and make the weights not enough to use, but I don't know how to split groups in yolo-fastest(filters=groups?).
Is the error caused by this problem, or others, and how to solve this error?

Darknet to TensorFlow

What about
Darknet weights (.weights) to TensorFlow weights (.ckpt)
Darknet model (.cfg) to TensorFlow graph (.pb, .meta)

How to get the pre-trained weights for yolo-fastest-xl ?

Hello qiuqiu,

Is my understanding below right ?
The commands for getting the pre-trained weights for yolo-fastest-xl and training yolo-fastest-xl are :

./darknet partial yolo-fastest-xl.cfg yolo-fastest-xl.weights yolo-fastest-xl.conv.109 109
./darknet detector train voc.data yolo-fastest-xl.cfg yolo-fastest-xl.conv.109

Thanks.

About intel i7 cpu predict 320*320 very slowly problem?

I trained model using NVIDIA Tesla P100, My cpu is intel i7 1.8G, Why infer 320*320 needing about 30ms, where is wrong?
Kirin 990 CPU’s time only is 6.74ms!! where is wrong?
Thank you very much!

我用NVIDIA Tesla P100训练的模型，在intel i7 1.8G cpu上预测320*320的时间是30ms左右，但是根据官网，Kirin 990 CPU预测同样大小是6.74ms。
这个差距太大了，是哪里出了问题？非常感谢！

total loss is very large

I follow the README file to train my dataset. but the total loss is still very large(>=350)at 1000 epochs and without ant decrease tread,HOW CAN I SOLVE THIS PROBLEM?

Papers about yolo-fastest

Hi,good job you have done.I want to read about the artical about the Yolo-Fastest,but I can not find where it is.

About anchors

How could we calculate anchors for this model?

onnx convert error

the code of https://github.com/CaoWGG/TensorRT-YOLOv4 can`t converts the yolo-fastest to onnx model.
typeerror: buffer is too small for requested array

nvcc fatal : Unsupported gpu architecture 'compute_61'

编译时报错，nvcc fatal : Unsupported gpu architecture 'compute_61'，显卡是笔记本rtx 2060 ，请问怎么解决？
when i compile it ，an error occurred，“nvcc fatal : Unsupported gpu architecture 'compute_61'”，my GPU is laptop edition RTX 2060 ，who can help me？

可以训练yolov4，v4-tiny

这个分支和yolov4那个分支有啥区别，也可以用来训练v4,v4-tiny吧

inference costs 29ms 1080TI

I followed your README.md, modified Makefile to open CUDA CUDNN and GPU, and then run "make"
I used ./Yolo-Fastest/VOC/yolo-fastest.weights to run image_yolov3.sh, it costs 29ms on 1080TI GPU.
I do not know where is wrong
looking forward to your reply

Demo

How to run a demo, please? It seems not so clear in readme.md.

EfficientNet-lite

how can I change *.cfg with EfficientNet-lite1~4

quite slow inference on GPU 2070super

I tested yolo-fastest on my PC, which run quite slowly with almost 25ms per frame on average. The XL version also takes even much more consuming time with 45ms. BTW, the version of my GPU is 2070super. Did I test it not correctly? Need help.

COCO mAP for Yolo-Fastest

@dog-qiuqiu Hi Sir, Can you please update the COCO mAP for the Yolo-Fastest?
https://github.com/dog-qiuqiu/Yolo-Fastest#evaluating-indicatorbenchmark

Seems like it's missing, or maybe you are still working on it.
Thanks

主干添加CSP模块

Can’t use GPU for training or testing

I used CUDA 10.2，cudnn 7.6.5，GPU NVIDIA GeForce RTX 2060

I have set up in Makefile
GPU=1
CUDNN=1
ifeq ($(GPU), 1)
COMMON+= -DGPU -I/usr/local/cuda-10.2/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda-10.2/lib64 -lcuda -lcudart -lcublas -lcurand

C:\Yolo-Fastest\build\darknet\x64> ./darknet.exe detector test ./cfg/voc.data ./cfg/yolo-fastest.cfg ./cfg/yolo-fastest.weights ./data/person.jpg -i 1 -thresh 0.25 -out_filename ./data/person_output.jpg

CUDA status Error: file: c:\yolo-fastest\src\dark_cuda.c : cuda_set_device() : line: 39 : build time: Nov 24 2020 - 18:36:35
CUDA Error: invalid device ordinal

Can anyone help to fix this?

Yolo-fast 转ncnn 加载崩溃

自己训练的Yolo-Fast转换成.param，一旦执行load_param的时候就会崩溃，怎么办?

Android deployment

Can I use this project（Android_NCNN_yolov4-tiny） to deploy？

convert model to pb

I have tried : https://github.com/Linzmin1927/DW2TF

the error:

  File "/home/lebhoryi/RT-Thread/Detection_API/Yolo-Fastest/DW2TF/util/reader.py", line 37, in walk
    assert end_point <= self.size, 'Over-read {}'.format(self.path)
AssertionError: Over-read ../Yolo-Fastest/COCO/yolo-fastest.weights

how could I do it

请问到哪里找yolo-fastest.conv.109？

感谢您的技术分享！！
请问预训练weights yolo-fastest.conv.109要到哪里下载呢？
多谢！

How about pytorch?

Hi, qiuqiu,

Awesome!
Very you for the great project sharing!

I am facing a little trouble.

I want to replace yolov3-spp model in Alphapose.
however that convert by pytorch like this

elif  module_type == "shortcut":
    from_ = int(modules[i]["from"])
    x = outputs[i-1] + outputs[i+from_]
    outputs[i] = x             

elif module_type == 'yolo':        
                
anchors = self.module_list[i][0].anchors
#Get the input dimensions
inp_dim = int (self.net_info["height"])

#Get the number of classes
num_classes = int (modules[i]["classes"])

#Output the result
x = x.data.to(args.device)
x = predict_transform(x, inp_dim, anchors, num_classes, args)

that support layers like shortcut , yolo ...etc

but this project added dropout layer.
anyone have any idea to convert that?

please help, thanks!

请问如果我想训练自己的数据，需要怎么更改.cfg文件？

您好！

如果我想训练自己的数据集，需要如何更改.cfg文件？我看这个项目中的.cfg文件格式和darknet yolov4的还是不太一样，比如我的classes数量和预训练的模型不一样，我需要按什么规则更改呢？

您好，请问怎么训练RGB-D的数据呢？

您好，我想在RGB-D数据上训练yolo,请问要怎么去配置呢？

not fast enough on Raspberry pi 4

Thank you for releasing Yolo-Fastest! It is indeed the fastest model I have used on Raspberry pi.
However, my running time is about 100ms at 2 threads and more than 100ms at 4 threads. I checked the CPU usage(60% percent at most) and memory usage(below 30%), both of which are low.
I wonder what else I can do with the Raspberry pi 4 to reach 33ms with yolo-fastest.
Thanks a lot!

dog-qiuqiu / yolo-fastest Goto Github PK

yolo-fastest's Introduction

⚡Yolo-Fastest⚡

Evaluating indicator/Benchmark

Yolo-Fastest-1.1 Multi-platform benchmark

Pascal VOC performance index comparison

Yolo-Fastest-1.1 Pedestrian detection

Demo

Compile

How to compile on Linux

Test/Demo

Demo on image input

Demo on video input

Yolo-Fastest Test

Yolo-Fastest-xl Test

How to Train

Generate a pre-trained model for the initialization of the model backbone

Train

Deploy

NCNN

NCNN Conversion Tutorial

NCNN Sample

MNN&TNN&MNN

ONNX&TensorRT

OpenCV DNN

Thanks

Cite as

yolo-fastest's People

Contributors

Stargazers

Watchers

Forkers

yolo-fastest's Issues

Recommend Projects

Recommend Topics

Recommend Org