Code Monkey home page Code Monkey logo

yolov7's Introduction

Official YOLOv7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

PWC Hugging Face Spaces Open In Colab arxiv.org

Web Demo

Performance

MS COCO

Model Test Size APtest AP50test AP75test batch 1 fps batch 32 average time
YOLOv7 640 51.4% 69.7% 55.9% 161 fps 2.8 ms
YOLOv7-X 640 53.1% 71.2% 57.8% 114 fps 4.3 ms
YOLOv7-W6 1280 54.9% 72.6% 60.1% 84 fps 7.6 ms
YOLOv7-E6 1280 56.0% 73.5% 61.2% 56 fps 12.3 ms
YOLOv7-D6 1280 56.6% 74.0% 61.8% 44 fps 15.0 ms
YOLOv7-E6E 1280 56.8% 74.4% 62.1% 36 fps 18.7 ms

Installation

Docker environment (recommended)

Expand
# create the docker container, you can change the share memory size if you have more.
nvidia-docker run --name yolov7 -it -v your_coco_path/:/coco/ -v your_code_path/:/yolov7 --shm-size=64g nvcr.io/nvidia/pytorch:21.08-py3

# apt install required packages
apt update
apt install -y zip htop screen libgl1-mesa-glx

# pip install required packages
pip install seaborn thop

# go to code folder
cd /yolov7

Testing

yolov7.pt yolov7x.pt yolov7-w6.pt yolov7-e6.pt yolov7-d6.pt yolov7-e6e.pt

python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val

You will get the results:

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.51206
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.69730
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.55521
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.35247
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.55937
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.66693
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.38453
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.63765
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.68772
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.53766
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.73549
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.83868

To measure accuracy, download COCO-annotations for Pycocotools to the ./coco/annotations/instances_val2017.json

Training

Data preparation

bash scripts/get_coco.sh
  • Download MS COCO dataset images (train, val, test) and labels. If you have previously used a different version of YOLO, we strongly recommend that you delete train2017.cache and val2017.cache files, and redownload labels

Single GPU training

# train p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

# train p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml

Multiple GPU training

# train p5 models
python -m torch.distributed.launch --nproc_per_node 4 --master_port 9527 train.py --workers 8 --device 0,1,2,3 --sync-bn --batch-size 128 --data data/coco.yaml --img 640 640 --cfg cfg/training/yolov7.yaml --weights '' --name yolov7 --hyp data/hyp.scratch.p5.yaml

# train p6 models
python -m torch.distributed.launch --nproc_per_node 8 --master_port 9527 train_aux.py --workers 8 --device 0,1,2,3,4,5,6,7 --sync-bn --batch-size 128 --data data/coco.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6.yaml --weights '' --name yolov7-w6 --hyp data/hyp.scratch.p6.yaml

Transfer learning

yolov7_training.pt yolov7x_training.pt yolov7-w6_training.pt yolov7-e6_training.pt yolov7-d6_training.pt yolov7-e6e_training.pt

Single GPU finetuning for custom dataset

# finetune p5 models
python train.py --workers 8 --device 0 --batch-size 32 --data data/custom.yaml --img 640 640 --cfg cfg/training/yolov7-custom.yaml --weights 'yolov7_training.pt' --name yolov7-custom --hyp data/hyp.scratch.custom.yaml

# finetune p6 models
python train_aux.py --workers 8 --device 0 --batch-size 16 --data data/custom.yaml --img 1280 1280 --cfg cfg/training/yolov7-w6-custom.yaml --weights 'yolov7-w6_training.pt' --name yolov7-w6-custom --hyp data/hyp.scratch.custom.yaml

Re-parameterization

See reparameterization.ipynb

Inference

On video:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source yourvideo.mp4

On image:

python detect.py --weights yolov7.pt --conf 0.25 --img-size 640 --source inference/images/horses.jpg

Export

Pytorch to CoreML (and inference on MacOS/iOS) Open In Colab

Pytorch to ONNX with NMS (and inference) Open In Colab

python export.py --weights yolov7-tiny.pt --grid --end2end --simplify \
        --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640 --max-wh 640

Pytorch to TensorRT with NMS (and inference) Open In Colab

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights ./yolov7-tiny.pt --grid --end2end --simplify --topk-all 100 --iou-thres 0.65 --conf-thres 0.35 --img-size 640 640
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16

Pytorch to TensorRT another way Open In Colab

Expand

wget https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt
python export.py --weights yolov7-tiny.pt --grid --include-nms
git clone https://github.com/Linaom1214/tensorrt-python.git
python ./tensorrt-python/export.py -o yolov7-tiny.onnx -e yolov7-tiny-nms.trt -p fp16

# Or use trtexec to convert ONNX to TensorRT engine
/usr/src/tensorrt/bin/trtexec --onnx=yolov7-tiny.onnx --saveEngine=yolov7-tiny-nms.trt --fp16

Tested with: Python 3.7.13, Pytorch 1.12.0+cu113

Pose estimation

code yolov7-w6-pose.pt

See keypoint.ipynb.

Instance segmentation (with NTU)

code yolov7-mask.pt

See instance.ipynb.

Instance segmentation

code yolov7-seg.pt

YOLOv7 for instance segmentation (YOLOR + YOLOv5 + YOLACT)

Model Test Size APbox AP50box AP75box APmask AP50mask AP75mask
YOLOv7-seg 640 51.4% 69.4% 55.8% 41.5% 65.5% 43.7%

Anchor free detection head

code yolov7-u6.pt

YOLOv7 with decoupled TAL head (YOLOR + YOLOv5 + YOLOv6)

Model Test Size APval AP50val AP75val
YOLOv7-u6 640 52.6% 69.7% 57.3%

Citation

@inproceedings{wang2023yolov7,
  title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
  author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2023}
}
@article{wang2023designing,
  title={Designing Network Design Strategies Through Gradient Path Analysis},
  author={Wang, Chien-Yao and Liao, Hong-Yuan Mark and Yeh, I-Hau},
  journal={Journal of Information Science and Engineering},
  year={2023}
}

Teaser

YOLOv7-semantic & YOLOv7-panoptic & YOLOv7-caption

YOLOv7-semantic & YOLOv7-detection & YOLOv7-depth (with NTUT)

YOLOv7-3d-detection & YOLOv7-lidar & YOLOv7-road (with NTUT)

Acknowledgements

Expand

yolov7's People

Contributors

ak391 avatar akashad98 avatar alexeyab avatar alexeysi avatar dhiaeddine-oussayed avatar dmlon avatar greatv avatar hran2004 avatar hungtrieu07 avatar ian321 avatar jpkoponen avatar kadirnar avatar kayce001 avatar kivanctezoren avatar ksnzh avatar linaom1214 avatar linghu8812 avatar m-gangloff avatar mkhoshbin72 avatar philipp-schmidt avatar raymondben avatar rohanpatankar926 avatar samsamhuns avatar sashaalderson avatar spacewalk01 avatar superfast852 avatar taka-wang avatar triple-mu avatar wongkinyiu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolov7's Issues

About the architecture

Hi, I tried to plot the yolov7 backbone block.
Could you help me to check the figure is correct or not?

Thanks, a lot.

YOLOv7-Tiny
yolov7_tiny

YOLOv7
yolov7

YOLOv7-d6
yolov7_d6

YOLOv7-e6
yolov7_e6

YOLOv7-e6e
yolov7_e6e

Dataset cache incompatible with YOLOv5 cache

As usual, when starting a new training, YOLOv7 will try to spawn train.cache/valid.cache.
This is the same name as YOLOv5 cache which, if already exists, will crash the training.

For present and future compatibility, could it be something with some sort of identifier? Example: "train.yolov7", "train.cachev7", etc.
Just a reminder that, for YOLOR, it's "train.cache3".

The speed of the tiny model.

Hi,
I tested the tiny model on a CPU, it turned on approximately 350ms per image, at the same time, I tested the nano version of YOLOv5 on the same CPU and it was two times faster than YOLOv7-tiny. However, in the paper, it's specified that YOLOv7-tiny is approximately two times faster than YOLOv5-N when executed on a GPU. Is that normal?
Many thanks!

redundant code

Thanks for the great work. I saw so many codes that look like YOLOV5, the author is still cleaning?

Please provide a full set of 640 scale training models, and don't do curve comparison.

Please provide a full set of 640 scale training models, and don't do curve comparison.
There are many unfair comparisons in your form, which will not show you to be superior, but will only mislead readers and make readers more confused.
請提供全套640尺度的訓練模型,不要做曲線對比。
你的表格存在很多不公平的對比,這並不會顯得你很優越,只會誤導讀者,讓讀者更困惑。

Issue using the webcam

Hi @WongKinYiu - Excited to try the implementation !
I can't use the webcam, actually the function LoadWebcam is not called at the detect file, only LoadStreams is called.
One more thing, could your provide please the tiny-pretrained model?
Thank you so much!

TRT errors on concatentation route_15

Total weights read: 37669889
Building YOLO network

      layer                        input               output         weightPtr
(0)   conv_silu                  3 x 640 x 640      32 x 640 x 640    992    
(1)   conv_silu                 32 x 640 x 640      64 x 320 x 320    19680  
(2)   conv_silu                 64 x 320 x 320      64 x 320 x 320    56800  
(3)   conv_silu                 64 x 320 x 320     128 x 160 x 160    131040 
(4)   conv_silu                128 x 160 x 160      64 x 160 x 160    139488 
(5)   conv_silu                 64 x 160 x 160      64 x 160 x 160    143840 
(6)   conv_silu                 64 x 160 x 160      64 x 160 x 160    180960 
(7)   conv_silu                 64 x 160 x 160      64 x 160 x 160    218080 
(8)   conv_silu                 64 x 160 x 160      64 x 160 x 160    255200 
(9)   conv_silu                 64 x 160 x 160      64 x 160 x 160    292320 
(10)  route                           -            128 x 160 x 160    292320 
(11)  conv_silu                128 x 160 x 160     256 x 160 x 160    326112 
(12)  conv_silu                256 x 160 x 160     128 x 160 x 160    359392 
(13)  conv_silu                128 x 160 x 160     128 x 160 x 160    376288 
(14)  conv_silu                128 x 160 x 160     128 x  80 x  80    524256 
ERROR: [TRT]: 4: [layers.cpp::estimateOutputDims::1944] Error Code 4: Internal Error (route_15: all concat input tensors must have the same dimensions except on the concatenation axis (0), but dimensions mismatched at index 1. Input 0 shape: [128,80,80], Input 1 shape: [128,160,160])

I tried my hand at generating darknet weights for yolov7.pt but I am blocked getting it converted to TensorRT. Any guidance on this issue?

Classification label

t[range(n), tcls[i]] = self.cp
#t[t==self.cp] = iou.detach().clamp(0).type(t.dtype)

Hello, I'm glad to see the release of yolov7. Using self.cp instead of using IOU value as classification label, is it because it will lead to performance degradation in your experiment?

Distributed training(DDP)

why write here parser.add_argument('--local_rank', type=int, default=-1, help='DDP parameter, do not modify'),if i want to use DDP, should i change to 0

matching_bs[i] = torch.cat(matching_bs[i], dim=0)

File "/root/yolov7-main/utils/loss.py", line 778, in build_targets
matching_bs[i] = torch.cat(matching_bs[i], dim=0)
NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

wow

When will the training be released

REORG operation may not be needed

REORG operation may not be needed, as it has been demonstrated in the V 5 work, with better efficiency through a Conv[64, 6, 2, 2]]

attempt_download ERROR

assets = [x['name'] for x in response['assets']] # release assets
KeyError: 'assets'

\utils\google_utils.py", line 26, in attempt_download

onnx->tensorrt

when onnx_to_tensorrt
In node 163 (parseGraph): INVALID_GRAPH: Assertion failed: ctx->tensors().count(inputName)

onnx is from models/export.py.
my testing model is pretrained yolov7.pt

More loss function information

Can you provide additional information about the complete composite loss function formula? The paper does not provide enough information.

CUDA out of memory on Colab

I am running inference on a video in google colab, and after processing some frames, I am getting CUDA out of memory. @WongKinYiu
Anyways, thanks for the great work

Train custom dataset

Hi,

Is the repo ready for training on custom dataset?

I am trying to find implementation of Coarse-to-fine lead head guided label assigner in the repo and can anyone point out where is this portion located?

Yolov7-tiny version in yaml

Hi,

Is there available the .yaml version of the yolov7-tiny.cfg?

I'd like to train it with my custom data using PyTorch. Already tested with darknet and is working fine

Setup or installation instructions

Hi, I am wondering if there will be some notes regarding the setup or installation of the official YOLOv7, preferrably for installation with Docker? Would be great if some material related to this can be added to the repo.

good job !!!

I want to ask why P is always much lower than R during training?
Are these two values weighted during training?

Thank you for this!! I dub this the official YOLOv7

I really appreciate your guys contribution! It is backed by an official Peer reviewed paper - It also has reputable authors. I wish we could have an official board to regulate YOLO models. Looking forward to making a course on this model!

SOTA claims vs leaderboards mismalignment

@WongKinYiu @AlexeyAB
Hi friendly pings

YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56.8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.

Weird claim when you actually rank #20 on COCO
If we exclude all models with extra training data you still rank #11.
the #1 without extra data is Dual-Swin-L(HTC, multi-scale), with 60.1 box AP
with extra data it is DINO(Swim-L,multi-scale) with 63.3 box AP

what is the INPUT_BLOB_NAME and OUTPUT_BLOB_NAME, when trensorrt forward?

yolov5 is:
const char* INPUT_BLOB_NAME = "data";
const char* OUTPUT_BLOB_NAME = "prob";

yolov6 is:
const char* INPUT_BLOB_NAME = "image_arrays";
const char* OUTPUT_BLOB_NAME = "outputs";

yolov7 input is images and output is putput
const char* INPUT_BLOB_NAME = "images";
const char* OUTPUT_BLOB_NAME = "output";

yolov7, use the above, all is -1, as below:
IRuntime* runtime = createInferRuntime(gLogger);
assert(runtime != nullptr);
ICudaEngine* engine = runtime->deserializeCudaEngine(trtModelStream, size);
assert(engine != nullptr);
IExecutionContext* context = engine->createExecutionContext();
assert(context != nullptr);
delete[] trtModelStream;
const ICudaEngine& engine = context.getEngine();
const int inputIndex = engine.getBindingIndex(INPUT_BLOB_NAME); //return -1
const int outputIndex = engine.getBindingIndex(OUTPUT_BLOB_NAME); //return -1

error on train

I try to train on my dataset (good worked on yolov5) and got error after several steps:

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
       0/4     1.41G   0.04262     2.882         0     2.925         1       640:   7%| | 3/41 [00:06<01:25,  2.24s/i
Traceback (most recent call last):
  File "./train.py", line 609, in <module>
    train(hyp, opt, device, tb_writer)
  File "./train.py", line 362, in train
    loss, loss_items = compute_loss_ota(pred, targets.to(device), imgs)  # loss scaled by batch_size
  File "/mnt/HD0/projects/yolov7/yolov7/utils/loss.py", line 585, in __call__
    bs, as_, gjs, gis, targets, anchors = self.build_targets(p, targets, imgs)
  File "/mnt/HD0/projects/yolov7/yolov7/utils/loss.py", line 778, in build_targets
    matching_bs[i] = torch.cat(matching_bs[i], dim=0)
NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function.  Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Python, Named, Conjugate, Negative, ZeroTensor, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradLazy, AutogradXPU, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, AutocastCPU, Autocast, Batched, VmapMode, Functionalize].

How I can fix this?
Tried on GPU and CPU.
For CPU I had to fix some code at 71 line of train.py:

run_id = torch.load(weights, map_location=device).get('wandb_id') if weights.endswith('.pt') and os.path.isfile(weights) else None

parameter ", map_location=device" was added.

About yolov7 Model's Configuration?

Firstly, It's a great work derived from YOLO series, but I notice that the 'cfg' fold in your repo have a 'deploy' direction with various v7's model configuration files, are they the model's cfg after re-parameterization and different from training models?

Does it run on coral TPU

What model can be runned in Google Coral TPU Accelerator?
Is it there a snip code to transform into model_quantized.tflite?

1

RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat. This usually means that this function requires a non-empty list of Tensors. Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradNestedTensor, UNKNOWN_TENSOR_TYPE_ID, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].
image

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.