Code Monkey home page Code Monkey logo

darknet's Introduction

YOLO modifications

  1. (Intro) What is Darknet project?
  2. (Intro) What is YOLO
  3. My modifications
    - Testing multiple images
    - Testing multiple thresholds
  4. FAQ

Warning: As the files .weights are large files, I am sharing them in my googledrive. The weight needed to run this tutorial can be downloaded here. Don't forget to copy this weight file to the folder /newdata/.


Darknet

Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation.

For more information see the Darknet project website.

For questions or issues please use the Google Group

YOLO

YOLO (You Only Live Look Once) is a real-time object detection and classification that obtained excellent results on the Pascal VOC dataset.

So far, YOLO has two versions: YOLO V1 and YOLO V2, also refered as Yolo 9000. Click on the image below to watch YOLO 9000's promo video.

The authors have created a website explaining how it works, how to use it and how to train yolo with your images. Check the references below:

YOLO: You Only Look Once: Unified, Readl-Time Object Detection (2016)
(Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi)
[official site]
[site] [pdf] [slides] [talk] [ted talk]

YOLO9000: Better, Faster, Stronger (2017)
(Joseph Redmon, Ali Farhadi)
[official site]
[site] [pdf] [talk] [slides]

YOLOv3: An Incremental Improvement (2018)
(Joseph Redmon, Ali Farhadi)
[official site]
[site] [pdf]

YOLO: People talking about it
[Andrew NG] [Siraj Raval]

YOLO: People writing about it (Explanations and codes)
[Towards data science]: A brief summary about yolo and how it works.
[Machine Think blog]: A brief summary about yolo and how it works.
[Timebutt's github]: A tutorial explaing how to train yolo 9000 to detect a single class object.
[Timebutt's github]: Read this if you want to understand yolo's training output -> Not everything is correct here. Be careful!
[Cvjena's github]: Comments of some of the tags used in the cfg files.
[Guanghan Ning's blog]: A tutorial explaining how to train yolo v1 with your own data. The author used two classes (yield and stop signs).   [AlexeyAB's github]: Very good project forked from yolo 9000 supporting Windows and Linux.
[Google's Group]: Excellent source of information. People ask and answer doubts about darknet and yolo.
[Guanghan Ning's blog]: Studies and analysis on reducing the running time of Yolo on CPU.
[Guanghan Ning's blog]: Recurrent YOLO. This is an interesting work mixing recurrent network and yolo for object tracking.
[Jonathan Hui]: One of the most detailed and correct explanations about YOLO V2.
[Ayoosh Kathuria]: What’s new in YOLO v3?

My modifications:

Recently I have forked the official darket project and modified it to attend my demands. Below you can find some additional functions I added to the original project.

All the examples can be easily run. You just need to clone or download this repository, compile and run the commands :)

Testing multiple images

Let's say you want to detect objects in a single or multiple images given a network structure and your weights file. Using this function you can also choose to visualize the results (images with bounding boxes) or save your results. You can save the detections (bounding boxes and classes) in .txt files and also save the resulting images.

Another good thing is that you don't need to pass the arguments in a specific order anyomore. This function makes the work easier by accepting the arguments in any order you want.

See the example below to detect multiple images:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg

Arguments:

  • newdata/voc.data: the path for your voc.data file. Your voc.data file must contain the following tags:
    • names: the path to the file containing a list of classes' names.
    • test: the path to the text file containing a list of images to be tested.
    • results: the path to the folder where your results will be saved.
      See here an example of the voc.data file.
  • newdata/yolo-voc.2.0.cfg: The configuration file that represents the backbone of the YOLO v2.
  • newdata/yolo-voc_final.weights: The pretrained weights used in this example. Due to its size (~268MB), this weight was not commited with this project, therefore you need to download it here and put it in the folder /newdata/.
  • -savetxt: this is an optional argument. If you add this argument, a text file will be created for each image containing the bounding boxes and classes detected. It will be saved in the results folder specified in the voc.data file.
  • -saveimg: this is also an optional argument. With this argument, the resulting images with the detected objects will be saved in the results folder specified in the voc.data file.
  • -threshold: also optional. The default value is 0.24. Only bounding boxes with higher or equal confidence will be considered.

The output detections will be seen as:

If you add the -saveimg and -savetxt arguments, the results (_dets.txt and .png files) will be created in the results folder specified in your newdata/voc.data file as seen below:

See below the content of an image and its corresponding txt file:

Each line of the _dets.txt file represents a bounding box. The values representing a bounding box are: id confidence relative_center_x relative_center_y relative_width relative height. The id represents the class order of the detected object that appears in the names tag in your newdata/voc.data file. The confidence represents in % how much sure YOLO is of that detection. Remember the threshold in the ./darknet testimages command? This confidence of the detected objects will always be equal or higher than the threshold you set.

But if you want to apply the detector to a single image, you need to add the argument -img followed by the image's path as shown in the example below:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -img newdata/images/000058.jpg -savetxt -saveimg

Add the argument -showimg if you want to visualize the resulting images as soon as the detector evaluates them. (Note: this feature requires openCV compilation. To do so, change the 3rd line of the Makefile to OPENCV=1 and recompile it). Example:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -img newdata/images/000058.jpg -showimg

Remeber that the arguments (file.data, network.cfg, file.weights, etc) do not have to follow an exact order. You can specify them in any position you want. :)

Threrefore the command:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg

is equivalent to:

./darknet testimages newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights newdata/voc.data -saveimg -savetxt

Testing multiple thresholds

Sometimes we need to test multiple images or just a single one with different threshold values.

Suppose you want to test your images with a range of threshold values starting at 30% going up to 100% with steps of 10%. In other words, you will be testing all the following 8 threshold values: 30%, 40%, 50%, 60%, 70%, 80%, 90% and 100%.

You don't need to run the ./darknet testimages 8 times for that and separate your results into folders. You just need to use the argument -thresh informing the initial threshold, incremental step and the final threshold.

The example below tests many threshold values (30%, 40%, 50%, 60%, 70%, 80%, 90% and 100%) on the image 000058.jpg.

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -img newdata/images/000058.jpg -savetxt -saveimg -thresh .30,.10,1

Attention: The 3 values after the -thresh argument must be separated by comma. That's how we read the argument -thresh .30,.10,1: Test threshold values starting at 30% (0.30), then an increment of 10% will be added (0.10) until it reaches 100% (1).

If somehow your steps reach a value higher than the final threshold, it won't be considered. Thus:

  • -thresh .45,.15,1 will test the thresholds: 45%, 60%, 75% and 90%
  • -thresh .50,.1,.85 will test the thresholds: 50%, 60%, 70% and 80%

You could also use the command ./darknet testimages with the tag -thresh to test multiple thresholds for multiple images as seen below:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg -thresh .45,.45,1

As already presented in this tutorial, the paths of images to be evaluated must be listed in a .txt file identified in the newdata/voc.data with the test tag. The output files (_dets.txt and .png files) will be generated in the results folder specified in your newdata/voc.data.

Because we are testing many thresholds, folders identifying each threshold will be created. All your output files will be added in their respective folder. The image below shows an example of the threshold folder structure created by adding the argument -thresh .20,.10,1:

Of course, the command ./darknet testimages also supports the -thresh argument with only one threshold. The example below shows how to test your images using a single threshold of 75%:

./darknet testimages newdata/voc.data newdata/yolo-voc.2.0.cfg newdata/yolo-voc_final.weights -savetxt -saveimg -thresh .75

FAQ YOLO

Question 1: What do those values mean during training?

Answer: During training the samples are divided into batches and the batches are grouped into subdivisions, which are set in the .cfg file. While darknet is training YOLO, statistics of the training are presented as shown in the image below:

The highlighted represents the training for a batch. In this example, each batch contains 64 images divided into 8 subdivisions. Thus, for this particular case each subdivision contains 8 images. Those values represent:

  • Loaded: 0.000036 seconds: Time to load the (64) images of this batch.
  • Region Avg IOU: 0.738649: For this particular subdivision (containing 8 images), the average IoU (Intersection over Union) is 73.86%.
  • Class: 0.931707: It is the average of the probabilities of the True Positives. In this case, 93.17% of the detected objects in this subdivision belong to their correct classes.
  • Obj: 0.562116: In YOLOV2, the image is divided into a 13x13 grid. Each cell of the grid "owns" 5 bounding boxes and there is a confidence score for each bounding box. The confidence score reflects how likely the box contains an object. In this example, our training is 56.21% confident that the bounding boxes that must contain an object (ground truth) actually detected an object in them. Higher values means your network is very confident that there is an object where it should be, implying your training is going well. A correct detection (True Positive) is considered if IOU between detected object and groundtruth is >= 50%.
    Observation: For YOLOV2, each bounding box contains 4 coordinates (x,y,w,h), plus its objectness, plus a probability for each class, summing 5+classes parameters per bounding box. Each cell has 5 bounding boxes. This way, it can detect up to 5 different objects of different classes in the same cell.
  • No Obj: 0.006090: It is the same idea as in Obj, but now it means that locations where there must be no object, our training also thinks the same. You should expect lower values here.
  • Avg Recall: 0.875000: In short words, recall means "out of the objects that should be detected, how many objects I have actually detected". Recall = TP/(TP + FN). Thus, recall is the division of the number of correctly detected objects by the total number of objects that should be detected. Avg Recall is the average recall of the (8) images of this subdivision. In this example, 87.5% of the all objects that should be detected were correctly detected.
  • count: 16: The total number of objects that were detected in this subdivision.

The last line gives statistics of the whole batch:

  • 80194: Number of batches trained so far.
    6.186997: Accumulated lost. The lower the better. You could use this as a reference to stop your training.
    6.712183 avg: The average loss (error) for this batch. We expect it to be the lowest as possible. You could also use this as a reference to stop your training.
    0.000010 rate: Learning rate used to update the weights while training this batch.
    2.998310 seconds: How long it took to train this batch.
    5132416 images: Number of images trained so far. With this we can calculate the epochs. It is easily verified that the images trained so far is the batch number (80194) times the size of the batch (64). Therefore: 80194*64 = 5132416.

darknet's People

Contributors

rafaelpadilla avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

darknet's Issues

save only class number

Hi thanks for useful program.
Your program save multiple images and result of detection.
Txt file includes id confidence relative_center_x relative_center_y relative_width relative height.
Can I save only id confidence ??

detect far placed objects and small objects ?

Hi.
I have trained for detection of helmets.
Most of the training images are around 1.5 meters from the camera.
Now during testing,the model detects helmets up to 1.5 meters. (cannot detect far placed objects)
Is there a way to change any parameters to allow yolo to detect helmets at around 10 or 15 meters?( or should i add training data set for each varying distance as well ?)
Is this disadvantage of my training data set or disadvantage yolo implementation ?
What parameters must be changed to detect far placed objects accurately?

Thank you.

Use yolov3.weights

Hello rafaelpadilla! I want to use yolov3.weights in your code!
I use cfg file yolov3.cfg, yolov3.weights in darknet directory in github.
And I edit coco.data like this
classes= 80
train = /home/pjreddie/data/coco/trainvalno5k.txt
valid = coco_testdev
#valid = data/coco_val_5k.list
names = data/obj.names
backup = /home/pjreddie/backup/
eval=coco
test = test/test.txt
results = test/results

I insert terminal in my directory
'./darknet testimages data/coco.data cfg/yolov3.cfg yolov3.weights -savetxt -saveimg'
but result is like this.

image
Can't I use yolov3.weights?

can i use yolo v3 with your modification?

hello, rafaelpadilla
first of all, thank you for your work because it's exactly what I want.
I am very new in this field. so I have little knowledge about computer science.

I want to get objected bounding box and re-label to re-train model.
As my first step, I need objected bounding box for re-labeling to re-train model.
(which exactly your work provides. thx again)

so I did 'git clone' your repo. and I modified Makefile from GPU=0 to 1 for using GPU.
and I did 'make'. then I did do your example.

It did look like working properly. However, it couldn't detect any objects like below.
default
can u tell me what did I wrong?

and next question is the same as the title.
I tried to do your example with yolov3.cfg & weight.
then I got an error like below.
image
is there any way to use yolo v3 with your modification?

I hope that you could help me. Thank you a lot again anyway.

Cannot load image Error: testing multiple images

Hello,
I'm trying to test a detector with my own dataset, consists of 1 class data.
It is very well work for single image(own) and example multiple images(yours). But as shown below a "cannot load image" error occurrence in my own multiple images.

./darknet testimages newdata/voc.data newdata/own.cfg newdata/own.weights -savetxt -saveimg
image

What do you think I did wrong?
Please help me. Thanks in advance.

really buggy when make

first i make with those errors

gcc -Iinclude/ -Isrc/ -DGPU -I/usr/local/cuda/include/ -DCUDNN  -Wall -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DGPU -DCUDNN -c ./src/utils.c -o obj/utils.o
./src/utils.c: 在函数‘split_threshold_ranges’中:
./src/utils.c:876:9: 警告:隐式声明函数‘isValidDouble’ [-Wimplicit-function-declaration]
         if ((isValidDouble(listChar[0], &res) == 0) || (res > 1 || res < 0))
         ^
./src/utils.c:929:9: 错误:只允许在 C99 模式下使用‘for’循环初始化声明
         for (int i = 0; i< count; i++)
         ^
编译因为 -Wfatal-errors 而中止。
make: *** [obj/utils.o] 错误 1

then i add -std=c99 on CFLAGS, then

gcc -Iinclude/ -Isrc/ -DGPU -I/usr/local/cuda/include/ -Wall -Wno-unknown-pragmas -Wfatal-errors -fPIC -std=c99 -Ofast -DGPU -c ./src/gemm.c -o obj/gemm.o
gcc -Iinclude/ -Isrc/ -DGPU -I/usr/local/cuda/include/ -Wall -Wno-unknown-pragmas -Wfatal-errors -fPIC -std=c99 -Ofast -DGPU -c ./src/utils.c -o obj/utils.o
./src/utils.c: 在函数‘what_time_is_it_now’中:
./src/utils.c:29:21: 错误:‘now’的存储大小未知
     struct timespec now;
                     ^
编译因为 -Wfatal-errors 而中止。
make: *** [obj/utils.o] 错误 1

Segmentation fault while processing multiple images

Hi,
I trained a model over custom dataset (using AlexeyAB code) and I tried to process a batch of images with this repo using this command -

./darknet testimages newdata/obj.data newdata/yolov3-tiny-obj.cfg newdata/yolov3-tiny-obj_4000.weights -savetxt -saveimg

But I get this error -

Arguments:
dataFile: newdata/obj.data
cfgFile: newdata/yolov3-tiny-obj.cfg
weightsFile: newdata/yolov3-tiny-obj_4000.weights
thresh: 0.250000
hier_thresh: 0.500000
filename: (null)
saveImArg: -saveimg
saveTxtArg: -savetxt
showimg: (null)
Segmentation fault (core dumped)

Please help, let me know if you need any more info.

Thanks!

Segmentation Fault

I tried running
./darknet testimages cfg/dota.data cfg/yolo-dota.cfg dota-backup/yolo-dota.cfg_450000.weights -img /home/shiva/work/DOTA_YOLOv2/data_transform/images/try.jpg -savetxt -saveimg

I got this error

Arguments:
dataFile: cfg/dota.data
cfgFile: cfg/yolo-dota.cfg
weightsFile: dota-backup/yolo-dota.cfg_450000.weights
default confidence thresh: 0.250000
hier_thresh: 0.500000
Segmentation fault (core dumped)

I am using Yolov2
Please help

bounding boxes cordinates help urgent

the bounding box file which is created contain 5 cordinates instead of 4 apart from the id. what does they represent? as u mentioned :
The values representing a bounding box are: id relative_center_x relative_center_y relative_width relative height.

but there are 5 instead of 4.
0 0.806619 0.345010 0.666073 0.062282 0.051545

what does they represent. I want to crop the image from the detected region. How can I do this?? Please help

Object detection using infrared camera.

Hi. i have an infrared camera and can create lot of custom training images and annotation files.

Will using infrared camera files be useful to detect objects or using jpg images be useful?

Thank you.

Can we implement infrared images to detect objects in yolo framework?

Export txt

I would like to know what src files have changed and how to export the coordinates ( yolo format)to a txt just as you have done

recipe for target 'obj/gemm.o' failed

gcc -Iinclude/ -Isrc/ -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/gemm.c -o obj/gemm.o
In file included from /usr/local/include/opencv2/core/core_c.h:48:0,
from /usr/local/include/opencv2/highgui/highgui_c.h:45,
from include/darknet.h:25,
from ./src/utils.h:5,
from ./src/gemm.c:2:
/usr/local/include/opencv2/core/types_c.h: In function ‘cvIplImage’:
/usr/local/include/opencv2/core/types_c.h:370:12: error: incompatible types when returning type ‘int’ but ‘IplImage {aka struct _IplImage}’ was expected
return _IplImage();
^~~~~~~~~~~
compilation terminated due to -Wfatal-errors.
Makefile:85: recipe for target 'obj/gemm.o' failed
make: *** [obj/gemm.o] Error 1

Building the darknet with 'make' in Linux

Hi Rafael,

I'm trying to build your custom darknet repos after having downloaded it with git clone, by using the command 'make' . (running in Linux):
I get the following : "
mkdir -p obj
mkdir -p results
gcc -Iinclude/ -Isrc/ -DOPENCV pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/gemm.c -o obj/gemm.o
Package opencv was not found in the pkg-config search path.
Perhaps you should add the directory containing `opencv.pc'
to the PKG_CONFIG_PATH environment variable
No package 'opencv' found
In file included from ./src/utils.h:5,
from ./src/gemm.c:2:
include/darknet.h:25:14: fatal error: opencv2/highgui/highgui_c.h: No such file or directory
25 | #include "opencv2/highgui/highgui_c.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
make: *** [Makefile:85: obj/gemm.o] Error 1
"

How can I resolve this?

Thx

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.