Code Monkey home page Code Monkey logo

Comments (13)

PatricLee avatar PatricLee commented on July 23, 2024 3

Hi @iraadit , sorry for the late reply.
I tried training a 960x320 network on my dataset the other day and it worked fine. It took fewer iterations to train (or at least it made me feel this way) and it has slightly higher accuracy than the 416x416 network I trained earlier, probably because 960x320 resolution is larger than 416x416.

But if you are in the same scenario as I am, where all the data have the same aspect ratio, then maybe Alexey is right, it makes no point that you train a non-square network that has the same aspect ratio as the data, instead of a square one.

from darknet.

AlexeyAB avatar AlexeyAB commented on July 23, 2024 2

BTW, I've also noticed that learning rate would be 10 times larger after 100 iterations, for example when I set learning rate to be 0.0001 like in the example, it automatically changes to 0.001 after 100 iterations and the network diverges. So I had to set learning rate to 0.00001 so that learning rate would be 0.0001 and the network worked just fine. Is it programmed this way?

"the network worked just fine" - It depends on the number of classes and the number of images. For PascalVOC seems optimal values in the yolo-voc.cfg

How it is programmed - see paragraph 5: #30 (comment)

If learning_rate = 0.0001, policy=steps, steps=100,25000,35000 and scales=10,.1,.1 then actual learning_rate will be:

  • [0 - 100] iterations learning_rate will be 0.0001
  • [100 - 25000] iterations learning_rate will be 0.001
  • [25000 - 35000] iterations learning_rate will be 0.0001
  • [35000 - ...] iterations learning_rate will be 0.00001

from darknet.

PatricLee avatar PatricLee commented on July 23, 2024

BTW, I've also noticed that learning rate would be 10 times larger after 100 iterations, for example when I set learning rate to be 0.0001 like in the example, it automatically changes to 0.001 after 100 iterations and the network diverges. So I had to set learning rate to 0.00001 so that learning rate would be 0.0001 and the network worked just fine. Is it programmed this way?

from darknet.

AlexeyAB avatar AlexeyAB commented on July 23, 2024

Yes, strictly speaking, the Recall should always be greater than (or equal to) IoU. But Yolo calculates average of the best IoUs instead of average IoU. And calculates True Positives instead of Recall.
That's why I advise you to pay attention to IoU (best IoU closer to IoU, than True Positive to Recall): https://github.com/AlexeyAB/darknet#when-should-i-stop-training

https://en.wikipedia.org/wiki/Precision_and_recall
68747470733a2f2f6873746f2e6f72672f66696c65732f6361382f3836362f6437362f63613838363664373666623834303232383934306462663434326137663036612e6a7067


Yolo calculates average of the best IoUs instead of average IoU. And calculates True Positives instead of Recall.

fprintf(stderr, "%5d %5d %5d\tRPs/Img: %.2f\tIOU: %.2f%%\tRecall:%.2f%%\n", i, correct, total, (float)proposals/(i+1), avg_iou*100/total, 100.*correct/total);

from darknet.

PatricLee avatar PatricLee commented on July 23, 2024

Well that's why my Recall curve looked so much like the true positive curve.

Thank you for you reply though, and your amazing work.

I've finished the training on VOC dataset, validated the network on VOC testing set and compared my result to yolo-voc.weights I downloaded. I noticed that although I'm getting about as many true positives and as much average IoU as the downloaded network, my network has noticeably more RPs/Img (about 160 vs 75), so there I have some questions:

  • Does this mean that my RPN part has not yet converged and require further training?
  • Will this (more region proposals per image) cause performance issue, like more time spent when detecting objects?

from darknet.

AlexeyAB avatar AlexeyAB commented on July 23, 2024

my network has noticeably more RPs/Img (about 160 vs 75), so there I have some questions:

Does this mean that my RPN part has not yet converged and require further training?

Hard to say. But also it may be some effect of the Bug on Windows that I corrected just that: 4422399

Will this (more region proposals per image) cause performance issue, like more time spent when detecting objects?

No, this should not significantly affect performance.

from darknet.

PatricLee avatar PatricLee commented on July 23, 2024

Thanks for the correction, Alexey, it seems to work... I couldn't tell for now though

One last question. Since I'm currently working on autonomous driving, my camera has a really wild angle and a weird aspect ratio of about 3:1, so,
-Is it possible to modify the input of the network so that the network also has a aspect ratio of 3:1 (say inputs would be 600x200)? And if it is, where do I have to modify except 'height' and 'width' in the .cfg file?
-Will this lead to a performance improvement (or greater IoU, to be more specifically) in my scenario, compared to network with 1:1 aspect ratio, like 416x416?

For now I'm getting an average IoU of about 65% on my data set, and that's not so good when it detects object for autonomous driving. I wonder if I could improve this somehow.

Again, thank you for your amazing work and amazing answers.

from darknet.

AlexeyAB avatar AlexeyAB commented on July 23, 2024

You can try to set width=608 and height=224

height=416

  1. It must always be a multiple of 32, such as 608x224, not 600x200
  2. I didn't test non-square resolution, so I can't said will there any bugs or undefined behavior.

I used Yolo to detection wide image (stitched 8 cameras) with wide-angle ~200, but I divide it to many 416x416 square images and run Yolo for each square-image on separate 4 GPU.

I think if your training-dataset has the same aspect ration 3:1 such as detection-dataset, then you should use square resolution 416x416.


To increase IoU:

  1. You can train Yolo with flag random=1

    random=0

  2. You can train Yolo with multiplied steps at number_of_classes/20, for example if you use 6 classes then steps=100,7500,10000

    steps=100,25000,35000

  3. For detection (not for training) you can use larger resolution, for example 832x832 by using weights-file trained on 416x416 resolution.
    (Or if you trained on resolution 608x224 then you can change resolution to 1216x448 after training).

  4. Also may be, for detection (not for training) you should rescale ahcnors from 16:9 to 3:1, i.e. divide each second value by 1.7, it should be anchors = 1.08,0.71, 3.42,2.59, 6.63,6.69, 9.42,3.00, 16.62,6.19:

    anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52

from darknet.

PatricLee avatar PatricLee commented on July 23, 2024

Thank you so much for your answer, I will try them out.

from darknet.

iraadit avatar iraadit commented on July 23, 2024

Hi @PatricLee
Have you tried to train with the non-square size? Was it working?

from darknet.

MyVanitar avatar MyVanitar commented on July 23, 2024

Also may be, for detection (not for training) you should rescale ahcnors from 16:9 to 3:1, i.e. divide each second value by 1.7, it should be anchors = 1.08,0.71, 3.42,2.59, 6.63,6.69, 9.42,3.00, 16.62,6.19:

Why we should not train the model with new calculated anchors?

I think if your training-dataset has the same aspect ration 3:1 such as detection-dataset, then you should use square resolution 416x416.

How can we calculate this when each image has its own width and height?
if you think it is a good idea, we can pad (add black area around image) and make them all have the same size (for example 960 * 960) and then start to annotate them.

from darknet.

Brandy24 avatar Brandy24 commented on July 23, 2024

i am training 5 classes on cpu(intel core i7-5500 2.4Ghz) 8gb ram. how many pictures should i train per classes to have good result? and how long will it take to finish?

from darknet.

stephanecharette avatar stephanecharette commented on July 23, 2024

i am training 5 classes on cpu(intel core i7-5500 2.4Ghz) 8gb ram. how many pictures should i train per classes to have good result? and how long will it take to finish?

Not sure why you chose this closed issue to post your question. But I would argue that you cannot possibly train a 5-class network on a CPU. It would take weeks if not months to train. Get yourself a decent GPU, or rent one from Amazon AWS, Linode, Google, Azure, etc...

See this recent post I made about a 2-class network. It took 4 hours to train a network with a GPU, but it would have taken 16 days on my 16-core 3.2 GHz CPU: https://www.ccoderun.ca/programming/2020-01-04_neural_network_training/

from darknet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.