Code Monkey home page Code Monkey logo

Comments (2)

ayooshkathuria avatar ayooshkathuria commented on August 25, 2024 1

Well, I think I can help you out a bit.

First of all, in detect.py there's a variable called imlist which stores the paths of the images. Keep note of index of each image.

Then, in the code, at around line 250, precisely after the line

output[:,1:5] *= im_dim_list

There is a variable output, a 2-d tensor which contains the details of each detection per row. Here is what the output looks like for default imgs directory.

Columns 0 to 6 
    0.0000   118.2249   119.4276   572.4760   429.8945     0.9985     0.9982
    0.0000   477.6349    84.2408   687.1730   170.4992     0.9546     0.8620
    0.0000   120.2748   207.8708   313.6242   541.9991     0.9987     0.9932
    1.0000    44.5722    16.6605   798.5769   725.9565     0.9994     1.0000
    1.0000   632.7744   117.8598  1144.7529   718.2219     0.9976     1.0000
    1.0000  1193.0134   475.0443  1287.9580   684.4681     0.9952     0.9997
    1.0000   373.5918   380.6013   445.9761   505.4237     0.5078     0.8786
    3.0000   130.2477    78.4422   606.2959   447.8753     0.9999     0.9969
    4.0000   189.4601    88.3928   274.2381   367.9490     0.9989     0.9999
    4.0000    68.5351   266.8620   204.5787   348.1553     0.9984     0.9985
    4.0000   392.5365   139.0717   600.5876   344.8741     0.9991     0.9959
    5.0000   161.6265   261.8117   252.6919   369.4432     0.9814     0.9979
    5.0000     3.5819   243.2121    68.0432   373.8014     0.9427     0.9748
    5.0000   456.5395   148.8004   474.4457   168.4535     0.6563     0.9994
    6.0000     0.0000   186.0073   376.2489   404.3224     0.9967     0.9982
    6.0000   435.6129   213.8517   598.8758   345.5339     0.9901     0.8957
    6.0000   230.9883   180.6014   460.7431   358.8419     0.7936     0.9951
    7.0000   231.7281   329.7414   333.4314   375.6737     0.9992     0.9991
    7.0000    15.5507   310.3270    81.2751   364.0855     0.9973     0.5676
    7.0000   362.2694   331.0900   497.7398   388.7157     0.9948     0.9307
    7.0000   172.1066   327.6013   254.0974   365.0104     0.9873     0.9853
    7.0000   139.6422   323.4359   190.3763   357.8081     0.9847     0.9893
    7.0000    83.0906   328.0125   112.1545   348.0191     0.8918     0.9972
    7.0000   106.8326   326.4905   134.6070   351.3198     0.7799     0.9954
    7.0000     2.4614   322.7372    13.5060   341.0310     0.6313     0.9664
    7.0000    18.7216   311.1300    84.3383   362.4027     0.9951     0.6338
    7.0000   381.9297   107.4763   408.2099   167.2793     0.9309     0.9999
    8.0000   138.5508   196.7630   209.1340   295.7622     0.9964     0.9993
    9.0000    17.9466     0.0000   353.0000   500.0000     0.9902     0.9948
    9.0000    56.6002   225.1510   190.4027   360.4565     0.9786     0.9925
   10.0000   247.1123   192.6713   426.7067   451.9630     0.9985     0.9464
   10.0000   145.4339    37.6159   459.5411   421.7098     0.9981     0.9988

Columns 7 to 7 
    1.0000
    7.0000
   16.0000
    0.0000
    0.0000
    0.0000
   29.0000
   14.0000
    0.0000
   16.0000
   17.0000
   56.0000
   56.0000
   74.0000
   17.0000
   17.0000
   17.0000
    2.0000
    2.0000
    2.0000
    2.0000
    2.0000
    2.0000
    2.0000
    2.0000
    7.0000
    9.0000
    6.0000
    0.0000
   16.0000
   22.0000
   23.0000
[torch.FloatTensor of size (32,8)]

You see, the column number 0 represents the index of the image in imlist, the columns 1-4 represent the corner co-ordinates of that bounding box in form (x1, y1, x2, y2) and the last column is the object class of that detection.

Given you have the co-ordinates for the bounding boxes, just slice the array containing the image and do your thing.

from yolo_v3_tutorial_from_scratch.

P-Lubinski avatar P-Lubinski commented on August 25, 2024

Great! Thanks a lot for a quick response!

from yolo_v3_tutorial_from_scratch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.