Code Monkey home page Code Monkey logo

Comments (10)

yeliudev avatar yeliudev commented on August 20, 2024 1

Please try this.

from consnet.api import load_anno

img_name = 'HICO_train2015_00000001.jpg'
img_id = int(img_name[-12:-4])

anno = load_anno('<path-to-anno_bbox.mat>', 'train')
hoi_idx = anno[anno[:, 0] == img_id, 1].int().tolist()

from consnet.

yeliudev avatar yeliudev commented on August 20, 2024 1

I also tried another image which has very weird results
Screenshot from 2021-04-08 18-43-18

feh data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py 
[133, 133, 139, 139]
--- 43.073837995529175 seconds ---
133  horse           hug           
134  horse           jump          
135  horse           kiss          
136  horse           load          
137  horse           hop_on        
138  horse           pet           
139  horse           race      

from hico_list_hoi.txt
as you see there is no horse involved.

Here's the api_demo.py code:

import time
start_time = time.time()    
from consnet.api import load_anno

img_name = 'data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg'
img_id = int(img_name[-12:-4])

anno = load_anno('data/hico_20160224_det/anno_bbox.mat', 'train')
hoi_idx = anno[anno[:, 0] == img_id, 1].int().tolist()
print(hoi_idx)
print("--- %s seconds ---" % (time.time() - start_time))

I think 40s is a very long time for a lookup.

You are fetching the annotations of test split, thus the second argument of load_anno method should be 'test'.

It does not take such a long time for the loop in our own cases, is the most time consuming part comes from load_anno? This method is only needed to be called once for all images.

from consnet.

monacv avatar monacv commented on August 20, 2024 1

*I just wanted to update that you were correct regarding time and running it for all the train images only took 114 seconds :)

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py
--- 114.5227403640747 seconds ---

from consnet.

yeliudev avatar yeliudev commented on August 20, 2024 1

so when I run the matlab code that comes with the original HICO_DET dataset sometimes the results are different (thanks for catching my mistake about load_anno).

Here, I am loading train load_anno, and

$ python api_demo.py
[56, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61]
--- 7.9999425411224365 seconds ---

While I get drive bus from your results, I get other results too, do you know why and why your results is different from original matlab code -- I personally accept that if you are "driving a bus", you are also "riding a bus", and also "sitting on bus".
055 bus board
056 bus direct
057 bus drive
058 bus exit
059 bus inspect
060 bus load
061 bus ride
062 bus sit_on
063 bus wash
064 bus wave

while the matlab is yielding this result:
Screenshot from 2021-04-08 22-58-14

Here's the original matlab code:

im_root   = '../images/';
bbox_file = '../anno_bbox.mat';

ld = load(bbox_file);
bbox_train = ld.bbox_train;
bbox_test = ld.bbox_test;
list_action = ld.list_action;

% change this
i = 765;  % image index
j = 1;    % hoi index

% read image
im_file = [im_root 'train2015/' bbox_train(i).filename];
im = imread(im_file);

% display image
figure(1);
imshow(im); hold on;

% display hoi
hoi_id = bbox_train(i).hoi(j).id;
aname = [list_action(hoi_id).vname_ing ' ' list_action(hoi_id).nname];
aname = strrep(aname,'_',' ');
title(aname);

% display bbox
if bbox_train(i).hoi(j).invis
    fprintf('hoi not visible\n');
else
    bboxhuman  = bbox_train(i).hoi(j).bboxhuman;
    bboxobject = bbox_train(i).hoi(j).bboxobject;
    connection = bbox_train(i).hoi(j).connection;
    visualize_box_conn_one(bboxhuman, bboxobject, connection, 'b','g');
end

and here's the image name I used: data/hico_20160224_det/images/train2015/HICO_train2015_00000765.jpg

Both our API and the MATLAB code are correct :) In HICO-DET, a single image may contain multiple HOIs (i.e. multiple human-object pairs performing different actions). There are 23 HOI instances in total in HICO_train2015_00000765.jpg, which can be seen in the anno_bbox.mat. But the MATLAB code you provided just prints the first one.

Please also note that the API outputs hoi_idx (0 ~ 599) instead of hoi_id (1 ~ 600), so the three HOIs should be 56 - direct bus, 60 - ride bus and 61 - sit_on bus.

from consnet.

monacv avatar monacv commented on August 20, 2024

thanks a lot, so I ran the following and it took quite a while which means it would be very long if I want to run it for all images
also not sure why it also has "eat cake" as an annotation? There is no eating in this photo. Is there a way I could acceleta this for all images?
data/hico_20160224_det/images/train2015/HICO_train2015_00001549.jpg
Screenshot from 2021-04-08 18-28-08

from consnet.

monacv avatar monacv commented on August 20, 2024

I also tried another image which has very weird results
Screenshot from 2021-04-08 18-43-18

feh data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py 
[133, 133, 139, 139]
--- 43.073837995529175 seconds ---
133  horse           hug           
134  horse           jump          
135  horse           kiss          
136  horse           load          
137  horse           hop_on        
138  horse           pet           
139  horse           race      



from hico_list_hoi.txt
as you see there is no horse involved.

Here's the api_demo.py code:

import time
start_time = time.time()    
from consnet.api import load_anno

img_name = 'data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg'
img_id = int(img_name[-12:-4])

anno = load_anno('data/hico_20160224_det/anno_bbox.mat', 'train')
hoi_idx = anno[anno[:, 0] == img_id, 1].int().tolist()
print(hoi_idx)
print("--- %s seconds ---" % (time.time() - start_time))

I think 40s is a very long time for a lookup.

from consnet.

yeliudev avatar yeliudev commented on August 20, 2024

thanks a lot, so I ran the following and it took quite a while which means it would be very long if I want to run it for all images
also not sure why it also has "eat cake" as an annotation? There is no eating in this photo. Is there a way I could acceleta this for all images?
data/hico_20160224_det/images/train2015/HICO_train2015_00001549.jpg
Screenshot from 2021-04-08 18-28-08

I think this may be the problem of HICO-DET's original annotation. Please double check it in anno_bbox.mat.

from consnet.

monacv avatar monacv commented on August 20, 2024

so when I run the matlab code that comes with the original HICO_DET dataset sometimes the results are different (thanks for catching my mistake about load_anno).

Here, I am loading train load_anno, and

$ python api_demo.py
[56, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61]
--- 7.9999425411224365 seconds ---

While I get drive bus from your results, I get other results too, do you know why and why your results is different from original matlab code -- I personally accept that if you are "driving a bus", you are also "riding a bus", and also "sitting on bus".
055 bus board
056 bus direct
057 bus drive
058 bus exit
059 bus inspect
060 bus load
061 bus ride
062 bus sit_on
063 bus wash
064 bus wave

while the matlab is yielding this result:
Screenshot from 2021-04-08 22-58-14

Here's the original matlab code:

im_root   = '../images/';
bbox_file = '../anno_bbox.mat';

ld = load(bbox_file);
bbox_train = ld.bbox_train;
bbox_test = ld.bbox_test;
list_action = ld.list_action;

% change this
i = 765;  % image index
j = 1;    % hoi index

% read image
im_file = [im_root 'train2015/' bbox_train(i).filename];
im = imread(im_file);

% display image
figure(1);
imshow(im); hold on;

% display hoi
hoi_id = bbox_train(i).hoi(j).id;
aname = [list_action(hoi_id).vname_ing ' ' list_action(hoi_id).nname];
aname = strrep(aname,'_',' ');
title(aname);

% display bbox
if bbox_train(i).hoi(j).invis
    fprintf('hoi not visible\n');
else
    bboxhuman  = bbox_train(i).hoi(j).bboxhuman;
    bboxobject = bbox_train(i).hoi(j).bboxobject;
    connection = bbox_train(i).hoi(j).connection;
    visualize_box_conn_one(bboxhuman, bboxobject, connection, 'b','g');
end

and here's the image name I used: data/hico_20160224_det/images/train2015/HICO_train2015_00000765.jpg

from consnet.

monacv avatar monacv commented on August 20, 2024

regarding the cake, I made a mistake. Since I didn't check the one off in yours.
so basically yours return "carry cake", and "hold cake" which basically makes sense but the original matlab code returns only "carry cake". If you could shed any light on why the original code doesn't return "hold cake" would be really great. Thanks for the nice API.

Screenshot from 2021-04-08 23-03-06

from consnet.

yeliudev avatar yeliudev commented on August 20, 2024

*I just wanted to update that you were correct regarding time and running it for all the train images only took 114 seconds :)

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py
--- 114.5227403640747 seconds ---

Thanks for reporting this :)

from consnet.

Related Issues (15)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.