Given an image name from HICO_DET, how can I retreive its HOI_idx using ConsNet API? about consnet HOT 10 CLOSED

yeliudev commented on August 20, 2024

Given an image name from HICO_DET, how can I retreive its HOI_idx using ConsNet API?

from consnet.

Comments (10)

yeliudev commented on August 20, 2024 1

Please try this.

from consnet.api import load_anno

img_name = 'HICO_train2015_00000001.jpg'
img_id = int(img_name[-12:-4])

anno = load_anno('<path-to-anno_bbox.mat>', 'train')
hoi_idx = anno[anno[:, 0] == img_id, 1].int().tolist()

from consnet.

yeliudev commented on August 20, 2024 1

I also tried another image which has very weird results

feh data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py 
[133, 133, 139, 139]
--- 43.073837995529175 seconds ---

133  horse           hug           
134  horse           jump          
135  horse           kiss          
136  horse           load          
137  horse           hop_on        
138  horse           pet           
139  horse           race

from hico_list_hoi.txt
as you see there is no horse involved.

Here's the api_demo.py code:

import time
start_time = time.time()    
from consnet.api import load_anno

img_name = 'data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg'
img_id = int(img_name[-12:-4])

anno = load_anno('data/hico_20160224_det/anno_bbox.mat', 'train')
hoi_idx = anno[anno[:, 0] == img_id, 1].int().tolist()
print(hoi_idx)
print("--- %s seconds ---" % (time.time() - start_time))

I think 40s is a very long time for a lookup.

You are fetching the annotations of test split, thus the second argument of load_anno method should be 'test'.

It does not take such a long time for the loop in our own cases, is the most time consuming part comes from load_anno? This method is only needed to be called once for all images.

from consnet.

monacv commented on August 20, 2024 1

*I just wanted to update that you were correct regarding time and running it for all the train images only took 114 seconds :)

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py
--- 114.5227403640747 seconds ---

from consnet.

yeliudev commented on August 20, 2024 1

so when I run the matlab code that comes with the original HICO_DET dataset sometimes the results are different (thanks for catching my mistake about load_anno).

Here, I am loading train load_anno, and

$ python api_demo.py
[56, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61]
--- 7.9999425411224365 seconds ---

While I get drive bus from your results, I get other results too, do you know why and why your results is different from original matlab code -- I personally accept that if you are "driving a bus", you are also "riding a bus", and also "sitting on bus".
055 bus board
056 bus direct
057 bus drive
058 bus exit
059 bus inspect
060 bus load
061 bus ride
062 bus sit_on
063 bus wash
064 bus wave

while the matlab is yielding this result:

Here's the original matlab code:
im_root   = '../images/';
bbox_file = '../anno_bbox.mat';

ld = load(bbox_file);
bbox_train = ld.bbox_train;
bbox_test = ld.bbox_test;
list_action = ld.list_action;

% change this
i = 765;  % image index
j = 1;    % hoi index

% read image
im_file = [im_root 'train2015/' bbox_train(i).filename];
im = imread(im_file);

% display image
figure(1);
imshow(im); hold on;

% display hoi
hoi_id = bbox_train(i).hoi(j).id;
aname = [list_action(hoi_id).vname_ing ' ' list_action(hoi_id).nname];
aname = strrep(aname,'_',' ');
title(aname);

% display bbox
if bbox_train(i).hoi(j).invis
    fprintf('hoi not visible\n');
else
    bboxhuman  = bbox_train(i).hoi(j).bboxhuman;
    bboxobject = bbox_train(i).hoi(j).bboxobject;
    connection = bbox_train(i).hoi(j).connection;
    visualize_box_conn_one(bboxhuman, bboxobject, connection, 'b','g');
end
and here's the image name I used: data/hico_20160224_det/images/train2015/HICO_train2015_00000765.jpg

Both our API and the MATLAB code are correct :) In HICO-DET, a single image may contain multiple HOIs (i.e. multiple human-object pairs performing different actions). There are 23 HOI instances in total in HICO_train2015_00000765.jpg, which can be seen in the anno_bbox.mat. But the MATLAB code you provided just prints the first one.

Please also note that the API outputs hoi_idx (0 ~ 599) instead of hoi_id (1 ~ 600), so the three HOIs should be 56 - direct bus, 60 - ride bus and 61 - sit_on bus.

from consnet.

monacv commented on August 20, 2024

thanks a lot, so I ran the following and it took quite a while which means it would be very long if I want to run it for all images
also not sure why it also has "eat cake" as an annotation? There is no eating in this photo. Is there a way I could acceleta this for all images?
data/hico_20160224_det/images/train2015/HICO_train2015_00001549.jpg

from consnet.

monacv commented on August 20, 2024

I also tried another image which has very weird results

feh data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py 
[133, 133, 139, 139]
--- 43.073837995529175 seconds ---

133  horse           hug           
134  horse           jump          
135  horse           kiss          
136  horse           load          
137  horse           hop_on        
138  horse           pet           
139  horse           race

from hico_list_hoi.txt
as you see there is no horse involved.

Here's the api_demo.py code:

import time
start_time = time.time()    
from consnet.api import load_anno

img_name = 'data/hico_20160224_det/images/test2015/HICO_test2015_00007935.jpg'
img_id = int(img_name[-12:-4])

anno = load_anno('data/hico_20160224_det/anno_bbox.mat', 'train')
hoi_idx = anno[anno[:, 0] == img_id, 1].int().tolist()
print(hoi_idx)
print("--- %s seconds ---" % (time.time() - start_time))

I think 40s is a very long time for a lookup.

from consnet.

yeliudev commented on August 20, 2024

thanks a lot, so I ran the following and it took quite a while which means it would be very long if I want to run it for all images
also not sure why it also has "eat cake" as an annotation? There is no eating in this photo. Is there a way I could acceleta this for all images?
data/hico_20160224_det/images/train2015/HICO_train2015_00001549.jpg

I think this may be the problem of HICO-DET's original annotation. Please double check it in anno_bbox.mat.

from consnet.

monacv commented on August 20, 2024

so when I run the matlab code that comes with the original HICO_DET dataset sometimes the results are different (thanks for catching my mistake about load_anno).

Here, I am loading train load_anno, and

$ python api_demo.py
[56, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61]
--- 7.9999425411224365 seconds ---

While I get drive bus from your results, I get other results too, do you know why and why your results is different from original matlab code -- I personally accept that if you are "driving a bus", you are also "riding a bus", and also "sitting on bus".
055 bus board
056 bus direct
057 bus drive
058 bus exit
059 bus inspect
060 bus load
061 bus ride
062 bus sit_on
063 bus wash
064 bus wave

while the matlab is yielding this result:

Here's the original matlab code:

im_root   = '../images/';
bbox_file = '../anno_bbox.mat';

ld = load(bbox_file);
bbox_train = ld.bbox_train;
bbox_test = ld.bbox_test;
list_action = ld.list_action;

% change this
i = 765;  % image index
j = 1;    % hoi index

% read image
im_file = [im_root 'train2015/' bbox_train(i).filename];
im = imread(im_file);

% display image
figure(1);
imshow(im); hold on;

% display hoi
hoi_id = bbox_train(i).hoi(j).id;
aname = [list_action(hoi_id).vname_ing ' ' list_action(hoi_id).nname];
aname = strrep(aname,'_',' ');
title(aname);

% display bbox
if bbox_train(i).hoi(j).invis
    fprintf('hoi not visible\n');
else
    bboxhuman  = bbox_train(i).hoi(j).bboxhuman;
    bboxobject = bbox_train(i).hoi(j).bboxobject;
    connection = bbox_train(i).hoi(j).connection;
    visualize_box_conn_one(bboxhuman, bboxobject, connection, 'b','g');
end

and here's the image name I used: data/hico_20160224_det/images/train2015/HICO_train2015_00000765.jpg

from consnet.

monacv commented on August 20, 2024

regarding the cake, I made a mistake. Since I didn't check the one off in yours.
so basically yours return "carry cake", and "hold cake" which basically makes sense but the original matlab code returns only "carry cake". If you could shed any light on why the original code doesn't return "hold cake" would be really great. Thanks for the nice API.

from consnet.

yeliudev commented on August 20, 2024

*I just wanted to update that you were correct regarding time and running it for all the train images only took 114 seconds :)

(consnet) mona@goku:~/research/code/ConsNet$ python api_demo.py
--- 114.5227403640747 seconds ---

Thanks for reporting this :)

from consnet.

Given an image name from HICO_DET, how can I retreive its HOI_idx using ConsNet API? about consnet HOT 10 CLOSED

Comments (10)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent