microsoft / scene_graph_benchmark Goto Github PK

View Code? Open in Web Editor NEW

381.0 13.0 86.0 21.31 MB

image scene graph generation benchmark

License: MIT License

Dockerfile 0.36% Python 87.37% C++ 1.55% Cuda 9.88% C 0.83%

scene_graph_benchmark's People

Contributors

Stargazers

Watchers

Forkers

xhyandwyy trendingtechnology weiyunfei liu-hx sonaliashish alcoholrithm he159ok spqrxviii001 qpc-database spartag117 nasrinkalanat aaronhd robertocarlosjuan nku-shengzheliu alliedel standardgalactic yizhe-ang smartfeeds tracywang16 manoj-alexander vidsgr computer-vision-code maogewudi007 hapticmusic skrighyz ammexm ihaeyong shunk031 rh-dang qc-ly tushar7797 jonasnm hcwei13 bhathiya-hw donghoonkim-1938 aimicm champon1020 eenzeenee au-nebula shawndong98 smichniak rishabh-mitra gaohuan2015 dgphust rafaelpadilla woojeongjin yuni1314 jncsnlp smritichandrasekar feifang24 enesmsahin nobelvictory hieunghia-pat mohammedessamtga iamzifei zshangg2 dreamflasher idejie tetsuyamurata phoebussi kusses david1-git linhuixiao alceballosa jamiemagee insightcs michaellampe itthisakp lil-shake civannakbas herobaby71 lindadeltax ravisrivk mesnico wangyangchen nineves delfimpandiani ilileun nuclear2 zzingok williamwang164

scene_graph_benchmark's Issues

demo_image.py not working

Hi,

I am running the following code:

python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file women_fish.jpg --save_file output/woman_fish_x152c4.obj.jpg MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "." TEST.IGNORE_BOX_REGRESSION False

Here is the error:

    rel_subj_centers = [r['subj_center'] for r in rel_dets]
UnboundLocalError: local variable 'rel_dets' referenced before assignment

I believe the bug is in line

scene_graph_benchmark/tools/demo/demo_image.py

Line 118 in f91725d

if isinstance(model, SceneParser):

Index error when accessing box features for RelDN model

Hi there,
I'm trying to use the model sgg_configs/vg_rvd/rel_danfeiX_FPN50_reldn.yaml for generating scene graphs for a custom dataset. The bounding box proposals works fine however seems that there is a bug in the way the proposal_pairs are computed. In particular, I get the following exception:

Traceback (most recent call last):
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/contextlib.py", line 131, in __exit__
    self.gen.throw(type, value, traceback)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/profiler/profilers.py", line 103, in profile
    yield action_name
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1088, in run_predict
    self.predict_loop.predict_step(batch, batch_idx, dataloader_idx)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/trainer/predict_loop.py", line 111, in predict_step
    predictions = self.trainer.accelerator.predict_step(args)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 265, in predict_step
    return self.training_type_plugin.predict_step(*args)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 167, in predict_step
    return self.lightning_module.predict_step(*args, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/tools/video/extract_features.py", line 311, in predict_step
    predictions = self.model(images)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/scene_parser.py", line 319, in forward
    x_pairs, prediction_pairs, relation_losses = self.relation_head(features, predictions, targets)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/relation_head/relation_head.py", line 211, in forward
    = self.rel_predictor(features, proposals, proposal_pairs)
  File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/relation_head/reldn/reldn.py", line 103, in forward
    sub_vert_per_image = proposal_per_image.get_field("subj_box_features")[rel_ind_i[:, 0]]
IndexError: index 49 is out of bounds for dimension 0 with size 38

This seems to be due to the fact the indexes of bounding boxes in rel_ind_i cause out of bounds error because there are fewer bounding boxes in proposal_per_image.get_field("subj_box_features"). In this specific case I can see that proposal_per_image.get_field("subj_box_features") has the following shape: torch.Size([38, 1024]). While, rel_ind_i[:, 0] has the following indexes:

tensor([49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 50, 50, 50, 50, 50, 50,
        50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 51, 51, 51,
        51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 52, 52, 52, 52,
        52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52,
        52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 53, 53, 53,
        53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53,
        53, 53, 53, 53, 53, 53, 53, 53, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54,
        54, 54, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55,
        55, 55, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56,
        57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 58, 58, 58, 58, 58, 58, 58,
        58, 58, 58, 58, 58, 58, 58, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
        59, 59, 59, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61,
        61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61,
        61, 61, 61, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 63,
        63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 64, 64, 64, 64, 64,
        64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65,
        65, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 67, 67, 67, 67,
        67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67,
        67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 68, 68, 68, 68,
        68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68,
        68, 68, 68, 68, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69,
        69, 69, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 71, 71,
        71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 72, 72, 72, 72,
        72, 72, 72, 72, 72, 72, 72, 72, 72, 73, 73, 73, 73, 73, 73, 73, 73, 73,
        73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 74, 74, 74, 74,
        74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 75, 75, 75, 75, 75, 75, 75,
        75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
        75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 76, 76, 76, 76, 76, 76, 76,
        76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76,
        76, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 78, 78, 78, 78, 78, 78, 78,
        78, 79, 79, 79, 79, 79, 79, 79, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
        80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
        80, 80, 80, 80, 80, 80, 80, 80, 80, 81, 81, 81, 81, 81, 81, 81, 81, 81,
        81, 81, 81, 81, 81, 81, 81, 81, 81, 81, 81, 82, 82, 82, 82, 82, 82, 82,
        82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,
        82, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83,
        83, 83, 83, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 85, 85, 85, 85,
        85, 85, 85, 85, 85, 85, 85, 85, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86,
        86, 86, 86, 86, 86, 86, 86, 86])

I'm running this code with batch size 2. I thought that the error could have been in the way the proposal_pairs are generated. However, the exception happens when either of these lines are executed to generate the proposal_pairs:

Do you think that offset is required here: https://github.com/microsoft/scene_graph_benchmark/blob/main/scene_graph_benchmark/relation_head/reldn/reldn.py#L98

@hanxiaotian Could you please advise?

Broken links to VinVL model and associated labelmaps

Hi!
These links are broken (resource not found error):
https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json

Slow feature extraction compared to bottom-up-attention

Hi, thanks for the great work and open-sourcing this project.

I'm excited to try VinVL since it promises faster computation time for the feature extraction part as written in the paper compared to bottom-up-attention

I have created my own TSV file using tsv_demo.py and ran tools/test_sg_net.py to do feature extraction.
The sad thing is the feature extraction runs quite slowly.
Right now I'm using Pytorch 1.7, Debian 10, with 1 Nvidia T4.
The feature extraction process took 9 second / 4 images.

I used bottom-up-attention from https://github.com/airsplay/py-bottom-up-attention and https://github.com/peteanderson80/bottom-up-attention while using OSCAR on the same dataset. these repo give much faster feature extraction time (the first repo need 2.7 seconds / 8 images, while the original caffe bottom-up took less than 1 second for 1 image ) on a similar machine. This contradicts what written in your paper.

Here's some key config that I'm using while running the tools/test_sg_net.py

TEST:
    IMS_PER_BATCH: 4
    IGNORE_BOX_REGRESSION: True
    SKIP_PERFORMANCE_EVAL: True
    SAVE_PREDICTIONS: True
    SAVE_RESULTS_TO_TSV: True
    TSV_SAVE_SUBSET: ['rect', 'class', 'conf', 'feature']
    GATHER_ON_CPU: True
    OUTPUT_FEATURE : True

I'm check my nvidia-smi and it showing my GPU is working.

Is anyone else have this issue also?

Is that a bug for extracting visual features?

Hi,
I run the demo_image.py. But I found something inconsistent.
In code, the image is changed into RGB format to feed into the detection model.

But, I found that in the configure file. the PIXEL MEAN is : [103.530, 116.280, 123.675], which is BGR format indeed. and I found in
the tsv_demo.py, the image is also BGR format read by CV2.
So I am confused which is right?

missing model weight when training on visualgenome

I attempted to train RelDN model on visual genome by executing this command:

python tools/train_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml

and get the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'models/vgvrd/vgnm_usefpTrue_objctx0_edgectx2/model_final.pth'

I searched in the repository but found no information about the file vgnm_usefpTrue_objctx0_edgectx2/model_final.pth. Could you please provide more details about where can we download this model file? Thanks in advance.

git clone https://github.com/hanxiaotian/scene_graph_benchmark.git

This might be a wrong path.

Inference on COCO/NoCaps dataset

Is it possible to inference the model on COCO dataset?

Could you please show me an easy way to generate features for single images？

The instruction seems complicated, and I follow the guide, but only bbox is created without features.
Could you give us an reproducible examples to use the demo to extract featrures?

Pre-extracted Image Features: what OD model is used?

Hi,
In here, we can easily use pre-extracted image features.

And I thought these features are from VINVL OD model trained from the merged four datasets: COCO with stuff, Visual Genome, Object365 and Open Images.

However, I found that features and corresponding labels (object tags) are only from the Visual Genome dataset, which shows inferior performance than that from merged four datasets (according to VinVL paper)

So I want to clarify whether the given image features are from the pretrained X152-C4 object-attribute detection (based on only the Visual Genome dataset) or from the pretrained model on the merged four datasets.

Thanks

Execute setup.py with the problem link.exe failed with code 1181 in windows platform

Dear author:
Thanks for your great job. When i use your code "setup.py" in the win10 platform, i get the error link.exe "fatal error lnk 1181 link.exe ,cannot open ROIAlign.obj file". And i used vs2015 and vs2019 with vs++14.0 and vs++16.0 ,both get the same problem.I check the ROIAlign.obj file and it missed in the file path. Can you tell me how to fix the problem?
But it works on the ubuntu platform.
Thanks very much, looks your forward soon.

predcls ValueError: object labelmap is required, but was not provided

hi I want to get some suggestions about the more object label circumstance. I found the project provides us the labelmap file. But the labelmap only provide 50 objects. It is so smaller. I run the coco2014 , the object label 1370 is far bigger than it . So We should only add some id and name to the labelmap.file or we should from scratch to train the object detector of the project ? I have the coco 2014 36 box's label. But I don't know how to get bigger object label labelmap file

If you have way, please help me . Thanks

KeyError: 'broccoli'
Killing subprocess 4356

About Vinvl R50-C4 Model

Could you please share your VINVL R50-C4 detection model which is referred in Oscar plus, I want to make some further experiments with this model.
Thank you!

where can I find the dataset yaml in the config file?

In any config file, I found there existed DATASETS.TRAIN is a yaml dir. But I can not find it.

TypeError: run_test() got an unexpected keyword argument'model_name

What is the reason for the TypeError: run_test() got an unexpected keyword argument'model_name' error during single GPU training?

When I extract image features with VinVL, the AssertionError occur

hello! When I extract image features with VinVL by the command :

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True TEST.OUTPUT_FEATURE True

I get the traceback:

Traceback (most recent call last):
File "tools/test_sg_net.py", line 129, in
main()
File "tools/test_sg_net.py", line 106, in main
data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
cfg, dataset_name, factory_name, is_train
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
assert op.isfile(full_yaml_file)
AssertionError

I guess the reason is that the file flickr30k/tsv/flickr30k.yaml is not exist, so where can I find this yaml file?

I also try to delete the line :
TEST: ("flickr30k/tsv/flickr30k.yaml",)
in the file: vinvl_x152c4.yaml,
but after run the code, there is nothing output.

Extracting features for a directory of images

I'm slightly confused by the instructions on how to extract features from the test_sg_net.py script - more specifically:

what format the data directory has to be in (i.e. something more than just a directory containing images) and
what variables need to be set in the sgg_configs/vgattr/vinvl_x152c4.yaml config in order to point to my data directory (I'm not sure what the variables DATASETS.TEST and DATA_DIR need to be - I assumed the latter was the directory with all the images I need features for, but that does not seem to be the case)

RuntimeError: CUDA error: invalid device function

When I try to run

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 1 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 \
MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR /my_path_to_prepard_tsv/dataset/tsv TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True TEST.OUTPUT_FEATURE True

with environment

PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1

OS: Ubuntu 16.04.7 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
CMake version: version 3.5.1

Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration: 
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
GPU 2: Tesla V100-SXM2-32GB
GPU 3: Tesla V100-SXM2-32GB
GPU 4: Tesla V100-SXM2-32GB
GPU 5: Tesla V100-SXM2-32GB
GPU 6: Tesla V100-SXM2-32GB
GPU 7: Tesla V100-SXM2-32GB

Nvidia driver version: 418.67
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5

Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.4.0
[pip3] torchvision==0.5.0
[conda] blas                      1.0                         mkl  
[conda] mkl                       2020.2                      256  
[conda] mkl-service               2.3.0            py37he8ac12f_0  
[conda] mkl_fft                   1.3.0            py37h54f3939_0  
[conda] mkl_random                1.1.1            py37h0573a6f_0  
[conda] pytorch                   1.4.0           py3.7_cuda10.1.243_cudnn7.6.3_0    pytorch
[conda] torchvision               0.5.0                py37_cu101    pytorch
        Pillow (8.2.0)

I encountered the error as following

RuntimeError: CUDA error: invalid device function (launch_kernel at /opt/conda/conda-bld/pytorch_1579022060824/work/aten/src/ATen/native/cuda/Loops.cuh:103)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f3f8ee64627 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<__nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>, __nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> const&) + 0x78d (0x7f3f9670368d in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x571bf32 (0x7f3f966fcf32 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x571c298 (0x7f3f966fd298 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #4: <unknown function> + 0x16957eb (0x7f3f926767eb in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #5: at::native::index(at::Tensor const&, c10::ArrayRef<at::Tensor>) + 0x47e (0x7f3f926725ae in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #6: <unknown function> + 0x1c0155a (0x7f3f92be255a in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #7: <unknown function> + 0x1c06023 (0x7f3f92be7023 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #8: <unknown function> + 0x3820d1a (0x7f3f94801d1a in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #9: <unknown function> + 0x1c06023 (0x7f3f92be7023 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #10: at::Tensor::index(c10::ArrayRef<at::Tensor>) const + 0x191 (0x7f3fc1465931 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #11: nms_cuda(at::Tensor, float) + 0x7e8 (0x7f3f6982407b in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #12: nms(at::Tensor const&, at::Tensor const&, float) + 0x790 (0x7f3f697eabb0 in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #13: <unknown function> + 0x53b97 (0x7f3f697fbb97 in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #14: <unknown function> + 0x5004d (0x7f3f697f804d in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>

I tried both to set up my env by using option 1 and docker image, both environments give me the same error.
If anyone also has the same issue, please guide me.

Question about image ids

I thought the key in the tsv file ( visualgenome/label_danfeiX_overlap.new.tsv ) represented the image ID, but I found out that it does not.

from maskrcnn_benchmark.structures.tsv_file_ops import tsv_reader
import glob

# get image ids from Visual Genome image files
img_files = glob.glob('/VisualGenome/VG_100K/*.jpg') + glob.glob('/VisualGenome/VG_100K_2/*.jpg')
image_ids_from_files = [img_file.split('/')[-1].split('.')[0] for img_file in img_files]

# get image ids from the scene graph annotation files
tsv = tsv_reader('datasets/visualgenome/label_danfeiX_overlap.new.tsv')
image_ids_from_annos = [row[0] for row in tsv]

# extract the overlap between the two data
overlap = set(image_ids_from_annos) & set(image_ids_from_files)

print(f'size of iid from files: {len(image_ids_from_files)}')
print(f'size of iid from annotations: {len(image_ids_from_annos)}')
print(f'size of overlapped iid: {len(overlapped_iid)}')

The output of this code looks like this:

size of iid from files: 108249
size of iid from annotations: 108073
size of overlapped iid: 5196

I think this shows that about 95% of the image ids are mismatched.
How can I get the correct image id mapping?

How to extract the scene graph?

I would like to extract the scene graph like what is shown on R152FPN_demo.png. Which script should I try?

Several issues in extracting VinVL Feature extraction

When I have installed your environment step by step by option 1 and then run the command for below,

# extract vision features with VinVL object-attribute detection model
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

There are several issues,

I do not find the code to load the pre_trained model parameters for "AttrRCNN". Though the command has a pre-trained model path, I do not find a concrete torch.load() code by debugging. Thus I wonder whether I need to add torch.load() by self when I run the above command.
The "self.training" in "AttrRCNN" is extended from "torch.nn.modules.module.py", which is set as True by default. But in run the command to extract VinVL features by the beginning command, it seems that it should be False and I have to overwrite each init functions of AttrRCNN, its "self.rpn", and "self.roi_heads" as below,

 proposals, proposal_losses = self.rpn(images, features, targets, is_training = self.training)
  x, predictions, detector_losses = self.roi_heads(features,  proposals, targets, is_training = self.training)

Instead of applying Pytorch 1.4, I apply Pytorch 1.7, but it always gives running errors for several in-place operations, such as below codes in "bounding_box.py"

        def clip_to_image(self, remove_empty=True):
        TO_REMOVE = 1
        self.bbox[:, 0].clamp_(min=0, max=self.size[0] - TO_REMOVE)
        self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
        self.bbox[:, 2].clamp_(min=0, max=self.size[0] - TO_REMOVE)
        self.bbox[:, 3].clamp_(min=0, max=self.size[1] - TO_REMOVE)

the error is as below,

  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/rpn.py", line 188, in _forward_test
    boxes = self.box_selector_test(anchors, objectness, rpn_box_regression)
  File "/home/jfhe/anaconda3/envs/JD2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/inference.py", line 140, in forward
    sampled_boxes.append(self.forward_for_single_feature_map(a, o, b))
  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/inference.py", line 114, in forward_for_single_feature_map
    boxlist = boxlist.clip_to_image(remove_empty=False)
  File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 217, in clip_to_image
    self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.

I address them by setting "with torch.no_grad()", but it feels strange. If I have to fine-tune the model, then these bugs will come again by removing "with torch.no_grad()".

Also, I agree with the top question, #25
Could you please provide a simpler way to extract the VinVL features directly? Because it will bring much help to the community, and we will cite your works definitely.

ModelZoo contains broken links

Hi there,

It looks like the ModelZoo contains broken links for all the OpenImages models such as:

Would it be possible to fix them!?

Thanks,
Alessandro

Res152c4 on 4 datasets seems not right

VinVL's DOWNLOAD.md says We also provide the X152-C4 objecte detection config file and pretrained model on the merged four datasets (COCO with stuff, Visual Genome, Objects365 and Open Images). The labelmap to decode the 1848 can be found here. The first 1594 classes are exactly VG classes, with the same order. The map from COCO vocabulary to this merged vocabulary can be found here. The map from Objects365 vocabulary to this merged vocabulary can be found here. The map from OpenImages V5 vocabulary to this merged vocabulary can be found here.

But I am wondering how to run this pretrained model?
Obviously Scene Graph Benchmark can't run this pre-trained model since the configuration file is not compatible with that package. I force to change the config file (deleting options one by one until yacs accepts), so I can manage to run the pre-trained model, but results are not right because the number of boxes are too small compared to other detector (which should not be) ...

Any help please?

fail to setup

error occurs "#error "You're running a too old version of GCC. We need GCC 5 or later." when I build the setup

How to generate train.labelmap.tsv ?

Could you please tell me how to generate train.labelmap.tsv ? It seems that there is no code about it in the tsv_demo.py. @hanxiaotian Thanks.

demo yaml

Can you provide flickr30k/tsv/flickr30k.yaml as specified in the vgattr/vinvl_x152c4.yaml？

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

Hi, I found a way to circumvent using tsv files by modifying `scene_graph_benchmark/tools/demo/demo_image.py`, and now I only need `jpg` image dataset, VinVl yaml configuration file and model weight file. The predictions are saved in dictionary and are stored in `pth` format. I ran it on Google Colab and it generates predictions at a rate about 2s/image. I hope this helps.

Hi, I found a way to circumvent using tsv files by modifying scene_graph_benchmark/tools/demo/demo_image.py, and now I only need jpg image dataset, VinVl yaml configuration file and model weight file. The predictions are saved in dictionary and are stored in pth format. I ran it on Google Colab and it generates predictions at a rate about 2s/image. I hope this helps.

# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json

import cv2
import os
import os.path as op
import argparse
import json
from PIL import Image


from scene_graph_benchmark.scene_parser import SceneParser
from scene_graph_benchmark.AttrRCNN import AttrRCNN
from maskrcnn_benchmark.data.transforms import build_transforms
from maskrcnn_benchmark.utils.checkpoint import DetectronCheckpointer
from maskrcnn_benchmark.config import cfg
from scene_graph_benchmark.config import sg_cfg
from maskrcnn_benchmark.data.datasets.utils.load_files import \
    config_dataset_file
from maskrcnn_benchmark.data.datasets.utils.load_files import load_labelmap_file
from maskrcnn_benchmark.utils.miscellaneous import mkdir

def cv2Img_to_Image(input_img):
    cv2_img = input_img.copy()
    img = cv2.cvtColor(cv2_img, cv2.COLOR_BGR2RGB)
    img = Image.fromarray(img)
    return img


def detect_objects_on_single_image(model, transforms, cv2_img):
    # cv2_img is the original input, so we can get the height and 
    # width information to scale the output boxes.
    img_input = cv2Img_to_Image(cv2_img)
    img_input, _ = transforms(img_input, target=None)
    img_input = img_input.to(model.device)

    with torch.no_grad():
        prediction = model(img_input)[0].to('cpu')
    #     prediction = prediction[0].to(torch.device("cpu"))

    img_height = cv2_img.shape[0]
    img_width = cv2_img.shape[1]

    prediction = prediction.resize((img_width, img_height))
    
    return prediction

#Setting configuration
cfg.set_new_allowed(True)
cfg.merge_from_other_cfg(sg_cfg)
cfg.set_new_allowed(False)
#Configuring VinVl
cfg.merge_from_file('/scene_graph_benchmark/sgg_configs/vgattr/vinvl_x152c4.yaml')

#This is a list specifying the values for additional arguments, it encompasses pairs of list and values in an ordered manner
#MODEL.WEIGHT specifies the full path of the VinVl weight pth file
#DATA_DIR specifies the directory that contains VinVl input tsv configuration yaml file
argument_list = [
                 'MODEL.WEIGHT', 'vinvl_vg_x152c4.pth',
                 'MODEL.ROI_HEADS.NMS_FILTER', 1,
                 'MODEL.ROI_HEADS.SCORE_THRESH', 0.2, 
                 'TEST.IGNORE_BOX_REGRESSION', False,
                 'MODEL.ATTRIBUTE_ON', True
                 ]
cfg.merge_from_list(argument_list)
cfg.freeze()

#     assert op.isfile(args.img_file), \
#         "Image: {} does not exist".format(args.img_file)

output_dir = cfg.OUTPUT_DIR
#     mkdir(output_dir)

model = AttrRCNN(cfg)
model.to(cfg.MODEL.DEVICE)
model.eval()

checkpointer = DetectronCheckpointer(cfg, model, save_dir=output_dir)
checkpointer.load(cfg.MODEL.WEIGHT)

transforms = build_transforms(cfg, is_train=False)

input_img_directory = 'insert your images directory path here'
#need to be pth
output_prediction_file = 'insert your output pth file path here'
dets = {}
for img_name in os.listdir(input_img_directory):
  #Convert png format to jpg format
  if img_name.split('.')[1]=='png' or img_name.split('.')[1]=='PNG':
    im = Image.open(os.path.join(input_img_directory, img_name))
    rgb_im = im.convert('RGB')
    new_name = img_name.split('.')[0]+'.jpg'
    rgb_im.save(os.path.join(input_img_directory, new_name))
    print(new_name)

  img_file_path = os.path.join(input_img_directory,img_name.split('.')[0]+'.jpg')
  print(img_file_path)

  cv2_img = cv2.imread(img_file_path)

  det = detect_objects_on_single_image(model, transforms, cv2_img)
  
#   prediction contains ['labels',
#  'scores',
#  'box_features',
#  'scores_all',
#  'boxes_all',
#  'attr_labels',
#  'attr_scores']
# box_features are used for oscar

  det_dict ={key : det1[0].get_field(key) for key in det1[0].fields()}

  dets[img_name.split('.')[0]] = det_dict


torch.save(dets, output_prediction_file)

Originally posted by @SPQRXVIII001 in #7 (comment)

how to export to pb?

Guide to run train_net.py

Hi, I want to train the detector with train_net.py, could you please give me some guide? How to organize the data, and how to pass the parameters. Thanks. @pzzhang

Image file in Vinvl example script

I failed to find the image source when I was trying to execute the script about Vinvl, where can I get the image file in the script"--img_file ../maskrcnn-benchmark-1/datasets1/imgs/woman_fish.jpg"?

con't not download openimages_v5c

The download link is not available, please check it.

azcopy copy 'https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/datasets/openimages_v5c' --recursive

KeyError: 'gt_labels' during training

When attempting to perform training using tools/train_sg_net.py and a config file like sgg_configs/vg_vrd/rel_danfeiX_FPN50_nm.yaml, I receive the following error:

Traceback (most recent call last):
  File "tools/train_sg_net.py", line 225, in <module>
    main()
  File "tools/train_sg_net.py", line 218, in main
    model = train(cfg, args.local_rank, args.distributed)
  File "tools/train_sg_net.py", line 110, in train
    meters
  File "/home/scene_graph_benchmark/maskrcnn_benchmark/engine/trainer.py", line 94, in do_train
    loss_dict = model(images, targets)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/parallel/distributed.py",
 line 619, in forward
    output = self.module(*inputs[0], **kwargs[0])
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/scene_graph_benchmark/scene_graph_benchmark/scene_parser.py", line 319, in forward
    x_pairs, prediction_pairs, relation_losses = self.relation_head(features, predictions, targets)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/relation_head.py", line
211, in forward
    = self.rel_predictor(features, proposals, proposal_pairs)
  File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/neural_motif/neuralmotif
.py", line 135, in forward
    = _get_tensor_from_boxlist(proposals, 'gt_labels')
  File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/sparse_targets.py", line
 61, in _get_tensor_from_boxlist
    assert proposals[0].extra_fields[field] is not None
KeyError: 'gt_labels'

Has anyone else encountered the same problem?

Issue when attempting to generate image features

Hi,

Thank you for providing us code for this project. I was trying to generate images features. I attempted to follow both examples from #25 and #7
as follows:

With a directory of 18 images (stored in datasets/test_imgs), I used tools/mini_tsv/demo_tsv.py to generate tsv files (label, hw, linelist) for the corresponding dataset, and stored them in datasets/test/. Since I didn't have any particular labelmap in mind, and I had downloaded the checkpoint for the RelDN model, and its corresponding config file, I used the label map VG-SGG-dicts-vgoi6-clipped.json (I copied this file into the same directory), so that my yaml file is as follows:

datasets/test/test_imgs.yaml
img: test_imgs.tsv label: test_imgs_label.tsv hw: test_imgs_hw.tsv label_map: VG-SGG-dicts-vgoi6-clipped.json linelist: test_imgs_linelist.tsv

Then, I made a new yaml file datasets/test/testing.yaml which was the same yaml file as rel_danfeiX_FPN50_reldn.yaml but with DATASETS.TRAIN = ("test/test_imgs.yaml",) and DATASETS.TEST = ("test/test_imgs.yaml",) and ran the command

python -m torch.distributed.launch --nproc_per_node=2 tools/test_sg_net.py --config-file datasets/test/testing.yaml

This ran into the error:

2021-07-16 04:08:02,996 maskrcnn_benchmark.inference INFO: Start evaluation on test/test_imgs.yaml dataset(18 images).
INFO:maskrcnn_benchmark.inference:Start evaluation on test/test_imgs.yaml dataset(18 images).
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "tools/test_sg_net.py", line 198, in
main()
File "tools/test_sg_net.py", line 194, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 73, in run_test
save_predictions=cfg.TEST.SAVE_PREDICTIONS,
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 265, in inference
predictions = compute_on_dataset(model, data_loader, device, bbox_aug, inference_timer)
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 32, in compute_on_dataset
for _, batch in enumerate(tqdm(data_loader)):
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/tqdm/std.py", line 1185, in iter
for obj in iterable:
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/relation_tsv.py", line 146, in getitem
target = self.get_target_from_annotations(annotations, img_size)
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/relation_tsv.py", line 78, in get_target_from_annotations
target = self.label_loader(annotations['objects'], img_size, remove_empty=False)
TypeError: list indices must be integers or slices, not str

This is very strange to me since when I run the command without using my own datasets I don't run into this issue at all. Is there anything that I did incorrectly that could cause this error-- and if so, how can I fix it?

This is my first issue raised so forgive me if this is too much/little info or if this is more suited to Stack Overflow instead.

[Errno 2] No such file or directory: './datasets/openimages_v5c/vrd/ji_vrd_labelmap.json'

Not

Correct way to extract image features with VinVL

Hi!

How can I extract image features from my dataset with VinVL if it's not in tsv format, but in the form of a folder with image files? What's the correct way to do this?

Why does the program freeze when running with multiple GPUs?

python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_sg_net.py --config-file
"sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml"

Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.

How to extract relation features

I use the model offered by VinVL Feature extraction,.But after using the same setting as advised, I can only obtain bounding box features. So, how to extract relation features? Thanks

cannot download pretrained model

I cannot seem to open model links from https://github.com/microsoft/scene_graph_benchmark/blob/main/SCENE_GRAPH_MODEL_ZOO.md
any help will be very much appreciated

How to generate the predicted object attributes and relations labels together?

@hanxiaotian Hi, thanks a lot for releasing the great SGG benchmark! I want to extract the predicted scene graph (with predicted boxes, attributes and relations) from scratch for new images using the pre-trained models in the model zoo. However, when I try to use the demo script, I notice the model cannot predict attributes and relations together (the VinVL pre-trained model only predicts the attributes and the RelDN pre-trained model only generates relations, where I need to do further ad-hoc alignment with the two outputs to predict a full scene graph). Is there any way to achieve this using a single provided pre-trained model? (Sorry I'm not very familiar with this task.) Thank you very much!

KeyError: 'box_features' --- extracting features from own image tsv files

Hi - I am getting a KeyError box features message when trying to extract features from my own image. I've played around with the code, but can't seem to figure it out. If I set the TEST.OUTPUT_FEATURE to False, then the code runs fine, but just outputting the detected objects. Can someone please help out?

For reference, the demo extraction on single image works fine - both object detection and box features.

Traceback (most recent call last):
File "tools/test_sg_net.py", line 197, in
main()
File "tools/test_sg_net.py", line 193, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 72, in run_test
save_predictions=cfg.TEST.SAVE_PREDICTIONS,
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 297, in inference
relation_on=cfg.MODEL.RELATION_ON,
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 211, in convert_predictions_to_tsv
tsv_writer(gen_rows(), os.path.join(output_folder, output_tsv_name))
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/structures/tsv_file_ops.py", line 42, in tsv_writer
for value in values:
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 139, in gen_rows
features = prediction.get_field('box_features').numpy()
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 43, in get_field
return self.extra_fields[field]
KeyError: 'box_features'

can you add same GCN methods in recent CVPR

I find the five methods in the comparisons are all before 2020, but the mainstream method in 2020 and 2021 is graph neural network. So I hope your team can add some latest methods to this project. Thanks.

MIssing links to pre-trained model checkpoints

The download links to Pre-trained models' checkpoints of Openimages V5 show File Not Found Errors. Could please take a look and provide new links?

the labelmap only provide 50 objects? Could it more bigger Or we only from scratch to train the project object detection?

I have all the boxes object label ,but it is bigger than the 50 object label in labelmap.file that project provide.
So how I should get the relation prediction?

I run the precls code , Always get the KeyError: 'broccoli','carrot', and so on.....

using tsv_demo.py to generate some tsv files using my own images

hi, when I use tsv_demo.py to generate some tsv files using my own images, It is easy to generate img.tsv and hw.tsv, but for label.tsv, how can I generate this class and rect, I do not have any annotations?
thanks a lot!

VinVL can model the relation prediction?

I found VinVL 'S object and attribute lable is so bigger. So How to use the VinVL in predicate classification. At Now the project only provides 150 object. But visual genome +faster rcnn can detect 1370 object class. It is so big difference.

config_file for train_sg_net.py

Hi,

I try to train a model following "python tools/train_sg_net.py --config-file "/path/to/config/file.yaml"
". But I confuse which config.file I need to use, can you share more details about training config files?

Thank you very much!

Cannot run VinVL feature extraction command

I ran the following command (from the README):

python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True

I think the DATA_DIR is misconfigured because I get the following error (below). Where is ../maskrcnn-benchmark-1/datasets1 from? Or the file visualgenome/test_vgoi6_clipped.yaml, which I think it's looking for?

This is the AssertionError:

Traceback (most recent call last):
  File "tools/test_sg_net.py", line 197, in <module>
    main()
  File "tools/test_sg_net.py", line 193, in main
    run_test(cfg, model, args.distributed, model_name)
  File "tools/test_sg_net.py", line 55, in run_test
    data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
  File "/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
    datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
  File "/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
    cfg, dataset_name, factory_name, is_train
  File "/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
    assert op.isfile(full_yaml_file)
AssertionError

Thanks in advance!

Download vinvl pre-training model

I failed to used azcopy command to download vinvl pre-training model. Could you show me how to download pretraining models? Thank you!

./azcopy cp https://penzhanwu2.blob.core.windows.net/results/vinvl/od_models/vinvl_vg_x152c4.pth .

INFO: Scanning...

failed to perform copy command due to error: Login Credentials missing. No SAS token or OAuth token is present and the resource is not public.

About attribute and object lable for the pointed or designated bounding box

Dear scholar,
I want to ask whether your elegant code includes the function about produce a description about the attribute and object for the designated bounding box.
In your tools/demo_image.py , it can produce 36 bounding box with the attribute and object label by your model with the pretrained weights file.

Could your code pass the boxlist to the model, and the model produce the designated bounding box 's attribute and object label?

how to set DATASETS FACTORY_TEST(when extract features from my own directory of images)

hi, I want to extract features from my directory of images, so I follow issue#7 to set up some tsv files using tsv_demo.py, then I wrote a test.yaml. then, I do something in vinvl_x152c4.yaml，I point DATASETS:TEST to my test.yaml, but I can not run test_sg_net.py correctly, is there anything I should do but I missed?
thank you very much!