microsoft / scene_graph_benchmark Goto Github PK
View Code? Open in Web Editor NEWimage scene graph generation benchmark
License: MIT License
image scene graph generation benchmark
License: MIT License
Hi,
I am running the following code:
python tools/demo/demo_image.py --config_file sgg_configs/vgattr/vinvl_x152c4.yaml --img_file women_fish.jpg --save_file output/woman_fish_x152c4.obj.jpg MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "." TEST.IGNORE_BOX_REGRESSION False
Here is the error:
rel_subj_centers = [r['subj_center'] for r in rel_dets]
UnboundLocalError: local variable 'rel_dets' referenced before assignment
I believe the bug is in line
Hi there,
I'm trying to use the model sgg_configs/vg_rvd/rel_danfeiX_FPN50_reldn.yaml
for generating scene graphs for a custom dataset. The bounding box proposals works fine however seems that there is a bug in the way the proposal_pairs
are computed. In particular, I get the following exception:
Traceback (most recent call last):
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/contextlib.py", line 131, in __exit__
self.gen.throw(type, value, traceback)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/profiler/profilers.py", line 103, in profile
yield action_name
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/trainer/trainer.py", line 1088, in run_predict
self.predict_loop.predict_step(batch, batch_idx, dataloader_idx)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/trainer/predict_loop.py", line 111, in predict_step
predictions = self.trainer.accelerator.predict_step(args)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/accelerators/accelerator.py", line 265, in predict_step
return self.training_type_plugin.predict_step(*args)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 167, in predict_step
return self.lightning_module.predict_step(*args, **kwargs)
File "/Users/asuglia/workspace/scene_graph_benchmark/tools/video/extract_features.py", line 311, in predict_step
predictions = self.model(images)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/scene_parser.py", line 319, in forward
x_pairs, prediction_pairs, relation_losses = self.relation_head(features, predictions, targets)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/relation_head/relation_head.py", line 211, in forward
= self.rel_predictor(features, proposals, proposal_pairs)
File "/Users/asuglia/opt/anaconda3/envs/devel/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/Users/asuglia/workspace/scene_graph_benchmark/scene_graph_benchmark/relation_head/reldn/reldn.py", line 103, in forward
sub_vert_per_image = proposal_per_image.get_field("subj_box_features")[rel_ind_i[:, 0]]
IndexError: index 49 is out of bounds for dimension 0 with size 38
This seems to be due to the fact the indexes of bounding boxes in rel_ind_i
cause out of bounds error because there are fewer bounding boxes in proposal_per_image.get_field("subj_box_features")
. In this specific case I can see that proposal_per_image.get_field("subj_box_features")
has the following shape: torch.Size([38, 1024])
. While, rel_ind_i[:, 0]
has the following indexes:
tensor([49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 49, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 50, 51, 51, 51,
51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 51, 52, 52, 52, 52,
52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52,
52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 52, 53, 53, 53,
53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53, 53,
53, 53, 53, 53, 53, 53, 53, 53, 54, 54, 54, 54, 54, 54, 54, 54, 54, 54,
54, 54, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55, 55,
55, 55, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56,
57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 57, 58, 58, 58, 58, 58, 58, 58,
58, 58, 58, 58, 58, 58, 58, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59, 59,
59, 59, 59, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 60, 61, 61,
61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61, 61,
61, 61, 61, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 62, 63,
63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 63, 64, 64, 64, 64, 64,
64, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65, 65,
65, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 66, 67, 67, 67, 67,
67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67,
67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 67, 68, 68, 68, 68,
68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68, 68,
68, 68, 68, 68, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69, 69,
69, 69, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 70, 71, 71,
71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 71, 72, 72, 72, 72,
72, 72, 72, 72, 72, 72, 72, 72, 72, 73, 73, 73, 73, 73, 73, 73, 73, 73,
73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 73, 74, 74, 74, 74,
74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 74, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 76, 76, 76, 76, 76, 76, 76,
76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76, 76,
76, 77, 77, 77, 77, 77, 77, 77, 77, 77, 77, 78, 78, 78, 78, 78, 78, 78,
78, 79, 79, 79, 79, 79, 79, 79, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
80, 80, 80, 80, 80, 80, 80, 80, 80, 81, 81, 81, 81, 81, 81, 81, 81, 81,
81, 81, 81, 81, 81, 81, 81, 81, 81, 81, 81, 82, 82, 82, 82, 82, 82, 82,
82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82, 82,
82, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83, 83,
83, 83, 83, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 84, 85, 85, 85, 85,
85, 85, 85, 85, 85, 85, 85, 85, 86, 86, 86, 86, 86, 86, 86, 86, 86, 86,
86, 86, 86, 86, 86, 86, 86, 86])
I'm running this code with batch size 2. I thought that the error could have been in the way the proposal_pairs
are generated. However, the exception happens when either of these lines are executed to generate the proposal_pairs
:
Do you think that offset
is required here: https://github.com/microsoft/scene_graph_benchmark/blob/main/scene_graph_benchmark/relation_head/reldn/reldn.py#L98
@hanxiaotian Could you please advise?
Hi!
These links are broken (resource not found error):
https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
Hi, thanks for the great work and open-sourcing this project.
I'm excited to try VinVL since it promises faster computation time for the feature extraction part as written in the paper compared to bottom-up-attention
I have created my own TSV file using tsv_demo.py
and ran tools/test_sg_net.py
to do feature extraction.
The sad thing is the feature extraction runs quite slowly.
Right now I'm using Pytorch 1.7, Debian 10, with 1 Nvidia T4.
The feature extraction process took 9 second / 4 images.
I used bottom-up-attention from https://github.com/airsplay/py-bottom-up-attention and https://github.com/peteanderson80/bottom-up-attention while using OSCAR on the same dataset. these repo give much faster feature extraction time (the first repo need 2.7 seconds / 8 images, while the original caffe bottom-up took less than 1 second for 1 image ) on a similar machine. This contradicts what written in your paper.
Here's some key config that I'm using while running the tools/test_sg_net.py
TEST:
IMS_PER_BATCH: 4
IGNORE_BOX_REGRESSION: True
SKIP_PERFORMANCE_EVAL: True
SAVE_PREDICTIONS: True
SAVE_RESULTS_TO_TSV: True
TSV_SAVE_SUBSET: ['rect', 'class', 'conf', 'feature']
GATHER_ON_CPU: True
OUTPUT_FEATURE : True
I'm check my nvidia-smi and it showing my GPU is working.
Is anyone else have this issue also?
Hi,
I run the demo_image.py. But I found something inconsistent.
In code, the image is changed into RGB format to feed into the detection model.
But, I found that in the configure file. the PIXEL MEAN is : [103.530, 116.280, 123.675], which is BGR format indeed. and I found in
the tsv_demo.py, the image is also BGR format read by CV2.
So I am confused which is right?
I attempted to train RelDN model on visual genome by executing this command:
python tools/train_sg_net.py --config-file sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml
and get the following error:
FileNotFoundError: [Errno 2] No such file or directory: 'models/vgvrd/vgnm_usefpTrue_objctx0_edgectx2/model_final.pth'
I searched in the repository but found no information about the file vgnm_usefpTrue_objctx0_edgectx2/model_final.pth
. Could you please provide more details about where can we download this model file? Thanks in advance.
This might be a wrong path.
Is it possible to inference the model on COCO dataset?
The instruction seems complicated, and I follow the guide, but only bbox is created without features.
Could you give us an reproducible examples to use the demo to extract featrures?
Hi,
In here, we can easily use pre-extracted image features.
And I thought these features are from VINVL OD model trained from the merged four datasets: COCO with stuff, Visual Genome, Object365 and Open Images.
However, I found that features and corresponding labels (object tags) are only from the Visual Genome dataset, which shows inferior performance than that from merged four datasets (according to VinVL paper)
So I want to clarify whether the given image features are from the pretrained X152-C4 object-attribute detection (based on only the Visual Genome dataset) or from the pretrained model on the merged four datasets.
Thanks
Dear author:
Thanks for your great job. When i use your code "setup.py" in the win10 platform, i get the error link.exe "fatal error lnk 1181 link.exe ,cannot open ROIAlign.obj file". And i used vs2015 and vs2019 with vs++14.0 and vs++16.0 ,both get the same problem.I check the ROIAlign.obj file and it missed in the file path. Can you tell me how to fix the problem?
But it works on the ubuntu platform.
Thanks very much, looks your forward soon.
hi I want to get some suggestions about the more object label circumstance. I found the project provides us the labelmap file. But the labelmap only provide 50 objects. It is so smaller. I run the coco2014 , the object label 1370 is far bigger than it . So We should only add some id and name to the labelmap.file or we should from scratch to train the object detector of the project ? I have the coco 2014 36 box's label. But I don't know how to get bigger object label labelmap file
If you have way, please help me . Thanks
KeyError: 'broccoli'
Killing subprocess 4356
Could you please share your VINVL R50-C4 detection model which is referred in Oscar plus, I want to make some further experiments with this model.
Thank you!
In any config file, I found there existed DATASETS.TRAIN is a yaml dir. But I can not find it.
What is the reason for the TypeError: run_test() got an unexpected keyword argument'model_name' error during single GPU training?
hello! When I extract image features with VinVL by the command :
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True TEST.OUTPUT_FEATURE True
I get the traceback:
Traceback (most recent call last):
File "tools/test_sg_net.py", line 129, in
main()
File "tools/test_sg_net.py", line 106, in main
data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
cfg, dataset_name, factory_name, is_train
File "/content/drive/MyDrive/VinVL/scene_graph_benchmark/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
assert op.isfile(full_yaml_file)
AssertionError
I guess the reason is that the file flickr30k/tsv/flickr30k.yaml is not exist, so where can I find this yaml file?
I also try to delete the line :
TEST: ("flickr30k/tsv/flickr30k.yaml",)
in the file: vinvl_x152c4.yaml,
but after run the code, there is nothing output.
I'm slightly confused by the instructions on how to extract features from the test_sg_net.py
script - more specifically:
sgg_configs/vgattr/vinvl_x152c4.yaml
config in order to point to my data directory (I'm not sure what the variables DATASETS.TEST
and DATA_DIR
need to be - I assumed the latter was the directory with all the images I need features for, but that does not seem to be the case)When I try to run
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 1 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 \
MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR /my_path_to_prepard_tsv/dataset/tsv TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True TEST.OUTPUT_FEATURE True
with environment
PyTorch version: 1.4.0
Is debug build: No
CUDA used to build PyTorch: 10.1
OS: Ubuntu 16.04.7 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
CMake version: version 3.5.1
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: 10.1.243
GPU models and configuration:
GPU 0: Tesla V100-SXM2-32GB
GPU 1: Tesla V100-SXM2-32GB
GPU 2: Tesla V100-SXM2-32GB
GPU 3: Tesla V100-SXM2-32GB
GPU 4: Tesla V100-SXM2-32GB
GPU 5: Tesla V100-SXM2-32GB
GPU 6: Tesla V100-SXM2-32GB
GPU 7: Tesla V100-SXM2-32GB
Nvidia driver version: 418.67
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.6.5
Versions of relevant libraries:
[pip3] numpy==1.19.2
[pip3] torch==1.4.0
[pip3] torchvision==0.5.0
[conda] blas 1.0 mkl
[conda] mkl 2020.2 256
[conda] mkl-service 2.3.0 py37he8ac12f_0
[conda] mkl_fft 1.3.0 py37h54f3939_0
[conda] mkl_random 1.1.1 py37h0573a6f_0
[conda] pytorch 1.4.0 py3.7_cuda10.1.243_cudnn7.6.3_0 pytorch
[conda] torchvision 0.5.0 py37_cu101 pytorch
Pillow (8.2.0)
I encountered the error as following
RuntimeError: CUDA error: invalid device function (launch_kernel at /opt/conda/conda-bld/pytorch_1579022060824/work/aten/src/ATen/native/cuda/Loops.cuh:103)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x47 (0x7f3f8ee64627 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: void at::native::gpu_index_kernel<__nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>, __nv_dl_wrapper_t<__nv_dl_tag<void (*)(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>), &(void at::native::index_kernel_impl<at::native::OpaqueType<8> >(at::TensorIterator&, c10::ArrayRef<long>, c10::ArrayRef<long>)), 1u>> const&) + 0x78d (0x7f3f9670368d in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x571bf32 (0x7f3f966fcf32 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x571c298 (0x7f3f966fd298 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #4: <unknown function> + 0x16957eb (0x7f3f926767eb in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #5: at::native::index(at::Tensor const&, c10::ArrayRef<at::Tensor>) + 0x47e (0x7f3f926725ae in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #6: <unknown function> + 0x1c0155a (0x7f3f92be255a in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #7: <unknown function> + 0x1c06023 (0x7f3f92be7023 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #8: <unknown function> + 0x3820d1a (0x7f3f94801d1a in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #9: <unknown function> + 0x1c06023 (0x7f3f92be7023 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch.so)
frame #10: at::Tensor::index(c10::ArrayRef<at::Tensor>) const + 0x191 (0x7f3fc1465931 in /miniconda/envs/py37/lib/python3.7/site-packages/torch/lib/libtorch_python.so)
frame #11: nms_cuda(at::Tensor, float) + 0x7e8 (0x7f3f6982407b in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #12: nms(at::Tensor const&, at::Tensor const&, float) + 0x790 (0x7f3f697eabb0 in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #13: <unknown function> + 0x53b97 (0x7f3f697fbb97 in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
frame #14: <unknown function> + 0x5004d (0x7f3f697f804d in ./maskrcnn_benchmark/_C.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
I tried both to set up my env by using option 1 and docker image, both environments give me the same error.
If anyone also has the same issue, please guide me.
I thought the key in the tsv file ( visualgenome/label_danfeiX_overlap.new.tsv
) represented the image ID, but I found out that it does not.
from maskrcnn_benchmark.structures.tsv_file_ops import tsv_reader
import glob
# get image ids from Visual Genome image files
img_files = glob.glob('/VisualGenome/VG_100K/*.jpg') + glob.glob('/VisualGenome/VG_100K_2/*.jpg')
image_ids_from_files = [img_file.split('/')[-1].split('.')[0] for img_file in img_files]
# get image ids from the scene graph annotation files
tsv = tsv_reader('datasets/visualgenome/label_danfeiX_overlap.new.tsv')
image_ids_from_annos = [row[0] for row in tsv]
# extract the overlap between the two data
overlap = set(image_ids_from_annos) & set(image_ids_from_files)
print(f'size of iid from files: {len(image_ids_from_files)}')
print(f'size of iid from annotations: {len(image_ids_from_annos)}')
print(f'size of overlapped iid: {len(overlapped_iid)}')
The output of this code looks like this:
size of iid from files: 108249
size of iid from annotations: 108073
size of overlapped iid: 5196
I think this shows that about 95% of the image ids are mismatched.
How can I get the correct image id mapping?
I would like to extract the scene graph like what is shown on R152FPN_demo.png. Which script should I try?
When I have installed your environment step by step by option 1 and then run the command for below,
# extract vision features with VinVL object-attribute detection model
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True
There are several issues,
I do not find the code to load the pre_trained model parameters for "AttrRCNN". Though the command has a pre-trained model path, I do not find a concrete torch.load()
code by debugging. Thus I wonder whether I need to add torch.load() by self when I run the above command.
The "self.training" in "AttrRCNN" is extended from "torch.nn.modules.module.py", which is set as True by default. But in run the command to extract VinVL features by the beginning command, it seems that it should be False and I have to overwrite each init functions of AttrRCNN, its "self.rpn", and "self.roi_heads" as below,
proposals, proposal_losses = self.rpn(images, features, targets, is_training = self.training)
x, predictions, detector_losses = self.roi_heads(features, proposals, targets, is_training = self.training)
def clip_to_image(self, remove_empty=True):
TO_REMOVE = 1
self.bbox[:, 0].clamp_(min=0, max=self.size[0] - TO_REMOVE)
self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
self.bbox[:, 2].clamp_(min=0, max=self.size[0] - TO_REMOVE)
self.bbox[:, 3].clamp_(min=0, max=self.size[1] - TO_REMOVE)
the error is as below,
File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/rpn.py", line 188, in _forward_test
boxes = self.box_selector_test(anchors, objectness, rpn_box_regression)
File "/home/jfhe/anaconda3/envs/JD2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/inference.py", line 140, in forward
sampled_boxes.append(self.forward_for_single_feature_map(a, o, b))
File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/modeling/rpn/inference.py", line 114, in forward_for_single_feature_map
boxlist = boxlist.clip_to_image(remove_empty=False)
File "/home/jfhe/Documents/MountHe/jfhe/mm_dialogue/MM_Dialogue/scene_graph_benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 217, in clip_to_image
self.bbox[:, 1].clamp_(min=0, max=self.size[1] - TO_REMOVE)
RuntimeError: Output 0 of UnbindBackward is a view and its base or another view of its base has been modified inplace. This view is the output of a function that returns multiple views. Such functions do not allow the output views to be modified inplace. You should replace the inplace operation by an out-of-place one.
I address them by setting "with torch.no_grad()", but it feels strange. If I have to fine-tune the model, then these bugs will come again by removing "with torch.no_grad()".
Hi there,
It looks like the ModelZoo contains broken links for all the OpenImages models such as:
Would it be possible to fix them!?
Thanks,
Alessandro
VinVL's DOWNLOAD.md says We also provide the X152-C4 objecte detection config file and pretrained model on the merged four datasets (COCO with stuff, Visual Genome, Objects365 and Open Images). The labelmap to decode the 1848 can be found here. The first 1594 classes are exactly VG classes, with the same order. The map from COCO vocabulary to this merged vocabulary can be found here. The map from Objects365 vocabulary to this merged vocabulary can be found here. The map from OpenImages V5 vocabulary to this merged vocabulary can be found here.
But I am wondering how to run this pretrained model?
Obviously Scene Graph Benchmark can't run this pre-trained model since the configuration file is not compatible with that package. I force to change the config file (deleting options one by one until yacs accepts), so I can manage to run the pre-trained model, but results are not right because the number of boxes are too small compared to other detector (which should not be) ...
Any help please?
error occurs "#error "You're running a too old version of GCC. We need GCC 5 or later." when I build the setup
Could you please tell me how to generate train.labelmap.tsv ? It seems that there is no code about it in the tsv_demo.py. @hanxiaotian Thanks.
Can you provide flickr30k/tsv/flickr30k.yaml as specified in the vgattr/vinvl_x152c4.yaml?
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True
Hi, I found a way to circumvent using tsv files by modifying scene_graph_benchmark/tools/demo/demo_image.py
, and now I only need jpg
image dataset, VinVl yaml configuration file and model weight file. The predictions are saved in dictionary and are stored in pth
format. I ran it on Google Colab and it generates predictions at a rate about 2s/image. I hope this helps.
# pretrained models at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/vinvl_vg_x152c4.pth
# the associated labelmap at https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/vinvl_model_zoo/VG-SGG-dicts-vgoi6-clipped.json
import cv2
import os
import os.path as op
import argparse
import json
from PIL import Image
from scene_graph_benchmark.scene_parser import SceneParser
from scene_graph_benchmark.AttrRCNN import AttrRCNN
from maskrcnn_benchmark.data.transforms import build_transforms
from maskrcnn_benchmark.utils.checkpoint import DetectronCheckpointer
from maskrcnn_benchmark.config import cfg
from scene_graph_benchmark.config import sg_cfg
from maskrcnn_benchmark.data.datasets.utils.load_files import \
config_dataset_file
from maskrcnn_benchmark.data.datasets.utils.load_files import load_labelmap_file
from maskrcnn_benchmark.utils.miscellaneous import mkdir
def cv2Img_to_Image(input_img):
cv2_img = input_img.copy()
img = cv2.cvtColor(cv2_img, cv2.COLOR_BGR2RGB)
img = Image.fromarray(img)
return img
def detect_objects_on_single_image(model, transforms, cv2_img):
# cv2_img is the original input, so we can get the height and
# width information to scale the output boxes.
img_input = cv2Img_to_Image(cv2_img)
img_input, _ = transforms(img_input, target=None)
img_input = img_input.to(model.device)
with torch.no_grad():
prediction = model(img_input)[0].to('cpu')
# prediction = prediction[0].to(torch.device("cpu"))
img_height = cv2_img.shape[0]
img_width = cv2_img.shape[1]
prediction = prediction.resize((img_width, img_height))
return prediction
#Setting configuration
cfg.set_new_allowed(True)
cfg.merge_from_other_cfg(sg_cfg)
cfg.set_new_allowed(False)
#Configuring VinVl
cfg.merge_from_file('/scene_graph_benchmark/sgg_configs/vgattr/vinvl_x152c4.yaml')
#This is a list specifying the values for additional arguments, it encompasses pairs of list and values in an ordered manner
#MODEL.WEIGHT specifies the full path of the VinVl weight pth file
#DATA_DIR specifies the directory that contains VinVl input tsv configuration yaml file
argument_list = [
'MODEL.WEIGHT', 'vinvl_vg_x152c4.pth',
'MODEL.ROI_HEADS.NMS_FILTER', 1,
'MODEL.ROI_HEADS.SCORE_THRESH', 0.2,
'TEST.IGNORE_BOX_REGRESSION', False,
'MODEL.ATTRIBUTE_ON', True
]
cfg.merge_from_list(argument_list)
cfg.freeze()
# assert op.isfile(args.img_file), \
# "Image: {} does not exist".format(args.img_file)
output_dir = cfg.OUTPUT_DIR
# mkdir(output_dir)
model = AttrRCNN(cfg)
model.to(cfg.MODEL.DEVICE)
model.eval()
checkpointer = DetectronCheckpointer(cfg, model, save_dir=output_dir)
checkpointer.load(cfg.MODEL.WEIGHT)
transforms = build_transforms(cfg, is_train=False)
input_img_directory = 'insert your images directory path here'
#need to be pth
output_prediction_file = 'insert your output pth file path here'
dets = {}
for img_name in os.listdir(input_img_directory):
#Convert png format to jpg format
if img_name.split('.')[1]=='png' or img_name.split('.')[1]=='PNG':
im = Image.open(os.path.join(input_img_directory, img_name))
rgb_im = im.convert('RGB')
new_name = img_name.split('.')[0]+'.jpg'
rgb_im.save(os.path.join(input_img_directory, new_name))
print(new_name)
img_file_path = os.path.join(input_img_directory,img_name.split('.')[0]+'.jpg')
print(img_file_path)
cv2_img = cv2.imread(img_file_path)
det = detect_objects_on_single_image(model, transforms, cv2_img)
# prediction contains ['labels',
# 'scores',
# 'box_features',
# 'scores_all',
# 'boxes_all',
# 'attr_labels',
# 'attr_scores']
# box_features are used for oscar
det_dict ={key : det1[0].get_field(key) for key in det1[0].fields()}
dets[img_name.split('.')[0]] = det_dict
torch.save(dets, output_prediction_file)
Originally posted by @SPQRXVIII001 in #7 (comment)
Hi, I want to train the detector with train_net.py, could you please give me some guide? How to organize the data, and how to pass the parameters. Thanks. @pzzhang
I failed to find the image source when I was trying to execute the script about Vinvl, where can I get the image file in the script"--img_file ../maskrcnn-benchmark-1/datasets1/imgs/woman_fish.jpg"?
The download link is not available, please check it.
azcopy copy 'https://penzhanwu2.blob.core.windows.net/sgg/sgg_benchmark/datasets/openimages_v5c' --recursive
When attempting to perform training using tools/train_sg_net.py
and a config file like sgg_configs/vg_vrd/rel_danfeiX_FPN50_nm.yaml
, I receive the following error:
Traceback (most recent call last):
File "tools/train_sg_net.py", line 225, in <module>
main()
File "tools/train_sg_net.py", line 218, in main
model = train(cfg, args.local_rank, args.distributed)
File "tools/train_sg_net.py", line 110, in train
meters
File "/home/scene_graph_benchmark/maskrcnn_benchmark/engine/trainer.py", line 94, in do_train
loss_dict = model(images, targets)
File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/parallel/distributed.py",
line 619, in forward
output = self.module(*inputs[0], **kwargs[0])
File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/scene_graph_benchmark/scene_graph_benchmark/scene_parser.py", line 319, in forward
x_pairs, prediction_pairs, relation_losses = self.relation_head(features, predictions, targets)
File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/relation_head.py", line
211, in forward
= self.rel_predictor(features, proposals, proposal_pairs)
File "/home/miniconda3/envs/sg_benchmark/lib/python3.7/site-packages/torch/nn/modules/module.py", line
727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/neural_motif/neuralmotif
.py", line 135, in forward
= _get_tensor_from_boxlist(proposals, 'gt_labels')
File "/home/scene_graph_benchmark/scene_graph_benchmark/relation_head/sparse_targets.py", line
61, in _get_tensor_from_boxlist
assert proposals[0].extra_fields[field] is not None
KeyError: 'gt_labels'
Has anyone else encountered the same problem?
Hi,
Thank you for providing us code for this project. I was trying to generate images features. I attempted to follow both examples from #25 and #7
as follows:
With a directory of 18 images (stored in datasets/test_imgs), I used tools/mini_tsv/demo_tsv.py
to generate tsv files (label, hw, linelist) for the corresponding dataset, and stored them in datasets/test/
. Since I didn't have any particular labelmap in mind, and I had downloaded the checkpoint for the RelDN model, and its corresponding config file, I used the label map VG-SGG-dicts-vgoi6-clipped.json
(I copied this file into the same directory), so that my yaml file is as follows:
datasets/test/test_imgs.yaml
img: test_imgs.tsv label: test_imgs_label.tsv hw: test_imgs_hw.tsv label_map: VG-SGG-dicts-vgoi6-clipped.json linelist: test_imgs_linelist.tsv
Then, I made a new yaml file datasets/test/testing.yaml
which was the same yaml file as rel_danfeiX_FPN50_reldn.yaml
but with DATASETS.TRAIN = ("test/test_imgs.yaml",) and DATASETS.TEST = ("test/test_imgs.yaml",) and ran the command
python -m torch.distributed.launch --nproc_per_node=2 tools/test_sg_net.py --config-file datasets/test/testing.yaml
This ran into the error:
2021-07-16 04:08:02,996 maskrcnn_benchmark.inference INFO: Start evaluation on test/test_imgs.yaml dataset(18 images).
INFO:maskrcnn_benchmark.inference:Start evaluation on test/test_imgs.yaml dataset(18 images).
0%| | 0/3 [00:00<?, ?it/s]
Traceback (most recent call last):
File "tools/test_sg_net.py", line 198, in
main()
File "tools/test_sg_net.py", line 194, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 73, in run_test
save_predictions=cfg.TEST.SAVE_PREDICTIONS,
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 265, in inference
predictions = compute_on_dataset(model, data_loader, device, bbox_aug, inference_timer)
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 32, in compute_on_dataset
for _, batch in enumerate(tqdm(data_loader)):
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/tqdm/std.py", line 1185, in iter
for obj in iterable:
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in next
data = self._next_data()
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
return self._process_data(data)
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
data.reraise()
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
data = fetcher.fetch(index)
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/f-run/.conda/envs/sg_benchmark/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/relation_tsv.py", line 146, in getitem
target = self.get_target_from_annotations(annotations, img_size)
File "/home/f-run/PyCharmProjects/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/relation_tsv.py", line 78, in get_target_from_annotations
target = self.label_loader(annotations['objects'], img_size, remove_empty=False)
TypeError: list indices must be integers or slices, not str
This is very strange to me since when I run the command without using my own datasets I don't run into this issue at all. Is there anything that I did incorrectly that could cause this error-- and if so, how can I fix it?
This is my first issue raised so forgive me if this is too much/little info or if this is more suited to Stack Overflow instead.
Not
Hi!
How can I extract image features from my dataset with VinVL if it's not in tsv format, but in the form of a folder with image files? What's the correct way to do this?
python -m torch.distributed.launch --nproc_per_node=$NGPUS tools/train_sg_net.py --config-file
"sgg_configs/vg_vrd/rel_danfeiX_FPN50_reldn.yaml"
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
I use the model offered by VinVL Feature extraction,.But after using the same setting as advised, I can only obtain bounding box features. So, how to extract relation features? Thanks
I cannot seem to open model links from https://github.com/microsoft/scene_graph_benchmark/blob/main/SCENE_GRAPH_MODEL_ZOO.md
any help will be very much appreciated
@hanxiaotian Hi, thanks a lot for releasing the great SGG benchmark! I want to extract the predicted scene graph (with predicted boxes, attributes and relations) from scratch for new images using the pre-trained models in the model zoo. However, when I try to use the demo script, I notice the model cannot predict attributes and relations together (the VinVL pre-trained model only predicts the attributes and the RelDN pre-trained model only generates relations, where I need to do further ad-hoc alignment with the two outputs to predict a full scene graph). Is there any way to achieve this using a single provided pre-trained model? (Sorry I'm not very familiar with this task.) Thank you very much!
Hi - I am getting a KeyError box features message when trying to extract features from my own image. I've played around with the code, but can't seem to figure it out. If I set the TEST.OUTPUT_FEATURE to False, then the code runs fine, but just outputting the detected objects. Can someone please help out?
For reference, the demo extraction on single image works fine - both object detection and box features.
Traceback (most recent call last):
File "tools/test_sg_net.py", line 197, in
main()
File "tools/test_sg_net.py", line 193, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 72, in run_test
save_predictions=cfg.TEST.SAVE_PREDICTIONS,
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 297, in inference
relation_on=cfg.MODEL.RELATION_ON,
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 211, in convert_predictions_to_tsv
tsv_writer(gen_rows(), os.path.join(output_folder, output_tsv_name))
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/structures/tsv_file_ops.py", line 42, in tsv_writer
for value in values:
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/engine/inference.py", line 139, in gen_rows
features = prediction.get_field('box_features').numpy()
File "/content/drive/MyDrive/scene_graph_benchmark/maskrcnn_benchmark/structures/bounding_box.py", line 43, in get_field
return self.extra_fields[field]
KeyError: 'box_features'
I find the five methods in the comparisons are all before 2020, but the mainstream method in 2020 and 2021 is graph neural network. So I hope your team can add some latest methods to this project. Thanks.
The download links to Pre-trained models' checkpoints of Openimages V5 show File Not Found Errors. Could please take a look and provide new links?
the labelmap only provide 50 objects? Could it more bigger Or we only from scratch to train the project object detection?
I have all the boxes object label ,but it is bigger than the 50 object label in labelmap.file that project provide.
So how I should get the relation prediction?
I run the precls code , Always get the KeyError: 'broccoli','carrot', and so on.....
hi, when I use tsv_demo.py to generate some tsv files using my own images, It is easy to generate img.tsv and hw.tsv, but for label.tsv, how can I generate this class and rect, I do not have any annotations?
thanks a lot!
I found VinVL 'S object and attribute lable is so bigger. So How to use the VinVL in predicate classification. At Now the project only provides 150 object. But visual genome +faster rcnn can detect 1370 object class. It is so big difference.
Hi,
I try to train a model following "python tools/train_sg_net.py --config-file "/path/to/config/file.yaml"
". But I confuse which config.file I need to use, can you share more details about training config files?
Thank you very much!
I ran the following command (from the README):
python tools/test_sg_net.py --config-file sgg_configs/vgattr/vinvl_x152c4.yaml TEST.IMS_PER_BATCH 2 MODEL.WEIGHT models/vinvl/vinvl_vg_x152c4.pth MODEL.ROI_HEADS.NMS_FILTER 1 MODEL.ROI_HEADS.SCORE_THRESH 0.2 DATA_DIR "../maskrcnn-benchmark-1/datasets1" TEST.IGNORE_BOX_REGRESSION True MODEL.ATTRIBUTE_ON True
I think the DATA_DIR is misconfigured because I get the following error (below). Where is ../maskrcnn-benchmark-1/datasets1
from? Or the file visualgenome/test_vgoi6_clipped.yaml
, which I think it's looking for?
This is the AssertionError:
Traceback (most recent call last):
File "tools/test_sg_net.py", line 197, in <module>
main()
File "tools/test_sg_net.py", line 193, in main
run_test(cfg, model, args.distributed, model_name)
File "tools/test_sg_net.py", line 55, in run_test
data_loaders_val = make_data_loader(cfg, is_train=False, is_distributed=distributed)
File "/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 170, in make_data_loader
datasets = build_dataset(cfg, transforms, DatasetCatalog, is_train or is_for_period)
File "/scene_graph_benchmark/maskrcnn_benchmark/data/build.py", line 45, in build_dataset
cfg, dataset_name, factory_name, is_train
File "/scene_graph_benchmark/maskrcnn_benchmark/data/datasets/utils/config_args.py", line 7, in config_tsv_dataset_args
assert op.isfile(full_yaml_file)
AssertionError
Thanks in advance!
I failed to used azcopy command to download vinvl pre-training model. Could you show me how to download pretraining models? Thank you!
./azcopy cp https://penzhanwu2.blob.core.windows.net/results/vinvl/od_models/vinvl_vg_x152c4.pth .
INFO: Scanning...
failed to perform copy command due to error: Login Credentials missing. No SAS token or OAuth token is present and the resource is not public.
Dear scholar,
I want to ask whether your elegant code includes the function about produce a description about the attribute and object for the designated bounding box.
In your tools/demo_image.py , it can produce 36 bounding box with the attribute and object label by your model with the pretrained weights file.
Could your code pass the boxlist to the model, and the model produce the designated bounding box 's attribute and object label?
hi, I want to extract features from my directory of images, so I follow issue#7 to set up some tsv files using tsv_demo.py, then I wrote a test.yaml. then, I do something in vinvl_x152c4.yaml,I point DATASETS:TEST to my test.yaml, but I can not run test_sg_net.py correctly, is there anything I should do but I missed?
thank you very much!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.