Code Monkey home page Code Monkey logo

reltr's Introduction

🎉 Good News: our work has been accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI)! 🎉

Open In Colab

RelTR: Relation Transformer for Scene Graph Generation

We now provide [Colab] Demo!

PyTorch Implementation of the Paper RelTR: Relation Transformer for Scene Graph Generation

Different from most existing advanced approaches that infer the dense relationships between all entity proposals, our one-stage method can directly generate a sparse scene graph by decoding the visual appearance. If our work is helpful for your research, please cite our publication:

@article{cong2023reltr,
  title={Reltr: Relation transformer for scene graph generation},
  author={Cong, Yuren and Yang, Michael Ying and Rosenhahn, Bodo},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2023},
  publisher={IEEE}
}

0. Checklist

  • Inference Code 🎉
  • Training Code for Visual Genome 🎉
  • Evaluation Code for Visual Genome 🎉
  • Colab Demo 🎉
  • Training Code for OpenImages V6 🎉
  • Evaluation Code for OpenImages V6 🎉
  • Cleaner Evaluation Code 🕘
  • Post Processing 🕘

1. Installation

Download RelTR Repo with:

git clone https://github.com/yrcong/RelTR.git
cd RelTR

For Inference

😄 It is super easy to configure the RelTR environment.

If you want to infer an image, only python=3.6, PyTorch=1.6 and matplotlib are required! You can configure the environment as follows:

# create a conda environment 
conda create -n reltr python=3.6
conda activate reltr

# install packages
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
conda install matplotlib

Training/Evaluation on Visual Genome or Open Images V6

If you want to train/evaluate RelTR on Visual Genome, you need a little more preparation:

a) Scipy (we used 1.5.2) and pycocotools are required.

conda install scipy
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'

b) Follow README in the data directory to prepare the datasets.

c) Some widely-used evaluation code (IoU) need to be compiled... We will replace it with Pytorch code.

# compile the code computing box intersection
cd lib/fpn
sh make.sh

The directory structure looks like:

RelTR
| 
│
└───data
│   └───vg
│       │   rel.json
│       │   test.json
│       |   train.json
|       |   val.json
|       |   images
│   └───oi
│       │   rel.json
│       │   test.json
│       |   train.json
|       |   val.json
|       |   images
└───datasets    
... 

2. Usage

Inference

a) Download our RelTR model pretrained on the Visual Genome dataset and put it under

ckpt/checkpoint0149.pth

b) Infer the relationships in an image with the command:

python inference.py --img_path $IMAGE_PATH --resume $MODEL_PATH

We attached 5 images from VG dataset and 1 image from internet. You can also test with your customized image. The result should look like:

Training

a) Train RelTR on Visual Genome on a single node with 8 GPUs (2 images per GPU):

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --dataset vg --img_folder data/vg/images/ --ann_path data/vg/ --batch_size 2 --output_dir ckpt

b) Train RelTR on Open Images V6 on a single node with 8 GPUs (2 images per GPU):

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py --dataset oi --img_folder data/oi/images/ --ann_path data/oi/ --batch_size 2 --output_dir ckpt

Evaluation

a) Evaluate the pretrained RelTR on Visual Genome with a single GPU (1 image per GPU):

python main.py --dataset vg --img_folder data/vg/images/ --ann_path data/vg/ --eval --batch_size 1 --resume ckpt/checkpoint0149.pth

b) Evaluate the pretrained RelTR on Open Images V6 with a single GPU (1 image per GPU):

python main.py --dataset oi --img_folder data/oi/images/ --ann_path data/oi/ --eval --batch_size 1 --resume ckpt/checkpoint0149_oi.pth

3. Questions

Since the code is cleaned up from the draft, there may be some errors. If you meet any problem when running our code, please let me know! (It's better open an issue so that anyone can see it)

reltr's People

Contributors

yrcong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

reltr's Issues

Preprocessing Code

Hi, Dear authors of RelTR:

Your work is so solid that we are so impressed. Is it possible to release the code for preprocessing? so we could reproduce the results on other datasets.

When I was training data, I encountered an error

Hello, I am very interested in your research. I have created my own dataset (with new categories and relationships added), and the format of the dataset is the same as in your code. But when I was training data, I encountered an error, but I couldn't find the problem with my dataset format. Is there any aspect of the code that needs to be adjusted? Can you help me?

The error is as follows:

C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [61,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [62,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
C:/w/b/windows/pytorch/aten/src/ATen/native/cuda/IndexKernel.cu:84: block: [0,0,0], thread: [63,0,0] Assertion index >= -sizes[i] && index < sizes[i] && "index out of bounds" failed.
Traceback (most recent call last):
File "main.py", line 240, in
main(args)
File "main.py", line 192, in main
train_stats = train_one_epoch(model, criterion, data_loader_train, optimizer, device, epoch, args.clip_max_norm)
File "E:\LXD\RelTR-main\engine.py", line 40, in train_one_epoch
loss_dict = criterion(outputs, targets)
File "E:\Anaconda3\envs\reltr\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\LXD\RelTR-main\models\reltr.py", line 299, in forward
indices = self.matcher(outputs_without_aux, targets)
File "E:\Anaconda3\envs\reltr\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
result = self.forward(*input, **kwargs)
File "E:\Anaconda3\envs\reltr\lib\site-packages\torch\autograd\grad_mode.py", line 15, in decorate_context
return func(*args, **kwargs)
File "E:\LXD\RelTR-main\models\matcher.py", line 112, in forward
cost_sub_giou = -generalized_box_iou(box_cxcywh_to_xyxy(sub_bbox), box_cxcywh_to_xyxy(sub_tgt_bbox))
File "E:\LXD\RelTR-main\util\box_ops.py", line 50, in generalized_box_iou
assert (boxes1[:, 2:] >= boxes1[:, :2]).all(), boxes1
RuntimeError: CUDA error: device-side assert triggered

Model not getting trained on single GPU

When I try to train on single GPU, the error keeps on increasing and I cannot see any good results even thill 38th epoch.

train_class_error starts from 97.88 and deom 19th to 37th epoch its consistently 100. Can you debug this?

Please let me know if you need some more information

Train and Test for custom images

How did you get the annotations of Visual Genome (in COCO-format)? Do you have tools to process the original datasets? I want to use in my custom datasets(only pictures), I have seen the KaihuaTang/Scene-Graph-Benchmark.pytorch code ,but have no idea to process the my custom datasets for training and testing. Thanks very much!

annotation files

hey can you share how to create annotations files for custom dataset, thanks.

Regarding Pretrained Model Weights

Dear Author, thanks for your fabulous code! I have a question regarding model training, wondering if you initialized the model with pretrained detr model or you just train the whole network from scratch?

Some misunderstanding about the heat map using to predict Relationship

Hello, thank you very much for providing the code. However, in the paper, you said that "The predicate probability pˆ𝑝𝑟𝑑 is predicted by a multi-layer perceptron concatenating the corresponding subject representation, object representation, and spatial feature vector, which can be formulated as:
pˆ𝑝𝑟𝑑 = softmax(MLP([Q𝑠,Q𝑜,V𝑠𝑝𝑎]))."
But in the source code, I don't see this concatenating, I just see that you used sub heatmap for prediction. Could you explain it to me. Thank you very much

  • In RelTR model:
    image
  • and in Transformer:
    image

Evaluation

I don't understand why class_ Error, sub_ Error, obj_ Error, rel_ Is the error so high that I misunderstood or was the evaluation process wrong
Uploading Averaged stats.png…

Training on Visual Genome dataset

Hi. Thanks for sharing your nice work!

Do you have any pre-trained weights on full Visual genome dataset? (not split one)

Thanks.

How to generate a graph from an image?

If a scene graph need to be generated by RelTR, what should be done?
Since this command only detect the objects in the image:
python inference.py --img_path ./demo/vgx.jpg --resume ./ckpt/checkpoint0149.pth

So, please kindly help to tell me how to do, thanks very much!

Number of classes of relations in VG

May I kindly ask why each sample in outputs_class_rel = self.rel_class_embed(torch.cat((hs_sub, hs_obj, so_masks), dim=-1)) at models/reltr.py:111 has dim 52, instead of 51? The processed VG has 50 relation classes, so I assume the dim should be 51 with an additional no relation ('background') class.

Besides, it can be seen from models/reltr.py:213-214
Count the number of predictions that are NOT "no-object" (which is the last class)
card_pred = (pred_logits.argmax(-1) != pred_logits.shape[-1] - 1).sum(1)
that the last class represents the "no-object" class.

However, in the colab notebook, you said
REL_CLASSES = ['background', 'above', 'across', 'against', 'along', 'and', 'at', 'attached to', 'behind',
'belonging to', 'between', 'carrying', 'covered in', 'covering', 'eating', 'flying in', 'for',
'from', 'growing on', 'hanging from', 'has', 'holding', 'in', 'in front of', 'laying on',
'looking at', 'lying on', 'made of', 'mounted on', 'near', 'of', 'on', 'on back of', 'over',
'painted on', 'parked on', 'part of', 'playing', 'riding', 'says', 'sitting on', 'standing on',
'to', 'under', 'using', 'walking in', 'walking on', 'watching', 'wearing', 'wears', 'with']

I notice that REL_CLASSES has a length of 51, not 52, and 'background' is at the index 0, not at the last index.
Is this REL_CLASSES in colab the label indices you use in your training code (in data/vg/rel.json)? Because I am working on re-organizing the dataset labels for my own project, I need to know the exact ordering of these label indices. Thanks for your assistance!

ImportError: cannot import name '_new_empty_tensor'

Getting this error when trying to run inference.py-- is this related to version issues with python/conda? Did my best to triage the issue independently, but all I could find is that this is a known issue with pytorch that was fixed with python 3.6, however that is the version I'm running.

Appreciate you for all your work on this btw!

About evaluate_rel_batch() function

Hello, thank you very much for providing the code. I modified the evaluate_rel_batch() function, but it seems to have encountered an error. Could you please help analyze the possible reasons for this error? Thank you very much!
Uploading 微信图片_20231207162845.png…

Unable to download checkpoint from drive

Hey!

We tried to automatically download the RelTR checkpoint from drive but had some issues. We did not have this issue with any of the other assets hosted on drive.

Maybe you could help us out / update the permission.

Thanks!
Screenshot 2022-12-15 at 10 15 52 PM

The problem of training time

Thanks for your inspiring work!
According to your report, the model was trained on 8 RTX2080 (2 images per GPU) for 150 epochs to reach the expected performance, which is a bit too long to modify/re-train based on the code.

Therefore, I would like to know if the model took so long to train because the DETR part was completely retrained?
Is it possible to use the pre-trained DETR encoder part to reduce the training overhead?

about Predcls

I found out the code about Predcls from this issue #20 (comment)

You mentioned the function

def evaluate_batch_predcls(outputs, targets, evaluator, matching_indices, evaluator_list):
.......

Could you give some details about computing matching_indices. Thanks!

Some questions about evaluation

Hello,
Firstly, I want to express my gratitude for sharing your code. It has been incredibly helpful to me.

I see such evaluation results in your paper

eval

How did you get your SGDet, SGCls, PredCls evaluation results?

After running the evaluation in your code, I can't find a correspondence。

eval2

Could you please give me some code about it or explain the results of your evaluation results?

I'm looking forward to your response.

name 'train_stats' is not defined

When I use single NVIDIA GeForce RTX 4090 training, the error named 'train_stats' is not defined is reported, what is the reason for this, the command I am using is: python main.py --dataset vg --img_folder data/vg/images/ --ann_path data/vg/ --batch_size 2 --output_dir ckpt

demo问题

您好!请问在您提供的demo中如何更换其他图片呢?我更换的图片链接会报错。具体操作是什么呢?

evaluating RelTR on PredCLS/SGCLS

Dear Author, thanks for your code. I have a question about evaluating RelTR on PredCLS/SGCLS , wondering how to assign the ground truth information to the matched triplet proposals?Thanks a lot.

What happens when there are no relations in a sample?

Hi and thanks for sharing this interesting work!

I am working on supporting another dataset in vg format. In my dataset I have images with no relation between the objects and I wonder how to make the network support this.

There is an issue since the images still contain boxes of objects but targets['rel_annotations'] is empty. Therefore in matcher forward pass, indices has a different length from indices1.

I'd appreciate your input. Thanks in advance!

rel_logits vs pred_logits

Hello,

Thank you for the wonderful work.

In models/reltr.py RelTR module, you specify outputs_class_rel and outputs_class as rel_logits and pred_logits.
What is the difference between these two?

Thank you,
William Han

About OpenV6

Why is the number of object classes in the OpenV6 dataset I processed 600, while it is mentioned as 289 object classes in the article? Were there any additional processing steps involved?

Inquiry about code behavior in relation to constraints of evaluation metrics

Hello,

Firstly, I want to express my gratitude for sharing your code. It has been incredibly helpful to me. However, I have a question regarding the evaluation metric when considering constraints versus without constraints.

From my understanding, the multiple_preds variable is used to control different modes. However, upon reviewing the code, I couldn't identify any discernible differences in actions taken for each mode.

Could you kindly provide some clarification on how the code behaves differently when multiple_preds is set to different values? I would greatly appreciate any insights you can provide.

Thank you once again for your assistance.

Some details about Open Image v6.

Many thanks for the great work.
Since the code for Open Image v6 is not release, may I ask about the data split you use for training?
Do you use the train+val splits or only train split for training?

about output

I see that in the literature, the Scene graph is finally generated, but the output of the open source code is only the attention heatmap and the identification results on the image, right? Is there any way to generate the final integrated Scene graph?

Memory Utilization Issue

While training the code on a 4 GPU system, The memory utilization suddenly exploded after 5 epochs, Thus killing the process. I was training the code on university HPC, system specification
24 Cores
128 GB RAM
4 Nvidia Quadro RTX 8000

The problem of RelTR Demo.ipynb

1690274826716
I would like to ask you a question. When I run the RelTR Demo.ipynb you published, I encountered the error as shown in fig. Also, many tensors in conv_features are zeros. Could you guide me on how to resolve this?

Error during training in bbox.pyx : ValueError: Buffer dtype mismatch, expected 'DTYPE_t' but got 'double'

Hi,
Thank you for your work !
I try to train the model on a customized dataset, but after the first epoch, I have the following error at the lines of sg_eval.py :
sub_iou = bbox_overlaps(gt_box[None, :4], boxes[:, :4])[0]
obj_iou = bbox_overlaps(gt_box[None, 4:], boxes[:, 4:])[0]

File "bbox.pyx", line 17, in bbox.bbox_overlaps
ValueError: Buffer dtype mismatch, expected 'DTYPE_t' but got 'double'

Have you ever had this issue ?

About training strategy

Hi, I'm now training the reltr model on VG dataset and I find the training time is quite long. It takes ~2.5 days to train for 150 epochs on 4*3090 with batchsize 4. Im not sure whether I'm doing something wrong or it does need much time to train from scratch.

And I want to ask if you have tried other training strategies like multiple stage. For example, in the first stage just train the model for object detection, the in the second stage only train the triplet decoder and freeze the encoder and entity decoder(or updating with a low leaning rate). That sounds more practical and will reduce the training time in theory.

Hi, the training results are lower than reported results.

I git clone the code and start training on 2 A40 GPUs, 8 images per GPU. The hyperparameters remain the same with the original ones. Did I miss something? : (

System:
pytorch 2.0.1 py3.10_cuda11.8_cudnn8.7.0_0 pytorch
pytorch-cuda 11.8 h7e8668a_5 pytorch
torchvision 0.15.2 py310_cu118 pytorch

`======================sgdet============================
R@20: 0.203254
R@50: 0.250673
R@100: 0.271782

relationship: above
======================sgdet============================
R@20: 0.028179
R@50: 0.047680
R@100: 0.063172

relationship: across
======================sgdet============================
R@20: 0.015873
R@50: 0.015873
R@100: 0.047619

relationship: against
======================sgdet============================
R@20: 0.008065
R@50: 0.008065
R@100: 0.016129

relationship: along
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: and
======================sgdet============================
R@20: 0.005917
R@50: 0.005917
R@100: 0.017751

relationship: at
======================sgdet============================
R@20: 0.111198
R@50: 0.163596
R@100: 0.176330

relationship: attached to
======================sgdet============================
R@20: 0.001179
R@50: 0.001474
R@100: 0.006191

relationship: behind
======================sgdet============================
R@20: 0.111431
R@50: 0.186552
R@100: 0.226916

relationship: belonging to
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: between
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.003472

relationship: carrying
======================sgdet============================
R@20: 0.126705
R@50: 0.159264
R@100: 0.183682

relationship: covered in
======================sgdet============================
R@20: 0.016071
R@50: 0.018452
R@100: 0.025595

relationship: covering
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.012903

relationship: eating
======================sgdet============================
R@20: 0.059603
R@50: 0.094923
R@100: 0.118102

relationship: flying in
======================sgdet============================
R@20: 0.000000
R@50: 0.060606
R@100: 0.060606

relationship: for
======================sgdet============================
R@20: 0.022814
R@50: 0.035109
R@100: 0.039208

relationship: from
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: growing on
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: hanging from
======================sgdet============================
R@20: 0.000000
R@50: 0.003984
R@100: 0.003984

relationship: has
======================sgdet============================
R@20: 0.270150
R@50: 0.329975
R@100: 0.352970

relationship: holding
======================sgdet============================
R@20: 0.213111
R@50: 0.251752
R@100: 0.267716

relationship: in
======================sgdet============================
R@20: 0.098441
R@50: 0.143542
R@100: 0.169576

relationship: in front of
======================sgdet============================
R@20: 0.030335
R@50: 0.049338
R@100: 0.064679

relationship: laying on
======================sgdet============================
R@20: 0.081081
R@50: 0.121622
R@100: 0.130631

relationship: looking at
======================sgdet============================
R@20: 0.029451
R@50: 0.041499
R@100: 0.059572

relationship: lying on
======================sgdet============================
R@20: 0.040816
R@50: 0.051020
R@100: 0.051020

relationship: made of
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: mounted on
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: near
======================sgdet============================
R@20: 0.088555
R@50: 0.148588
R@100: 0.188174

relationship: of
======================sgdet============================
R@20: 0.188986
R@50: 0.256768
R@100: 0.279926

relationship: on
======================sgdet============================
R@20: 0.222539
R@50: 0.277110
R@100: 0.302019

relationship: on back of
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: over
======================sgdet============================
R@20: 0.023573
R@50: 0.034739
R@100: 0.042184

relationship: painted on
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: parked on
======================sgdet============================
R@20: 0.048555
R@50: 0.084791
R@100: 0.108387

relationship: part of
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: playing
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.090909

relationship: riding
======================sgdet============================
R@20: 0.215077
R@50: 0.267793
R@100: 0.282938

relationship: says
======================sgdet============================
R@20: 0.000000
R@50: 0.083333
R@100: 0.083333

relationship: sitting on
======================sgdet============================
R@20: 0.098548
R@50: 0.150509
R@100: 0.164636

relationship: standing on
======================sgdet============================
R@20: 0.048238
R@50: 0.071938
R@100: 0.090096

relationship: to
======================sgdet============================
R@20: 0.000000
R@50: 0.000000
R@100: 0.000000

relationship: under
======================sgdet============================
R@20: 0.074767
R@50: 0.101710
R@100: 0.121233

relationship: using
======================sgdet============================
R@20: 0.114583
R@50: 0.140625
R@100: 0.170000

relationship: walking in
======================sgdet============================
R@20: 0.000000
R@50: 0.004695
R@100: 0.004695

relationship: walking on
======================sgdet============================
R@20: 0.038874
R@50: 0.089455
R@100: 0.111126

relationship: watching
======================sgdet============================
R@20: 0.008418
R@50: 0.061785
R@100: 0.083550

relationship: wearing
======================sgdet============================
R@20: 0.386910
R@50: 0.412767
R@100: 0.421221

relationship: wears
======================sgdet============================
R@20: 0.004430
R@50: 0.024502
R@100: 0.044608

relationship: with
======================sgdet============================
R@20: 0.029754
R@50: 0.070640
R@100: 0.093761

======================sgdet mean recall with constraint============================
mR@20: 0.05724454045111795
mR@50: 0.08143982200281324
mR@100: 0.09561244052384507
Averaged stats: class_error: 60.00 sub_error: 50.00 obj_error: 0.00 rel_error: 75.00 loss: 16.0778 (18.9691) loss_ce: 0.2776 (0.4303) loss_bbox: 0.8823 (0.9988) loss_giou: 1.0329 (1.0426) loss_rel: 0.4454 (0.6207) loss_ce_0: 0.3065 (0.4637) loss_bbox_0: 1.0056 (1.1189) loss_giou_0: 1.1239 (1.1814) loss_rel_0: 0.4299 (0.5930) loss_ce_1: 0.3023 (0.4488) loss_bbox_1: 0.9873 (1.0519) loss_giou_1: 1.0969 (1.1049) loss_rel_1: 0.3992 (0.5942) loss_ce_2: 0.2839 (0.4382) loss_bbox_2: 0.8772 (1.0255) loss_giou_2: 1.0365 (1.0784) loss_rel_2: 0.4104 (0.5954) loss_ce_3: 0.2801 (0.4330) loss_bbox_3: 0.7734 (1.0091) loss_giou_3: 1.0344 (1.0574) loss_rel_3: 0.4219 (0.5989) loss_ce_4: 0.2719 (0.4310) loss_bbox_4: 0.7968 (0.9987) loss_giou_4: 1.0301 (1.0459) loss_rel_4: 0.4093 (0.6088) loss_ce_unscaled: 0.2776 (0.4303) class_error_unscaled: 28.5714 (34.1161) sub_error_unscaled: 33.3333 (54.1432) obj_error_unscaled: 33.3333 (48.1875) loss_bbox_unscaled: 0.1765 (0.1998) loss_giou_unscaled: 0.5165 (0.5213) cardinality_error_unscaled: 7.0000 (7.0240) loss_rel_unscaled: 0.4454 (0.6207) rel_error_unscaled: 56.2500 (66.5290) loss_ce_0_unscaled: 0.3065 (0.4637) loss_bbox_0_unscaled: 0.2011 (0.2238) loss_giou_0_unscaled: 0.5619 (0.5907) cardinality_error_0_unscaled: 7.0000 (8.3696) loss_rel_0_unscaled: 0.4299 (0.5930) loss_ce_1_unscaled: 0.3023 (0.4488) loss_bbox_1_unscaled: 0.1975 (0.2104) loss_giou_1_unscaled: 0.5484 (0.5524) cardinality_error_1_unscaled: 8.0000 (8.0064) loss_rel_1_unscaled: 0.3992 (0.5942) loss_ce_2_unscaled: 0.2839 (0.4382) loss_bbox_2_unscaled: 0.1754 (0.2051) loss_giou_2_unscaled: 0.5182 (0.5392) cardinality_error_2_unscaled: 5.0000 (7.8084) loss_rel_2_unscaled: 0.4104 (0.5954) loss_ce_3_unscaled: 0.2801 (0.4330) loss_bbox_3_unscaled: 0.1547 (0.2018) loss_giou_3_unscaled: 0.5172 (0.5287) cardinality_error_3_unscaled: 6.0000 (7.5166) loss_rel_3_unscaled: 0.4219 (0.5989) loss_ce_4_unscaled: 0.2719 (0.4310) loss_bbox_4_unscaled: 0.1594 (0.1997) loss_giou_4_unscaled: 0.5150 (0.5229) cardinality_error_4_unscaled: 5.0000 (6.8948) loss_rel_4_unscaled: 0.4093 (0.6088)
Accumulating evaluation results...
DONE (t=127.17s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.132
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.262
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.115
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.032
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.086
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.183
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.210
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.330
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.337
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.136
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.272
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.403`

dataset bbox

Hi, Yuren:
I want to know what is the format of annotation['bbox'] in train/val/test.json file in Visual Genome (in COCO-format) xyxy or xyhw?
If it is xyxy, which two points in the bounding box are represented by xy xy?
If it is xywh, which point does xy represent, and where is the (0,0) point of the picture?
Thanks!

Evaluation on Colab failing

Hello Authors,

As instructed, we have created a conda environment (python 3.6) to run inference and evaluation of RelTR. However, we are getting some unexpected errors and are not sure how to proceed.

Any feedback would be appreciated!

Error Stackstrace:

Not using distributed mode
git:
  sha: 4c9557165e8a8d9c90ca263aa9d2be82f70c1ace, status: has uncommited changes, branch: main

Namespace(ann_path='./data/vg/', aux_loss=True, backbone='resnet50', batch_size=1, bbox_loss_coef=5, clip_max_norm=0.1, dataset='vg', dec_layers=6, device='cuda', dilation=False, dim_feedforward=2048, dist_url='env://', distributed=False, dropout=0.1, enc_layers=6, eos_coef=0.1, epochs=150, eval=True, frozen_weights=None, giou_loss_coef=2, hidden_dim=256, img_folder='data/vg/images/', lr=0.0001, lr_backbone=1e-05, lr_drop=100, nheads=8, num_entities=100, num_triplets=200, num_workers=2, output_dir='', position_embedding='sine', pre_norm=False, rel_loss_coef=1, resume='ckpt/checkpoint0149.pth', return_interm_layers=False, seed=42, set_cost_bbox=5, set_cost_class=1, set_cost_giou=2, set_iou_threshold=0.7, start_epoch=0, weight_decay=0.0001, world_size=1)
number of params: 63679528
loading annotations into memory...
Done (t=2.56s)
creating index...
index created!
loading annotations into memory...
Done (t=1.17s)
creating index...
index created!
Traceback (most recent call last):
  File "main.py", line 239, in <module>
    main(args)
  File "main.py", line 171, in main
    checkpoint = torch.load(args.resume, map_location='cpu')
  File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 585, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/lib/python3.6/site-packages/torch/serialization.py", line 755, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, '<'.

Screenshot 2022-12-15 at 10 45 31 PM

AttributeError: 'Namespace' object has no attribute 'dataset'

hi, i just follow readme/usage and type this conmand:
python inference.py --img_path demo/vg2.jpg --resume ckpt/checkpoint0149.pth
and i get this:

(scene_graph_benchmark) bash-4.2$ python inference.py --img_path demo/vg2.jpg --resume ckpt/checkpoint0149.pth
Namespace(aux_loss=True, backbone='resnet50', dec_layers=6, device='cuda', dilation=False, dim_feedforward=2048, dropout=0.1, enc_layers=6, hidden_dim=256, img_path='demo/vg2.jpg', lr_backbone=1e-05, nheads=8, num_entities=100, num_triplets=200, position_embedding='sine', pre_norm=False, resume='ckpt/checkpoint0149.pth', return_interm_layers=False)
yes
Traceback (most recent call last):
File "inference.py", line 191, in
main(args)
File "inference.py", line 104, in main
model = build_model(args)
File "/home/user/JL/myhome/juyterNotebook_folder/test/test_for_code/sgg_for_sgbEnv/reltr/RelTR-main/models/init.py", line 5, in build_model
return build(args)
File "/home/user/JL/myhome/juyterNotebook_folder/test/test_for_code/sgg_for_sgbEnv/reltr/RelTR-main/models/reltr.py", line 377, in build
num_classes = 151 if args.dataset != 'oi' else None #TODO: openimage v6
AttributeError: 'Namespace' object has no attribute 'dataset'

RuntimeError: CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1. Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.

when i want to trian RelTR on Open Images V6 with a single GPU
python main.py --dataset oi --img_folder /home/ybz/RelTR/data/oi/images/ --ann_path /home/ybz/RelTR/data/ --batch_size 1 --output_dir ckpt1
2

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.