Code Monkey home page Code Monkey logo

kern's Introduction

KERN

NB: Original README can be found below

News: I have merged the PR from @AU-Nebula that (hopefully) make it easier to set up and run. Thank @AU-Nebula!

Install KERN for the first time

The current KERN implementation relies on CUDA 9.0 which, unfortunately, is an older version that does not run on more recent operating systems. Regardless of your operating system's support for CUDA 9.0, begin with the following steps:

  1. Clone the repository: git clone https://github.com/yuweihao/KERN.git.
  2. Change directory: cd KERN.
  3. Run: sh kern_setup.sh. There are quite some data to download, so this step will take a while. The script assumes that Docker and NVIDIA Container Toolkit are installed on the system.
  4. Activate the Conda environment: conda activate kern.
  5. Compile the CUDA part of the project: sh compile.sh.
  6. Generate knowledge matrices (statistical prior): python prior_matrices/generate_knowledge.py.
  7. Test KERN processing by running Validation task with VG dataset: python models/eval_rels.py -ckpt checkpoints/kern_sgdet.tar.

Run KERN with custom images

  1. Run: docker run -it -v $PWD:/kern --gpus all cuda9.
  2. Upload custom dataset to /data/custom_images.
  3. Run: sh scripts/eval_kern_sgdet.sh.

Work in progress:

  • Create visualization script for Custom Dataset

the Setup section in the original README below.

Knowledge-Embedded Routing Network for Scene Graph Generation

Tianshui Chen*, Weihao Yu*, Riquan Chen, and Liang Lin, “Knowledge-Embedded Routing Network for Scene Graph Generation”, CVPR, 2019. (* co-first authors) [PDF]

Note A typo in our final CVPR version paper: h_{iC}^o in eq. (6) should be corrected to f_{iC}^o.

This repository contains trained models and PyTorch version code for the above paper, If the paper significantly inspires you, we request that you cite our work:

Bibtex

@inproceedings{chen2019knowledge,
  title={Knowledge-Embedded Routing Network for Scene Graph Generation},
  author={Chen, Tianshui and Yu, Weihao and Chen, Riquan and Lin, Liang},
  booktitle = "Conference on Computer Vision and Pattern Recognition",
  year={2019}
}

Setup

In our paper, our model's strong baseline model is SMN (Stacked Motif Networks) introduced by @rowanz et al. To compare these two models fairly, the PyTorch version code of our model is based on @rowanz's code neural-motifs. Thank @rowanz for sharing his nice code to research community.

  1. Install python3.6 and pytorch 3. I recommend the Anaconda distribution. To install PyTorch if you haven't already, use conda install pytorch=0.3.0 torchvision=0.2.0 cuda90 -c pytorch. We use TensorBoard to observe the results of validation dataset. If you want to use it in PyTorch, you should install TensorFlow and tensorboardX first. If you don't want to use TensorBaord, just not use the command -tb_log_dir.

  2. Update the config file with the dataset paths. Specifically:

    • Visual Genome (the VG_100K folder, image_data.json, VG-SGG.h5, and VG-SGG-dicts.json). See data/stanford_filtered/README.md for the steps to download these.
    • You'll also need to fix your PYTHONPATH: export PYTHONPATH=/home/yuweihao/exp/KERN
  3. Compile everything. Update your CUDA path in Makefile file and run make in the main directory: this compiles the Bilinear Interpolation operation for the RoIs.

  4. Pretrain VG detection. To compare our model with neural-motifs fairly, we just use their pretrained VG detection. You can download their pretrained detector checkpoint provided by @rowanz. You could also run ./scripts/pretrain_detector.sh to train detector by yourself. Note: You might have to modify the learning rate and batch size according to number and memory of GPU you have.

  5. Generate knowledge matrices: python prior_matrices/generate_knowledge.py, or download them from here: prior_matrices (Google Drive, OneDrive).

  6. Train our KERN model. There are three training phase. You need a GPU with 12G memory.

    • Train VG relationship predicate classification: run CUDA_VISIBLE_DEVICES=YOUR_GPU_NUM ./scripts/train_kern_predcls.sh This phase maybe last about 20-30 epochs.
    • Train scene graph classification: run CUDA_VISIBLE_DEVICES=YOUR_GPU_NUM ./scripts/train_kern_sgcls.sh. Before run this script, you need to modify the path name of best checkpoint you trained in precls phase: -ckpt checkpoints/kern_predcls/vgrel-YOUR_BEST_EPOCH_RNUM.tar. It lasts about 8-13 epochs, then you can decrease the learning rate to 1e-6 to further improve the performance. Like neural-motifs, we use only one trained checkpoint for both predcls and sgcls tasks. You can also download our checkpoint here: kern_sgcls_predcls.tar (Google Drive, OneDrive).
    • Refine for detection: run CUDA_VISIBLE_DEVICES=YOUR_GPU_NUM ./scripts/train_kern_sgdet.sh or download the checkpoint here: kern_sgdet.tar (Google Drive, OneDrive). If you find the validation performance plateaus, you could also decrease learning rate to 1e-6 to improve performance.
  7. Evaluate: refer to the scripts CUDA_VISIBLE_DEVICES=YOUR_GPU_NUM ./scripts/eval_kern_[predcls/sgcls/sgdet].sh. You can conveniently find all our checkpoints, evaluation caches and results in this folder KERN_Download (Google Drive, OneDrive).

Evaluation metrics

In validation/test dataset, assume there are images. For each image, a model generates top predicted relationship triplets. As for image , there are ground truth relationship triplets, where triplets are predicted successfully by the model. We can calculate:

For image , in its ground truth relationship triplets, there are ground truth triplets with relationship (Except , meaning no relationship. The number of relationship classes is , including no relationship), where triplets are predicted successfully by the model. In images of validation/test dataset, for relationship , there are images which contain at least one ground truth triplet with this relationship. The R@X of relationship can be calculated:

Then we can calculate:

Some results

relationship distribution of VG dataset Figure 1. The distribution of different relationships on the VG dataset. The training and test splits share similar distribution.


Ours_SMN Figure 2. The R@50 without constraint of our method and the SMN on the predicate classification task on the VG dataset.


Ours_SMN_Diff Figure 3. The R@50 absolute improvement of different relationships of our method to the SMN. The R@50 are computed without constraint.


Scatter Figure 4. The relation between the R@50 improvement and sample proportion on the predicate classification task on the VG dataset. The R@50 are computed without constraint.


Method SGGen SGCls PredCls Mean Relative
mR@50 mR@100 mR@50 mR@100 mR@50 mR@100 improvement
Constraint SMN 5.3 6.1 7.1 7.6 13.3 14.4 9.0
Ours 6.4 7.3 9.4 10.0 17.7 19.2 11.7 ↑ 30.0%
Unconstraint SMN 9.3 12.9 15.4 20.6 27.5 37.9 20.6
Ours 11.7 16.0 19.8 26.2 36.3 49.0 26.5 ↑ 28.6%
Table 1. Comparison of the mR@50 and mR@100 in % with and without constraint on the three tasks of the VG dataset.

Acknowledgement

Thank @rowanz for his generously releasing nice code neural-motifs.

Help

Feel free to open an issue if you encounter trouble getting it to work.

kern's People

Contributors

emilbaekdahl avatar fm-turno avatar yuweihao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kern's Issues

ValueError("heck")

in object_detector.py
if len(dets) == 0:
print("nothing was detected", flush=True)
return None
and in kern_model.py
ValueError("heck")
who meet those problem? and any ?

An error while running the code

While training the pretrain VG detection with the ./scripts/pretrain-detector.sh command, there was an error: No module named 'dataloaders.mscoco' . what should I do?

About the training phases

In Step 5
Train scene graph classification: run CUDA_VISIBLE_DEVICES=YOUR_GPU_NUM ./scripts/train_kern_predcls.sh.

a mistake? run ./scripts/train_kern_sgcls.sh ?

Understanding the equations in the paper.

@yuweihao I was reading your paper (KERN) and wanted to make sure that there is no mistake in equation 6. You explain it as: all correlated output feature vectors are aggregated to predict the class label but you have also used the hidden state of the last class i.e. h_iC, I am confused. Could you please clarify it.

Thanks.

cannot import name 'calculate_mR_from_evaluator_list'

Hello, I encountered the following problem in the process of reproducing the code. How can you solve this problem? Thank you.
Traceback (most recent call last):
File "models/train_rels.py", line 17, in
from lib.evaluation.sg_eval import BasicSceneGraphEvaluator, calculate_mR_from_evaluator_list, eval_entry
ImportError: cannot import name 'calculate_mR_from_evaluator_list'

Problem on running code

I installed the exact versions of pytorch as required and run into the same problem as in rowanz/neural-motifs#2. However, after changing the make options in cuda files (roi_align and nms) to be /usr/local/cuda/bin/nvcc -c -o file.cu.o file.cu --compiler-options -fPIC -gencode arch=compute_35,code=sm_35 (My GPU is Tesla K40c, I think it's compute capability is 3.5), I still got the same error. Do you have any idea how I can repair it?

Means of rm and od

Thank you for your excellent codes.
I want to know what exact meaning of rm and od. Such as
return Result(
od_obj_dists=od_obj_dists,
rm_obj_dists=obj_dists,
obj_scores=nms_scores,
obj_preds=nms_preds,
obj_fmap=obj_fmap,
od_box_deltas=od_box_deltas,
rm_box_deltas=box_deltas,
od_box_targets=bbox_targets,
rm_box_targets=bbox_targets,
od_box_priors=od_box_priors,
rm_box_priors=box_priors,
boxes_assigned=nms_boxes_assign,
boxes_all=nms_boxes,
od_obj_labels=obj_labels,
rm_obj_labels=rm_obj_labels,
rpn_scores=rpn_scores,
rpn_box_deltas=rpn_box_deltas,
rel_labels=rel_labels,
im_inds=im_inds,
fmap=fmap if return_fmap else None,
)
in lib object_detector.py.
What's more, why I always get ''RuntimeError: cuda runtime error (2) : out of memory at /pytorch/aten/src/THC/generic/THCStorage.cu:58'', though I had change batch_size and num_workers to 1 and mine gpu has 16G memory space.
Waiting for your reply,thank you once again.

Result

@yuweihao I want to know if the final result of this code is the scene map of the picture. Can this scene map be visualized? Will you input a picture and output the corresponding scene map? Thank you.

pretrain_detector

您好,我在VG预训练detector时遇到这样的错误:
from dataloaders.mscoco import CocoDetection,CocoDataLoader

ModuleNotFoundError: No module named 'dataloaders

Some questions about the input

Hi @yuweihao , Sorry that I haven't read the code, but when I read the paper I have some questions about the input of the graph node.

  1. After the detector producing the object bounding box, do you only use 'roipooling features'(maybe) as the input of the graph node? Or do you trained a feature extraction network also to extract the box feature then concatenate with 'roipooling features' or others?

I am confused about it and not sure how you make the input. Could you give me some advice? Thanks very much!

Missing evaluation functions and typos in README

This two functions are missing calculate_mR_from_evaluator_list, eval_entry when I executed eval_rels.py and faced ImportError in this line. Could please check it for me?

Btw, for the step 5 in SETUP section in README.MD, ./scripts/refine_for_detection.sh is not there in your directory and it seems exists only in the original neural-motifs repo. I guess you have changed it to train_kern_sgdet.py?

Torch Version 0.4.1

I am getting so many errors at different points for different torch versions (0.3.0, 0.4.0 or >1.0). Just wanted to share:

pip install torch==0.4.1 torchfile==0.1.0 torchvision==0.2.0

This solved the pytorch version issues.

cudaCheckError() failed : no kernel image is available for execution on the device

I am getting the following error trace when I execute CUDA_VISIBLE_DEVICES=0 ./scripts/eval_kern_predcls.sh

save_rel_recall : results/kern_rel_recall_predcls.pkl Unexpected key ggnn_obj_reason.obj_proj.weight in state_dict with size torch.Size([512, 4096]) Unexpected key ggnn_obj_reason.obj_proj.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq3_w.weight in state_dict with size torch.Size([512, 1024]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq3_w.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq3_u.weight in state_dict with size torch.Size([512, 512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq3_u.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq4_w.weight in state_dict with size torch.Size([512, 1024]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq4_w.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq4_u.weight in state_dict with size torch.Size([512, 512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq4_u.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq5_w.weight in state_dict with size torch.Size([512, 1024]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq5_w.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq5_u.weight in state_dict with size torch.Size([512, 512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_eq5_u.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_output.weight in state_dict with size torch.Size([512, 1024]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_output.bias in state_dict with size torch.Size([512]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_obj_cls.weight in state_dict with size torch.Size([151, 77312]) Unexpected key ggnn_obj_reason.ggnn_obj.fc_obj_cls.bias in state_dict with size torch.Size([151]) 0%| | 0/26446 [00:00<?, ?it/s]cudaCheckError() failed : no kernel image is available for execution on the device

Couldn't find union_boxes.conv.2.num_batches_tracked,union_boxes.conv.6.num_batches_tracked

Hello, Thanks for this awesome repo. While running ./scripts/eval_kern_sgdet.sh, I am getting the following error:

  • We couldn't find union_boxes.conv.2.num_batches_tracked,union_boxes.conv.6.num_batches_tracked
    0%| | 0/26446 [00:00<?, ?it/s]/home/riro/bibek_repo/KERN/dataloaders/blob.py:129: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
    self.imgs = Variable(torch.stack(self.imgs, 0), volatile=self.volatile)
    /home/riro/bibek_repo/KERN/dataloaders/blob.py:120: UserWarning: volatile was removed and now has no effect. Use with torch.no_grad(): instead.
    return Variable(tensor(np.concatenate(datom, 0)), volatile=self.volatile), chunk_sizes
    THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp line=663 error=8 : invalid device function
    0%| | 0/26446 [00:00<?, ?it/s]
    Traceback (most recent call last):
    File "models/eval_rels.py", line 114, in
    val_batch(conf.num_gpus*val_b, batch, evaluator, evaluator_multiple_preds, evaluator_list, evaluator_multiple_preds_list)
    File "models/eval_rels.py", line 55, in val_batch
    det_res = detector[b]
    File "/home/riro/bibek_repo/KERN/lib/kern_model.py", line 423, in getitem
    return self(*batch[0])
    File "/home/riro/anaconda3/envs/kern/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
    result = self.forward(*input, **kwargs)
    File "/home/riro/bibek_repo/KERN/lib/kern_model.py", line 355, in forward
    train_anchor_inds, return_fmap=True)
    File "/home/riro/anaconda3/envs/kern/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
    result = self.forward(*input, **kwargs)
    File "/home/riro/bibek_repo/KERN/lib/object_detector.py", line 293, in forward
    fmap = self.feature_map(x)
    File "/home/riro/bibek_repo/KERN/lib/object_detector.py", line 119, in feature_map
    return self.features(x) # Uncomment this for "stanford" setting in which it's frozen: .detach()
    File "/home/riro/anaconda3/envs/kern/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
    result = self.forward(*input, **kwargs)
    File "/home/riro/anaconda3/envs/kern/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
    File "/home/riro/anaconda3/envs/kern/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
    result = self.forward(*input, **kwargs)
    File "/home/riro/anaconda3/envs/kern/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
    self.padding, self.dilation, self.groups)
    RuntimeError: CuDNN error: CUDNN_STATUS_EXECUTION_FAILED

Any help on how to solve this issue?
Thank you!

facing an issue in training

ImportError: /home/KERN/lib/fpn/nms/_ext/nms/_nms.so: undefined symbol: __cudaPopCallConfiguration

please give suggestion how to resolve this issue.

Visualization

Hi there,

Is there any code to visualize detection and your model result?

Thanks!

Test on image not in Visual Genome

Hello,

Thank you for your very useful code! If I want the scene graph generated on an image not in the Visual Genome dataset, I believe I have to make it a "Blob" object (dataloaders.blob.Blob object), is that correct and use it in the sgdet mode? Also, if I have to use the visualize_sgcls, it looks like I have to add Ground Truth information into the class info. I am assuming that I can set the ground truth equal to null (is this assumption correct?). Do you have any pointers for getting the novel image into a Blob object (if that is the best way of doing it?).

Thank you very much for any help!

Question regarding training time

Thank you so much for releasing this repository looks awesome!
Quick question how much time does it take you to train the graph classification/detection? let's say time per epoch?

Thanks!

Question about generate_knowledge.py

Hi,
line 32 of generate_knowledge.py mat[gt_classes[i], gt_classes[j]] += 1
Should it be mat[gt_classes_list[i], gt_classes_list[j]] += 1 because there are repeated labels in gt_classes

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.