lilydaytoy / openpvsg Goto Github PK

View Code? Open in Web Editor NEW

72.0 3.0 5.0 4.02 MB

Benchmarking Panoptic Video Scene Graph Generation (PVSG), CVPR'23

Home Page: https://jingkang50.github.io/PVSG/

License: Other

Python 27.28% MATLAB 0.02% Shell 0.24% Jupyter Notebook 72.46%

scene-graph scene-graph-generation scene-understanding video-understanding

openpvsg's Introduction

Panoptic Video Scene Graph Generation

demo.mp4

Panoptic Video Scene Graph Generation
Jingkang Yang, Wenxuan Peng, Xiangtai Li,
Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma,
Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu,
S-Lab, Nanyang Technological University & SenseTime Research

What is PVSG Task?

The Panoptic Video Scene Graph Generation (PVSG) Task aims to interpret a complex scene video with a dynamic scene graph representation, with each node in the scene graph grounded by its pixel-accurate segmentation mask tube in the video.


Given a video, PVSG models need to generate a dynamic (temporal) scene graph that is grounded by panoptic mask tubes.

The PVSG Dataset

We carefully collect 400 videos, each featuring dynamic scenes and rich in logical reasoning content. On average, these videos are 76.5 seconds long (5 FPS). The collection comprises 289 videos from VidOR, 55 videos from EpicKitchen, and 56 videos from Ego4D.

Please access the dataset via this link, and put the downloaded zip files to the place below.

├── assets
├── checkpoints
├── configs
├── data
├── data_zip
│   ├── Ego4D
│   │   ├── ego4d_masks.zip
│   │   └── ego4d_videos.zip
│   ├── EpicKitchen
│   │   ├── epic_kitchen_masks.zip
│   │   └── epic_kitchen_videos.zip
│   ├── VidOR
│   │   ├── vidor_masks.zip
│   │   └── vidor_videos.zip
│   └── pvsg.json
├── datasets
├── models
├── scripts
├── tools
├── utils
├── .gitignore
├── environment.yml
└── README.md

Please run unzip_and_extract.py to unzip the files and extract frames from the videos. If you use zip, make sure to use unzip -j xxx.zip to remove junk paths. You should have your data directory looks like this:

data
├── ego4d
│   ├── frames
│   ├── masks
│   └── videos
├── epic_kitchen
│   ├── frames
│   ├── masks
│   └── videos
├── vidor
│   ├── frames
│   ├── masks
│   └── videos
└── pvsg.json

We suggest our users to play with ./tools/Visualize_Dataset.ipynb to quickly get familiar with PSG dataset.

Get Started

To setup the environment, we use conda to manage our dependencies.

Our developers use CUDA 10.1 to do experiments.

You can specify the appropriate cudatoolkit version to install on your machine in the environment.yml file, and then run the following to create the conda environment:

conda env create -f environment.yml
conda activate openpvsg

You shall manually install the following dependencies.

# Install mmcv
pip install mmcv-full==1.4.0 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
conda install -c conda-forge pycocotools
pip install mmdet==2.25.0

# already within environment.yml
pip install timm
python -m pip install scipy
pip install git+https://github.com/cocodataset/panopticapi.git

# for unitrack
pip install imageio==2.6.1
pip install lap==0.4.0
pip install cython_bbox==0.1.3

# for vps
pip install seaborn
pip install ftfy
pip install regex

# If you're using wandb for logging
pip install wandb
wandb login

Download the pretrained models for tracking if you are interested in IPS+Tracking solution.

Training and Testing

IPS+Tracking & Relation Modeling

# Train IPS
sh scripts/train/train_ips.sh
# Tracking and save query features
sh scripts/utils/prepare_qf_ips.sh
# Prepare for relation modeling
sh scripts/utils/prepare_rel_set.sh
# Train relation models
sh scripts/train/train_relation.sh
# Test
sh scripts/test/test_relation_full.sh

VPS & Relation Modeling

# Train VPS
sh scripts/train/train_vps.sh
# Save query features
sh scripts/utils/prepare_qf_vps.sh
# Prepare for relation modeling
sh scripts/utils/prepare_rel_set.sh
# Train relation models
sh scripts/train/train_relation.sh
# Test
sh scripts/test/test_relation_full.sh

Model Zoo

Method	M2F ckpt	vanilla	filter	conv	transformer
mask2former_ips	link	link	link	link	link
mask2former_vps	link	link	link	link	link

Citation

If you find our repository useful for your research, please consider citing our paper:

@inproceedings{yang2023pvsg,
    author = {Yang, Jingkang and Peng, Wenxuan and Li, Xiangtai and Guo, Zujin and Chen, Liangyu and Li, Bo and Ma, Zheng and Zhou, Kaiyang and Zhang, Wayne and Loy, Chen Change and Liu, Ziwei},
    title = {Panoptic Video Scene Graph Generation},
    booktitle = {CVPR},
    year = {2023},
}

openpvsg's People

Contributors

Stargazers

Watchers

Forkers

muaazdev tyrannicawe lichuanx eric11eca jiugexuan

openpvsg's Issues

Why do I get out of memory error when I run test_ips.sh?

Hello Dr.Yang
[>>>>>>>>>>>>>>>>>>>>>>> ] 22583/22609, 3.8 task/s, elapsed: 6005s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22584/22609, 3.8 task/s, elapsed: 6005s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22585/22609, 3.8 task/s, elapsed: 6006s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22586/22609, 3.8 task/s, elapsed: 6006s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22587/22609, 3.8 task/s, elapsed: 6007s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22588/22609, 3.8 task/s, elapsed: 6007s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22589/22609, 3.8 task/s, elapsed: 6008s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22590/22609, 3.8 task/s, elapsed: 6009s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22591/22609, 3.8 task/s, elapsed: 6009s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22592/22609, 3.8 task/s, elapsed: 6010s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22593/22609, 3.8 task/s, elapsed: 6010s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22594/22609, 3.8 task/s, elapsed: 6011s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22595/22609, 3.8 task/s, elapsed: 6011s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22596/22609, 3.8 task/s, elapsed: 6012s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22597/22609, 3.8 task/s, elapsed: 6013s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22598/22609, 3.8 task/s, elapsed: 6013s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22599/22609, 3.8 task/s, elapsed: 6014s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22600/22609, 3.8 task/s, elapsed: 6014s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22601/22609, 3.8 task/s, elapsed: 6015s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22602/22609, 3.8 task/s, elapsed: 6015s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22603/22609, 3.8 task/s, elapsed: 6016s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22604/22609, 3.8 task/s, elapsed: 6017s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22605/22609, 3.8 task/s, elapsed: 6017s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22606/22609, 3.8 task/s, elapsed: 6018s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22607/22609, 3.8 task/s, elapsed: 6018s, ETA: [>>>>>>>>>>>>>>>>>>>>>>> ] 22608/22609, 3.8 task/s, elapsed: 6019s, ETA: [>>>>>>>>>>>>>>>>>>>>>>>>] 22609/22609, 3.8 task/s, elapsed: 6019s, ETA: /data02/run01/scv9240/code/OpenPVSG-main/datasets/datasets/utils.py:81: RuntimeWarning: divide by zero encountered in scalar divide
iou = int_area / union
/data02/archive01/scv9240/pvsg/vidor/frames/1006_4580824633/0000.png is annotated as all background!
/data02/archive01/scv9240/pvsg/vidor/frames/1006_4580824633/0003.png is annotated as all background!
/data02/archive01/scv9240/pvsg/vidor/frames/1006_4580824633/0001.png is annotated as all background!
/var/spool/slurmd/job1129607/slurm_script: line 27: 38811 Killed python -u tools/test.py ${CONFIG} ${CHECKPOINT} --work-dir=${WORK_DIR} slurmstepd: error: Detected 1 oom-kill event(s) in StepId=1129607.batch cgroup. Some of your processes may have been killed by the cgroup 7s
7s
6s
6s
6s
6s
5s
5s
5s
5s
4s
4s
4s
3s
3s
3s
3s
2s
2s
2s
2s
1s
1s
1s
1s
0s
0s/data02/archive01/scv9240/pvsg/vidor/frames/1006_4580824633/0002.png is annotated as all background!
--eval PQ
out-of-memory handler.

When I execute "sbatch --gpus=4 ./scripts/test/test_ips.sh" statement, I get this error. ( “sbatch” ，This is because I used the server of Beijing Super-Computing Center.) I tried to --gpus=1/2/3/4/8，but I got the same error. However, I found that the utilization of my gpus was very low.

Can you tell me Why this happened? (I guess that if your code didn't clear some features and suddenly crashed halfway through the program?)
Can you tell me How to solve this problem?

Best regards!

Yours sincerely,

XY.Han

Missing key "subject_encoder" in the loaded_state_dicts from pretrained model "epoch_8.pth"

Hi,

I am running the file rel_test_full.py to do some testing, but I got a KeyError: "subject_encoder" when accessing the loaded_state_dicts which is loaded from the pre-trained model epoch_8.pth, the variable save_work_dir is set to be
work_dirs/relation/rel_ips_transformer, which contains the model epoch_8.pth

as far as I can tell, we are supposed to load the model from epoch_8.pth which is under the folder "mask2former_r50_ips"

Can you please help me with this? Please point out any mistakes that I have made. Thanks

Division by zero

I get an error about division by zero from the below line mean_recall = sum(rel['hit'] / rel['total'] from rel_metrices.py.Upon further investigation I found that the relation_dict['relations'] is empty which is taken from relation.pickle.So can you please help me by telling how to solve the error.Your help will be highly appreciated.

About inference on Custom Videos

Thank you for your wonderful work!
Are there any demos of generating scene graphs for custom images?

No module named 'models.unitrack.data'

Hi! Thanks your great work!
I have some questions need help.

No module named 'models.unitrack.data' When I run the
"Save query features"
sh scripts/utils/prepare_qf_vps.sh
I want to know where are the
"""
from models.unitrack.data import query_feat_tracklet
from models.unitrack.data.query_feat_tracklet import QueryFeatTube
from models.unitrack.data.single_video import LoadOutputsFromMask2Former
"""

TypeError: init() got an unexpected keyword argument 'video_name' (when running tools/prepare_query_tube_vps.py)

Hi,

I've been working with prepare_query_tube_vps.py for the past three days, attempting multiple installations to resolve an issue I keep encountering. Despite my efforts, I'm consistently faced with a TypeError that I can't seem to overcome.

The error message I receive is as follows: TypeError: init() got an unexpected keyword argument 'video_name'.

Has anyone else experienced this problem or have any insights on how to resolve it? I'm looking for any guidance or suggestions on how to proceed.

Thank you in advance for your help!

IndexError，objects_id和instance_id-1不相等

Hello,

I'm encountering some issues while trying to reproduce your code. In a specific example, I found that the pixel values in an image exceed the instance_id - 1 value. Although there are pixel values reaching 38, the objects_id only goes up to 37. This discrepancy leads to an IndexError during training. Upon further investigation, I discovered that a total of 217 images exhibit similar errors. Could you please confirm if the dataset labels are accurate?

Your assistance in resolving this matter would be greatly appreciated.

Thank you!

LICENSE?

Cool work!

I am wondering what LICENSE is attached in your OpenPVSG work?
Thanks!

Promotion Materials

Demo: https://github.com/Jingkang50/OpenPVSG/assets/17070708/54a0f4c4-daca-4168-8460-95eb4cf8b85a

There is something wrong with your code when I execute test_relation_full.sh.

Hello Dr. Yang
Your original code test_relation_full.sh

# sh scripts/test/test_relation_full.sh
PARTITION=priority
JOB_NAME=psg
PORT=${PORT:-$((29500 + $RANDOM % 29))}
GPUS_PER_NODE=${GPUS_PER_NODE:-1}
CPUS_PER_TASK=${CPUS_PER_TASK:-5}

PYTHONPATH="/mnt/lustre/jkyang/CVPR23/openpvsg":$PYTHONPATH \
srun -p ${PARTITION} \
    --job-name=${JOB_NAME} \
    --gres=gpu:${GPUS_PER_NODE} \
    --ntasks-per-node=${GPUS_PER_NODE} \
    --cpus-per-task=${CPUS_PER_TASK} \
    --kill-on-bad-exit=1 \
    python tools/rel_test_full.py --launcher="slurm" ${PY_ARGS}

The code is incorrect, because your tools/rel_test_full.py doesn't even have an argument for args.launcher. And This is an error message.

+ srun -p priority --job-name=psg --gres=gpu:1 --ntasks-per-node=1 --cpus-per-task=5 --kill-on-bad-exit=1 python tools/rel_test_full.py --launcher=slurm
usage: rel_test_full.py [-h] [--work-dir WORK_DIR] [--model-pth MODEL_PTH]
rel_test_full.py: error: unrecognized arguments: --launcher=slurm
srun: error: g0008: task 0: Exited with exit code 2
srun: launch/slurm: _step_signal: Terminating StepId=1141480.0

your original code tools/rel_test_full.py

...
parser = argparse.ArgumentParser(description='prepare relation set')
parser.add_argument('--work-dir', help='vanilla, filter, conv, transformer')
parser.add_argument('--epoch-id', type=int, default='100')
args = parser.parse_args()
....

So, when we execute test_relation_full.sh, we should delete this argument --launcher="slurm"

I wrote this issue to tell anyone who executes this code to be aware of this bug.

Sincerely,
XY.Han

Annotation issues

Hello,

Thanks for the great work on putting this dataset together!
However, I noticed that some relations for VidOR annotations, for example "squatting on" are not listed at the top of the "relations" key at the top of the json. Was this done by accident or you decided to remove those relations since they have very few samples?

Also I noticed some cases where the annotations refers to an object that is not annotated on the current frame, is that also by accident or something that was expected for a reason?

Thanks!

Missing `ips_train_save_qf` Folder During `train_relation.sh` Script Execution

Hello!

I encountered an issue while running sh scripts/train/train_relation.sh. The error indicates that the ips_train_save_qf folder is missing. I have checked the file structure and found the ips_val_save_qf folder, which was created after running sh scripts/utils/prepare_rel_set.sh. However, I couldn't find the ips_train_save_qf folder. Could you please advise on how to resolve this issue? Is there a specific script or step to generate this folder?

Thank you for your assistance!

The dataset can not be downloaded.

The link shows that xxx can't be found in the entuedu-my.sharepoint.com directory. Could you please give some help.

Distributed Training with PyTorch on Multi-GPU Setup

Hi there,

I've been following your work with great interest and appreciate all the effort you've put into it. I encountered an issue when running your code on a remote server with 8 V-100 GPUs under PyTorch. After switching the launcher to PyTorch, I ran into a address already in use error that seems to prevent multi-GPU utilization, restricting the process to a single GPU.

Is there any chance that a distributed training update compatible with PyTorch might be on the horizon? It would greatly benefit those of us working with similar hardware configurations.

Thanks for your continued contributions to the field!

How to do the relation inference with custom video?

Hi, I have followed your steps in stage 1 (IPS) and successfully got the segmentation and unitrack results.

However, when I follow the instructions on the preparation for the relation model (# Prepare for relation modeling: sh scripts/utils/prepare_rel_set.sh). I found that it requires the ground truth of relation to generate relation.pickle . If without relation.pickle, the following steps for relation inference (# Test sh scripts/test/test_relation_full.sh) could not work.

I wanner to know how could I get the relation prediction on my custom dataset without the gt relation (only: video.mp4+segmation results(frames mask)+unitrack results(qurey_feats.pickle)).

About sh scripts/utils/prepare_qf_ips.sh question!

When I run the "sh scripts/utils/prepare_qf_ips.sh" , only have 62 files in the file_path "ips_val_save_qf".
Then I run the sh scripts/train/train_relation.sh, It [Errno 2] No such file or directory: '~/OpenPVSG-main/work_dirs/ips_train_save_qf/P18_02/relations.pickle'
I found train the model some need file not in the "ips_val_save_qf".
I want to ask would all the frames need tracking and save query features. And the prepare_qf_ips.sh only generate 62 files.
looking forward your reply!