alvin-zeng / pgcn Goto Github PK

View Code? Open in Web Editor NEW

319.0 8.0 66.0 152.32 MB

Graph Convolutional Networks for Temporal Action Localization (ICCV2019)

Python 0.71% Shell 0.01% Jupyter Notebook 99.29%

pgcn's People

Stargazers

Watchers

Forkers

xinlongxiao tylerjoe hzhang57 jdc08161063 skbai1996 zhwzhong zhouyichen blackboy5004 thisisjeffchen yaozy15 shadowclouds 33357 yangwf1 konglongteng jabongkoo25 13274086 zdstandup hyzcn salarim tzzcl ammieqi starkyyy lingeo maodong2056 2zhaokaijie wastoon 18673461800 banben azhuantou xiaobai824 minminaicode jijiaqi hnn123 peihaochen ywmsama chenyr0021 xzhoubb ccmethod xujinglin suxinpei humamalwassel aunusualman michiru123 xiaoyu0229 zhangyaofangfang aakgun jerrynogjf rahulrajus vivva huyimean scut-ailab hj0711 jazib-sudo yangsuhui carpedkm zikang12138 cz-2128 tanyaspee hawksilent harishgovardhandamodar tuanlase02874 fusica fraware 123hao123zhong

pgcn's Issues

Question about proposal generation

Hi, thanks for your sharing of the code. I'd like to know that where the pre-extracted proposals come from. Did you reimplement the paper Boundary Sensitive Network or just use their provided proposals?

ActivityNet feature.

Hello, Alvin!

Thank you very much for sharing the code and feature!

However, when using ActivityNet feature, I run into a problem that the number of directories for RGB and Flow frames are different.

Why the number of directories for ActivityNet RGB (13811) and Flow (14938) frames are different?
Does it mean that we can only use the intersection?

Performance on Activitynet

Thanks for releasing your code and features!

I ran the training using the provided default settings on activitynet dataset and the I3D flow features, but obtained mAP much below the reported numbers (10.63 mAP).

请问有没有activitynet1.3上的特征提供下载

Pre-trained model on THUMOS'14 for testing purpose

Hi, thanks for your work!

Can you please share your pre-trained model on THUMOS'14 dataset? Thank you.

Has anyone Reproduce the results of the paper？

question about the I3D feature

In your paper, it says "We first uniformly divide each input video into 64-frame segments. We then use a two-stream Inflated 3D ConvNet (I3D) model pre-trained on Kinetics [5] to extract the segment features."
However, in your code

interval = 8
clip_length = 64
start_unit = int(min(ft_num - 1, np.floor(float(start_ind + off) / interval)))
end_unit = int(min(ft_num - 2, np.ceil(float(end_ind - clip_length) / interval)))

I guess minusing 64 means you do not use the last few frames not divisible by 64, but why should interval=8?
Is it means that you divide each input video into 8-frame?

By the way? Could you offer the I3D feature on ActivityNet? It's so time-comsuming to extrat.

adj_num

Hello, Can you tell me why the adj_num is 21 from dataset_cfg.yaml ?

training error, please help

fg, incomp, bg = self.prop_dict[video.id][0], self.prop_dict[video.id][1], self.prop_dict[video.id][2]

KeyError: 'video_validation_0000187'

Train Data Corrupted for I3D ActivityNet Feature

Hi, Thanks for uploading the I3D features for ANet.

However, i found out that the RGB features zipped (in Gcloud ) are corrupted. Can you double check the current version or upload the corrected features ? There are too many files in I3D ANet RGB features link.

Thanks in advance

Model for ActivityNet

Thanks for the brilliant work!

Are you going to release the trained model on ActivityNet (like on THUMOS)?

If yes: Great! Fine-tuning on ActivytyNet is hard. I believe more people will star the repo and waiting for the update.
If no: Can you keep this issue open? Someone will probability do it.

我想问一下bsn_train_proposal_list.txt中的pre_box是怎么得到的？

您好，我想问一下bsn_train_proposal_list.txt中的pre_box是怎么得到的，特别是box对应的label？如果我想用自己的数据训练，怎么得到数据对应的label以及pre_box

How to fuse rgb and flow features.

Hi, Alvin. Thanks for your excellent work, but I am confused about how you fused the flow and rgb features of I3D model. Is it a simple average function? Cuz most I3D models generate features with 1024 dimensions for both streams, but the fused features in your work are 2048 dimensions.

Training error

Dear, When run the pgcn_train.py will always pop below error, would you please help me to check what is the problem? thanks a lot.

File "/media/ActionRecognition/PGCN/pgcn_dataset.py", line 355, in _video_centric_sampling
print("self.prop_dict[video.id][0]",self.prop_dict[video.id][0])
KeyError: 'video_validation_0000187'

The num of proposals is not equal to lines of proposals in BSN test Proposal Files

Hi, according to https://github.com/yjxiong/action-detection/wiki/A-Description-of-the-Proposal-Files, the sum of lines of proposals should be equal to the last line of description. i.e. for the first one 'video_test_0000896' should have 3936 proposals however there only 800 proposals in the prop file. How do you choose the 800 proposals?

get thumos14 proposal pkl file in data directory

hi, all
how to get thumos14 proposal pkl file in data?

An error when pickle load the .pkl in data/

Hello, an error occured when load the ./data/thumos14_train_prop_dict.pkl

UnpicklingError Traceback (most recent call last) ----> 1 data = pickle.load(open('/home/ld/tac/PGCN/data/thumos14_train_prop_dict.pkl', 'rb')) UnpicklingError: pickle data was truncated
Could you checked the .pkl file?

Does this alg work for long action proposals?

Hi, Thanks for opening your code.
I have a question about the length range of proposal which this alg works well. For example, some of videos in activatynet dataset have a long action proposal ,have you ever counted the results of the long action proposals?

Why not use the offical toolkit for evaluation on THUMOS

Thanks for your code. There is an offical toolkit in THUMOS website. Why not use this to evaluate the performance of THUMOS?

Anet RGB feature files' name

Hi, I found that the file names of the Anet RGB feature you released in Google Drive don't match with the vid in the anet1.3_bsn600_validation_proposal_list, could you tell me how to use this feature properly.

What's more, I noticed that the flow model's performance is much better than the RGB model. Could you explain the reason, I think the RGB feature contains more information for classification, isn't it true?

How do you extract the flow feature?

code understanding problem

Hi, I'm recently reading your excellent code.
When I read the sample_indeices(start, end nm_seg) function in I3D_Poolin.py, I found that the valid_length > num_seg condition would never exist. Is it a bug?

def sample_indices(start, end, num_seg):
    """
    :param record: VideoRecord
    :return: list
    """
    valid_length = end - start
    average_duration = (valid_length + 1) // num_seg
    if average_duration > 0:
        # normal cases
        offsets = np.multiply(list(range(num_seg)), average_duration)

    # TODO: here is a bug?
    elif valid_length > num_seg:
        offsets = np.sort(randint(valid_length, size=num_seg))
    else:
        offsets = np.zeros((num_seg,))

    return offsets, average_duration

Some error during training, please help

pickle.dump([self.act_iou_dict, self.act_dis_dict, self.prop_dict], open(self.prop_dict_path, "wb"))
MemoryError

Is RGB model saved with float datatype?

Thanks for the brilliant work!

I happen to see an error when the RGB model is directly loaded into the PGCN architecture.
The reason seems to be a mismatch of the datatype.

To solve that, I replaced one line of code in pgcn_test.py
reg_scores[prop_idx, :] = net((act_batch_var, comp_batch_var), None, None, None)
by
reg_scores[prop_idx, :] = net((act_batch_var.float(), comp_batch_var.float()), None, None, None)

Do it first if you find a similar error :)

custom videos

Thanks for your code,but is it able to train and test custom videos?

How to run this model on a New datasets

I would like to test your model on a new TAL dataset collected by our laboratory. Hence we want to know how should we prepare the dataset directory and ground truth files. Any suggestions will be very helpful!

How to predict a single unlabeled video?

Dear author, I have trained the PGCN model on my own data set, but I need to make a prediction on a video (not in the training set nor in the test set), I see that the code needs to generate the corresponding proposal which need corresponding GT information, But the videos I'm testing now don't have tag files. could u tell me how to do it? thanks a lot

No such file or directory: 'data/train\\video_validation_0000187'

when i am training ,there is error:No such file or directory: 'data/train\video_validation_0000187',and do we need unzip the feature?

A detail question about the model

Hey! Great Job!
After seeing through your paper, I've got one question. I was wondering how did you process the output feature of GCN model (Nxd) before fc layer. Because you know, you got to get rid of the two dimensions.
And I saw the code, found that you just picked the first row of all N features. Did I understand it correctly? If so, could you please explain why would you do that. Why not perform an average pool between the N features? Thanks a lot!

the proposal_list for ActivityNet

hello,have you got the I3D feature or the proposal_list for ActivityNet? I'm also working on activitynet dataset. Thank you!
My email is [email protected]

how to generate bsn_proposal_list.txt

thank you for your great work.
I have some question wether if I want to apply your work for my own dataset how can i generate the bsn_proposal_list files ? and am I need to train bsn proposal generator with my own dataset ?

thank you so much

Question about Pre-trained Model.

Sorry to disturb. May I ask where to get the pre-trained model for THOMUS 14 ?

video_validation_0000947 is missing in Flow_Train.tar.gz

Hi,
I downloaded Flow_Train.tar.gz via google cloud and I found that video_validation_0000947 is missing in Flow_Train.tar.gz while it is existing in Rgb_Train_feature.zip.

Could you reupload the file?

Thank you so much.

About incomplete_overlap_thresh

Hi, the incomplete_overlap_thresh in current_configs.yaml is set to be 0.01, however, I think it should be around 0.7?

The performance of the best model is lower than the results in the paper?

Thanks for your excellent work.
I trained the model that you provided and found that the best model's(at the epoch 15) performance is
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
| mean AP | 0.6574 | 0.6382 | 0.6009 | 0.5374 | 0.4578 | 0.3369 | 0.2172 | 0.0903 | 0.0134 | 0.3944 |
+------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+---------
And it's lower than results in the paper. Could you provide the pretrained model or explain why this happened ?
Thank you

Could you please explain the meaning of some parameters

Thanks for your great work, but I have some parameters which I don't understand

PGCN/pgcn_models.py

Line 157 in 46428da

for stage_cnt in range(self.child_num + 1):

could you please explain the meaning of parameters: self.adj_num, self.child_num ?
And I don't find how to select the contextual edges and surrounding edges as said in your paper.
I‘m confused how to find the edges or mask in this part, could you please give me a kind help? Thanks

RuntimeError: cuda runtime error (10)

Hi,there!
When runing the pgcn_tset.py for inference, I encounter the cuda error and here is my stack trace:

model epoch 15 loss: 1.4765163376217796
File parsed. Time:4.10
Dict constructed. Time:4.39
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-2:
Traceback (most recent call last):
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
    torch.cuda.set_device(gpu_id)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-3:
Traceback (most recent call last):
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
    torch.cuda.set_device(gpu_id)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
  0%|                                                   | 0/210 [00:00<?, ?it/s]THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-4:
Traceback (most recent call last):
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
    torch.cuda.set_device(gpu_id)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
  6%|██▍                                     | 13/210 [06:37<1:47:22, 32.70s/it]^CTraceback (most recent call last):
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 216, in <module>
    rst = result_queue.get()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/queues.py", line 94, in get
    res = self._recv_bytes()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt
Process SpawnProcess-1:
  6%|██▍                                     | 13/210 [06:44<1:42:07, 31.10s/it]

Process finished with exit code 1

I also test my cuda and it turns out TRUE:

>>> import torch
>>> torch.cuda.is_available()
True

I do not know how to fix this error. Could anyone help?

Which flow algorithm did you use for this project?

Hi, I was wondering what flow algorithm did you use to generate the flow map for extracting I3D flow features?

Features of THUMOS14

There are total 413 videos in this dataset, including training and testing. The number of provided RGB features is 413, but the number of provided flow features is 412. Why there is someone is missing?

How to exact feature from video？And how to generate proposal txt?

How to exact I3D feature from origin video？Could you share data preprocess code or feature exaction code？And how to generate proposal txt from custom video? Thx~

The input proposal is after-NMS or before NMS

The input to P-GCN is the proposals generated by BSN. So are them the results before NMS or after NMS used in BSN?

The released checkpoint and result file cannot reach the released mAP value

Hi, I have evaluate the released flow checkpoint and flow result following the default setting, both reached a lower mAP than the release [email protected] on testset: 0.4683 and 0.4662. The approx. 1 percent difference seems nontrivial.

Besides, I have also tried to train it for a few times and eval different checkpoints, which all yields ~1% gap with 47.42% despite a jittering around .2% each training.

Please double-check the released ckpts and results, and hopefully point out some mistasks that I may have made. Appreciate that!

What does best overlap means?

What is best overlap? What is the differences between best iou and best overlap?

parameters

About Activitynet feature

Hi, thanks a lot for open resource .I'm working on activitynet dataset. Do u have I3D features u used in this project, I'm appreciate it if u can share one copy to me!
My email address is [email protected]

Normalization and image size for I3D feature extraction

Hi, first of all congratulations on your work..

I want to use your work in a real pipeline where I need to run all networks in a series fashion. For this, I need to first extract the features with the I3D model. From here and the paper I saw that you extract the features in a sliding windows manner with blocks of 64 frames and a stride of 8, is that correct?

Furthermore, I couldn't find any information about the frame size and normalization prior to feeding into the I3D network. The original repo does not say anything about that. Here, I'm using the following pre-processing:

img = cv2.resize(img, (224, 224))
img = img[:, :, ::-1] # BGR2RGB
img = img / 127.5
img = img - 1

However, the results are different if I use my feature and yours. Could you help me with this issue?

Kind regards,

Can you let me know how I can re-score G-TAD generated output using PGCN?

Your answers to the above questions will clarify a lot of doubts.

Thank you for your time!

The proposal number in "bsn_test_proposal_list.txt" is wrong?

A error when clone the reporitory

When I download the project, I meet a trouble as follow:
fatal: No url found for submodule path 'anet_toolkit/anet_toolkit' in .gitmodules

How should I deal with this

alvin-zeng / pgcn Goto Github PK

pgcn's People

Stargazers

Watchers

Forkers

pgcn's Issues

Recommend Projects

Recommend Topics

Recommend Org