alvin-zeng / pgcn Goto Github PK
View Code? Open in Web Editor NEWGraph Convolutional Networks for Temporal Action Localization (ICCV2019)
Graph Convolutional Networks for Temporal Action Localization (ICCV2019)
Hi, thanks for your sharing of the code. I'd like to know that where the pre-extracted proposals come from. Did you reimplement the paper Boundary Sensitive Network or just use their provided proposals?
Hello, Alvin!
Thank you very much for sharing the code and feature!
However, when using ActivityNet feature, I run into a problem that the number of directories for RGB and Flow frames are different.
Why the number of directories for ActivityNet RGB (13811) and Flow (14938) frames are different?
Does it mean that we can only use the intersection?
Thanks for releasing your code and features!
I ran the training using the provided default settings on activitynet dataset and the I3D flow features, but obtained mAP much below the reported numbers (10.63 mAP).
Hi, thanks for your work!
Can you please share your pre-trained model on THUMOS'14 dataset? Thank you.
In your paper, it says "We first uniformly divide each input video into 64-frame segments. We then use a two-stream Inflated 3D ConvNet (I3D) model pre-trained on Kinetics [5] to extract the segment features."
However, in your code
interval = 8
clip_length = 64
start_unit = int(min(ft_num - 1, np.floor(float(start_ind + off) / interval)))
end_unit = int(min(ft_num - 2, np.ceil(float(end_ind - clip_length) / interval)))
I guess minusing 64 means you do not use the last few frames not divisible by 64, but why should interval=8?
Is it means that you divide each input video into 8-frame?
By the way? Could you offer the I3D feature on ActivityNet? It's so time-comsuming to extrat.
Hello, Can you tell me why the adj_num is 21 from dataset_cfg.yaml ?
fg, incomp, bg = self.prop_dict[video.id][0], self.prop_dict[video.id][1], self.prop_dict[video.id][2]
KeyError: 'video_validation_0000187'
Hi, Thanks for uploading the I3D features for ANet.
However, i found out that the RGB features zipped (in Gcloud ) are corrupted. Can you double check the current version or upload the corrected features ? There are too many files in I3D ANet RGB features link.
Thanks in advance
Thanks for the brilliant work!
Are you going to release the trained model on ActivityNet (like on THUMOS)?
If yes: Great! Fine-tuning on ActivytyNet is hard. I believe more people will star the repo and waiting for the update.
If no: Can you keep this issue open? Someone will probability do it.
您好,我想问一下bsn_train_proposal_list.txt中的pre_box是怎么得到的,特别是box对应的label?如果我想用自己的数据训练,怎么得到数据对应的label以及pre_box
Hi, Alvin. Thanks for your excellent work, but I am confused about how you fused the flow and rgb features of I3D model. Is it a simple average function? Cuz most I3D models generate features with 1024 dimensions for both streams, but the fused features in your work are 2048 dimensions.
Dear, When run the pgcn_train.py will always pop below error, would you please help me to check what is the problem? thanks a lot.
File "/media/ActionRecognition/PGCN/pgcn_dataset.py", line 355, in _video_centric_sampling
print("self.prop_dict[video.id][0]",self.prop_dict[video.id][0])
KeyError: 'video_validation_0000187'
Hi, according to https://github.com/yjxiong/action-detection/wiki/A-Description-of-the-Proposal-Files, the sum of lines of proposals should be equal to the last line of description. i.e. for the first one 'video_test_0000896' should have 3936 proposals however there only 800 proposals in the prop file. How do you choose the 800 proposals?
hi, all
how to get thumos14 proposal pkl file in data?
Hello, an error occured when load the ./data/thumos14_train_prop_dict.pkl
UnpicklingError Traceback (most recent call last) ----> 1 data = pickle.load(open('/home/ld/tac/PGCN/data/thumos14_train_prop_dict.pkl', 'rb')) UnpicklingError: pickle data was truncated
Could you checked the .pkl file?
Hi, Thanks for opening your code.
I have a question about the length range of proposal which this alg works well. For example, some of videos in activatynet dataset have a long action proposal ,have you ever counted the results of the long action proposals?
Thanks for your code. There is an offical toolkit in THUMOS website. Why not use this to evaluate the performance of THUMOS?
Hi, I found that the file names of the Anet RGB feature you released in Google Drive don't match with the vid in the anet1.3_bsn600_validation_proposal_list, could you tell me how to use this feature properly.
What's more, I noticed that the flow model's performance is much better than the RGB model. Could you explain the reason, I think the RGB feature contains more information for classification, isn't it true?
How do you extract the flow feature?
Hi, I'm recently reading your excellent code.
When I read the sample_indeices(start, end nm_seg) function in I3D_Poolin.py, I found that the valid_length > num_seg condition would never exist. Is it a bug?
def sample_indices(start, end, num_seg):
"""
:param record: VideoRecord
:return: list
"""
valid_length = end - start
average_duration = (valid_length + 1) // num_seg
if average_duration > 0:
# normal cases
offsets = np.multiply(list(range(num_seg)), average_duration)
# TODO: here is a bug?
elif valid_length > num_seg:
offsets = np.sort(randint(valid_length, size=num_seg))
else:
offsets = np.zeros((num_seg,))
return offsets, average_duration
pickle.dump([self.act_iou_dict, self.act_dis_dict, self.prop_dict], open(self.prop_dict_path, "wb"))
MemoryError
Thanks for the brilliant work!
I happen to see an error when the RGB model is directly loaded into the PGCN architecture.
The reason seems to be a mismatch of the datatype.
To solve that, I replaced one line of code in pgcn_test.py
reg_scores[prop_idx, :] = net((act_batch_var, comp_batch_var), None, None, None)
by
reg_scores[prop_idx, :] = net((act_batch_var.float(), comp_batch_var.float()), None, None, None)
Do it first if you find a similar error :)
Thanks for your code,but is it able to train and test custom videos?
I would like to test your model on a new TAL dataset collected by our laboratory. Hence we want to know how should we prepare the dataset directory and ground truth files. Any suggestions will be very helpful!
Dear author, I have trained the PGCN model on my own data set, but I need to make a prediction on a video (not in the training set nor in the test set), I see that the code needs to generate the corresponding proposal which need corresponding GT information, But the videos I'm testing now don't have tag files. could u tell me how to do it? thanks a lot
when i am training ,there is error:No such file or directory: 'data/train\video_validation_0000187',and do we need unzip the feature?
Hey! Great Job!
After seeing through your paper, I've got one question. I was wondering how did you process the output feature of GCN model (Nxd) before fc layer. Because you know, you got to get rid of the two dimensions.
And I saw the code, found that you just picked the first row of all N features. Did I understand it correctly? If so, could you please explain why would you do that. Why not perform an average pool between the N features? Thanks a lot!
hello,have you got the I3D feature or the proposal_list for ActivityNet? I'm also working on activitynet dataset. Thank you!
My email is [email protected]
thank you for your great work.
I have some question wether if I want to apply your work for my own dataset how can i generate the bsn_proposal_list files ? and am I need to train bsn proposal generator with my own dataset ?
thank you so much
Sorry to disturb. May I ask where to get the pre-trained model for THOMUS 14 ?
Hi,
I downloaded Flow_Train.tar.gz via google cloud and I found that video_validation_0000947 is missing in Flow_Train.tar.gz while it is existing in Rgb_Train_feature.zip.
Could you reupload the file?
Thank you so much.
Hi, the incomplete_overlap_thresh in current_configs.yaml is set to be 0.01, however, I think it should be around 0.7?
Thanks for your excellent work.
I trained the model that you provided and found that the best model's(at the epoch 15) performance is
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
| mean AP | 0.6574 | 0.6382 | 0.6009 | 0.5374 | 0.4578 | 0.3369 | 0.2172 | 0.0903 | 0.0134 | 0.3944 |
+------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+---------
And it's lower than results in the paper. Could you provide the pretrained model or explain why this happened ?
Thank you
Thanks for your great work, but I have some parameters which I don't understand
Line 157 in 46428da
could you please explain the meaning of parameters: self.adj_num, self.child_num ?
And I don't find how to select the contextual edges and surrounding edges as said in your paper.
I‘m confused how to find the edges or mask in this part, could you please give me a kind help? Thanks
Hi,there!
When runing the pgcn_tset.py for inference, I encounter the cuda error and here is my stack trace:
model epoch 15 loss: 1.4765163376217796
File parsed. Time:4.10
Dict constructed. Time:4.39
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-2:
Traceback (most recent call last):
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
torch.cuda.set_device(gpu_id)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-3:
Traceback (most recent call last):
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
torch.cuda.set_device(gpu_id)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
0%| | 0/210 [00:00<?, ?it/s]THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-4:
Traceback (most recent call last):
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
torch.cuda.set_device(gpu_id)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
6%|██▍ | 13/210 [06:37<1:47:22, 32.70s/it]^CTraceback (most recent call last):
File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 216, in <module>
rst = result_queue.get()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/queues.py", line 94, in get
res = self._recv_bytes()
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
Process SpawnProcess-1:
6%|██▍ | 13/210 [06:44<1:42:07, 31.10s/it]
Process finished with exit code 1
I also test my cuda and it turns out TRUE:
>>> import torch
>>> torch.cuda.is_available()
True
I do not know how to fix this error. Could anyone help?
Hi, I was wondering what flow algorithm did you use to generate the flow map for extracting I3D flow features?
There are total 413 videos in this dataset, including training and testing. The number of provided RGB features is 413, but the number of provided flow features is 412. Why there is someone is missing?
How to exact I3D feature from origin video?Could you share data preprocess code or feature exaction code?And how to generate proposal txt from custom video? Thx~
The input to P-GCN is the proposals generated by BSN. So are them the results before NMS or after NMS used in BSN?
Hi, I have evaluate the released flow checkpoint and flow result following the default setting, both reached a lower mAP than the release [email protected] on testset: 0.4683 and 0.4662. The approx. 1 percent difference seems nontrivial.
Besides, I have also tried to train it for a few times and eval different checkpoints, which all yields ~1% gap with 47.42% despite a jittering around .2% each training.
Please double-check the released ckpts and results, and hopefully point out some mistasks that I may have made. Appreciate that!
What is best overlap? What is the differences between best iou and best overlap?
Hi, thanks a lot for open resource .I'm working on activitynet dataset. Do u have I3D features u used in this project, I'm appreciate it if u can share one copy to me!
My email address is [email protected]
Hi, first of all congratulations on your work..
I want to use your work in a real pipeline where I need to run all networks in a series fashion. For this, I need to first extract the features with the I3D model. From here and the paper I saw that you extract the features in a sliding windows manner with blocks of 64 frames and a stride of 8, is that correct?
Furthermore, I couldn't find any information about the frame size and normalization prior to feeding into the I3D network. The original repo does not say anything about that. Here, I'm using the following pre-processing:
img = cv2.resize(img, (224, 224))
img = img[:, :, ::-1] # BGR2RGB
img = img / 127.5
img = img - 1
However, the results are different if I use my feature and yours. Could you help me with this issue?
Kind regards,
I am confused on how to read the proposal list for the dataset to be used in PGCN. Can someone explain how to read each number in one line of the proposal list? Thank you!
Do you have a visualization of Proposal Graph?
Hi,
I am trying to use PGCN on my own dataset. I have annotated the data according to Thumos'14 annotation format and extracted features using I3D. I have also trained and infered a G-TAD model.
Your answers to the above questions will clarify a lot of doubts.
Thank you for your time!
When I download the project, I meet a trouble as follow:
fatal: No url found for submodule path 'anet_toolkit/anet_toolkit' in .gitmodules
How should I deal with this
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.