Code Monkey home page Code Monkey logo

pgcn's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pgcn's Issues

Performance on Activitynet

Thanks for releasing your code and features!

I ran the training using the provided default settings on activitynet dataset and the I3D flow features, but obtained mAP much below the reported numbers (10.63 mAP).

An error when pickle load the .pkl in data/

Hello, an error occured when load the ./data/thumos14_train_prop_dict.pkl

UnpicklingError Traceback (most recent call last) ----> 1 data = pickle.load(open('/home/ld/tac/PGCN/data/thumos14_train_prop_dict.pkl', 'rb')) UnpicklingError: pickle data was truncated
Could you checked the .pkl file?

Training error

Dear, When run the pgcn_train.py will always pop below error, would you please help me to check what is the problem? thanks a lot.

File "/media/ActionRecognition/PGCN/pgcn_dataset.py", line 355, in _video_centric_sampling
print("self.prop_dict[video.id][0]",self.prop_dict[video.id][0])
KeyError: 'video_validation_0000187'

training error, please help

fg, incomp, bg = self.prop_dict[video.id][0], self.prop_dict[video.id][1], self.prop_dict[video.id][2]

KeyError: 'video_validation_0000187'

RuntimeError: cuda runtime error (10)

Hi,there!
When runing the pgcn_tset.py for inference, I encounter the cuda error and here is my stack trace:

model epoch 15 loss: 1.4765163376217796
File parsed. Time:4.10
Dict constructed. Time:4.39
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-2:
Traceback (most recent call last):
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
    torch.cuda.set_device(gpu_id)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-3:
Traceback (most recent call last):
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
    torch.cuda.set_device(gpu_id)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
  0%|                                                   | 0/210 [00:00<?, ?it/s]THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp line=34 error=10 : invalid device ordinal
Process SpawnProcess-4:
Traceback (most recent call last):
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 116, in runner_func
    torch.cuda.set_device(gpu_id)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/site-packages/torch/cuda/__init__.py", line 264, in set_device
    torch._C._cuda_setDevice(device)
RuntimeError: cuda runtime error (10) : invalid device ordinal at /opt/conda/conda-bld/pytorch_1549630534704/work/torch/csrc/cuda/Module.cpp:34
  6%|██▍                                     | 13/210 [06:37<1:47:22, 32.70s/it]^CTraceback (most recent call last):
  File "/home/ubuntu/users/z/PGCN/pgcn_test.py", line 216, in <module>
    rst = result_queue.get()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/queues.py", line 94, in get
    res = self._recv_bytes()
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
    buf = self._recv_bytes(maxlength)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
    buf = self._recv(4)
  File "/home/ubuntu/users/z/anaconda3/envs/pgcn/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
    chunk = read(handle, remaining)
KeyboardInterrupt
Process SpawnProcess-1:
  6%|██▍                                     | 13/210 [06:44<1:42:07, 31.10s/it]

Process finished with exit code 1

I also test my cuda and it turns out TRUE:

>>> import torch
>>> torch.cuda.is_available()
True

I do not know how to fix this error. Could anyone help?

Using G-TAD results In PGCN :

Hi,

I am trying to use PGCN on my own dataset. I have annotated the data according to Thumos'14 annotation format and extracted features using I3D. I have also trained and infered a G-TAD model.

  1. Can you let me know how I can re-score G-TAD generated output using PGCN?

Your answers to the above questions will clarify a lot of doubts.

Thank you for your time!

The Proposal List in PGCN

I am confused on how to read the proposal list for the dataset to be used in PGCN. Can someone explain how to read each number in one line of the proposal list? Thank you!

Model for ActivityNet

Thanks for the brilliant work!

Are you going to release the trained model on ActivityNet (like on THUMOS)?

  • If yes: Great! Fine-tuning on ActivytyNet is hard. I believe more people will star the repo and waiting for the update.

  • If no: Can you keep this issue open? Someone will probability do it.

The performance of the best model is lower than the results in the paper?

Thanks for your excellent work.
I trained the model that you provided and found that the best model's(at the epoch 15) performance is
| IoU thresh | 0.10 | 0.20 | 0.30 | 0.40 | 0.50 | 0.60 | 0.70 | 0.80 | 0.90 | Average |
| mean AP | 0.6574 | 0.6382 | 0.6009 | 0.5374 | 0.4578 | 0.3369 | 0.2172 | 0.0903 | 0.0134 | 0.3944 |
+------------+--------+--------+--------+--------+--------+--------+--------+--------+--------+---------
And it's lower than results in the paper. Could you provide the pretrained model or explain why this happened ?
Thank you

Features of THUMOS14

There are total 413 videos in this dataset, including training and testing. The number of provided RGB features is 413, but the number of provided flow features is 412. Why there is someone is missing?

A error when clone the reporitory

When I download the project, I meet a trouble as follow:
fatal: No url found for submodule path 'anet_toolkit/anet_toolkit' in .gitmodules

How should I deal with this

How to predict a single unlabeled video?

Dear author, I have trained the PGCN model on my own data set, but I need to make a prediction on a video (not in the training set nor in the test set), I see that the code needs to generate the corresponding proposal which need corresponding GT information, But the videos I'm testing now don't have tag files. could u tell me how to do it? thanks a lot

Could you please explain the meaning of some parameters

Thanks for your great work, but I have some parameters which I don't understand

for stage_cnt in range(self.child_num + 1):

could you please explain the meaning of parameters: self.adj_num, self.child_num ?
And I don't find how to select the contextual edges and surrounding edges as said in your paper.
I‘m confused how to find the edges or mask in this part, could you please give me a kind help? Thanks

ActivityNet feature.

Hello, Alvin!

Thank you very much for sharing the code and feature!

However, when using ActivityNet feature, I run into a problem that the number of directories for RGB and Flow frames are different.

Why the number of directories for ActivityNet RGB (13811) and Flow (14938) frames are different?
Does it mean that we can only use the intersection?

Anet RGB feature files' name

Hi, I found that the file names of the Anet RGB feature you released in Google Drive don't match with the vid in the anet1.3_bsn600_validation_proposal_list, could you tell me how to use this feature properly.

What's more, I noticed that the flow model's performance is much better than the RGB model. Could you explain the reason, I think the RGB feature contains more information for classification, isn't it true?

How do you extract the flow feature?

Question about proposal generation

Hi, thanks for your sharing of the code. I'd like to know that where the pre-extracted proposals come from. Did you reimplement the paper Boundary Sensitive Network or just use their provided proposals?

The released checkpoint and result file cannot reach the released mAP value

Hi, I have evaluate the released flow checkpoint and flow result following the default setting, both reached a lower mAP than the release [email protected] on testset: 0.4683 and 0.4662. The approx. 1 percent difference seems nontrivial.

Besides, I have also tried to train it for a few times and eval different checkpoints, which all yields ~1% gap with 47.42% despite a jittering around .2% each training.

Please double-check the released ckpts and results, and hopefully point out some mistasks that I may have made. Appreciate that!

how to generate bsn_proposal_list.txt

thank you for your great work.
I have some question wether if I want to apply your work for my own dataset how can i generate the bsn_proposal_list files ? and am I need to train bsn proposal generator with my own dataset ?

thank you so much

Does this alg work for long action proposals?

Hi, Thanks for opening your code.
I have a question about the length range of proposal which this alg works well. For example, some of videos in activatynet dataset have a long action proposal ,have you ever counted the results of the long action proposals?

custom videos

Thanks for your code,but is it able to train and test custom videos?

question about the I3D feature

In your paper, it says "We first uniformly divide each input video into 64-frame segments. We then use a two-stream Inflated 3D ConvNet (I3D) model pre-trained on Kinetics [5] to extract the segment features."
However, in your code

interval = 8
clip_length = 64
start_unit = int(min(ft_num - 1, np.floor(float(start_ind + off) / interval)))
end_unit = int(min(ft_num - 2, np.ceil(float(end_ind - clip_length) / interval)))

I guess minusing 64 means you do not use the last few frames not divisible by 64, but why should interval=8?
Is it means that you divide each input video into 8-frame?

By the way? Could you offer the I3D feature on ActivityNet? It's so time-comsuming to extrat.

code understanding problem

Hi, I'm recently reading your excellent code.
When I read the sample_indeices(start, end nm_seg) function in I3D_Poolin.py, I found that the valid_length > num_seg condition would never exist. Is it a bug?

def sample_indices(start, end, num_seg):
    """
    :param record: VideoRecord
    :return: list
    """
    valid_length = end - start
    average_duration = (valid_length + 1) // num_seg
    if average_duration > 0:
        # normal cases
        offsets = np.multiply(list(range(num_seg)), average_duration)

    # TODO: here is a bug?
    elif valid_length > num_seg:
        offsets = np.sort(randint(valid_length, size=num_seg))
    else:
        offsets = np.zeros((num_seg,))

    return offsets, average_duration

adj_num

Hello, Can you tell me why the adj_num is 21 from dataset_cfg.yaml ?

How to run this model on a New datasets

I would like to test your model on a new TAL dataset collected by our laboratory. Hence we want to know how should we prepare the dataset directory and ground truth files. Any suggestions will be very helpful!

How to fuse rgb and flow features.

Hi, Alvin. Thanks for your excellent work, but I am confused about how you fused the flow and rgb features of I3D model. Is it a simple average function? Cuz most I3D models generate features with 1024 dimensions for both streams, but the fused features in your work are 2048 dimensions.

About Activitynet feature

Hi, thanks a lot for open resource .I'm working on activitynet dataset. Do u have I3D features u used in this project, I'm appreciate it if u can share one copy to me!
My email address is [email protected]

Is RGB model saved with float datatype?

Thanks for the brilliant work!

I happen to see an error when the RGB model is directly loaded into the PGCN architecture.
The reason seems to be a mismatch of the datatype.

To solve that, I replaced one line of code in pgcn_test.py
reg_scores[prop_idx, :] = net((act_batch_var, comp_batch_var), None, None, None)
by
reg_scores[prop_idx, :] = net((act_batch_var.float(), comp_batch_var.float()), None, None, None)

Do it first if you find a similar error :)

A detail question about the model

Hey! Great Job!
After seeing through your paper, I've got one question. I was wondering how did you process the output feature of GCN model (Nxd) before fc layer. Because you know, you got to get rid of the two dimensions.
And I saw the code, found that you just picked the first row of all N features. Did I understand it correctly? If so, could you please explain why would you do that. Why not perform an average pool between the N features? Thanks a lot!

Train Data Corrupted for I3D ActivityNet Feature

Hi, Thanks for uploading the I3D features for ANet.

However, i found out that the RGB features zipped (in Gcloud ) are corrupted. Can you double check the current version or upload the corrected features ? There are too many files in I3D ANet RGB features link.

Thanks in advance

Normalization and image size for I3D feature extraction

Hi, first of all congratulations on your work..

I want to use your work in a real pipeline where I need to run all networks in a series fashion. For this, I need to first extract the features with the I3D model. From here and the paper I saw that you extract the features in a sliding windows manner with blocks of 64 frames and a stride of 8, is that correct?

Furthermore, I couldn't find any information about the frame size and normalization prior to feeding into the I3D network. The original repo does not say anything about that. Here, I'm using the following pre-processing:

img = cv2.resize(img, (224, 224))
img = img[:, :, ::-1] # BGR2RGB
img = img / 127.5
img = img - 1

However, the results are different if I use my feature and yours. Could you help me with this issue?

Kind regards,

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.