bruceyo / mmnet Goto Github PK

View Code? Open in Web Editor NEW

31.0 31.0 11.0 1.07 GB

License: Other

Python 93.94% Shell 2.15% MATLAB 3.91%

mmnet's People

Contributors

Stargazers

Watchers

Forkers

kangyoungseo csid-dgu sangothesis yurkar2333 sunbeam-kkt ugurkilic ugrkilc markhsia

mmnet's Issues

Training RGB video (ntu60 xsub)

Sorry for the multiple questions.
I am now retraining the RGB video. I run the following command.
python main_rgb_fused.py recognition -c config/ntu60_xsub/train_rgb_fused.yaml [--work_dir <work folder>]
However, the accuracy at epoch 5 is low as shown in the image below. I would like to know if you know the cause of this problem.

Here is my train_rgb_fused.yaml file.

# python main_rgb_fused.py recognition -c config/ntu60_xsub/train_rgb_fused.yaml
# work_dir: ../../data/st-gcn/xsub/with_rgb/rgb_tp15_tf5_
work_dir: work_dir/ntu60/xsub/rgb_fused

skeleton_joints_pkl: results/ntu60/xsub/joint_result_msg3d.pkl
skeleton_bones_pkl: results/ntu60/xsub/bone_result_msg3d.pkl
# skeleton_joints_pkl: results/ntu60/xsub/joint_result_stgcn.pkl
# skeleton_bones_pkl: results/ntu60/xsub/bone_result_stgcn.pkl
# skeleton_joints_pkl: work_dir/ntu60/xsub/skeleton_joint/test_result_epoch25.pkl
# skeleton_bones_pkl: work_dir/ntu60/xsub/skeleton_bone/test_result_epoch55.pkl


# feeder
feeder: feeder.feeder_rgb_fused_ntu.Feeder
train_feeder_args:
  debug: False
  random_choose: False
  centralization: False
  random_move: False
  window_size: -1
  random_flip: False
  random_interval: True
  temporal_rgb_frames: 5
  # data_path: /media/bruce/2Tssd/data/ntu/xsub/train_data_joint.npy
  # label_path: /media/bruce/2Tssd/data/ntu/xsub/train_label.pkl
  data_path: data/ntu/xsub/train_data_joint.npy
  label_path: data/ntu/xsub/train_label.pkl
test_feeder_args:
  debug: False
  centralization: False
  evaluation: True
  temporal_rgb_frames: 5
  # data_path: /media/bruce/2Tssd/data/ntu/xsub/val_data_joint.npy
  # label_path: /media/bruce/2Tssd/data/ntu/xsub/val_label.pkl
  data_path: data/ntu/xsub/val_data_joint.npy
  label_path: data/ntu/xsub/val_label.pkl

# model
model: net.mmn.Model
model_args:
  in_channels: 3
  num_class: 60
  dropout: 0.5
  edge_importance_weighting: True
  graph_args:
    layout: 'ntu-rgb+d'
    strategy: 'spatial'

# training
temporal_positions: 15
fix_weights: True
joint_weights: models/ntu60/xsub/joint_model_stgcn.pt
# joint_weights: work_dir/ntu60/xsub/skeleton_joint/epoch25_model.pt
device: [0,1,2,3]
weight_decay: 0.0001
base_lr: 0.1
step: [10, 50]
batch_size: 32
test_batch_size: 32
num_epoch: 80

# debug
debug: False

The image is black and the correct ST-ROI is not generated as shown below. Is there a solution?

          Thanks for the reply. As you said, I uncommented lines 365-370 in `MMNet/feeder/segment_rgbbody_ntu.py` and specified lines 68-69 in `MMNet/feeder/feeder_rgb_fused_ntu.py` to the path where I saved the ST-ROI, and the following The image is black and the correct ST-ROI is not generated as shown below. Is there a solution?

Originally posted by @katahiyu in #9 (comment)

Training RGB video (ntu60)

Hello, I'm trying to train RGB video using 'fivevs' you provided through Google Drive.
I changed the path of the rgb_path_ntu60 in the feeder_rgb_fuesed_ntu.py.
Also, I changed the data and label path in the train_rgb_fused.yaml.
Then, I ran

python main_rgb_fused.py recognition -c config/ntu60_xsub/train_rgb_fused.yaml

However, there was an error related to the input shape.
Could you let me know how to train RGB video?
Where should I have to change the code?
The screenshot of the error is

I'm looking forward to your reply!
Thank you :)

OpenPose extracted data are corrupt

Unfortunately the data in these files, the openpose extracted skeletons can't be used. They are corrupt I think. Could you please look into this and if possible upload them again if you have access to the original openpose extracted skeleton? Thank you.

Masked depth maps

Excuse me, in the feeder/segment_rgbbody_ntu.py, what files are placed in the depth_path?

The training accuracy of NTU120 is very low！

Dear author! I retrained NTU120XSub according to the configuration and settings you provided, but I found that its accuracy is very low! I don't know which setting was ignored, or the model itself is flawed, and I can't reproduce the results mentioned in the paper. Below is my experimental setup:

Questions about import

Has anyone encountered this issue? When executing `python tools/ntu60_gendata.py --data_path`, it displays "no module," and changing it to `tools.data_gen` also displays "no module named tools."

Question about Resnet in RGB-based

Do you use the Resnet with pretrained or trained by yourself ?

Thank for your reading and answering.

Some Questions about Acc in Cross_Subject task?

Here is my config parameters: Batchsize:64 Optimizer:SGD Lr:0.1 weight_decay:0.0001
RGB data from your Google Cloud Drive(fivefs)，When I run the source code, I get an accuracy of 78%(At 18 epoch and Only RGB), which is much greater than the experimental results 72.7% in your paper, have you encountered a similar situation?

python main_rgb_fused.py recognition -c config/ntu60_xsub/train_rgb_fused.yaml

work_dir: ../../data/st-gcn/xsub/with_rgb/rgb_tp15_tf5_

skeleton_joints_pkl: /home/MMnet/results/ntu60/xsub/joint_result_msg3d.pkl
skeleton_bones_pkl: /home/MMnet/results/ntu60/xsub/bone_result_msg3d.pkl

feeder

feeder: feeder.feeder_rgb_fused_ntu.Feeder
train_feeder_args:
debug: False
random_choose: False
centralization: False
random_move: False
window_size: -1
random_flip: False
random_interval: False
temporal_rgb_frames: 5
data_path: /data/mmnet_skeleton/ntu60/xsub/train_data_joint.npy
label_path: /data/mmnet_skeleton/ntu60/xsub/train_label.pkl
test_feeder_args:
debug: False
centralization: False
evaluation: True
temporal_rgb_frames: 5
data_path: /data/mmnet_skeleton/ntu60/xsub/val_data_joint.npy
label_path: /data/mmnet_skeleton/ntu60/xsub/val_label.pkl

model

model: net.mmn.Model
model_args:
in_channels: 3
num_class: 60
dropout: 0.5
edge_importance_weighting: True
graph_args:
layout: 'ntu-rgb+d'
strategy: 'spatial'

training

temporal_positions: 15
fix_weights: True
joint_weights: models/ntu60/xsub/joint_model_stgcn.pt
device: [0,1,6,7]
weight_decay: 0.0001
base_lr: 0.1
step: [10, 50]
batch_size: 64
test_batch_size: 64
num_epoch: 80

debug

debug: False

Generate Region of Interest

@bruceyo Dear author, when running the “Generate Region of Interest”, I can't understand the meaning of this sentence——“python tools/data_gen/gen_fivefs_”.What does it mean of the "gen_fivefs". Please answer the questions.

can't find this file "main_bone.py"

Your work is great, i am in the process of reproducing your wokr and can't fint this file "main_bone.py",can you tell me where to find it?

Accuracy of this method:python main_rgb_fused.py recognition -c config/ntu60_xsub/test_rgb_fused.yaml

Dear author,
I ran this code using a trained model and got the following results: Why is the Top1 accuracy 4.83%? Please let me know if you have any solution.

Project code

Dear Professor, could you please update the code of the project? When I followed ReadMe to reproduce the model in the paper, I found that a lot of files were missing from the project. I hope you will take my suggestion, thanks

A problem about the readme.md document

Hello!
I want to follow your work, but the readme.md you provided does not match the code, so I cannot reproduce your work smoothly. Can you provide or update a more detailed readme.md document?

RuntimeError: Given groups=1, weight of size [60, 256, 1, 1], expected input[1, 3, 1, 1] to have 256 channels, but got 3 channels instead

when I use python main_skeleton.py recognition -c config/ntu60_xview/train_joint.yaml to train the ntu60 joint model, I got this error, how can i fix?thanks.

Consultation on handling documents

@bruceyo Hello, the author. For the training module, at the “skeleton joing run” stage, please ask where the file is. If it is convenient, please provide the python file for processing, and also consult the corresponding files on the “skeleton bone run”. Finally, for RGB processing, the relevant processing files should refer to “main_rgb.py".I would appreciate your timely reply.

e questions about MMNet: A Model-based Multimodal Network for Human Action Recognition in RGB-D Videos

Dear author, I have some questions about your paper MMNet: A Model-based Multimodal Network for Human Action Recognition in RGB-D Videos.
I use the code you posted in github, but how did some of the files get there?
E.g data/ntu_st_gcn/xsub/val_label.pkl .The file has and differs from the pkl of the test data.

I followed the code you posted on github to retrain to get a new pkl file, and when using the ensemble file to augment the results, yielded extremely low results. I put the result in the image below:
How do I get the results in your essay?

can not find main_joint.py and main_bone.py

couldn't find tools/data_gen/gen_fivefs_<dataset>

Dear Author,

I'm trying to train ntu60 rgb video but I checked that the st-roi you provided is for testing purposes.
Therefore, I would like to make st-roi for training. But I couldn't find gen_fivefs_<dataset> you mentioned in the readme.md in the given code. Could you share that tools/data_gen/gen_fivefs_<dataset> code?

Also, I found a similar code generating fivefs in MMNet/feeder/segment_rgbbody_ntu.py.
I thought using this code, we can generate st-roi for training. Is it possible?
Could you let me know how to generate st-roi for training purpose?

Thank you!

Prepare the RGB Modality

I am not familiar with the processing steps for RGB modality preparation. I would like to know the details.
Also, is this process unnecessary if I download the preprocessed ST-ROI and place it in the specified location?
Currently, I have downloaded it from GoogleDrive and placed it under the path marked in this image.

A question about feeder_rgb_fused_ntu.py

Hello !

I have a question about feeder_rgb_fused_ntu.py. Here it calls rgb_roi.construct_st_roi() to build st_roi. But I found that the provided train_rgb_fused.yaml and test_rgb_fused.yaml both set self.random_interval to False. Therefor in "segment_rgbbody_ntu.py", when calculating the frame_range in function construct_st_roi(), it will start from 0, the step is "len(frames) // sequence_length", and the end is len(frames). It means that rgb frames are equally spaced during training and testing, rather than randomly selected.

However, earlier in the question you mentioned that RGB frames are randomly sampled, which confuses me a lot.

Should I modify "train_rgb_fused.yaml" to set random_interval to TRUE during training?