3dpose / gntcn Goto Github PK

Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos

License: MIT License

Python 100.00%

human-pose-estimation 3d-human-pose multi-person-pose-estimation 3d-multi-person-pose-estimation graph-convolution-networks temporal-convolutional-network 3d-human-pose-estimation-absolute 3d-pose-estimation

gntcn's Issues

velocity-tcn

Thanks for your work. I am trying to understand where the velocity-tcn is used in the code on e.g. mupots? I can only see the joint-tcn and root-tcn in the code. Can you point me in the right direction?

How to train the GCN and TCN network?

Thanks for your awesome work. I really want to konw that how the provided pre-trained model was obtained?
I would be very glad if you could help me.

The confidence scores of the part affinity field for bone-GCN

Thank you for sharing such amazing work. I have gone through the code and I have a question about how to get the confidence scores of the PAF for bone-GCN.

In the current eval_gt_h36m.py, I understand that we assume the confidence score of each joint is 1 in this case. So we get the
affpts = torch.ones(bsize,17,17).cuda() / 17. If we want to test our own custom images, we should use the real confidence scores of each joint from OpenPose.

However, when I looked at the outputs data from OpenPose, I couldn't find where we can get the confidence scores of the PAF. I assume this part: affb = torch.ones(bsize,16,16).cuda() / 16 is where we should put the actually scores. It would be great if you can share details about how to generate the confidence scores of the PAF.

Thank you so much for your time and help.

2 different checkpoints for GCN

Thank you for sharing your work. I have one question:

There are 2 checkpoints for GCN, model_gcn and model_gcnwild. What is the difference between the two? I notice that model_gcn is used for H36M evaluation and model_gcnwild is used for MuPoTs. Is it because H36M and MuPoTs have different keypoints definition?

Questions about "points_eval.pkl"

Thank you again for sharing your amazing work. I'd like to know whether it's possible to share the points_eval.pkl file. More specifically, I'd like to know the format of the p2d and p3d (i.e. p2d,p3d = dataset[i]). Is there any preprocessing for the 2d joint data (e.g. normalize to 0-1, relative to root joint.. etc )?

I have been tried to use GnTCN on a custom image these days, but there are some parts that still need your guidance or clarification. Let me know if it's ok to email you to discuss more instead of posting issues here.

Thank you so much for your help and time.

About the TCN part

Thanks for making your amazing work open in public!
I am a newcomer to this topic(human pose estimation) so I'm not familiar with reading the code.
And I have a question about the TCNs part.
Even after I've read your paper thoroughly, I am not quite sure about the TCN part of this model architecture.

On page 5 of your paper, there is equation (8) which handles the importance of the output from Root-TCN and velocity-TCN.
I wonder where I can find this part in the code.

Thanks!

About Network Architecture

i wonder if u can share the details of network architecture of gcns and tcns, it's hard to understand the implementation according to the codes.

3D coordinate for human pose

Thank you very much for sharing your great work.
I want to infer 3D human pose by giving new image or movie as an input using GnTCN. Is it possible in this repo?
I have limited knowledge about this technology. Sorry for basic question in advance.

Why to use 2D Ground-truth joints as input?

Thanks for the code and your work?

Why do you use 2D ground-truth joints as input for evaluation?
This is a lifting network not entirely 2D+3D pose estimation work like https://github.com/mks0601/3DMPPE_POSENET_RELEASE? Is that right?

Thanks
G.

what's the value of 915 stand for in the line 32 of the calculate_mupots_depth.py? thanks for your amazing work.

There is something that confuses me. There is a value of 915 in line 32 in the calculate_mupots_depth.py, what does it represent, and how is it derived?
p2d = torch.from_numpy(p2d).cuda() / 915

affb,affpts

Dear authors, thank you for sharing such a great work with public. While when I am trying to write training code in other datasets,I don't know how to generate affb and affpts matrix.In your paper,I found out that affpts can be calculated using confidence heatmaps and affb can be calculated using PAFs.However,It is not clear where you generate affb and affpts,can you provide related code or some tools about generating affb and affpts?

Can you provide a inference code for testing other video inputs?

Estimate 3D pose in photo instead of videos

Hi,

Thanks for the great works.

I wonder if the code for estimating 3D pose in image instead of videos can be released.

PCk_abs threshold

Hello,
First of all, thank you for publishing your code.

I am having some trouble understanding how absolute coordinates are evaluated in MupoTS.
What is the threshold value used when computing the PCK_abs?
Because I noticed that the threshold for PCK_rel is 150 mm and 250 mm for AP root. But I am not sure about PCK_abs.

Thanks in advance for your response and your time.

Which license is applicable for this repository?

Hello.
I have a question about the license of this work.
Could you tell me about it?

Regarding variable names and Preparation of the pickle file for custom data

Thank you for the amazing work, I would like to gain a bit more clarity over the variable names that are used in the code and as to what they represent physically. Particularly I would like to know what p2d refers to and on checking the dimension of this variable it was about (500, 17, 2) what does the 500 (or the number in that position/ bsize variable ) represent.

Furthermore, if I would like to test this pipeline on the custom dataset (a video stream), how should I prepare the corresponding points_eval.pkl file. In other words, what is the format of the points_eval.pkl file for a given video stream.

Your help would be highly appreciated.

Thank You

RooTTCN Details

Thanks for sharing your amazing work.
I read in your paper that you are using the rootTCN model to get the absolute poses, I have some questions about that:

1- Actually, I tried to load the pre-trained model from the /ckpts/model_root folder using the Discriminator2D class as follows:

roottcn = networktcn.Discriminator2D() M.Saver(roottcn).restore ('./ckpts/model_root/')

However, I'm getting this error:

Is there a problem in the network or I have loaded it incorrectly. Also, I would like to know the size of the input, by visualising the pth model I see it is a size of (1024,34,3) and according to the paper the input is 2D poses of each person but in fact the Discriminator2D class doesn't need any parameters.. Could you, please, help me with this issue.

2- My second question is about the data format used to train this model to estimate Z/f. Did you use intrinsic parameters and poses given in Human3.6m for that?

Thank you in advance for your time.

How to apply to videos in the wild?

Hi, thanks for your great work. I have just entered into the 3D human pose. I admire your work.
Your paper estimates Z/f using weak perspective model, which SMAP estimates Zw/f, in which Z is the original depth, and f and w are the focal length and the image width both in pixels. Which is better? I think your estimation is in the real space and theirs is in a normalization space.

[1] SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation

Pre-trained Model

Hi,

Can you share the pre-trained model again because the dropbox link says that the files has been deleted. Thank you!

what's the value of 195 stand for in the line 32 of the calculate_mupots_depth.py?

thanks for your amazing work.
There is something that confuses me. There is a value of 915 in line 32 in the calculate_mupots_depth.py, what does it represent, and how is it derived?
p2d = torch.from_numpy(p2d).cuda() / 915

3dpose / gntcn Goto Github PK

gntcn's Issues

Recommend Projects

Recommend Topics

Recommend Org