3dpose / gntcn Goto Github PK
View Code? Open in Web Editor NEWGraph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos
License: MIT License
Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos
License: MIT License
Thanks for your work. I am trying to understand where the velocity-tcn is used in the code on e.g. mupots? I can only see the joint-tcn and root-tcn in the code. Can you point me in the right direction?
Thanks for your awesome work. I really want to konw that how the provided pre-trained model was obtained?
I would be very glad if you could help me.
Thank you for sharing such amazing work. I have gone through the code and I have a question about how to get the confidence scores of the PAF for bone-GCN.
In the current eval_gt_h36m.py
, I understand that we assume the confidence score of each joint is 1 in this case. So we get the
affpts = torch.ones(bsize,17,17).cuda() / 17
. If we want to test our own custom images, we should use the real confidence scores of each joint from OpenPose.
However, when I looked at the outputs data from OpenPose, I couldn't find where we can get the confidence scores of the PAF. I assume this part: affb = torch.ones(bsize,16,16).cuda() / 16
is where we should put the actually scores. It would be great if you can share details about how to generate the confidence scores of the PAF.
Thank you so much for your time and help.
Thank you for sharing your work. I have one question:
Thank you again for sharing your amazing work. I'd like to know whether it's possible to share the points_eval.pkl
file. More specifically, I'd like to know the format of the p2d and p3d (i.e. p2d,p3d = dataset[i])
. Is there any preprocessing for the 2d joint data (e.g. normalize to 0-1, relative to root joint.. etc )?
I have been tried to use GnTCN on a custom image these days, but there are some parts that still need your guidance or clarification. Let me know if it's ok to email you to discuss more instead of posting issues here.
Thank you so much for your help and time.
Thanks for making your amazing work open in public!
I am a newcomer to this topic(human pose estimation) so I'm not familiar with reading the code.
And I have a question about the TCNs part.
Even after I've read your paper thoroughly, I am not quite sure about the TCN part of this model architecture.
On page 5 of your paper, there is equation (8) which handles the importance of the output from Root-TCN and velocity-TCN.
I wonder where I can find this part in the code.
Thanks!
i wonder if u can share the details of network architecture of gcns and tcns, it's hard to understand the implementation according to the codes.
Thank you very much for sharing your great work.
I want to infer 3D human pose by giving new image or movie as an input using GnTCN. Is it possible in this repo?
I have limited knowledge about this technology. Sorry for basic question in advance.
Thanks for the code and your work?
Why do you use 2D ground-truth joints as input for evaluation?
This is a lifting network not entirely 2D+3D pose estimation work like https://github.com/mks0601/3DMPPE_POSENET_RELEASE? Is that right?
Thanks
G.
There is something that confuses me. There is a value of 915 in line 32 in the calculate_mupots_depth.py, what does it represent, and how is it derived?
p2d = torch.from_numpy(p2d).cuda() / 915
Dear authors, thank you for sharing such a great work with public. While when I am trying to write training code in other datasets,I don't know how to generate affb and affpts matrix.In your paper,I found out that affpts can be calculated using confidence heatmaps and affb can be calculated using PAFs.However,It is not clear where you generate affb and affpts,can you provide related code or some tools about generating affb and affpts?
Can you provide a inference code for testing other video inputs?
Hi,
Thanks for the great works.
I wonder if the code for estimating 3D pose in image instead of videos can be released.
Hello,
First of all, thank you for publishing your code.
I am having some trouble understanding how absolute coordinates are evaluated in MupoTS.
What is the threshold value used when computing the PCK_abs?
Because I noticed that the threshold for PCK_rel is 150 mm and 250 mm for AP root. But I am not sure about PCK_abs.
Thanks in advance for your response and your time.
Hello.
I have a question about the license of this work.
Could you tell me about it?
Thank you for the amazing work, I would like to gain a bit more clarity over the variable names that are used in the code and as to what they represent physically. Particularly I would like to know what p2d refers to and on checking the dimension of this variable it was about (500, 17, 2) what does the 500 (or the number in that position/ bsize variable ) represent.
Furthermore, if I would like to test this pipeline on the custom dataset (a video stream), how should I prepare the corresponding points_eval.pkl file. In other words, what is the format of the points_eval.pkl file for a given video stream.
Your help would be highly appreciated.
Thank You
Thanks for sharing your amazing work.
I read in your paper that you are using the rootTCN model to get the absolute poses, I have some questions about that:
1- Actually, I tried to load the pre-trained model from the /ckpts/model_root folder using the Discriminator2D class as follows:
roottcn = networktcn.Discriminator2D() M.Saver(roottcn).restore ('./ckpts/model_root/')
However, I'm getting this error:
Is there a problem in the network or I have loaded it incorrectly. Also, I would like to know the size of the input, by visualising the pth model I see it is a size of (1024,34,3) and according to the paper the input is 2D poses of each person but in fact the Discriminator2D class doesn't need any parameters.. Could you, please, help me with this issue.
2- My second question is about the data format used to train this model to estimate Z/f. Did you use intrinsic parameters and poses given in Human3.6m for that?
Thank you in advance for your time.
Hi, thanks for your great work. I have just entered into the 3D human pose. I admire your work.
Your paper estimates Z/f using weak perspective model, which SMAP estimates Zw/f, in which Z is the original depth, and f and w are the focal length and the image width both in pixels. Which is better? I think your estimation is in the real space and theirs is in a normalization space.
[1] SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation
Hi,
Can you share the pre-trained model again because the dropbox link says that the files has been deleted. Thank you!
thanks for your amazing work.
There is something that confuses me. There is a value of 915 in line 32 in the calculate_mupots_depth.py, what does it represent, and how is it derived?
p2d = torch.from_numpy(p2d).cuda() / 915
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.