Code Monkey home page Code Monkey logo

gntcn's Issues

velocity-tcn

Thanks for your work. I am trying to understand where the velocity-tcn is used in the code on e.g. mupots? I can only see the joint-tcn and root-tcn in the code. Can you point me in the right direction?

How to train the GCN and TCN network?

Thanks for your awesome work. I really want to konw that how the provided pre-trained model was obtained?
I would be very glad if you could help me.

The confidence scores of the part affinity field for bone-GCN

Thank you for sharing such amazing work. I have gone through the code and I have a question about how to get the confidence scores of the PAF for bone-GCN.

In the current eval_gt_h36m.py, I understand that we assume the confidence score of each joint is 1 in this case. So we get the
affpts = torch.ones(bsize,17,17).cuda() / 17. If we want to test our own custom images, we should use the real confidence scores of each joint from OpenPose.

However, when I looked at the outputs data from OpenPose, I couldn't find where we can get the confidence scores of the PAF. I assume this part: affb = torch.ones(bsize,16,16).cuda() / 16 is where we should put the actually scores. It would be great if you can share details about how to generate the confidence scores of the PAF.

Thank you so much for your time and help.

2 different checkpoints for GCN

Thank you for sharing your work. I have one question:

  1. There are 2 checkpoints for GCN, model_gcn and model_gcnwild. What is the difference between the two? I notice that model_gcn is used for H36M evaluation and model_gcnwild is used for MuPoTs. Is it because H36M and MuPoTs have different keypoints definition?

Questions about "points_eval.pkl"

Thank you again for sharing your amazing work. I'd like to know whether it's possible to share the points_eval.pkl file. More specifically, I'd like to know the format of the p2d and p3d (i.e. p2d,p3d = dataset[i]). Is there any preprocessing for the 2d joint data (e.g. normalize to 0-1, relative to root joint.. etc )?

I have been tried to use GnTCN on a custom image these days, but there are some parts that still need your guidance or clarification. Let me know if it's ok to email you to discuss more instead of posting issues here.

Thank you so much for your help and time.

About the TCN part

Thanks for making your amazing work open in public!
I am a newcomer to this topic(human pose estimation) so I'm not familiar with reading the code.
And I have a question about the TCNs part.
Even after I've read your paper thoroughly, I am not quite sure about the TCN part of this model architecture.

On page 5 of your paper, there is equation (8) which handles the importance of the output from Root-TCN and velocity-TCN.
I wonder where I can find this part in the code.

Thanks!

About Network Architecture

i wonder if u can share the details of network architecture of gcns and tcns, it's hard to understand the implementation according to the codes.

3D coordinate for human pose

Thank you very much for sharing your great work.
I want to infer 3D human pose by giving new image or movie as an input using GnTCN. Is it possible in this repo?
I have limited knowledge about this technology. Sorry for basic question in advance.

affb,affpts

Dear authors, thank you for sharing such a great work with public. While when I am trying to write training code in other datasets,I don't know how to generate affb and affpts matrix.In your paper,I found out that affpts can be calculated using confidence heatmaps and affb can be calculated using PAFs.However,It is not clear where you generate affb and affpts,can you provide related code or some tools about generating affb and affpts?

PCk_abs threshold

Hello,
First of all, thank you for publishing your code.

I am having some trouble understanding how absolute coordinates are evaluated in MupoTS.
What is the threshold value used when computing the PCK_abs?
Because I noticed that the threshold for PCK_rel is 150 mm and 250 mm for AP root. But I am not sure about PCK_abs.

Thanks in advance for your response and your time.

Regarding variable names and Preparation of the pickle file for custom data

Thank you for the amazing work, I would like to gain a bit more clarity over the variable names that are used in the code and as to what they represent physically. Particularly I would like to know what p2d refers to and on checking the dimension of this variable it was about (500, 17, 2) what does the 500 (or the number in that position/ bsize variable ) represent.

Furthermore, if I would like to test this pipeline on the custom dataset (a video stream), how should I prepare the corresponding points_eval.pkl file. In other words, what is the format of the points_eval.pkl file for a given video stream.

Your help would be highly appreciated.

Thank You

RooTTCN Details

Thanks for sharing your amazing work.
I read in your paper that you are using the rootTCN model to get the absolute poses, I have some questions about that:

1- Actually, I tried to load the pre-trained model from the /ckpts/model_root folder using the Discriminator2D class as follows:

roottcn = networktcn.Discriminator2D() M.Saver(roottcn).restore ('./ckpts/model_root/')

However, I'm getting this error:
image

Is there a problem in the network or I have loaded it incorrectly. Also, I would like to know the size of the input, by visualising the pth model I see it is a size of (1024,34,3) and according to the paper the input is 2D poses of each person but in fact the Discriminator2D class doesn't need any parameters.. Could you, please, help me with this issue.

2- My second question is about the data format used to train this model to estimate Z/f. Did you use intrinsic parameters and poses given in Human3.6m for that?

Thank you in advance for your time.

How to apply to videos in the wild?

Hi, thanks for your great work. I have just entered into the 3D human pose. I admire your work.
Your paper estimates Z/f using weak perspective model, which SMAP estimates Zw/f, in which Z is the original depth, and f and w are the focal length and the image width both in pixels. Which is better? I think your estimation is in the real space and theirs is in a normalization space.

[1] SMAP: Single-Shot Multi-Person Absolute 3D Pose Estimation

Pre-trained Model

Hi,

Can you share the pre-trained model again because the dropbox link says that the files has been deleted. Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.