Code Monkey home page Code Monkey logo

Comments (12)

HowieMa avatar HowieMa commented on May 25, 2024

In training process, the CURRENT Best EPOCH is based on the [email protected] * BBOX of validation dataset.
And the inference code and PCK calculating code for both validation dataset and test dataset are same. Your log seems a little strange, cause in my experiments the validation PCK and test PCK are quite close.
I am not sure how you crop the Panoptic dataset and how your split your training/dev/test, now I release the preprocessed data in my experiment, you can download it from here to check whether it solves your problems. But please DO NOT duplicate it for any commercial purposes, and the copyright still belongs to Panoptic.

from nsrmhand.

www516717402 avatar www516717402 commented on May 25, 2024

Thank you for reply. Test your project in OneHand10k, acquire the same accuracy with paper.But, Severe vibration of keypoint and lower with confidence in Actual video 1280*720 shape. I already modify your model by hourglass and add center cycle detect and other trick.Get a more stable model.About Limb mask trick, thoughts like this paper Multi-Scale Structure-Aware Network for Human Pose Estimation

from nsrmhand.

HowieMa avatar HowieMa commented on May 25, 2024

Since our model is trained only on image with size 368 * 368, it may not work very well on high resolution image. Thank you for adapting our model with Hourglass, and wish you can share your results.
Besides, thank you for sharing this paper, and I will read it recently. Actually, our limb mask idea is original from the Part Affinity Field of Openpose PAF. The idea of limb representation is very common in pose estimation problem, and there are lot of papers talking about it.

from nsrmhand.

www516717402 avatar www516717402 commented on May 25, 2024

I am glad to share my code. Currently, we are annotation data to train our model and modify part of finger detect.After publishing paper, I will discuss with the mentor to open source.Thank you for share project again.

from nsrmhand.

HowieMa avatar HowieMa commented on May 25, 2024

I am glad to share my code. Currently, we are annotation data to train our model and modify part of finger detect.After publishing paper, I will discuss with the mentor to open source.Thank you for share project again.

Look forward to your paper and code. And I really appreciate it if you could cite my paper. Thank you!

from nsrmhand.

www516717402 avatar www516717402 commented on May 25, 2024

Of course, this project helped me a lot.

from nsrmhand.

aqsc avatar aqsc commented on May 25, 2024

It does the padding the hand with 2.2B size of Panoptic data in training, but it only gets good results in panoptic dataset while it gets worse results when testing other imageset, such as onehand10k or self-photoing pictures. I think the padding size impacts greatly.
Another question, If we train the hand keypoint with the merge dataset of Panoptic and onehand10k or other hans with different padding size, can we get the better results when testing the hand with different padding size ?

from nsrmhand.

HowieMa avatar HowieMa commented on May 25, 2024

It does the padding the hand with 2.2B size of Panoptic data in training, but it only gets good results in panoptic dataset while it gets worse results when testing other imageset, such as onehand10k or self-photoing pictures. I think the padding size impacts greatly.
Another question, If we train the hand keypoint with the merge dataset of Panoptic and onehand10k or other hans with different padding size, can we get the better results when testing the hand with different padding size ?

Yes, you are correct. The padding size may impact a lot. As the model I released is just trained on the preprocessed Panoptic dataset. Thus it may only work well on the fixed 2.2B bounding box. Besides, the Panoptic(P) dataset has a totally different distribution with the Onehand10K(O), like the background of P is just the lab, while the background of O is the wild. Thus I think it's unfair to test the model trained with P on O.

For the second question, it may work well, as the hands in O can take any percent area of the image. For this, you may need to adjust the hyperparameters sigma of LPM and width of LDM, to make them consistent with the hand size.

By the way, the goal of this paper is just to improve performance with algorithm, not to build a general hand pose estimation system working on all scenes :)

from nsrmhand.

aqsc avatar aqsc commented on May 25, 2024

We can see that invisible keypoints are unannotated in dataset O, as the values =-1 in the label. If we want to train O, should we modify the HandDataset_LPM class in the hand_lpm.py ? and are keypoint groudtruth values still written as -1 in the labels.json?

from nsrmhand.

HowieMa avatar HowieMa commented on May 25, 2024

We can see that invisible keypoints are unannotated in dataset O, as the values =-1 in the label. If we want to train O, should we modify the HandDataset_LPM class in the hand_lpm.py ? and are keypoint groudtruth values still written as -1 in the labels.json?

For this issue, you can make the heatmap all zeros in training. When evaluating the PCK, if label -1, you can just ignore it.
Thus you may need to modify the data loader function.

from nsrmhand.

HowieMa avatar HowieMa commented on May 25, 2024

Can you share us the data loader function or other modified function when training O dataset?

There is no tricks in this function, and I already said very clear that you just need to set all the heatmaps zero for invisible keypoints. That is really simple to code it by yourself, just two lines code ... I believe you can do it within a few seconds :)

For the code in data loader, just

    def gen_label_heatmap(self, label):
        label = torch.Tensor(label)     # (21,2)
        grid = torch.zeros((self.label_size, self.label_size, 2))       # size:(46,46,2)
        grid[..., 0] = torch.Tensor(range(self.label_size)).unsqueeze(0)
        grid[..., 1] = torch.Tensor(range(self.label_size)).unsqueeze(1)
        grid = grid.unsqueeze(0)
        labels = label.unsqueeze(-2).unsqueeze(-2)
        exponent = torch.sum((grid - labels)**2, dim=-1)    # size:(21,46,46)
        heatmaps = torch.exp(-exponent / 2.0 / self.sigma / self.sigma)  # size:(21,46,46)

        # Here is the only different  *******************************
        invisible = (label[:, 0] == -1)     # set invisible heat maps to zero
        heatmaps[invisible, ...] = 0
        # **********************************************************
        return heatmaps

For the sigma in the LPM, as the sizes of hands vary in the Onehand10K, thus I set it as 0.03 of the bounding box size in the input image scale (368 * 368). You can adjust it by your self to get a better results. I just set it casually ... :)

from nsrmhand.

HowieMa avatar HowieMa commented on May 25, 2024

Can you share us the data loader function or other modified function when training O dataset?

By the way, its better to start a new issue for this if you still have questions, rather than discussing a lot in other people's issue which is not relevant to your question.
I found that the owner of this issue close it just now. I hope our discussion does not bother him or her :)

from nsrmhand.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.