Code Monkey home page Code Monkey logo

improved-body-parts's Introduction

Hi there 👋

  • 🔭 I’m currently working on Multimodal Emotion Recognition and Computer Vision.
  • 🌟 I'm interested in Neural Networks and Self-Supervised Learning.
  • 🌱 I like to learn some interesting thoerys that can be diagrammed.

Anurag's GitHub stats

Top Langs

improved-body-parts's People

Contributors

hellojialee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

improved-body-parts's Issues

About focal l2 loss

Hi,

Thank you for sharing this great work.
I have implemented focal l2 loss but unfortunately didn't get better results compared to normal l2 loss.
Here are some questions about focal l2 loss.

  1. In the paper, you mentioned that before apply focal l2 loss, first train the network with normal l2 loss. Does that mean you should have two stage training, the first stage would be l2 loss training and then use the best checkpoint from first stage as initial weight and train the network with focal l2 loss? Or you can just directly train focal l2 loss without any pretrained stage?
  2. Is focal l2 loss sensitive to hyper-parameter? I adopt nearly the same hyper-parameters as your implementation. I guess maybe this is the reason why I didn't get better results?.
    I'm looking forward to your suggestion.
    Thank you in advance.

Doesn't run without GPU

$ ls link2checkpoints_distributed/
PoseNet_52_epoch_without_optimizer_statue.pth
$ python3 ./demo_image.py --resume --opt-level O0 --image ~/in.jpg --output ~/out.jpg
[...]
Traceback (most recent call last):
  File "./demo_image.py", line 631, in <module>
    loss_scale=args.loss_scale)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/apex-0.1-py3.6.egg/apex/amp/frontend.py", line 358, in initialize
    return _initialize(models, optimizers, _amp_state.opt_properties, num_losses, cast_model_outputs)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/apex-0.1-py3.6.egg/apex/amp/_initialize.py", line 171, in _initialize
    check_params_fp32(models)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/apex-0.1-py3.6.egg/apex/amp/_initialize.py", line 93, in check_params_fp32
    name, param.type()))
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/apex-0.1-py3.6.egg/apex/amp/_amp_state.py", line 32, in warn_or_err
    raise RuntimeError(msg)
RuntimeError: Found param posenet.pre.conv1.weight with type torch.FloatTensor, expected torch.cuda.FloatTensor.
When using amp.initialize, you need to provide a model with parameters
located on a CUDA device before passing it no matter what optimization level
you chose. Use model.to('cuda') to use the default device.

Please advise.

关于论文中before generating precise ground truth Gaussian peaks

Hi,jialee
有幸阅读到simple pose这篇论文,关于论文当中产生GT高斯热图的解释我没有理解为何要涉及到R(stride步长)以及对位置的换算。论文当中是这样描述的:
By the way, we map the pixel p at the location p(x; y) in
the j-th ground truth heatmap to its original floating point
location p~(~ x; y~) = p~(x · R + R=2 − 0:5; y · R + R=2 − 0:5)
in the input image, in which R is the output stride, before
generating the precise ground truth Gaussian peaks.

在heatmapper.py中有看到相关注释,比如:
x,y coordinates of centers of bigger grid, stride / 2 -0.5是为了在计算响应图时,使用grid的中心
basically we should use center of grid, but in this place classic implementation uses left-top point.
期待您的回复,谢谢~

About pretrained model and focal loss

Hi,

  1. Could you help explain the method of the 3 pretrained model? How does they relate to the method in the paper?
  2. Could you introduce more info on the Focal loss function? Seems the latest code in loss_model.py is different from the formula in the paper(a=b=0, abs(1. - st) instead (1. - st) ** gamma). And the loss in loss_model_parrel.py also different from the loss_model.py . Which is better?
  3. Seems the model with Focal loss will cause more false positive and error connection, even the AP is more higher. Does that make sense?
  4. Could you comment the Focal loss used on the Cornernet on heatmap? Will that work on this model also?

Thanks

Heatmap Visualization

Hello, what part of the code is the visualization of the heat map? How can I get it?

further improvements in speed and maintaining high accuracy

Hi, nice work. But I have some recommendations to further improve the speed and maintaining high accuracy.
1, Optimize the post-processing speed (write in C++)
2, Try Knowledge distillation (currently I think we could reduce 70 -> 80% size of the model with comparable accuracy).
3, Try TensorRT.
But first I think you need to clean the source code architecture so that everyone could easy to help you.

Test without multi scale and flip

Hi,
Thanks for your job and the project. I have demo the pretrained model on some pics. With the default setting( with FLIP, multi scale) the pose result is OK, similar to the openpose BODY25 model. But, if without the flip, and multi scale, the result is not so good even on very simple image.
Could you help check if any issue there?

origin image
2

with flip
result_withflip

without flip
result_woflip

Evaluation results are always zero

Average precision and average recall from evaluate.py always give zero.

Screenshot 2021-01-10 at 11 43 27

I am using images of size 512(other parameters are default as it is), I have trained the model without checkpoint and for 20 epochs with COCO dataset. The results were decent. But evaluation is zero(evaluation on validation set for 50 ids). What would be the reason ?

Thanks,
Aravind

Request for licensing information

I hope this message finds you well. I came across your project on GitHub and I'm impressed with its capabilities. However, I noticed that there is no information about the licensing of the project.

I would like to inquire about the licensing of this project and whether it can be used for commercial purposes. I am interested in using your project to develop a commercial software, and I need to know the licensing terms before proceeding.

I would greatly appreciate it if you could provide me with more information about the licensing of SimplePose. If there are any specific terms or conditions, please let me know as well.

Thank you very much for your time and assistance. I look forward to hearing from you soon.

Best regards,
Bing BAI

Out of memory

Hi, I run python evaluation.py to evaluate the model with 2080Ti.
But it reminds me that it is out of memory.

I don't change anything in this repository. why it appears? Thank you very much.

Aboat fig 4 is or not improved hourglass ?

Hello, your work is excellent, but there is one thing I don’t understand very well. Figure 4 in the paper is an improved hourglass network, but in Figure 3, it is still marked as hourglass, and you don’t see the part about Figure 4 in your code. Code. Excuse me, I don’t know if I misunderstood Figure 4.

Understanding body part heatmap implementation

@hellojialee , Kudos to the excellent work,

I am referring to your ground truth heatmap generation implementation because I must generate body part heatmaps for my study. In short, I am looking to achieve this :
image

I intend to generate 18 channel heatmap where each channel stores one body part ( line segment between wrist and elbow, line segment between elbow and shoulder etc )
It is not clear to me how the elliptical gaussian has been implemented. Can you please explain the steps in put_limb_gaussian_maps in /py_cocodata_server/py_data_heatmapper.py because I believe this is the function doing my desired task.

Thanks in advance,
jysa01

Joints heatmaps

Hi I want to ask you what we have in the multi-person joints heatmap generated with the heatmapper.
Is It just a gaussian around each joint location so that the same semantic joint (i.e. left shoulder) is on the same heatmap channel for all the human targets in the scene but at different x,y location?

So could you vectorize the joint heatmapper emitter i.e. with render gaussian? Cause I see you have many loop there with numpy code and so I am guessing if It could be vectorized with some Pytorch ops.

data/coco_masks_hdf5.py

prev_center.append(np.append(person_center, max(img_anns[p]["bbox"][2], img_anns[p]["bbox"][3])))

这里的p是固定值值,是否存在问题?宽高的最大值,是当前main_persons的吗?

how to divide different people's kp in the same image?

你好,我之前在学习CornerNet,CenterNet时,不同目标的相同关键点是通过嵌入向量分组的,如83channel=80类+嵌入向量+xy偏移量,所以当图片中有多个人时,IMHN的输出是如何解析的?IMHN的输出是什么样的,通道有什么含义?我理解的是所有人体的同个关键点预测在同张heatmap上,这样如何区分不同人呢?谢谢

How much does multi-scale search contribute?

Hi, from experiment 16 vs 19 in Table 1, we find multi-scale search seems to contribute more than 3% AP.
Did all other models including your baseline in Table 1 have multi-scale search?

Data pre-processing

Thank you for a job well done! I have some doubts like you ask,In the data pre-processing,Why is there a problem with center point alignment?Is this because the size of the input image through CNN has changed?

An error appears during the training that may pass in a non-contiguous input.

So glad to see your project, I successfully run the demo, create the h5 file. But when I try to train the model, An error appears just like:
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.
I really hope to get your help, thank you very much.

Inference is very slow, 6 seconds per frame.

Hi, and thank you for making this code available.

I am running it in windows, on a GTX 1080, and using the demo_image.py file with the model from google drive and the time it takes to detect keypoints is more than 6 seconds.

What am i doing wrong? How can i get close to the 38 fps that you mention on the readme?

Thank you again!


>python demo_image.py --image input.jpg
0 neck->nose
1 neck->Reye
2 neck->Leye
3 neck->Rear
4 neck->Lear
5 nose->Reye
6 nose->Leye
7 Reye->Rear
8 Leye->Lear
9 neck->Rsho
10 Rsho->Relb
11 Relb->Rwri
12 neck->Lsho
13 Lsho->Lelb
14 Lelb->Lwri
15 neck->Rhip
16 Rhip->Rkne
17 Rkne->Rank
18 neck->Lhip
19 Lhip->Lkne
20 Lkne->Lank
21 nose->Rsho
22 nose->Lsho
23 Rsho->Rhip
24 Rhip->Lkne
25 Lsho->Lhip
26 Lhip->Rkne
27 Rear->Rsho
28 Lear->Lsho
29 Rhip->Lhip
{0: 'neck->nose',
 1: 'neck->Reye',
 2: 'neck->Leye',
 3: 'neck->Rear',
 4: 'neck->Lear',
 5: 'nose->Reye',
 6: 'nose->Leye',
 7: 'Reye->Rear',
 8: 'Leye->Lear',
 9: 'neck->Rsho',
 10: 'Rsho->Relb',
 11: 'Relb->Rwri',
 12: 'neck->Lsho',
 13: 'Lsho->Lelb',
 14: 'Lelb->Lwri',
 15: 'neck->Rhip',
 16: 'Rhip->Rkne',
 17: 'Rkne->Rank',
 18: 'neck->Lhip',
 19: 'Lhip->Lkne',
 20: 'Lkne->Lank',
 21: 'nose->Rsho',
 22: 'nose->Lsho',
 23: 'Rsho->Rhip',
 24: 'Rhip->Lkne',
 25: 'Lsho->Lhip',
 26: 'Lhip->Rkne',
 27: 'Rear->Rsho',
 28: 'Lear->Lsho',
 29: 'Rhip->Lhip',
 30: 'nose',
 31: 'neck',
 32: 'Rsho',
 33: 'Relb',
 34: 'Rwri',
 35: 'Lsho',
 36: 'Lelb',
 37: 'Lwri',
 38: 'Rhip',
 39: 'Rkne',
 40: 'Rank',
 41: 'Lhip',
 42: 'Lkne',
 43: 'Lank',
 44: 'Reye',
 45: 'Leye',
 46: 'Rear',
 47: 'Lear',
 48: 'background',
 49: 'reverseKeypoint'}
Resuming from checkpoint ......
Network weights have been resumed from checkpoint...
cuda
Selected optimization level O1:  Insert automatic casts around Pytorch functions and Tensor methods.

Defaults for this optimization level are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled                : True
opt_level              : O1
cast_model_type        : None
patch_torch_functions  : True
keep_batchnorm_fp32    : None
master_weights         : None
loss_scale             : dynamic
start processing...
the 0th keypoint detection result is :  ([(384.98810766687865, 156.99848021452428), (392.0089789786089, 140.00016588448665), (372.00392927155144, 141.9994244210869), (396.997404715929, 137.00354114471122), (339.00678492184926, 140.0066329927729), (424.0065017794617, 191.99842561943024), (304.9960763460449, 220.00916854059585), (443.0001489242592, 272.0109579295975), (292.00050351624543, 310.9984260760411), (465.0083100132065, 350.99493035095674), (293.00562399904305, 404.00513994760007), (420.99916662586236, 393.0031377139439), (349.9987046664099, 401.00452761418853), (413.99545615615057, 536.0021693790678), (351.0002542695355, 541.9933765298466), (376.0021593526506, 644.988972815169), (352.00185668667876, 677.9945526718805)], 0.9674948892626798)
processing time is 6.45740

Group

你好,请问这个库有针对HigherHRNet中valid文件使用的AE分组策略导致的关键点不合理连接的解决方案?

it seems the inference speed is slow!

Hi, @jialee93
Thanks for your great work. When I use demo_images.py to test on my own image , each image consumes about 40 seconds(TitianV)!. But as you said in the readme, the speed has achived real time. I don't know why and hope you can give some advice~

l2 focal loss is diffirent from paper

So glad to see your project, I successfully run the demo.But i found that the l2 focal loss in this project (models/loss_model_parallel.py), set the alpha=0 and beta=0, factor = torch.abs(1.- st), which is different from your paper shows, alpha=0.1, beta=0.02 and gamma=2, factor = (1. - st) **gamma.I'm really confused about that.
I really hope to get your help, thank you very much.

With python demo_image.py

Defaults for this optimization level are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Processing user overrides (additional kwargs that are not None)...
After processing overrides, optimization options are:
enabled : True
opt_level : O1
cast_model_type : None
patch_torch_functions : True
keep_batchnorm_fp32 : None
master_weights : None
loss_scale : dynamic
Warning: multi_tensor_applier fused unscale kernel is unavailable, possibly because apex was installed without --cuda_ext --cpp_ext. Using Python fallback. Original ImportError was: ModuleNotFoundError("No module named 'amp_C'",)
start processing...
Traceback (most recent call last):
File "demo_image.py", line 637, in
params, model_params = config_reader()
File "/scratch/gp/Improved-Body-Parts-master/utils/config_reader.py", line 9, in config_reader
param = config['param'] # 继承了dict的一种字典类型
File "/scratch/mool/ana3/envs/gpp/lib/python3.6/site-packages/configobj.py", line 554, in getitem
val = dict.getitem(self, key)
KeyError: 'param'
(gpp) zhhu@k8s-master01:/scratch/gp/Improved-Body-Parts-master$ python demo_image.py
I am in ubuntu18.04 CUDA10.0 run it.But it failed.can you help me to watch the errror? thanks!!!

A small question about the formulation (2)

Hi, thanks for sharing your brilliant work. I noticed that when computing Sd, the two formulations are all close to 1 ( the first line when S ~= 1, Sd ~= 0.9, the second line when S ~= 0.01, Sd ~= 0.97). How does Sd balance easy and hard samples?

ValueError: not enough values to unpack (expected 5, got 3)

Hi writer, I encountered an error in running train.py,How should I modify it?
Thank you for your answer!!!

Test phase, Epoch: 0
Traceback (most recent call last):
File "train.py", line 206, in
test(epoch, show_image=False)
File "train.py", line 178, in test
images, mask_misses, heatmaps, offsets, mask_offsets = target_tuple
ValueError: not enough values to unpack (expected 5, got 3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.