snap-research / hyperhuman Goto Github PK

View Code? Open in Web Editor NEW

472.0 472.0 11.0 309.22 MB

[ICLR 2024] Github Repo for "HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion"

Home Page: https://snap-research.github.io/HyperHuman/

HTML 71.11% CSS 22.24% JavaScript 6.65%

hyperhuman's People

Contributors

Stargazers

Watchers

Forkers

hadryan ameerazam08 tomchapin chssozxw xtends bill007bill kekewind dearborn-open-ai bloomv zhaopufeng psalmmyn

hyperhuman's Issues

Thanks for the great work! Will the training dataset or the sub one be open source?

Will the MSCOCO 2014 Human Validation set be released soon?

Thanks for your work, I am wondering if the MSCOCO 2014 Human Validation in the paper will be released soon?

Questions with "Replicate" Layers

Thanks for sharing nice work.

I wanted to ask what "replicate layers" means. Does replicating include the model weights? Or, does it simply mean replicating same blocks (architecture), and each branch has their own set of weights?

It seems like Figure 2. is showing that each expert branches have their own set of weights for (first few and last few Downsample /Upsample blocks), and the intermediate blocks are shared between the experts. But it was not clear from the paper itself, and so I would like some explanation.

If my understanding is correct, then how are the features from different branches fused together? Is it simple addition of features before passing the "shared" intermediate layers?

Thank you.

FInetuning the whole unet

Hi, thanks for your great work. Please allow me to ask a question that might not relate closely to the paper.

In the paper, you finetune the whole unet with three expert modules, which results in high computational cost. Is it possible to only finetune the extra expert modules(normal and depth) if we assume the rgb branch is already well-trained? Could you please provide some insights on this? Thank you in advance.

Question about depth map

Hi, thanks for your great work.

I am confused about how the depth map is processed. If I understand correctly, the depth map has only one channel. Then how can it be encoded by the vae encoder? Could you please provide more details? Thank you.

Why body skeleton is required as input for the first stage structure model?

Can we just use text as input to enforce the joint learning of image appearance, spatial relationship, and geometry in a unified network?

Why not use openpose-style skeleton.

Thank you for your excellent work. I have a question, since openpose-style skeleton refers to a skeleton with hand joints, a neck joint that connects nose joint, left and right shoulder joints. This topology is more easy-to-understand, and contains more details about hands, and other body parts. I wonder why you do not use the openpose style.
A example of openpose style is in the following image:

skeleton pose

Thank you for your excellent work. May I ask if the skeleton input to the UNet also goes through the VaeEncoder? Is the operation the same as with the image? Looking forward to your reply.

There is something confusing in section 3.3 about the network archetectrue?

as you say, you use different conv to convert different controls to same input size 128*128 and elementsize add them all to an encoder of SDXL, so is it the same archetecture with ControlNet but only changes the base model and increase some control types? Why you name it refiner?

Will there be opensource code?

Will you opensource code?

snap-research / hyperhuman Goto Github PK

hyperhuman's People

Contributors

Stargazers

Watchers

Forkers

hyperhuman's Issues

Thanks for the great work! Will the training dataset or the sub one be open source?

Will the MSCOCO 2014 Human Validation set be released soon?

Questions with "Replicate" Layers

FInetuning the whole unet

Question about depth map

Why body skeleton is required as input for the first stage structure model?

Why not use openpose-style skeleton.

skeleton pose

There is something confusing in section 3.3 about the network archetectrue?

Will there be opensource code?

Will there be opensource code?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent