snap-research / hyperhuman Goto Github PK
View Code? Open in Web Editor NEW[ICLR 2024] Github Repo for "HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion"
Home Page: https://snap-research.github.io/HyperHuman/
[ICLR 2024] Github Repo for "HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion"
Home Page: https://snap-research.github.io/HyperHuman/
Thanks for your work, I am wondering if the MSCOCO 2014 Human Validation in the paper will be released soon?
Thanks for sharing nice work.
I wanted to ask what "replicate layers" means. Does replicating include the model weights? Or, does it simply mean replicating same blocks (architecture), and each branch has their own set of weights?
It seems like Figure 2. is showing that each expert branches have their own set of weights for (first few and last few Downsample /Upsample blocks), and the intermediate blocks are shared between the experts. But it was not clear from the paper itself, and so I would like some explanation.
If my understanding is correct, then how are the features from different branches fused together? Is it simple addition of features before passing the "shared" intermediate layers?
Thank you.
Hi, thanks for your great work. Please allow me to ask a question that might not relate closely to the paper.
In the paper, you finetune the whole unet with three expert modules, which results in high computational cost. Is it possible to only finetune the extra expert modules(normal and depth) if we assume the rgb branch is already well-trained? Could you please provide some insights on this? Thank you in advance.
Hi, thanks for your great work.
I am confused about how the depth map is processed. If I understand correctly, the depth map has only one channel. Then how can it be encoded by the vae encoder? Could you please provide more details? Thank you.
Can we just use text as input to enforce the joint learning of image appearance, spatial relationship, and geometry in a unified network?
Thank you for your excellent work. I have a question, since openpose-style skeleton refers to a skeleton with hand joints, a neck joint that connects nose joint, left and right shoulder joints. This topology is more easy-to-understand, and contains more details about hands, and other body parts. I wonder why you do not use the openpose style.
A example of openpose style is in the following image:
Will you opensource code?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.