akashsengupta1997 / humaniflow Goto Github PK

[CVPR 2023] Code repository for HuManiFlow: Ancestor-Conditioned Normalising Flows on SO(3) Manifolds for Human Pose and Shape Distribution Estimation

License: MIT License

Python 100.00%

3d-human-pose 3d-human-shape-and-pose-estimation computer-vision cvpr cvpr2023 normalizing-flows smpl

humaniflow's Issues

Performance gap

Hi @akashsengupta1997 ,

I tried the evaluation code with your released default checkpoint but found a performance gap between what I got from the code and the performance reported in the paper.

This is the SSD-3D performance I got:

This is the 3DPW performance I got:

The results show a significant difference from what is shown in Table 3 -5 in the paper. Could you help to provide some clarification about it?

Using model with multiple images or video

What is the best way to update the code to work on multiple images or a video? I attempted to use VideoCapture on a gif file to read each frame. However, I am having difficulty appending each image and heatmap together to be fed into the model.

This is in the predict_humaniflow.py script:

for image_fname in tqdm(sorted([f for f in os.listdir(image_dir)])):
        with torch.no_grad():
            # Capture video from file
            cap = cv2.VideoCapture(os.path.join(image_dir, image_fname))
            # Capture frame-by-frame
            ret, frame = cap.read()
            frames = []
            while ret:
                # ------------------------- INPUT LOADING AND PROXY REPRESENTATION GENERATION -------------------------
                image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

                ...............

                frames.append(torch.cat([proxy_rep_img, proxy_rep_heatmaps], dim=1))
                ret, frame = cap.read()
                if not ret:
                    break
            
            cap.release()
            cv2.destroyAllWindows()
            proxy_rep_input = torch.cat([x.float() for x in frames], dim=1).float()  # (1, 18, img_wh, img_wh)

Using more than 10 betas

In the humaniflow_config.py file, the number of betas is set to 10. If I adjust the betas, the program returns an error. Is it possible to update other parameters to use more than 10 betas, or does the model need to be re-trained to complete this?

Background images for training

Hi! Thanks for releasing the code!
I'm trying to reproduce the training and thus need to gather all the training and validation backgrounds. Following your description, I used the lsun repo to download and extract the backgrounds. However, now I'm struggling with 1) selecting the "correct" background images and 2) converting them to the right format and to the right location.
The provided script data/copy_lsun_images_to_train_files_dir.py doesn't work for me, since I guess the directory structure isn't correct after extracting the images. E.g. "bedroom_train_lmdb" extracts images to e.g. ./f/8/8/1/2/2/*.webp which isn't compatible with your script. Your script also only looks for .jpg files. Furthermore, I don't know how to select the mentioned 397582 training backgrounds, since e.g. "bedroom_train_lmdb" alone has over 3mio images. Would be grateful for any help!

Model performance

Hi, thank you for your interesting work.

I would like to ask what is the performance of your model, like processing time for single image or fps? since I dont see it's not mentioned in your paper.
Is your model capable of running in real time?
Thank you for your time!

akashsengupta1997 / humaniflow Goto Github PK

humaniflow's Issues

Performance gap

Using model with multiple images or video

Using more than 10 betas

Background images for training

Model performance

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent