Code Monkey home page Code Monkey logo

humaniflow's Issues

Performance gap

Hi @akashsengupta1997 ,

I tried the evaluation code with your released default checkpoint but found a performance gap between what I got from the code and the performance reported in the paper.

This is the SSD-3D performance I got:
Screenshot 2024-04-09 at 8 34 22 PM

This is the 3DPW performance I got:
Screenshot 2024-04-09 at 8 34 28 PM

The results show a significant difference from what is shown in Table 3 -5 in the paper. Could you help to provide some clarification about it?

Using model with multiple images or video

What is the best way to update the code to work on multiple images or a video? I attempted to use VideoCapture on a gif file to read each frame. However, I am having difficulty appending each image and heatmap together to be fed into the model.

This is in the predict_humaniflow.py script:

for image_fname in tqdm(sorted([f for f in os.listdir(image_dir)])):
        with torch.no_grad():
            # Capture video from file
            cap = cv2.VideoCapture(os.path.join(image_dir, image_fname))
            # Capture frame-by-frame
            ret, frame = cap.read()
            frames = []
            while ret:
                # ------------------------- INPUT LOADING AND PROXY REPRESENTATION GENERATION -------------------------
                image = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)

                ...............

                frames.append(torch.cat([proxy_rep_img, proxy_rep_heatmaps], dim=1))
                ret, frame = cap.read()
                if not ret:
                    break
            
            cap.release()
            cv2.destroyAllWindows()
            proxy_rep_input = torch.cat([x.float() for x in frames], dim=1).float()  # (1, 18, img_wh, img_wh)

Using more than 10 betas

In the humaniflow_config.py file, the number of betas is set to 10. If I adjust the betas, the program returns an error. Is it possible to update other parameters to use more than 10 betas, or does the model need to be re-trained to complete this?

Background images for training

Hi! Thanks for releasing the code!
I'm trying to reproduce the training and thus need to gather all the training and validation backgrounds. Following your description, I used the lsun repo to download and extract the backgrounds. However, now I'm struggling with 1) selecting the "correct" background images and 2) converting them to the right format and to the right location.
The provided script data/copy_lsun_images_to_train_files_dir.py doesn't work for me, since I guess the directory structure isn't correct after extracting the images. E.g. "bedroom_train_lmdb" extracts images to e.g. ./f/8/8/1/2/2/*.webp which isn't compatible with your script. Your script also only looks for .jpg files. Furthermore, I don't know how to select the mentioned 397582 training backgrounds, since e.g. "bedroom_train_lmdb" alone has over 3mio images. Would be grateful for any help!

Model performance

Hi, thank you for your interesting work.

I would like to ask what is the performance of your model, like processing time for single image or fps? since I dont see it's not mentioned in your paper.
Is your model capable of running in real time?
Thank you for your time!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.