Code Monkey home page Code Monkey logo

posediffusion's Introduction

PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment

Teaser

[Paper] [Project Page]

Installation

We provide a simple installation script that, by default, sets up a conda environment with Python 3.9, PyTorch 1.13, and CUDA 11.6.

source install.sh

Quick Start

1. Download Checkpoint

Download the model checkpoint trained on Co3D from Dropbox. The predicted camera poses and focal lengths are defined in NDC coordinate.

2. Run the Demo

python demo.py image_folder="samples/apple" ckpt="/PATH/TO/DOWNLOADED/CKPT"

You can experiment with your own data by specifying a different image_folder.

On a Quadro GP100 GPU, the inference time for a 20-frame sequence is approximately 0.8 seconds without GGS and around 80 seconds with GGS (including 20 seconds for matching extraction).

You can choose to enable or disable GGS (or other settings) in ./cfgs/default.yaml.

We use Visdom by default for visualization. Ensure your Visdom settings are correctly configured to visualize the results accurately. However, Visdom is not necessary for running the model.

Training

1. Preprocess Annotations

Start by following the instructions here to preprocess the annotations of the Co3D V2 dataset. This will significantly reduce data processing time during training.

2. Specify Paths

Next, specify the paths for CO3D_DIR and CO3D_ANNOTATION_DIR in ./cfgs/default_train.yaml. CO3D_DIR should be set to the path where your downloaded Co3D dataset is located, while CO3D_ANNOTATION_DIR should point to the location of the annotation files generated after completing the preprocessing in step 1.

3. Start Training

  • For 1-GPU Training:

    python train.py
  • For multi-GPU training, launch the training script using accelerate, e.g., training on 8 GPUs (processes) in 1 node (machines):

    accelerate launch --num_processes=8 --multi_gpu --num_machines=1 train.py 

All configurations are specified inside ./cfgs/default_train.yaml. Please notice that we use Visdom to record logs.

Testing

1. Specify Paths

Please specify the paths CO3D_DIR, CO3D_ANNOTATION_DIR, and resume_ckpt in ./cfgs/default_test.yaml. The flag resume_ckpt refers to your downloaded model checkpoint.

2. Run Testing

python test.py

You can check different testing settings by adjusting num_frames, GGS.enable, and others in ./cfgs/default_test.yaml.

Acknowledgement

Thanks for the great implementation of denoising-diffusion-pytorch, guided-diffusion, hloc, relpose.

License

See the LICENSE file for details about the license under which this code is made available.

posediffusion's People

Contributors

jytime avatar gleize avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.