Code Monkey home page Code Monkey logo

openfmnav's Introduction

OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models

NAACL 2024 Findings

This is the official repository of OpenFMNav: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models.

Setup

Dataset Preparation

Please follow HM3DSem to download the dataset and prepare the data. The data format should be:

data/
├── objectgoal_hm3d/
│   ├── train/
│   ├── val/
│   └── val_mini/
├── scene_datasets/
│   └── hm3d/
│       ├── minival/
│       └── val/
├── versioned_data/
├── matterport_category_mappings.tsv
└── object_norm_inv_perplexity.npy

Checkpoints

Please checkout Grounded-SAM to download groundingdino_swint_ogc.pth and sam_vit_h_4b8939.pth and put them into Grounded_SAM/.

Dependencies

  1. Python & PyTorch

    This code is tested on Python 3.9.16 on Ubuntu 20.04, with PyTorch 1.11.0+cu113.

  2. Habitat-Sim & Habitat-Lab

    # Habitat-Sim
    git clone https://github.com/facebookresearch/habitat-sim.git
    cd habitat-sim; git checkout tags/challenge-2022; 
    pip install -r requirements.txt; 
    python setup.py install --headless
    
    # Habitat-Lab
    git clone https://github.com/facebookresearch/habitat-lab.git
    cd habitat-lab; git checkout tags/challenge-2022; 
    pip install -e .
    
  3. Grounded-SAM

    Please checkout Grounded-SAM to install the dependencies.

  4. Others

    pip install -r requirements.txt
    

OpenAI API keys

You will need an OpenAI API key to use this repo. Please touch apikey.txt and paste your API key in the file.

Running

Example

An example command to run the pipeline:

CUDA_VISIBLE_DEVICES=0 python main.py --split val --eval 1 --auto_gpu_config 0 --prompt_type scoring \
-n 1 --num_eval_episodes 100 --text_threshold 0.55 --boundary_coeff 12 --start_episode 0 --tag_freq 100 \
--use_gtsem 0 --num_local_steps 20 --print_images 1 --exp_name test

Visualization

To make a demo video on your saved images, you can either use ffmpeg to make separate videos or use

python make_demo.py --exp_name test # add `--delete_img` to delete images after making video

to make batched videos.

Acknowledgements

This repo is heavily based on L3MVN. We thank the authors for their great work.

Citation

If you find this work helpful, please consider citing:

@inproceedings{kuang2024openfmnav,
    title={Open{FMN}av: Towards Open-Set Zero-Shot Object Navigation via Vision-Language Foundation Models},
    author={Yuxuan Kuang and Hai Lin and Meng Jiang},
    booktitle={2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics},
    year={2024}
}

openfmnav's People

Contributors

yxkryptonite avatar

Stargazers

ZxShen avatar Polaris avatar rg avatar Francesco Taioli avatar  avatar chaoyi avatar

Watchers

 avatar

openfmnav's Issues

Getting IndexError: list index out of range

Hi, I'm getting the following exception:

 File "/home/ftaioli/projects/OpenFMNav/envs/habitat/objectgoal_env21.py", line 219, in _preprocess_semantic
    if self.scene.objects[se[i]].category.name() in self.hm3d_semantic_mapping:
IndexError: list index out of range

as soon as I start the evaluation with the following command:

CUDA_VISIBLE_DEVICES=0 python main.py --split val --eval 1 --auto_gpu_config 0 --prompt_type scoring -n 1 --num_eval_episodes 100 --text_threshold 0.55 --boundary_coeff 12 --start_episode 0 --tag_freq 100 --use_gtsem 0 --num_local_steps 20 --print_images 1 --exp_name test

Infos

I downloaded the scene dataset with the following command:
python -m habitat_sim.utils.datasets_download --username <api-token-id> --password <api-token-secret> --uids hm3d_val_v0.1

and downloaded the task dataset from here.
Specifically, I've tried objectnav_hm3d_v1.zip and with objectnav_hm3d_v2.zip.

Did you encounter this particular issue?

the issue about the segment_anything

When I run the code, I face the issue about cannot import name 'build_sam_hq'

>>> import segment_anything
>>> from segment_anything import SamPredictor
>>> from segment_anything import sam_model_registry
>>> from segment_anything import build_sam_hq
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'build_sam_hq' from 'segment_anything'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.