Code Monkey home page Code Monkey logo

psl-instancenav's Introduction

Prioritized Semantic Learning for Zero-shot Instance Navigation

Xander Sun1, Louis Law, Hoyard Zhi, Ronghe Qiu1, and Junwei Liang1

1 AI Thrust, The Hong Kong University of Science and Technology (Guangzhou)

Overview

We present PSL, a zero-shot approach for learning instance level navigation skill. The agent is tasked with a language goal to find a specified object in current scene, for example, a chair made of black leather and is located near two windows.

InstanceNav v.s ObjectNav.

Specifically, a semantic enhanced PSL agent is proposed and a prioritized semantic training strategy is introduced to select goal images that exhibit clear semantic supervision and relax the reward function from strict exact view matching. At inference time, a semantic expansion inference scheme is designed to preserve the same granularity level of the goal-semantic as training.

Model Architecture for PSL.

Main Results

ObjectNav Results

Methods with Mapping with LLM SR SPL
L3MVN 35.2 16.5
PixelNav 37.9 20.5
ESC 39.2 22.3
CoW 6.1 3.9
ProcTHOR 13.2 7.7
ZSON 25.5 12.6
PSL(Ours) 42.4 19.2

InstanceNav Results (Text-Goal)

Methods with Mapping with LLM SR SPL
CoW 1.8 1.1
CoW 7.2 4.2
ESC 6.5 3.7
OVRL 3.7 1.8
ZSON 10.6 4.9
PSL(Ours) 16.5 7.5

Installation

All the required data can be downloaded from here.

  1. Create a conda environment:

    conda create -n psl python=3.7 cmake=3.14.0
    
    conda activate psl
    
  2. Install pytorch version 1.10.2:

    pip install torch==1.10.2+cu113 torchvision==0.11.3+cu113 -i https://download.pytorch.org/whl/cu113
    
  3. Install habitat-sim:

    conda install habitat-sim-challenge-2022 headless -c conda-forge -c aihabitat
    
  4. Install habitat-lab:

    git clone --branch challenge-2022 https://github.com/facebookresearch/habitat-lab.git habitat-lab-challenge-2022
    
    cd habitat-lab-challenge-2022
    
    pip install -r requirements.txt
    
    # install habitat and habitat_baselines
    python setup.py develop --all 
    
    cd ..

Install PSL:

  1. Setup steps

    pip install -r requirements.txt
    
    python setup.py develop
    
  2. Follow the instructions here to set up the data/scene_datasets/ directory. gibson scenes can be found here.

  3. Download the HM3D objectnav dataset from ZSON.

    wget https://dl.fbaipublicfiles.com/habitat/data/datasets/objectnav/hm3d/v1/objectnav_hm3d_v1.zip
    
    unzip objectnav_hm3d_v1.zip -d data/datasets/objectnav/
    
    # clean-up
    rm objectnav_hm3d_v1.zip
  4. Download the HM3D instance navigation dataset.

    # download the original Instance Image Navigation dataset
    wget https://dl.fbaipublicfiles.com/habitat/data/datasets/imagenav/hm3d/v3/instance_imagenav_hm3d_v3.zip
    
    unzip instance_imagenav_hm3d_v3.zip -d data/datasets/
    
    # clean-up
    rm instance_imagenav_hm3d_v3.zip
    
    mkdir -p data/datasets/instancenav/val
    
    # download the attribute descriptions
    wget --no-check-certificate "https://drive.google.com/uc?export=download&id=1KNdv6isX1FDZi4KCVPiECYDxijg9cZ3L" -O data/datasets/instancenav/val/val_text.json.gz
    
    export PROJECT_ROOT=`pwd`
    cd data/datasets/instancenav/val/
    ln -s $PROJECT_ROOT/data/datasets/instance_imagenav_hm3d_v3/val/content .
    cd $PROJECT_ROOT
  5. Download the trained checkpoints PSL_Instancenav.pth, and move to data/models.

  6. Download the retrieved goal embeddings hm3d_objectnav.imagenav_v2.pth and hm3d_instancenav.imagenav_v2.pth

    mkdir -p data/goal_datasets/objectnav data/goal_datasets/instancenav
    
    # download retrieved goal embeddings
    wget --no-check-certificate "https://drive.google.com/uc?export=download&id=1spyyqfsSfhHL8pp5DG6aBZjozNny4IQd" -O data/goal_datasets/objectnav/hm3d_objectnav.imagenav_v2.pth
    
    wget --no-check-certificate "https://drive.google.com/uc?export=download&id=1UhA132XoQB0-4sflfBeBl-0uTt4RtidG" -O data/goal_datasets/instancenav/hm3d_instancenav.imagenav_v2.pth
  1. Setup data/goal_datasets using the script tools/extract-goal-features.py. This caches CLIP goal embeddings for faster training.

    Your directory structure should now look like this:

    .
    +-- habitat-lab-challenge-2022/
    |   ...
    +-- zson/
    |   +-- data/
    |   |   +-- datasets/
    |   |   |   +-- objectnav/
    |   |   |   +-- imagenav/
    |   |   +-- scene_datasets/
    |   |   |   +-- hm3d/
    |   |   |   +-- mp3d/
    |   |   +-- goal_datasets/
    |   |   |   +-- imagenav/
    |   |   |   |   +-- hm3d/
    |   |   +-- models/
    |   |   |   +-- PSL_Instancenav.pth
    |   +-- zson/
    |   ...
    

Evaluation

Evaluate the PSL agent on the ObjectNav task:

bash scripts/eval/objectnav_hm3d.sh

Evaluate the PSL agent on the InstanceNav Text-Goal task:

bash scripts/eval/instancenav_text_hm3d.sh

psl-instancenav's People

Contributors

xinyusun avatar

Stargazers

yuhao.Wang98 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.