Code Monkey home page Code Monkey logo

reverie's Introduction

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

🌟 [18/8/2023] Continuous REVERIE in Habitat simulator coming soon!

🌟 The 3rd REVERIE Challenge on CSIG is here! New data, new rules, and with total prizes 200,000RMB (~30,000$ USD)

🌟 Results of the 2nd REVERIE Challenge on ICCV Workshop 2021! More details see here

🌟 Results of the 1st REVERIE Challenge on ACL Workshop 2020! More details see here

🌟 Leaderboard here

Here are the pre-released code and data for the CVPR 2020 paper REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

Table of Contents

0. Updates
1. Definition of the REVERIE Task
2. Install without Docker
3. Install with Docker
4. Train and Test the Model
5. Data Organization of the REVERIE Task
6. Integrate into Your Existing Project
7. Result File Format
8. Evaluation
9. Acknowledgements
10. Reference
REVERIE task example

0. Updates

So far, there are two versions of REVERIE datasets (original version here and the 2nd version here) and there are two technical lines to address the REVERIE task: using our pretrained referring expression grounding model or your private model.

If you work on the original REVERIE + our pretrained grounding model (MAttNet), we suggest running navigation and grounding codes with two different environments. For your navigation code, you can try our out-of-box docker image: docker pull qykshr/ubuntu:orist or build from this file for easy simulator setup. For the modified grounding code (MAttNet), you can refer to the related instruction in section 2.

If you work on the original REVERIE + your private grounding model, you can use the docker image: docker pull qykshr/ubuntu:orist or build from this file.

If you work on the 2nd REVERIE + our pretrained grounding model (UNITER), you can use the docker image: docker pull qykshr/ubuntu:orist or build from this file for navigation, and see here for grounding.

If you work on the 2nd REVERIE + your private grounding model, you can use the docker image: docker pull qykshr/ubuntu:orist or build from this file.

If you are participating the 2022 REVERIE Challenge, see the "How to start" section for more start codes.

1. Definition of the REVERIE Task

As shown in the above figure, a robot agent is given a natural language instruction referring to a remote object (here in the red bounding box) in a photo-realistic 3D environment. The agent must navigate to an appropriate location and identify the object from multiple distracting candidates. The blue discs indicate nearby navigable viewpoints provided the simulator.

2. Install without Docker

Note* This section prepares everything to run or train our Navigator-Pointer model. If you are familar with R2R and just want to do the REVERIE task, you can directly go to Section 6.

Note** If you have a fresh Ubuntu system, the following instruction should work well. If not, it may screw up your existing project environments and recommend to try Section 3. Install with Docker.

Prerequisites

A C++ compiler with C++11 support is required. Matterport3D Simulator has several dependencies:

E.g. installing dependencies on Ubuntu:

sudo apt-get install libopencv-dev python-opencv freeglut3 freeglut3-dev libglm-dev libjsoncpp-dev doxygen libosmesa6-dev libosmesa6 libglew-dev

If still lack some packages during runing cmake/make or our codes, you can refer to the content in the Dockerfile.

2.1. Clone Repo

Clone the REVERIE repository:

git clone https://github.com/YuankaiQi/REVERIE.git
cd REVERIE

Note that our repository is based on the v0.1 version Matterport3DSimulator, which was originally proposed with the Room-to-Room dataset.

2.2. MAttNet3 Download

Download our pre-trained mini MAttnet3 from Google Drive or Baidu Yun (code: qts6), which is modified from MAttNet to support our model training. Unzip it into the MAttnet3 folder. This is used as the our Pointer model.

2.3. Dataset Download

You need to download RGB images and house segmentation files of the Matterport3D dataset. The following data types are required:

  • matterport_skybox_images
  • house_segmentations

The metadata is also needed, and organise data like below:

Matterport
|--v1
   |--metadata
   |--scans

Then update the 'matterportDir' to Matterport setting in trainFast.py.

2.4. Pre-computed Image Features Download

Download and extract the tsv files into the img_features directory from Matterport3DSimulator. You will only need the ImageNet features to replicate our results.

2.5. Installation with PyTorch

Let us get things ready to run experiments.

2.5.1. Create Anaconda Environment

# change "rog" (remote object grounding) to any name you prefer
conda create -n rog python=3.6

Activate the enviorment you just created

conda activate rog

2.5.2. Install Special Requirements

pip install -r tasks/REVERIE/requirements.txt

2.5.3. Install PyTorch

# with CUDA 90
conda install pytorch=0.4.0 cuda90 -c pytorch
conda install torchvision=0.2.0 -c pytorch

If you use a newer version, you need to modify codes to load pretrained models.

2.6. Compile the Matterport3D Simulator

Let us compile the simulator so that we can call its functions in python.

Build EGL version using CMake:

cd build
cmake -DOSMESA_RENDERING=ON ..

# Double-check if CMake find the proper path to your python
# if not, remove the make files and use the cmake with option below instead
rm -rf *
cmake -DOSMESA_RENDERING=ON -DPYTHON_EXECUTABLE:FILEPATH=/path/to/your/bin/python ..

make
cd ../

Note There are three rendering options, which are selected using cmake options during the build process:

  • Off-screen GPU rendering using EGL: cmake -DEGL_RENDERING=ON .. (Note: this is not supported by the v0.1 version of Matterport3D Simulator, but its latest version does.)
  • Off-screen CPU rendering using OSMesa: cmake -DOSMESA_RENDERING=ON .. (Recommended)
  • GPU rendering using OpenGL (requires an X server): cmake ..

The recommended (fast) approach for training agents is using off-screen GPU rendering (EGL).

2.7. Compile MAttNet3

2.7.1. Compile pytorch-faster-rcnn

cd MAttNet3/pyutils/mask-faster-rcnn/lib

You may need to change the -arch version in Makefile to compile the cuda code:

GPU model Architecture
TitanX (Maxwell/Pascal) sm_52
GTX 960M sm_50
GTX 1080 (Ti) sm_61
Grid K520 (AWS g2.2xlarge) sm_30
Tesla K80 (AWS p2.xlarge) sm_37

Compile the CUDA-based nms and roi_pooling using following simple commands:

make

2.7.2. Compile refer

cd ../../refer
make

It will generate _mask.c and _mask.so in external/ folder.

3. Install with Docker

We find that the success rate is slightly lower that obtained using environments built without docker.

Prerequisites

  • Nvidia GPU with driver >= 384
  • Install docker
  • Install nvidia-docker2.0
  • Note: CUDA / CuDNN toolkits do not need to be installed (these are provided by the docker image)

3.1 Clone Repo

Clone the REVERIE repository:

git clone https://github.com/YuankaiQi/REVERIE.git
cd REVERIE

3.2. Dataset Download

First download fiels as Section 2.3. Then set an environment variable to the location of the dataset, where is the full absolute path (not a relative path or symlink) to the directory 'v1':

export MATTERPORT_DATA_DIR=<PATH>

And set the 'matterportDir' parameter to 'data' in the trainFast.py file.

Note that if is a remote sshfs mount, you will need to mount it with the -o allow_root option or the docker container won't be able to access this directory.

3.3. Dataset Preprocess

To make data loading faster and to reduce memory usage we preprocess the matterport_skybox_images by downscaling and combining all cube faces into a single image using the following script:

./scripts/downsize_skybox.py

This will take a while depending on the number of processes used. By default images are downscaled by 50% and 20 processes are used.

3.4. Build Simulator

Build the docker image:

docker build -t reverie .

Run the docker container, mounting both the git repo and the dataset:

nvidia-docker run -it --mount type=bind,source=$MATTERPORT_DATA_DIR,target=/root/mount/Matterport3DSimulator/data/v1,readonly --volume `pwd`:/root/mount/Matterport3DSimulator reverie

Now (from inside the docker container), build the simulator and run the unit tests:

cd /root/mount/Matterport3DSimulator
mkdir build && cd build
cmake -DEGL_RENDERING=ON ..
make
cd ../

Note There are three rendering options, which are selected using cmake options during the build process (by varying line 3 in the build commands immediately above):

  • Off-screen GPU rendering using EGL: cmake -DEGL_RENDERING=ON .. (Note: this is not supported by v0.1 of Matterport3D Simulator but its latest version does.)
  • Off-screen CPU rendering using OSMesa: cmake -DOSMESA_RENDERING=ON .. (Recommended)
  • GPU rendering using OpenGL (requires an X server): cmake ..

The recommended (fast) approach for training agents is using off-screen GPU rendering (EGL).

3.5. Compile MAttNet3

3.5.1. Compile pytorch-faster-rcnn

cd MAttNet3/pyutils/mask-faster-rcnn/lib

You may need to change the -arch version in Makefile to compile the cuda code:

GPU model Architecture
TitanX (Maxwell/Pascal) sm_52
GTX 960M sm_50
GTX 1080 (Ti) sm_61
Grid K520 (AWS g2.2xlarge) sm_30
Tesla K80 (AWS p2.xlarge) sm_37

Compile the CUDA-based nms and roi_pooling using following simple commands:

make

3.5.2. Compile refer

cd ../../refer
make

It will generate _mask.c and _mask.so in external/ folder.

3.6. Enter Simulator with X server

Run the docker container while sharing the host's X server and DISPLAY environment variable with the container:

xhost +
nvidia-docker run -it -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --mount type=bind,source=$MATTERPORT_DATA_DIR,target=/root/mount/Matterport3DSimulator/data/v1,readonly --volume `pwd`:/root/mount/Matterport3DSimulator reverie
cd /root/mount/Matterport3DSimulator

If you get an error like Error: BadShmSeg (invalid shared segment parameter) 128 you may also need to include -e="QT_X11_NO_MITSHM=1" in the docker run command above.

4. Train and Test the Model

  • For training You can download our pre-trained models from Google Drive or Baidu Yun. If you want to train by yourself, just run the following command:
python tasks/REVERIE/trainFast.py --feedback_method sample2step --experiment_name releaseCheck
  • For testing To test the model, you need first obtain navigation results by
python tasks/REVERIE/run_search.py

Then run the following command to obtain the grounded object

python tasks/REVERIE/groundingAfterNav.py

Now, you should get results in the 'experiment/releaseCheck/results/' folder.

Note that the results might be slightly different due to using different dependant package versions or GPUs.

5. Data Organization of the REVERIE Task

In the tasks/REVERIE/data folder, you will have REVERIE_train.json, REVERIE_val_seen.json, REVERIE_val_unseen.json, and REVERIE_test four files, which provide instructions, paths, and target object of each task (except the REVERIE_test file). In the tasks/REVERIE/data/BBox folder, you will have json files that record objects observed at each viewpoint within 3 meters.

  • Example of tarin/val_seen/val_unseen.json file
[
  {
    "distance" : 11.65, # distance to the goal viewpoint
    "ix": 208,  # reserved data, not used
    "scan": "qoiz87JEwZ2", # building ID
    "heading": 4.59, # initial parameters for agent
    "path_id": 1357, # inherited from the R2R dataset
    "objId": 66, # the unique object ID in the current building 
    "id": "1357_66" # task id
    "instructions":[ # collected instructions for REVERIE
        "Go to the entryway and clean the coffee table", 
        "Go to the foyer and wipe down the coffee table", 
        "Go to the foyer on level 1 and pull out the coffee table further from the chair"
     ]
    "path": [ # inherited from the R2R dataset
        "bdb1023cb7cc4ebd8245b9291fcbc1a2", 
        "a6ba3f53b7964464b23341896d3c75fa", 
        "c407e34577aa4724b7e5d447a5d859d1", 
        "9f68b19f50d14f5d8371447f73c3a2e3", 
        "150763c717894adc8ccbbbe640fa67ef", 
        "59b190857cfe47f691bf0d866f1e5aeb", 
        "267a7e2459054db7952fc1e3e45e98fa"
      ]
     "instructions_l":[ # inherited from the R2R dataset and provided just for convenience 
        "Walk into the dining room and continue past the table. Turn left when you xxx ", 
       ...
       ]
  },
  ...
]
  • Example of json file in the bbox folder

    File name format: ScanID_ViewpointID.json, e.g.,VzqfbhrpDEA_57fba128d2f042f7a59793c665a3f587.json

{ # note that this is in the variable type of dict not list
  "57fba128d2f042f7a59793c665a3f587":{ # this key is the id of viewpoint
    "827":{ # the key if object ID
      "name": "toilet",
      "visible_pos":[
        6,7,8,9,19,20  # view index (0~35) which contain the object. Index is consitent with that in R2R 
        ],
      "bbox2d":[
        [585,382,55,98], # [x,y,w,h] and corresponds to the views listed in the "visible_pos"
        ...
       ]
    },
    "833": {
       ...
    },
    ...
  }
}

6. Integrate into Your Existing Project

The easiest way to integrate into your project is to preload all the objects bounding_box/label/visible_pos with the loadObjProposals() function as in the eval_release.py file. Then you are able to access visible objects using ScanID_ViewpointID as key. You can use any referring expression methods to get matched objects with an instruction.

Note The number of instructions may vary across the dataset, we recommend the following way to index an instruction:

instrType = "instructions"
self.instr_ids += ['%s_%d' % (str(item['id']),i) for i in range(len(item[instrType]))]

7. Result File Format

Just add the "'predObjId': int value" pair into your navigation results. That's it!

Below is a toy sample:

[
  {
    "trajectory": [
      [
        "a68b5ae6571e4a66a4727573b88227e4", 
        3.141592653589793, 
        0.0
      ], 
      ...
     ],
     "instr_id": "4774_267_1", 
     "predObjId": 402
  },
  ...
]

8. Evaluation

For the val_seen and val_unseen splits, you can use the eval_release.py file to evaluate your results. For the test split, you need to submit your result file to the evaluation server as described here.

9. Acknowledgements

We would like to thank Matterport for allowing the Matterport3D dataset to be used by the academic community. We also thank Philip Roberts, Zheng Liu, Zizheng Pan, and Sam Bahrami for their great help in building the dataset. This project is supported by the Australian Centre for Robotic Vision.

10. Reference

The REVERIE task and dataset are descriped in the following paper:

@inproceedings{reverie,
  title={REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments},
  author={Yuankai Qi and Qi Wu and Peter Anderson and Xin Wang and William Yang Wang and Chunhua Shen and Anton van den Hengel},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

reverie's People

Contributors

yuankaiqi avatar

Stargazers

 avatar ycghb-chenmaiyue avatar 甲子 avatar Benjamin Foster Holmes avatar  avatar Zan Wang avatar Nicholas Liao avatar Mark O. Mints avatar  avatar CH7au avatar Wu Chen avatar Xiaoyu Guo avatar  avatar Zijie Wang avatar 潘奕源 avatar Andy Cai avatar Haodong Hong avatar Mehmet Yıldız avatar lg(x) avatar Haomeng Zhang avatar Mutian Tong avatar Aria F avatar ZhuDeyao avatar Meredith avatar  avatar  avatar  avatar  avatar  avatar Anh H. Vo avatar learning... avatar  avatar  avatar  avatar  avatar Yunzhe Xu avatar  avatar Hongbo Zhao avatar yanqing avatar Mengxue Qu avatar Shuo Feng avatar zcy avatar  avatar  avatar  avatar 李雄 avatar RunweiSitu avatar katieb-alana avatar mrxirzzz avatar 谷纪豪 avatar Perry avatar w61 avatar Ubastic avatar XuqianRen avatar kmzy avatar Kevin Tan avatar Weixi Feng avatar Jackie Chou avatar Lu Zhang avatar  avatar HanielF avatar Qi avatar Dong An avatar Sayan Sinha avatar Daochang Liu avatar Feng Gao avatar Derek Ammerman avatar Chongyang Zhao avatar  avatar  avatar hanqing avatar  avatar  avatar  avatar vairleon avatar lingshi0606 avatar NKU_forestsea avatar Zhenyu Wei avatar  avatar William Wang avatar Mihai Bujanca avatar Dave Z. Chen  avatar Vincent Xiaopeng Lu avatar bygreencn avatar Ronghang Hu avatar Weixia Zhang avatar  avatar Hadesi avatar Daqing Liu avatar YuanKe avatar  avatar  avatar sntjn avatar JungJunKim avatar  avatar Zizheng Pan avatar Xiwen Liang avatar Yaya Shi avatar Zhenfang Chen avatar 爱可可-爱生活 avatar

Watchers

HY avatar Xin (Eric) Wang avatar ZhuFengdaaa avatar  avatar paper2code - bot avatar

reverie's Issues

Does it exists the data file "REVERIE_val_train_seen.json"?

Hi, I am recently running the recurrent VLN on REVERIE. It prompts me that this data file "REVERIE_val_train_seen.json" is missing. But I couldn't find it in your project. Does this file exist? If it exists, could you provide it for me? Thanks!

can't meet the requirement of codes

Hi! I am trying to run your codes following readme.md, and I have installed CUDA-9.0 and pytorch-0.4.0, just as mentioned. However, I still can't run the codes successfully. When I run: python tasks/REVERIE/trainFast.py --feedback_method sample2step --experiment_name releaseCheck I meet this error:
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/THCBlas.cu:249
It seems that my GPU version (RTX-2060) just can't support CUDA-9.0.
I don't know if there is any method to address this problem, and look forward to your update for the codes.
Thx :)

Load grounding model error

Loading grounding model
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/nn/modules/rnn.py:38: UserWarning: dropout option adds dropout after all but last recurrent layer, so non-zero dropout expects num_layers greater than 1, but got dropout=0.2 and num_layers=1
  "num_layers={}".format(dropout, num_layers))
attribute predict layer is commented
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.sparse.Embedding' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.dropout.Dropout' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.container.Sequential' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.linear.Linear' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.activation.ReLU' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.rnn.LSTM' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'layers.visual_encoder.SubjectEncoder' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.batchnorm.BatchNorm1d' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/serialization.py:425: SourceChangeWarning: source code of class 'torch.nn.modules.activation.Tanh' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
Traceback (most recent call last):
  File "tasks/REVERIE/trainFast.py", line 416, in <module>
    utilsFast.run(make_arg_parser(), train_val)
  File "/home/test/REVERIE/tasks/REVERIE/utilsFast.py", line 309, in run
    entry_function(args)
  File "tasks/REVERIE/trainFast.py", line 333, in train_val
    agent, train_env, val_envs = train_setup(args)
  File "tasks/REVERIE/trainFast.py", line 306, in train_setup
    args, vocab, train_splits, val_splits)
  File "tasks/REVERIE/trainFast.py", line 272, in make_env_and_models
    agent = make_follower(args, vocab)
  File "tasks/REVERIE/trainFast.py", line 228, in make_follower
    agent.pointer = Pointer(args)
  File "/home/test/REVERIE/tasks/REVERIE/modelFast.py", line 1433, in __init__
    self.model = self.loadmodel()
  File "/home/test/REVERIE/tasks/REVERIE/modelFast.py", line 1797, in loadmodel
    model.load_state_dict(checkpoint['model'].state_dict())
  File "/home/test/anaconda3/envs/rog_pytorch_0.4.1_cuda_9.2/lib/python3.7/site-packages/torch/nn/modules/module.py", line 719, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for JointMatching:
        Missing key(s) in state_dict: "sub_encoder.att_fuse.1.num_batches_tracked", "sub_matching.vis_emb_fc.1.num_batches_tracked", "sub_matching.vis_emb_fc.5.num_batches_tracked", "sub_matching.lang_emb_fc.1.num_batches_tracked", "sub_matching.lang_emb_fc.5.num_batches_tracked", "loc_matching.vis_emb_fc.1.num_batches_tracked", "loc_matching.vis_emb_fc.5.num_batches_tracked", "loc_matching.lang_emb_fc.1.num_batches_tracked", "loc_matching.lang_emb_fc.5.num_batches_tracked", "rel_matching.vis_emb_fc.1.num_batches_tracked", "rel_matching.vis_emb_fc.5.num_batches_tracked", "rel_matching.lang_emb_fc.1.num_batches_tracked", "rel_matching.lang_emb_fc.5.num_batches_tracked".

I tried to run training but it seems that when loading the grounding model, the JointMatching model does not match the mrcn_cmr_with_st.pth file. Can you help to check whether they match or not ?

Bounding box extraction

Hi!

I'm interested in using the bounding boxes annotated in REVERIE. I noticed that the categories are similar to the ones in the metadata of Matterport3D. Also, some objects are not annotated (such as doors).

Could you explain to me how were the bounding boxes annotated in REVERIE?

Thanks in advance,
Benjamin

QUESTION ABOUT TESTING

Hi Yuankai,

I used the Recurrent VLN BERT to train a sequence-sequence agent and got the best val unseen model. However, while I test it on the website on test split, it underperforms the val unseen results from a large margin (-8% on SR and SPL). Is it normal?

Bounding Box Categories

In the paper, it is said that There are 4,140 target objects in the dataset, falling into 489 categories. Can you release the category names ? I tried to extract the categories from the json files but failed to get the number 489.

Models implementation

Hello, are you planning to release the code to train baselines and your Navigator-Pointer model anytime soon?

Great contribution by the way, I loved your paper.

Something wrong when building docker

Hi, Yuankai.
I was doing docker build -t reverie . and it got the following error.

Reading package lists...
Reading package lists...
Building dependency tree...
Reading state information...
E: Unable to locate package python3.6
E: Couldn't find any package by glob 'python3.6'
E: Couldn't find any package by regex 'python3.6'
The command '/bin/sh -c apt-get update && apt-get install -y python3.6' returned a non-zero code: 100

It seems that RUN add-apt-repository ppa:deadsnakes/ppa in the dockerfile didn't work. Is there any solution?

Very appreciated to your reply.

OpenCV Error: OpenGL API call (Can't Create A GL Device Context) in createGlContext

When I run :python tasks/REVERIE/trainFast.py
The following error occurred:

root@e99d*******:~/mount/Matterport3DSimulator# python tasks/REVERIE/trainFast.py --feedback_method sample2step --experiment_name releaseCheck

Loading image features from img_features/ResNet-152-imagenet.tsv
Writing vocab of size 1144 to tasks/REVERIE/data/train_vocab.txt
Loading REFER ...
REVERIE Batch loaded with 10466 instructions, using splits: train
Using gt-based pointer
Gtk-Message: Failed to load module "canberra-gtk-module"
OpenCV Error: OpenGL API call (Can't Create A GL Device Context) in createGlContext, file /build/opencv-ys8xiq/opencv-2.4.9.1+dfsg/modules/highgui/src/window_gtk.cpp, line 703
Traceback (most recent call last):
File "tasks/REVERIE/trainFast.py", line 458, in
utilsFast.run(make_arg_parser(), train_val)
File "/root/mount/Matterport3DSimulator/tasks/REVERIE/utilsFast.py", line 309, in run
entry_function(args)
File "tasks/REVERIE/trainFast.py", line 368, in train_val
agent, train_env, val_envs = train_setup(args)
File "tasks/REVERIE/trainFast.py", line 341, in train_setup
args, vocab, train_splits, val_splits)
File "tasks/REVERIE/trainFast.py", line 307, in make_env_and_models
agent = make_follower(args, vocab)
File "tasks/REVERIE/trainFast.py", line 259, in make_follower
agent.pointer = Pointer(args)
File "/root/mount/Matterport3DSimulator/tasks/REVERIE/modelFast.py", line 1465, in init
self.sim = self.make_sim(args.WIDTH,args.HEIGHT,args.VFOV)
File "/root/mount/Matterport3DSimulator/tasks/REVERIE/modelFast.py", line 1498, in make_sim
sim.init()
RuntimeError: /build/opencv-ys8xiq/opencv-2.4.9.1+dfsg/modules/highgui/src/window_gtk.cpp:703: error: (-219) Can't Create A GL Device Context in function createGlContext

How to solve this problem?

Code release

When will the code be released for training and testing ?

Thanks

"Gtk-WARNING **: cannot open display"

Hello, I am trying "Install with Docker" according to exactly the README.
At the begining, I just use nvidia-docker run -it --mount type=bind,source=$MATTERPORT_DATA_DIR,target=/root/mount/Matterport3DSimulator/data/v1,readonly --volume pwd:/root/mount/Matterport3DSimulator reverie to run the docker. (Without "Enter Simulator with X server")
When I try to test the trained model with python tasks/REVERIE/run_search.py, I got the following output:

Loading image features from img_features/ResNet-152-imagenet.tsv
Loading REFER ...
REVERIE Batch loaded with 1423 instructions, using splits: val_seen
Loading REFER ...
REVERIE Batch loaded with 3521 instructions, using splits: val_unseen
Using gt-based pointer

(renderwin:13): Gtk-WARNING **: cannot open display:

Then I try "Enter Simulator with X server" and also try the same test with above. I got the following output:

Loading image features from img_features/ResNet-152-imagenet.tsv
Loading REFER ...
REVERIE Batch loaded with 1423 instructions, using splits: val_seen
Loading REFER ...
REVERIE Batch loaded with 3521 instructions, using splits: val_unseen
Using gt-based pointer

(renderwin:142): Gtk-WARNING **: cannot open display: localhost:11.0

And I can not run the test successfully. (of course, nor the training)
My question is :

  1. How can I correctly run the test(or train)
  2. Is X server a must to run the model?

I am using mobaXterm in Win10 to connect the remote server with Ubuntu 18.04.5 LTS.
It will be really appreciated for any reply!

How to correctly interpret the object names?

Hi, I am very interested in the REVERIE data, especially the BBOX annotations of different objects in Matterport3D.
However, I've got confused by the object names. Most of the names are single-word nouns which are eay to understand, like toilet. But I noticed there were also some compound objects with names composed of multiple words connected by special symbols like # and /. For example, there are floor#/#room#above, dining#chair, etc. So I just want to ask how should I understand these compound names, e.g., what is the rule for using different symbols like # and /.

bbox category

Can you please inform me where is the object_label-object_name (eg : 73-"light")dictionary file in dataset REVERIE?

Question About Data Organization of the REVERIE Task

Hi!

I'm interested in using the bounding boxes annotated in REVERIE. I noticed that the categories are similar to the ones in the metadata of Matterport3D. Also, some objects are not annotated (such as doors).

While viewing the json file in the bbox folder, I noticed that the image's pixel for extracting the bbox2d is 640 * 480. But my image's pixel is 512 * 512, which is comes from matterport_skybox_images of the Matterport3D dataset. I'm wondering where do the 640*480 images come from?

what's more, Could you explain to me how are 36 view indexes(0~35) formed which contain the object? Do they each contain a 640 *
480 image?

Thanks in advance,
Zheyuan Liu

Upgrading Pytorch version

Hi,

Thanks for sharing this code!

It would be nice to bump up the pytorch version. 0.4.0 is very old and a lot has changed since. Also it means we can't use allennlp and a lot of other great librairies.

This requires some efforts, since for now, MattNet does not compile pytorch 1.7.0. Moreover, a few breaking (but silent) changes were made at PyTorch 1.1: the code could work but not reproducing existing baselines.

Thanks

Object Id confusion

{ # note that this is in the variable type of dict not list
  "57fba128d2f042f7a59793c665a3f587":{ # this key is the id of viewpoint
    "827":{ # the key if object ID
      "name": "toilet",
      "visible_pos":[
        6,7,8,9,19,20  # view index (0~35) which contain the object. Index is consitent with that in R2R 
        ],
      "bbox2d":[
        [585,382,55,98], # [x,y,w,h] and corresponds to the views listed in the "visible_pos"
        ...
       ]
    },
    "833": {
       ...
    },
    ...
  }
}

In this example, is "827" the object id? If yes, I found there are many object names correspond to this one id. If this is not the object id, can you point out where I could get the object id you used creating reverie data?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.