Code Monkey home page Code Monkey logo

matterport3dsimulator's Introduction

Matterport3D Simulator

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

The Matterport3D Simulator enables development of AI agents that interact with real 3D environments using visual information (RGB-D images). It is primarily intended for research in deep reinforcement learning, at the intersection of computer vision, natural language processing and robotics.

Concept

Visit the main website to view a demo.

NEW February 2019: We have released several updates. The simulator is now dockerized, it supports batches of agents instead of just a single agent, and it is far more efficient (faster) than before. Also, it now outputs depth maps as well as RGB images. As a consequence, there are some changes to the original API (mainly, all inputs and outputs are now batched). Therefore, to mark the first release we have tagged it as v0.1 for any users that don't want to change to the new version.

Features

  • Dataset consisting of 90 different predominantly indoor environments,
  • Outputs RGB and depth images
  • All images and depth maps are real, not synthetic (providing much more visual complexity),
  • API for C++ and Python
  • Customizable image resolution, camera parameters, etc,
  • Supports off-screen rendering (both GPU and CPU based)
  • Fast (Around 1000 fps RGB-D off-screen rendering at 640x480 resolution using a Titan X GPU)
  • Unit tests for the rendering pipeline and agent's motions etc
  • Future releases may support class and instance object segmentations.

Reference

The Matterport3D Simulator and the Room-to-Room (R2R) navigation dataset are described in:

If you use the simulator or our dataset, please cite our paper (CVPR 2018 spotlight oral):

Bibtex:

@inproceedings{mattersim,
  title={{Vision-and-Language Navigation}: Interpreting visually-grounded navigation instructions in real environments},
  author={Peter Anderson and Qi Wu and Damien Teney and Jake Bruce and Mark Johnson and Niko S{\"u}nderhauf and Ian Reid and Stephen Gould and Anton van den Hengel},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2018}
}

Simulator Data

Matterport3D Simulator is based on densely sampled 360-degree indoor RGB-D images from the Matterport3D dataset. The dataset consists of 90 different indoor environments, including homes, offices, churches and hotels. Each environment contains full 360-degree RGB-D scans from between 8 and 349 viewpoints, spread on average 2.25m apart throughout the entire walkable floorplan of the scene.

Actions

At each viewpoint location, the agent can pan and elevate the camera. The agent can also choose to move between viewpoints. The precise details of the agent's observations and actions are described below and in the paper.

Room-to-Room (R2R) Navigation Task

The simulator includes the training data and evaluation metrics for the Room-to-Room (R2R) Navigation task, which requires an autonomous agent to follow a natural language navigation instruction to navigate to a goal location in a previously unseen building. Please refer to specific instructions to setup and run this task. There is a test server and leaderboard available at EvalAI.

Installation / Build Instructions

We recommend using our Dockerfile to install the simulator. The simulator can also be built without docker but satisfying the project dependencies may be more difficult.

Prerequisites

  • Nvidia GPU with driver >= 396.37
  • Install docker
  • Install nvidia-docker2.0
  • Note: CUDA / CuDNN toolkits do not need to be installed (these are provided by the docker image)

Clone Repo

Clone the Matterport3DSimulator repository:

# Make sure to clone with --recursive
git clone --recursive https://github.com/peteanderson80/Matterport3DSimulator.git
cd Matterport3DSimulator

If you didn't clone with the --recursive flag, then you'll need to manually clone the pybind submodule from the top-level directory:

git submodule update --init --recursive

Dataset Download

To use the simulator you must first download the Matterport3D Dataset which is available after requesting access here. The download script that will be provided allows for downloading of selected data types. At minimum you must download the matterport_skybox_images. If you wish to use depth outputs then also download undistorted_depth_images and undistorted_camera_parameters.

Set an environment variable to the location of the unzipped dataset, where is the full absolute path (not a relative path or symlink) to the directory containing the individual matterport scan directories (17DRP5sb8fy, 2t7WUuJeko7, etc):

export MATTERPORT_DATA_DIR=<PATH>

Note that if is a remote sshfs mount, you will need to mount it with the -o allow_root option or the docker container won't be able to access this directory.

Building using Docker

Build the docker image:

docker build -t mattersim:9.2-devel-ubuntu18.04 .

Run the docker container, mounting both the git repo and the dataset:

nvidia-docker run -it --mount type=bind,source=$MATTERPORT_DATA_DIR,target=/root/mount/Matterport3DSimulator/data/v1/scans --volume `pwd`:/root/mount/Matterport3DSimulator mattersim:9.2-devel-ubuntu18.04

Now (from inside the docker container), build the simulator code:

cd /root/mount/Matterport3DSimulator
mkdir build && cd build
cmake -DEGL_RENDERING=ON ..
make
cd ../

Rendering Options (GPU, CPU, off-screen)

Note that there are three rendering options, which are selected using cmake options during the build process (by varying line 3 in the build commands immediately above):

  • GPU rendering using OpenGL (requires an X server): cmake .. (default)
  • Off-screen GPU rendering using EGL: cmake -DEGL_RENDERING=ON ..
  • Off-screen CPU rendering using OSMesa: cmake -DOSMESA_RENDERING=ON ..

The recommended (fast) approach for training agents is using off-screen GPU rendering (EGL).

Dataset Preprocessing

To make data loading faster and to reduce memory usage we preprocess the matterport_skybox_images by downscaling and combining all cube faces into a single image. While still inside the docker container, run the following script:

./scripts/downsize_skybox.py

This will take a while depending on the number of processes used (which is a setting in the script).

After completion, the matterport_skybox_images subdirectories in the dataset will contain image files with filename format <PANO_ID>_skybox_small.jpg. By default images are downscaled by 50% and 20 processes are used.

Depth Outputs

If you need depth outputs as well as RGB (via sim.setDepthEnabled(True)), precompute matching depth skybox images by running this script:

./scripts/depth_to_skybox.py

Depth skyboxes are generated from the undistorted_depth_images using a simple blending approach. As the depth images contain many missing values (corresponding to shiny, bright, transparent, and distant surfaces, which are common in the dataset) we apply a simple crossbilateral filter based on the NYUv2 code to fill all but the largest holes. A couple of things to keep in mind:

  • We assume that the undistorted depth images are aligned to the matterport_skybox_images, but in fact this alignment is not perfect. For certain applications where better alignment is required (e.g., generating RGB pointclouds) it might be necessary to replace the matterport_skybox_images by stitching together undistorted_color_images (which are perfectly aligned to the undistorted_depth_images).
  • In the generated depth skyboxes, the depth value is the euclidean distance from the camera center (not the distance in the z direction). This is corrected by the simulator (see Simulator API, below).

Running Tests

Now (still from inside the docker container), run the unit tests:

./build/tests ~Timing

Assuming all tests pass, sim_imgs will now contain some test images rendered by the simulator. You may also wish to test the rendering frame rate. The following command will try to load all the Matterport environments into memory (requiring around 50 GB memory), and then some information about the rendering frame rate (at 640x480 resolution, RGB outputs only) will be printed to stdout:

./build/tests Timing

The timing test must be run individually from the other tests to get accurate results. Not that the Timing test will fail if there is insufficient memory. As long as all the other tests pass (i.e., ./build/tests ~Timing) then the install is good. Refer to the Catch documentation for unit test configuration options.

Now exit the docker container:

exit

Interactive Demo

To run an interactive demo, after completing the Installation / Build Instructions above, run the docker container while sharing the host's X server and DISPLAY environment variable with the container:

xhost +
nvidia-docker run -it -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix --mount type=bind,source=$MATTERPORT_DATA_DIR,target=/root/mount/Matterport3DSimulator/data/v1/scans,readonly --volume `pwd`:/root/mount/Matterport3DSimulator mattersim:9.2-devel-ubuntu18.04
cd /root/mount/Matterport3DSimulator

If you get an error like Error: BadShmSeg (invalid shared segment parameter) 128 you may also need to include -e="QT_X11_NO_MITSHM=1" in the docker run command above.

Commands for running both python and C++ demos are provided below. These are very simple demos designed to illustrate the use of the simulator in python and C++. By default, these demos have depth rendering off. Check the code and turn it on if you have preprocessed the depth outputs and want to see depth as well (see Depth Outputs above). These demos should work regardless of which rendering option was used when building the simulator.

Python demo:

python3 src/driver/driver.py

C++ demo:

build/mattersim_main

The javascript code in the web directory can also be used as an interactive demo, or to generate videos from the simulator in first-person view, or as an interface on Amazon Mechanical Turk to collect natural language instruction data.

Building without Docker

The simulator can be built outside of a docker container using the cmake build commands described above. However, this is not the recommended approach, as all dependencies will need to be installed locally and may conflict with existing libraries. The main requirements are:

  • Ubuntu >= 14.04
  • Nvidia-driver with CUDA installed
  • C++ compiler with C++11 support
  • CMake >= 3.10
  • OpenCV >= 2.4 including 3.x
  • OpenGL
  • GLM
  • Numpy

Optional dependences (depending on the cmake rendering options):

  • OSMesa for OSMesa backend support
  • epoxy for EGL backend support

The provided Dockerfile contains install commands for most of these libraries. For example, to install OpenGL and related libraries:

sudo apt-get install libjsoncpp-dev libepoxy-dev libglm-dev libosmesa6 libosmesa6-dev libglew-dev

Simulator API

The simulator API in Python exactly matches the extensively commented MatterSim.hpp C++ header file, but using python lists in place of C++ std::vectors etc. In general, there are various functions beginning with set that set the agent and simulator configuration (such as batch size, rendering parameters, enabling depth output etc). For training agents, we recommend setting setPreloadingEnabled(True), setBatchSize(X) and setCacheSize(2X), where X is the desired batch size, e.g.:

import MatterSim
sim = MatterSim.Simulator()
sim.setCameraResolution(640, 480)
sim.setPreloadingEnabled(True)
sim.setDepthEnabled(True)
sim.setBatchSize(100)
sim.setCacheSize(200) # cacheSize 200 uses about 1.2GB of GPU memory for caching pano textures

When preloading is enabled, all the pano images will be loaded into memory before starting. Preloading takes several minutes and requires around 50G memory for RGB output (about 80G if depth output is enabled), but rendering is much faster.

To start the simulator, call initialize followed by the newEpisode function, which takes as arguments a list of scanIds, a list of viewpoint ids, a list of headings (in radians), and a list of camera elevations (in radians), e.g.:

sim.initialize()
# Assuming batchSize = 1
sim.newEpisode(['2t7WUuJeko7'], ['1e6b606b44df4a6086c0f97e826d4d15'], [0], [0])

Heading is defined from the y-axis with the z-axis up (turning right is positive). Camera elevation is measured from the horizon defined by the x-y plane (up is positive). There is also a newRandomEpisode function which only requires a list of scanIds, and randomly determines a viewpoint and heading (with zero camera elevation).

Interaction with the simulator is through the makeAction function, which takes as arguments a list of navigable location indices, a list of heading changes (in radians) and a list of elevation changes (in radians). The navigable location indices select which nearby camera viewpoint the agent should move to. By default, only camera viewpoints that are within the agent's current field of view are considered navigable, unless restricted navigation is turned off (i.e., the agent can't move backwards, for example). For agent n, navigable locations are given by getState()[n].navigableLocations. Index 0 always contains the current viewpoint (i.e., the agent always has the option to stay in the same place). As the navigation graph is irregular, the remaining viewpoints are sorted by their angular distance from the centre of the image, so index 1 (if available) will approximate moving directly forward. For example, to turn 30 degrees left without moving (keeping camera elevation unchanged):

sim.makeAction([0], [-0.523599], [0])

At any time the simulator state can be returned by calling getState. The returned state contains a list of objects (one for each agent in the batch), with attributes as in the following example:

[
  {
    "scanId" : "2t7WUuJeko7"  // Which building the agent is in
    "step" : 5,               // Number of frames since the last newEpisode() call
    "rgb" : <image>,          // 8 bit image (in BGR channel order), access with np.array(rgb, copy=False)
    "depth" : <image>,        // 16 bit single-channel image containing the pixel's distance in the z-direction from the camera center 
                              // (not the euclidean distance from the camera center), 0.25 mm per value (divide by 4000 to get meters). 
                              // A zero value denotes 'no reading'. Access with np.array(depth, copy=False)
    "location" : {            // The agent's current 3D location
        "viewpointId" : "1e6b606b44df4a6086c0f97e826d4d15",  // Viewpoint identifier
        "ix" : 5,                                            // Viewpoint index, used by simulator
        "x" : 3.59775996208,                                 // 3D position in world coordinates
        "y" : -0.837355971336,
        "z" : 1.68884003162,
        "rel_heading" : 0,                                   // Robot relative coords to this location
        "rel_elevation" : 0,
        "rel_distance" : 0
    }
    "heading" : 3.141592,     // Agent's current camera heading in radians
    "elevation" : 0,          // Agent's current camera elevation in radians
    "viewIndex" : 0,          // Index of the agent's current viewing angle [0-35] (only valid with discretized viewing angles)
                              // [0-11] is looking down, [12-23] is looking at horizon, is [24-35] looking up
    "navigableLocations": [   // List of viewpoints you can move to. Index 0 is always the current viewpoint, i.e. don't move.
        {                     // The remaining valid viewpoints are sorted by their angular distance from the image centre.
            "viewpointId" : "1e6b606b44df4a6086c0f97e826d4d15",  // Viewpoint identifier
            "ix" : 5,                                            // Viewpoint index, used by simulator
            "x" : 3.59775996208,                                 // 3D position in world coordinates
            "y" : -0.837355971336,
            "z" : 1.68884003162,
            "rel_heading" : 0,                                   // Robot relative coords to this location
            "rel_elevation" : 0,
            "rel_distance" : 0
        },
        {
            "viewpointId" : "1e3a672fa1d24d668866455162e5b58a",  // Viewpoint identifier
            "ix" : 14,                                           // Viewpoint index, used by simulator
            "x" : 4.03619003296,                                 // 3D position in world coordinates
            "y" : 1.11550998688,
            "z" : 1.65892004967,
            "rel_heading" : 0.220844170027,                      // Robot relative coords to this location
            "rel_elevation" : -0.0149478448723,
            "rel_distance" : 2.00169944763
        },
        {...}
    ]
  }
]

Refer to src/driver/driver.py for example usage. To build html docs for C++ classes in the doxygen directory, run this command and navigate in your browser to doxygen/html/index.html:

doxygen

Precomputing ResNet Image Features

In our initial work using this simulator, we discretized heading and elevation into 30 degree increments, and precomputed image features for each view. Now that the simulator is much faster, this is no longer necessary, but for completeness we include the details of this setting below.

We generate image features using Caffe. To replicate our approach, first download and save some Caffe ResNet-152 weights into the models directory. We experiment with weights pretrained on ImageNet, and also weights finetuned on the Places365 dataset. The script scripts/precompute_features.py can then be used to precompute ResNet-152 features. Features are saved in tsv format in the img_features directory.

Alternatively, skip the generation and just download and extract our tsv files into the img_features directory:

Directory Structure

  • connectivity: Json navigation graphs.
  • webgl_imgs: Contains dataset views rendered with javascript (for test comparisons).
  • sim_imgs: Will contain simulator rendered images after running tests.
  • models: Caffe models for precomputing ResNet image features.
  • img_features: Storage for precomputed image features.
  • data: Matterport3D dataset.
  • tasks: Currently just the Room-to-Room (R2R) navigation task.
  • web: Javascript code for visualizing trajectories and collecting annotations using Amazon Mechanical Turk (AMT).

Other directories are mostly self-explanatory.

License

The Matterport3D dataset, and data derived from it, is released under the Matterport3D Terms of Use. Our code is released under the MIT license.

Acknowledgements

We would like to thank Matterport for allowing the Matterport3D dataset to be used by the academic community. This project is supported by a Facebook ParlAI Research Award and by the Australian Centre for Robotic Vision.

Contributing

We welcome contributions from the community. All submissions require review and in most cases would require tests.

matterport3dsimulator's People

Contributors

dependabot[bot] avatar peteanderson80 avatar philr-acvt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

matterport3dsimulator's Issues

EGL error 0x300c at eglGetDisplay

Everything is fine until I run the unittest:

/root/mount/Matterport3DSimulator/src/test/main.cpp:350: FAILED:
REQUIRE_NOTHROW( sim.initialize() )
due to unexpected exception with message:
EGL error 0x300c at eglGetDisplay

===============================================================================
test cases: 5 | 4 passed | 1 failed
assertions: 119187 | 119186 passed | 1 failed

I don't know how to fix it, please help

Pre-loading cube map textures

We need to run multiple simulators to support multiple agents (up to 50 - 100) learning in parallel, each with their own simulator (or at least, their own state). Each agent could be a different building, so we will be touching all the images. In total we have 18GB of matterport_skybox_images (compressed).

Unable to download the R2R Dataset

Hi, access denied to download the R2R Dataset. could you please fix this?

--2019-02-24 19:23:10-- https://storage.googleapis.com/bringmeaspoon/R2Rdata/R2R_test.json
Resolving storage.googleapis.com (storage.googleapis.com)... 216.58.199.80, 2404:6800:4006:804::2010
Connecting to storage.googleapis.com (storage.googleapis.com)|216.58.199.80|:443... connected.
HTTP request sent, awaiting response... 403 Forbidden
2019-02-24 19:23:11 ERROR 403: Forbidden.

Thanks!

Generating connectivity graph

Hi,

It is given in the paper that connectivity graph is constructed by ray tracing between viewpoints in Matterport3d scene meshes. Could you elaborate on this?

Thank you.

Help, Error occur when i do the command "python src/driver/driver.py"

Help, anyone can tell me the how to solve the below problem ??

GLib-GIO-Message: 00:41:33.081: Using the 'memory' GSettings backend. Your settings will not be saved or shared with other applications.
OpenCV Error: Assertion failed (0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows) in Mat, file /build/opencv-ys8xiq/opencv-2.4.9.1+dfsg/modules/core/src/matrix.cpp, line 323
Traceback (most recent call last):
File "src/driver/driver.py", line 25, in
sim.newRandomEpisode(['17DRP5sb8fy'])
RuntimeError: /build/opencv-ys8xiq/opencv-2.4.9.1+dfsg/modules/core/src/matrix.cpp:323: error: (-215) 0 <= roi.x && 0 <= roi.width && roi.x + roi.width <= m.cols && 0 <= roi.y && 0 <= roi.height && roi.y + roi.height <= m.rows in function Mat

Better Documentation

Hi,

Great work.

It would be nice to have better documentation. Especially for the python API.
A simple document stepping through the capabilities of this framework (with the R2R pytorch model for example) would save alot of time in figuring out how to use the framework.

ValueError: MatterSim: Invalid action index: 1

Hi @peteanderson80,

This error "ValueError: MatterSim: Invalid action index: 1" occurs sometimes during training.
I suspect this is a bug in the C++ code of your simulator. Say if the action is forward, then the corresponding env action is (1, 0, 0), whose index is 1. Combining with you following c++ code:

void Simulator::makeAction(int index, double heading, double elevation) {
    totalTimer.Start();
    // move
    if (!initialized || index < 0 || index >= state->navigableLocations.size() ){
        std::stringstream msg;
        msg << "MatterSim: Invalid action index: " << index;
        throw std::domain_error( msg.str() );
    }

Then it basically means that the agent always chooses the navigable location whose index is 1. So if the size is less than or equal 1, then it raises the error. This seems an edge case (the agent chooses to go forward but there is no next navigable location to go).

Is my understanding correct? If so, how do you think the bug can be fixed? Thanks!

Correspondence between matterport_skybox_images and viewIndices

At any given viewpoint, there seem to be 36 viewIndices (12 headings per elevation; 3 elevations). On the other hand, matterport_skybox_images provides 6 images per viewpoint (one top, one bottom and mainly, 4 images which could be stitched to form a panorama). Can you please explain how the 12 headings of 0 elevation (or viewIndices 12-23) correspond to the 4 skybox images for a given viewpoint e.g. which of the 4 images corresponds to the agent is looking straight? Kindly help me understand the mapping.

Minor interface fixes / changes

  • Add a setup function to set the vertical camera field of view parameter (in degrees), e.g. setVFOV()
  • Add a setup function to set the camera elevation limits (in degrees), e.g. setElevationLimits(min, max). It is useful to restrict this <90 as there is nothing to see at the poles.
  • Remove setScanId() function, scanId should become a parameter of newEpisode(), along with viewpointId. When starting a new episode we must be able to move to a different building. Refactoring this may need to wait till the threading / OSMESA questions are sorted.

undefined reference to `Json::Value::asString[abi:cxx11]() const'

I'm trying to install the M3D Simulator on Ubuntu 16.04.5 LTS. I encountered this undefined reference error when running make:

Scanning dependencies of target MatterSim
[  9%] Building CXX object CMakeFiles/MatterSim.dir/src/lib/MatterSim.cpp.o
[ 18%] Building CXX object CMakeFiles/MatterSim.dir/src/lib/Benchmark.cpp.o
[ 27%] Linking CXX shared library libMatterSim.so
[ 27%] Built target MatterSim
Scanning dependencies of target random_agent
[ 36%] Building CXX object CMakeFiles/random_agent.dir/src/driver/random_agent.cpp.o
[ 45%] Linking CXX executable random_agent
libMatterSim.so: undefined reference to `Json::Value::asString[abi:cxx11]() const'
collect2: error: ld returned 1 exit status
make[2]: *** [CMakeFiles/random_agent.dir/build.make:131: random_agent] Error 1
make[1]: *** [CMakeFiles/Makefile2:68: CMakeFiles/random_agent.dir/all] Error 2
make: *** [Makefile:84: all] Error 2

I've installed a bunch of dependencies locally since I don't have access to root. I'm using a conda environment to manage some packages. My understanding of the issue is that some packages might have conflicting versions of cxx, and reverting back to older versions of abi by setting _GLIBCXX_USE_CXX11_ABI=0 should resolve the issue (see here). But setting set(_GLIBCXX_USE_CXX11_ABI 0) in CMakeLists.txt doesn't help. I'm not sure how to proceed next.

Testing

We should consider if there are parts of the code that we can unit test. Also, we can probably come up with a few acceptance tests, e.g. comparing generated images against a test bank output by the web version.

Yes this is research and not industry-strength software. On the other hand, if we are successful quite a few other groups will use this - and it's embarrassing if they find bugs. Particularly if they may invalidate our results / publications.

OpenCV error

OpenCV Error: Assertion failed (_src1.sameSize(_src2) && _src1.type() == _src2.type()) in norm, file /build/opencv-L2vuMj/opencv-3.2.0+dfsg/modules/core/src/stat.cpp, line 3545

tests is a Catch v2.0.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
RGB Image
-------------------------------------------------------------------------------
/root/mount/Matterport3DSimulator/src/test/main.cpp:342
...............................................................................

/root/mount/Matterport3DSimulator/src/test/main.cpp:342: FAILED:
  {Unknown expression after the reported line}
due to unexpected exception with messages:
  [
  	{
  		"elevation" : 0.0085573808395640535,
  		"heading" : 2.551961945320492,
  		"reference_image" : "17DRP5sb8fy_85c23efeaecd4d43a7dcd5b90137179e_2.
  551961945320492_0.008557380839564054.png",
  		"scanId" : "17DRP5sb8fy",
  		"viewpointId" : "85c23efeaecd4d43a7dcd5b90137179e"
  	},
  	{
  		"elevation" : 0.00049218360228025842,
  		"heading" : 1.8699330579409539,
  		"reference_image" : "1LXtFkjw3qL_187589bb7d4644f2943079fb949c0be9_1.
  8699330579409539_0.0004921836022802584.png",
  		"scanId" : "1LXtFkjw3qL",
  		"viewpointId" : "187589bb7d4644f2943079fb949c0be9"
  	},
  	{
  		"elevation" : -0.024443526143047459,
  		"heading" : 4.6263310475510773,
  		"reference_image" : "1pXnuDYAj8r_163d61ac7edb43fb958c5d9e69ae11ad_4.
  626331047551077_-0.02444352614304746.png",
  		"scanId" : "1pXnuDYAj8r",
  		"viewpointId" : "163d61ac7edb43fb958c5d9e69ae11ad"
  	},
  	{
  		"elevation" : -0.00068389140394051673,
  		"heading" : 5.8441199099264436,
  		"reference_image" : "29hnd4uzFmX_1576d62e7bbb45e8a5ef9e7bb37b1839_5.
  844119909926444_-0.0006838914039405167.png",
  		"scanId" : "29hnd4uzFmX",
  		"viewpointId" : "1576d62e7bbb45e8a5ef9e7bb37b1839"
  	}
  ]
  /build/opencv-L2vuMj/opencv-3.2.0+dfsg/modules/core/src/stat.cpp:3545: error:
  (-215) _src1.sameSize(_src2) && _src1.type() == _src2.type() in function norm

===============================================================================
test cases:      5 |      4 passed | 1 failed
assertions: 119189 | 119188 passed | 1 failed

Rendered image all red

Hi,

I am running your python demo as instructed.

python src/driver/driver.py

But the rendered image is pretty weird. It's all red as below (no error reported). I have tested on Ubuntu 14.04 and MacOS 10.13. Both have the same result. Can you please provide some hints in solving this issue? Did I miss something?

image

Docker Container

Suggestion: It would aid reproducibility and adoption if there was a docker file/container for this simulator.

Feature extraction

I see your feature extraction code and I have a question. The input image size is set to be 480X640 and this will results in that the shape of the output of pool5 layer of resnet152 to be BX2048X9X14. Why did you only slice the feature in a single spatial position [:, :, 0, 0] instead of averaging features across all spatial positions?

features[f*BATCH_SIZE:(f+1)*BATCH_SIZE, :] = net.blobs['pool5'].data[:,:,0,0]

rendering with drive.py cannot navigate and tilt camera

2018-07-09 16 48 17

Hi,

I am running the drive.py file following your instruction. How could I adjust the camera heading or elevation or navigate to another viewpoint from here ?

I didnot see the arrows as expected. Also, where should I key in the navigation numbers to move to another viewpoint?

BTW, It would be great if you could provide some simple rendering code to help us play with the environment.

Thank you very much.

Zuo

about the image feature

hi,when i run the python tasks/R2R/eval.py , i could not find difference if i provide image feature or not,did there some difference or is the image features useful ?

Support for CPU parallelism

Hi,
Thanks for sharing this amazing repo and R2R dataset.

Do you have any plan for supporting CPU parallelism? If not, I might be able to help with that.

Some viewpoints in the path are not reachable?

It seems that for the scan=8WUmhLawc2A, instr_id=6825_2, the robot from viewpoint 550d66ef28114bef8525d3a2d6db9cd2 can not reach the next viewpoint 01b439d39a8f412fa1837be7afb45254 by adjusting heading and elevation. The viewpoint 01b439d39a8f412fa1837be7afb45254 might be in the lists of navigableLocations of 36 combinations of heading and elevation, but the indexes are never 1. Hence, it can not be reached by adjusting the angles of heading and elevation and moving forward.

Downloading matterport3d dataset too slow

I am interested in vision-language navigation and want to download matterport3d and R2R datasets. However, 1.7T data is too large for me to download(around 10 days-time keeping 2MB/s downloading speed).

I come to ask if I can train/test model under R2R setting with a small part of data like MINOS, which is a dataset uses only 6.3GB data of matterport3d environment.

niessner/Matterport#26

Red frames - Use of absolute paths in the simulator

Shaders are loaded using absolute paths. This causes issues when the simulator is called from a different path. It was always rendering red frames because of this.

const GLchar* vShaderSource = loadFile("src/lib/vertex.sh");

I have a simple solution where you can set a shadersPath member to the MatterSim class, and define a setShadersPath() method to assign the variable. The shaders path would then be accessed as shadersPath + "/vertex.sh" (or) shadersPath + "/fragment.sh". I can send a pull request if this is acceptable.

opencv built without opengl support

When I try to build after:

sudo apt-get install libopencv-dev python-opencv freeglut3 freeglut3-dev libglm-dev libjsoncpp-dev doxygen libosmesa6-dev libosmesa6 

as given in README, cmake and make complain about missing stuff. After I add cmake libglew-dev libpython2.7-dev to the list the build goes through, but I get the following error when testing:

> build/mattersim_main 

(process:30174): Gtk-WARNING **: 23:19:46.291: Locale not supported by C library.
        Using the fallback 'C' locale.
OpenCV Error: No OpenGL support (Library was built without OpenGL support) in cvNamedWindow, file /build/opencv-L2vuMj/\
opencv-3.2.0+dfsg/modules/highgui/src/window_gtk.cpp, line 1064
terminate called after throwing an instance of 'cv::Exception'
  what():  /build/opencv-L2vuMj/opencv-3.2.0+dfsg/modules/highgui/src/window_gtk.cpp:1064: error: (-218) Library was bu\
ilt without OpenGL support in function cvNamedWindow

Does this mean I have to build opencv from scratch and the libopencv-dev from apt won't work? (I am using version 18.04.1 LTS of ubuntu).

Build options

We need to add some sort of config / build options to support OpenCV 2 and python 2.
Related to this, we should keep track of build instructions and dependencies in the main README.md.

How many images from Matterport3D dataset in total should we download?

Hi, it seems that downloading the Matterport3D dataset using their script directly is difficult for me due to the annoying disconnection problem. Could you share the folder structure of all matterport_skybox images to me such that I can manually download every necessary zip file using other powerful downloading tool ? Thank you so much!

access state.rgb error

I want to get RGB image for each angle views. However, when I want to access the image, the program failed. The error is listed below:

im = state.rgb
TypeError: Unable to convert function return value to a Python type! The signature was
(self: MatterSim.SimState) -> object

I compile the simulator with python 3.6.9, numpy 1.13.3 and opencv 3.1.0

Rending error

I reproduced a code that used the Matterport3DSimulator. I have downloaded the dataset and placed it in the appropriate folder as specified. But it still has the following error. I don't know how to solve this problem. Could anyone know how to fix it?

tests is a Catch v2.0.1 host application.
Run with -? for options


RGB Image

/home/cjh/code/regretful-agent/src/test/main.cpp:302
...............................................................................

/home/cjh/code/regretful-agent/src/test/main.cpp:324: FAILED:
REQUIRE_NOTHROW( sim.newEpisode(scanId, viewpointId, heading, elevation) )
due to unexpected exception with message:
MatterSim: Could not open skybox files at: ./data/v1/scans/17DRP5sb8fy/
matterport_skybox_images/85c23efeaecd4d43a7dcd5b90137179e_skybox*_sami.jpg

===============================================================================
test cases: 5 | 4 passed | 1 failed
assertions: 118337 | 118336 passed | 1 failed

sim.initialize() fails with EGL error 0x3001

I was able to successfully perform a local installation of MatterSim without docker. However, running driver.py threw the following error:

Traceback (most recent call last):
  File "src/driver/driver.py", line 22, in <module>
    sim.initialize()
RuntimeError: EGL error 0x3001 at eglInitialize

My system configurations are as follows:

Ubuntu 16.04.5 LTS
Nvidia-driver version: 384.111
Cuda 9.0
CUDNN v7.1

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.