Code Monkey home page Code Monkey logo

vip-datasets's Introduction

VIP Datasets

This repository accompanies the paper "3D Object Detection with VI-SLAM Point Clouds: The Impact of Object and Environment Characteristics on Model Performance", to appear in the Proceedings of IEEE ICRA 2024. It introduces VIP500, a dataset of 4772 VI-SLAM point clouds that covers 500 different object and environment configurations. We also provide VIP500-D, an accompanying dataset of 20 RGB-D point clouds of the object classes and shapes in VIP500, for comparison purposes.

Datasets

The full VIP500 and VIP500-D datasets can be downloaded here: https://github.com/timscargill/VIP-Datasets/tree/main/VIP-Datasets. The dataset follows the hierarchical file structure shown below:

VIP-Datasets
└───VIP500
│   │
│   └───carpet_chair1_1.txt
│   └───carpet_chair1_2.txt
│   └───carpet_chair1_3.txt
│   ...
│
└───VIP500-D
│   │
│   └───chair1.pcd
│   └───chair2.pcd
│   └───chair3.pcd
|   ...

VIP500

Format: Each point cloud in the VIP500 dataset is in .txt format, with each line specifying the coordinates of a point and the object class that point is associated with (x y z class). The numerical value in the class column refers to the following object class assignments (derived from the order in the ModelNet10 dataset):

0 - No object
3 - Chair
4 - Desk
8 - Sofa
9 - Table

Creation: To generate this dataset we used Virtual-Inertial SLAM, which facilitates the creation of semi-synthetic visual-inertial SLAM input data. We created virtual environments in Unity 2020.3.14f1, consisting of a single object in a 8m×6m×4m room with blank walls and a textured floor. For each type of object (e.g., a chair) we created different configurations, in which we varied the object shape by using different 3D models, the object texture, and the floor texture, as illustrated below:

Virtual chair variationsVirtual desk, sofa and table variations

We used the A4 trajectory in the SenseTime VI-SLAM dataset to generate a new sequence for each environment variant, then ran them on a state-of-the-art open-source VI-SLAM algorithm, ORB-SLAM3. We modified the ORB-SLAM3 software to save the generated point cloud to a text file. A video of two VI-SLAM point clouds being generated with ORB-SLAM3 is shown below:

VIP-500 point cloud generation

Finally, we segmented and labeled these point clouds using the Open3D Python library. We applied plane detection and outlier removal to identify points not part of the object, and appended a new column to each line in the point cloud file indicating the object class for that point. Examples of the point clouds generated are shown below (object points in blue, environment points in red):

Chair point cloud examplesDesk, sofa and table point cloud examples

Contents: VIP500 consists of 4772 labeled VI-SLAM point clouds generated using the above process. It covers 500 different environment configurations: 4 common indoor object classes from the ModelNet10 dataset (chair, desk, sofa, and table) x 5 object shapes x 5 object textures x 5 floor textures. We ran 10 ORB-SLAM3 trials for each configuration; some configurations resulted in the loss of tracking in some trials and invalid point clouds, which were excluded from the dataset.

VIP500-D

Format: Each point cloud in the VIP-500D dataset is in .pcd format.

Creation: We also created an RGB-D dataset that accompanies VIP500, to study the differences between the VI-SLAM point clouds and point clouds generated from 3D scanners. We generated the dataset using the same virtual environments with the same object shapes as those used in VIP500. We exported the virtual environments used to generate VIP500 (built in Unity), to FBX files and imported them to Unreal Engine 4.27.2. To generate the VIP500-D point clouds, we leveraged the Unreal plugin AirSim, which facilitates the creation of RGB-D point clouds after capturing RGB camera images and depth sensor readings. Examples of the RGB-D point clouds in VIP500-D are shown below:

Chair model variations in VIP500-D

Contents: VIP500-D contains four object classes (chair, desk, sofa, and table), each with five object shapes. We do not consider different object and floor textures because these characteristics have minimal influence on RGB-D point clouds.

Citation

If you use Virtual-Inertial SLAM in an academic work, please cite:

@inproceedings{VIP-500,
  title={3D Object Detection with VI-SLAM Point Clouds: The Impact of Object and Environment Characteristics on Model Performance},
  author={Duan, Lin, and Scargill, Tim and Chen, Ying and Gorlatova, Maria},
  booktitle={Proceedings of IEEE ICRA 2024},
  year={2024}
 }

Acknowledgements

The authors of this repository are Tim Scargill, Ying Chen and Maria Gorlatova. Contact information of the authors:

  • Tim Scargill (timothyjames.scargill AT duke.edu)
  • Ying Chen (ying.chen151 AT duke.edu)
  • Maria Gorlatova (maria.gorlatova AT duke.edu)

This work was supported in part by NSF grants CSR-1903136, CNS-1908051, CNS-2312760, and CNS-2112562, NSF CAREER Award IIS-2046072, a CISCO Research Award, and a Meta Research Award.

vip-datasets's People

Contributors

timscargill avatar yingchen0115 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.