Code Monkey home page Code Monkey logo

nol's Introduction

Neural Object Learning for 6D Pose Estimation

An official implementation of the paper, Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images, ECCV 2020 Spotlight, https://arxiv.org/abs/2005.03717

NOL

Requirements:

The best way of running the code is using a nvidia-docker container


Citation

If you use this code, please cite the following

@InProceedings{Park_2020_ECCV,
author = {Park, Kiru and Patten, Timothy and Vincze, Markus},
title = {Neural Object Learning for 6D Pose Estimation Using a Few Cluttered Images},
booktitle = {European Conference on Computer Vision (ECCV)},
month = {Aug},
year = {2020}
}

Environment setting using Nvidia-docker

  1. git clone git clone --recurse-submodules https://github.com/kirumang/NOL.git
  2. Build Dockerfile bash docker_build.sh
  3. Edit docker_run.sh vim docker_run.sh
    • Set data_mount_arg to link dataset folders with the docker container
    • E.g., -v <path_to_dataset_local>:<path_to_dataset_container>
  4. Start the container bash docker_run.sh
  5. Test using the example script (Rendering an image of the cracker_box in SMOT)
python3 examples/NOL_rendering_one.py cfg_camera/camera_SMOT.json sample_data/obj_01.hdf5

The code is ready to run if you can see a rendered image in the folder ./result/obj_01/, with the following message.

Generated: ./results/obj_01/0000.png

The result image should look like this,

Sample image


Render an object from uniformly sample view points in the upper-hemisphere

This is an example code for rendering an object from various view points The variable, max_iter, can be reduced to increase the rendering speed

python3 examples/NOL_rendering.py <path/to/camera_cfg (.json)> <path/to/data (.hdf5)> <path/to/target (default: /result/filename/)>

Preprocess source images for LineMOD and SMOT datasets

  • From the entire training set, a number of images is sampled as source images
  • Please refer to the paper for more details of the process
  1. Preprocess for the LineMOD objects: the result file (.hdf5) will be saved in sample_data/linemod folder Link to the LindMOD dataset @ BOP Challenge
python3 data_processing/process_LineMOD_BOP.py [path/to/bop/lm] [obj_name e.g., obj_01,obj_02,...]
  1. Preprocess for the SMOT objects: the result file (.hdf5) will be saved in sample_data/smot folder.
python3 data_processing/process_SMOT.py [path/to/smot] [obj_name e.g., obj_01,obj_02,...] [icp=yes(1)/no(0),default=1]

Render a new object using your own images

a data file (.hdf5) should be created to render a new object from source images

  1. Essential components and keys for N source images
  • "vertices_3d": 3D vertices (Vx3, numpy array)
  • "faces": face indices (Fx3, numpy array)
  • "images": cropped patches of source images (Nx256x256x3, numpy array)
  • "bboxes" : bboxes, defined in the original images, where the patches are cropped from (Nx4, numpy array, [v1,h1,v2,h2])
    • e.g., if a patch of the object is cropped from (100,200) - (150,400)
    • images[0] = a resized image of the region (100,200) - (150,400)
    • bboxes[0] = [100,200,150,400]
  • "poses" : object poses (transformation matrix) in the source images (Nx4x4, numpy array)
  1. Save the components to a hdf5 file
train_data = h5py.File(fn, "w")
train_data.create_dataset("vertices_3d",data=np.array(vertices_3d))
train_data.create_dataset("faces",data=np.array(faces))
train_data.create_dataset("images",data=np.array(input_imgs))
train_data.create_dataset("poses",data=np.array(poses_))
train_data.create_dataset("masks",data=np.array(masks))
train_data.create_dataset("bboxes",data=np.array(bboxes))
train_data.close() 
  1. Define camera config (.json) file
  • See cfg_camera/camera_Default.json for a reference

Disclaimers:

  • The paper should be cosidered the main reference for this work. All the details of the algorithm and the training are reported there
  • Feel free to contact us when a new function is required for custom images

Contributors:

nol's People

Contributors

kirumang avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.