Code Monkey home page Code Monkey logo

acp-3detection's Introduction

Deep Learning on 3D Object Detection for Automatic Plug-in Charging Using a Mobile Manipulator

Challenging of Automatic Plug-in Charging (APC) & Automatic Charging and Plug-in (ACP)

This repository aims to introduce data prerequisites used in our project, focusing on 3D detection on Charging Station and Socket/Plug, which is mainly based on PV-RCNN.

3D Detection Techniques

Data Acquisition

In this project, all point cloud was retrieved by a PMD Camera with development kits.

3D Point Cloud Labeling Tools

There are many tools (online or off-line) providing labeling on a bunch of points, such as, basicfinder, supervise and 3D BAT. We are using an online tool, supervise for labeling 3D point cloud as below.

Dataset

Inspired by KITTI, for detection of charging station and socket/plug, two datasets for training and a dataset for evaluation need to be established respectively. To keep the coordinate as same as KITTI, and other requirements that make sure point cloud data we acquired can be fed into the target deep network, a set of tools were developed.

Since PV-RCNN is a state-of-the-art deep network framework that has high-performance on many autonomous driving benchmarks, such as KITTI. We employ and practice this learning-based technique to do a challenging of Automatic Charging and Plug-in(ACP). Moreover, Point Cloud, as the data-structure of input in our project, is the fundamental data source of 3D detection in PV-RCNN. PV-RCNN was implemented in OpenPCDet and modified in OpenPCDet. We hope the challenging of ACP can be benefitted by Learning-based methods.

Charging Station Dataset

For training:

A dataset of Charging Station, which consists of Training(number: ~1000, size: 480MB) and Evaluation (number: ~100, size: 53MB) data.

Download model (150 MB), trained with ~1000 dataset and 250 epochs.

Detection Result:


Socket/Plug Dataset

For evaluation:

A dataset of Socket/Plug, which consists of Training(number: ~1000, size: 254MB) and Evaluation (number: ~100, size: 48MB) data.

Download model (150 MB), trained with ~1000 dataset and 250 epochs.

Detection Result:

3D Construction and Pin Detection

Thanks to the UR robot, multiple acquisition poses could be obtained and integrated to rebuild a complete 3D environment. Followed by feature-based strategies to identify position and orientation of pin. For more details about this, please refer to these papers(click as below).

Papers

If you found it is useful, please consider cite us:

@article{zhou2022learning,
  title={Learning-based object detection and localization for a mobile robot manipulator in SME production},
  author={Zhou, Zhengxue and Li, Leihui and F{\"u}rsterling, Alexander and Durocher, Hjalte Joshua and Mouridsen, Jesper and Zhang, Xuping},
  journal={Robotics and Computer-Integrated Manufacturing},
  volume={73},
  pages={102229},
  year={2022},
  publisher={Elsevier}
}

@inproceedings{zhou2021deep,
  title={Deep Learning on 3D Object Detection for Automatic Plug-in Charging Using a Mobile Manipulator},
  author={Zhou, Zhengxue and Li, Leihui and Wang, Riwei and Zhang, Xuping},
  booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)},
  pages={4148--4154},
  year={2021},
  organization={IEEE}
}

Contribution

This project so far is maintained by @Leihui Li and @Zhengxue Zhou, please be free to contact us if you have any problems.

acp-3detection's People

Contributors

leihui6 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

acp-3detection's Issues

Sign problem in Translation

[0, 0, 1, 7],
[1, 0, 0, 0],
[0, 1, 0, -1.425],
[0, 0, 0, 1]]))
you are translating your points from 0.305----->1.73 then why you put -1.425 instead of +1.425 ?
Please help madam,thanks in advance

About POINT_CLOUD_RANGE and anchor_sizes

Hi,gltina
Specific parameters in this project,Socket and Plug,1.POINT_CLOUD_RANGE: [ ] 2.anchor_sizes[ ],
In OpenPCDet, the KITTI dataset looks like this POINT_CLOUD_RANGE: [0, -40, -3, 70.4, 40, 1],Car 'anchor_sizes': [[3.9, 1.6, 1.56]],
I had used your pretrained model given checkpoint_epoch_250.pth 、finedata,labelfile
Use converter_ Pc2KITTIPC converts the data into the same coordinate system as the KITTI dataset,
Meanwhile modified some configurations according to the instructions in OpenPCDet Custom Dataset.
In label, the object size is affected by the label, and the size is different
I hope you can provide the specific parameters of the above two configurations. Thank you!

How to get evaluation result on own dataset

Hi Gltina, thanks for your amazing work.

I used the custom dataset method that u mentioned in here. And I've get some not bad result which is in KITTI format. So I want to evaluate the result, but the kitti evaluation is not support for my custom result, so I read your paper, and want to use your method to get overlap volumes between ground truth and final_result.

I tried to use the evaluation.py code that u have made, but I can't get the right result, which I think should output the overlap volumes.
I put the trained and standard label fold as u mentioned:
image

and I get the result :
image

This is the command that I input:
evaluation.py trained_label

and the standard label example:
tractor 0.00 0 0.00 0 0 50 50 2.74 2.12 4.33 4.28 1.28 5.2 0.0

the trained label example:
tractor -1 -1 -3.8309 0.0000 0.0000 0.0000 0.0000 2.5715 2.2279 4.3080 4.1810 1.2036 5.1874 -3.1535 0.9790

I would appreciate it if you can help me figure out where is my problem, many thanks to u in advance.

Question about the 7m in the pointcould converter

Hello, first of all thank you for you work,
I wanted to ask why did you chose to move the scene of 7m exactly ( it worked better for me as well but I'm wondering why)
I was in a comment that, for kitti, the scene is a bit far from the center (velodyne), but i wanted to know the exact reason why 7m or was it just a choice ?

Thanks in advance

Regarding the output given dataset(Fine)

Hai,
I had used your pretrained model given checkpoint_epoch_250.pth and your data(Fine) to check the output but when I use the demo.py I got the output as [below. I didn't get the output as yours given in the .gif file
GLITINA
ie No 3d Box

how to convert json file from Supervisely to KITTI label format

Hi, Gltina,
Thank you for proving this wonderful repository.
I tried to use Supervisely to annotate my lidar data and I got a simple json file. The labels in json are quite different from those in KITTI label. You provided a tool to convert json to KITTI label , however, I am wondering if you have done any other conversion before feeding json( as below) to converter_mylabel2KITTIlabel.py since there were errors putting json directly into this function. If so, how did you do that? Thanks.
image
image

converter_pc2KITTIpc.py

Hi @Gltina
can you explain more where did you get value 0.305m) and 7m , 1.75m is height of velodyne to ground in kitti

Height info:
kitti_camera : 1.73 m
PMD_camera in this project: 0.796 m
1.73 - 0.305 = 1.425

negative direction
x + 7m
transform_matrix = np.mat(np.array([
[0, 0, 1, 7],
[1, 0, 0, 0],
[0, 1, 0, -1.425],
[0, 0, 0, 1]]))

Train the model with custom dataset

Please help me, is it possible to train the model with custom dataset , how can I train and inference the model with custom dataset , if possible please help me , Thank you so much in advance.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.