Here, we developed a deep learning based perception system which integrated semantic segmentation and object detection. By changing the cost function of YOLO, we can help the model focus better on small objects(remote cars) as well as keeping the real time performance.
The code will come later with the paper, here are two demo videos: