4d_net_pytorch's Introduction

4D Net Pytorch Implementation KITTI (RGB+LiDAR)

Weights available at :https://lchan017.asuscomm.com/s/4ytTaXQNmCLDzRA
- password:4dnetpytorch
This repo is an attempt at implementing 4D Net https://arxiv.org/abs/2109.01066
NO TEMPORAL ELEMENT (RGB+LiDAR only, no Time)
- This repo serves only as a tutorial for myself
- I may have missed out some stuff from the paper
Feel free to download this repo and implement the temporal elements yourself

Visual Results

Evaluation mAP

Evaluation Code from https://github.com/jacoblambert/3d_lidar_detection_evaluation

Model Details

The Model consist of a PointNet Processing model, an RGB Processing Model, PseudoImage Scattering Layer and a Efficient-Det style Single Shot Detector as object detection head
During Training, the Pseudo Images will look like this in Tensorboard and important objects should get more pronounced
For matching the targets to predicted outputs, i used a hungarian matcher used in DETR/Deformable-DETR
Half of the effort here is to let the dataset grab the relavant RGB feature coordinates
- These coordinates are used to grab the CNN features from the RGB Image to create a sepearte Pseudo Image
- This is then concatenated with the LiDAR Point Pillars Pseudo Image later

Anchorbox Calculation

K-Means analysis of ground truth boxes are used
Look at Stats.ipynb

How to Train/Infer

Edit the dataset root location in train_exp_KITTI.py:

    from KITTI_dataset import kitti_dataset,KITTI_collate_fn
    from pillar_models import NET_4D_EffDet
    
    batch_size = 4
    xyz_range = np.array([0,-40.32,-2,80.64,40.32,3])
    xy_voxel_size= np.array([0.16,0.16])
    points_per_pillar = 32
    n_pillars=12000

    dataset = kitti_dataset(root = "/home/conda/RAID_5_14TB/DATASETS/KITTI_dataset/training/" , xyz_range = xyz_range,xy_voxel_size= xy_voxel_size,points_per_pillar = points_per_pillar,n_pillars=n_pillars)
    data_loader_train = DataLoader(dataset, batch_size=batch_size,collate_fn= KITTI_collate_fn, num_workers=8, shuffle=True)

    anchor_dict = np.load("./cluster_kitti_3scales_3anchor.npy",allow_pickle=True).item()
    model = NET_4D_EffDet(anchor_dict,n_classes=4)
    model.cuda()
    for i,(img,(pillars, coord, contains_pillars),(pillar_img_pts,rgb_coors,contains_rgb),targets) in enumerate(data_loader_train):
        pred,_,_= model(img.cuda(),pillars.float().cuda(), coord.cuda(), contains_pillars.cuda(),pillar_img_pts.float().cuda(),rgb_coors.cuda(),contains_rgb.cuda())
     
    print(pred["pred_logits"].shape) #torch.Size([2, 15747, 4])
    print(pred["pred_boxes"].shape) #torch.Size([2, 15747, 7]) #x,y,z,w,l,h,r

4d_net_pytorch's People

Stargazers

Watchers

4d_net_pytorch's Issues

model_KITTI_exp_CP2.pth

Hi,how can I get the model for "model_KITTI_exp_CP2.pth"?

some questions about the code ...

Thank you for your excellent work, but I have some questions about some code. Can you help me try to answer them?
(1) Is the selection of label ranges from cameras to LiDAR coordinate systems too rough?
label.columns = self.col_names df = label[["type","z","x","y","l","w","h","yaw"]] #Camera Coord #从camera视角转化为lidar视角， df.columns = ["type","x","y","z","l","w","h","yaw"] #LiDAR Coord df["y"] = (-1*df["y"]).copy(deep=True) df = df[df["type"]!="DontCare"] #排除某些类别 xy_filter = (df["x"].values <= self.xyz_range[3]) & (df["x"].values >= self.xyz_range[0]) & (df["y"].values <= self.xyz_range[4]) & (df["y"].values >= self.xyz_range[1]) df = df[xy_filter]
(2) After setting 'max_voxels', should we randomly select or design a selector?

Recommend Projects

chanlilong / 4d_net_pytorch Goto Github PK

4d_net_pytorch's Introduction

4D Net Pytorch Implementation KITTI (RGB+LiDAR)

Visual Results

Evaluation mAP

Model Details

Anchorbox Calculation

How to Train/Infer

4d_net_pytorch's People

Stargazers

Watchers

Forkers

4d_net_pytorch's Issues

model_KITTI_exp_CP2.pth

some questions about the code ...

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent