Code Monkey home page Code Monkey logo

4d_net_pytorch's Introduction

4D Net Pytorch Implementation KITTI (RGB+LiDAR)

Visual Results

visual_result

Evaluation mAP

Model Details

4dnet

  • The Model consist of a PointNet Processing model, an RGB Processing Model, PseudoImage Scattering Layer and a Efficient-Det style Single Shot Detector as object detection head
  • During Training, the Pseudo Images will look like this in Tensorboard and important objects should get more pronounced
  • For matching the targets to predicted outputs, i used a hungarian matcher used in DETR/Deformable-DETR
  • Half of the effort here is to let the dataset grab the relavant RGB feature coordinates
    • These coordinates are used to grab the CNN features from the RGB Image to create a sepearte Pseudo Image
    • This is then concatenated with the LiDAR Point Pillars Pseudo Image later PseudoImages

Anchorbox Calculation

  • K-Means analysis of ground truth boxes are used
  • Look at Stats.ipynb

How to Train/Infer

  • Edit the dataset root location in train_exp_KITTI.py:
    from KITTI_dataset import kitti_dataset,KITTI_collate_fn
    from pillar_models import NET_4D_EffDet
    
    batch_size = 4
    xyz_range = np.array([0,-40.32,-2,80.64,40.32,3])
    xy_voxel_size= np.array([0.16,0.16])
    points_per_pillar = 32
    n_pillars=12000

    dataset = kitti_dataset(root = "/home/conda/RAID_5_14TB/DATASETS/KITTI_dataset/training/" , xyz_range = xyz_range,xy_voxel_size= xy_voxel_size,points_per_pillar = points_per_pillar,n_pillars=n_pillars)
    data_loader_train = DataLoader(dataset, batch_size=batch_size,collate_fn= KITTI_collate_fn, num_workers=8, shuffle=True)

    anchor_dict = np.load("./cluster_kitti_3scales_3anchor.npy",allow_pickle=True).item()
    model = NET_4D_EffDet(anchor_dict,n_classes=4)
    model.cuda()
    for i,(img,(pillars, coord, contains_pillars),(pillar_img_pts,rgb_coors,contains_rgb),targets) in enumerate(data_loader_train):
        pred,_,_= model(img.cuda(),pillars.float().cuda(), coord.cuda(), contains_pillars.cuda(),pillar_img_pts.float().cuda(),rgb_coors.cuda(),contains_rgb.cuda())
     
    print(pred["pred_logits"].shape) #torch.Size([2, 15747, 4])
    print(pred["pred_boxes"].shape) #torch.Size([2, 15747, 7]) #x,y,z,w,l,h,r

4d_net_pytorch's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

4d_net_pytorch's Issues

some questions about the code ...

Thank you for your excellent work, but I have some questions about some code. Can you help me try to answer them?
(1) Is the selection of label ranges from cameras to LiDAR coordinate systems too rough?
label.columns = self.col_names df = label[["type","z","x","y","l","w","h","yaw"]] #Camera Coord #从camera视角转化为lidar视角, df.columns = ["type","x","y","z","l","w","h","yaw"] #LiDAR Coord df["y"] = (-1*df["y"]).copy(deep=True) df = df[df["type"]!="DontCare"] #排除某些类别 xy_filter = (df["x"].values <= self.xyz_range[3]) & (df["x"].values >= self.xyz_range[0]) & (df["y"].values <= self.xyz_range[4]) & (df["y"].values >= self.xyz_range[1]) df = df[xy_filter]
(2) After setting 'max_voxels', should we randomly select or design a selector?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.