ganwanshui / simpleoccupancy Goto Github PK

View Code? Open in Web Editor NEW

186.0 186.0 10.0 240.89 MB

(IEEE TIV) A Comprehensive Framework for 3D Occupancy Estimation in Autonomous Driving

Python 99.82% Shell 0.18%

simpleoccupancy's People

Contributors

Stargazers

Watchers

Forkers

chaomath curiouscat-7 shelbyxu9 aimicm yuzhouxianzhi hepingpeace flyinggh jinwooklim chisyliu colleyli mengzelin

simpleoccupancy's Issues

How long did it take you to train the model?

Why do you mask the feat here?

mask_mems = (torch.abs(feat_mems) > 0).float()
feat_mem = basic.reduce_masked_mean(feat_mems, mask_mems, dim=1) # B, C, Z, Y, X
feat_mem = feat_mem.permute(0, 1, 4, 3, 2) # [0, ...].unsqueeze(0) # ZYX -> XYZ

requirements.txt

Could you provide requirements.txt? Thank you very much.

why cumsum the prob to render the depth

Question about GPU memory requiments for training and evaluation respectively?

How can I get the point_cloud_label? The file export_point_cloud_nuscenes seems to work incorrectly,btw.....

Why do you flip the xyz here?

ind_norm = ((xyz - self.xyz_min) / (self.xyz_max - self.xyz_min)).flip((-1,)) * 2 - 1
In my opinion, the xyz has been converted to the world coordiante. So if flip the xyz, the function will sample grid along the wrong axis（x along z-axis, z along x-axis）.
Is there anything wrong with my idea？

Supplements?

@GANWANSHUI Hi! Are there supplementary materials available for this paper? There are multiple references for them in the main paper.

btw, thanks for the interesting work.

Questions about depth map visualization and depth rendering

Hello and thank you for your great work. I have encountered some problems and hope to communicate with you.Forgive me, I need to spend some space to explain my problem.

First of all, the depth rendered by your method achieves lower Abs rel than traditional depth estimation methods, but the visualization of the depth map looks worse. So my first question is, what causes this？
To explore this question, I did some experiments on Kitti (I don't have sufficient GPU resources, so I chose a smaller dataset). I expected to achieve better results than traditional self-supervised depth estimation method(like monodepth2), but the abs rel was just close to monodepth2, and it took more training time. At the same time, the visualization of the depth map still looks worse than monodepth2.
Further, I would like to know if using gt occupancy label can render a better depth map?So I used semantic gt label to render some depth maps and got the following results.

This looks strange, I found it's a problem with the visualization depth function. I need to add direct=True in visualize-depth: pred_depth_color=visualize_depth (pred_depth. copy()).And then I got a normal result.

So my second question is, does volume rendering get depth or disp? Do I need depth or disp to visualize depth maps? I think visualizing depth maps requires disp rather than depth, so your method needs to first convert depth to disp in visualization, and this is also true in monodepth2.

def visualize_depth(depth, mask=None, depth_min=None, depth_max=None, direct=False):
    """Visualize the depth map with colormap.
       Rescales the values so that depth_min and depth_max map to 0 and 1,
       respectively.
    """
    if not direct:
        depth = 1.0 / (depth + 1e-6)
    invalid_mask = np.logical_or(np.isnan(depth), np.logical_not(np.isfinite(depth)))
    if mask is not None:
        invalid_mask += np.logical_not(mask)
    if depth_min is None:
        depth_min = np.percentile(depth[np.logical_not(invalid_mask)], 5) # 0.027  0.02
    if depth_max is None:
        depth_max = np.percentile(depth[np.logical_not(invalid_mask)], 95) # 9.99  0.169
    depth[depth < depth_min] = depth_min
    depth[depth > depth_max] = depth_max
    depth[invalid_mask] = depth_max

    depth_scaled = (depth - depth_min) / (depth_max - depth_min)
    depth_scaled_uint8 = np.uint8(depth_scaled * 255)
    depth_color = cv2.applyColorMap(depth_scaled_uint8, cv2.COLORMAP_MAGMA)
    depth_color[invalid_mask, :] = 0

    return depth_color

But what confuses me is that when I use GT occ label instead of pred occ prob, the visualization is incorrect.
Have I misunderstood anything?
Looking forward to your reply!

Rendering depth too slow

I implemented a version of torch's volume rendering code, but the rendering speed during training is too slow,（20 x slower). Looking forward to your open source code.

Visualization implementation

This is an interesting work! I'm interested in the visualized results shown in the gifs, especially the sdf visualization. Is there any plan for the release of visualization code?

Is it convenient to provide more details and parameters of volume rendering?

Such as near, far, n_samples, etc.

Poind Cloud Processing for nuScenes?

I see the point cloud processing file for nuscenes in tools but it is incomplete.
Could you upload the complete file and the label path in L41?

Why is this gt label not from the point cloud？

def get_gt_loss(self, inputs, scale, outputs, disp, depth_gt, mask):

    singel_scale_total_loss = 0

    if self.opt.volume_depth:

        if self.opt.l1_voxel != 'No':
            density_center = outputs[('density_center', 0)]
            label_true = torch.ones_like(density_center, requires_grad=False)

            all_empty = outputs[('all_empty', 0)]
            label_false = torch.zeros_like(all_empty, requires_grad=False)

            if 'l1' in self.opt.l1_voxel:
                surface_loss_true = F.l1_loss(density_center, label_true, size_average=True)
                surface_loss_false = F.l1_loss(all_empty, label_false, size_average=True)

                total_grid_loss = self.opt.empty_w * surface_loss_false + surface_loss_true

            elif 'ce' in self.opt.l1_voxel:
                label = torch.cat((label_true, label_false))
                pred = torch.cat((density_center, all_empty))

                total_grid_loss = self.criterion(pred, label)

                if self.local_rank == 0 and scale == 0:
                    print('ce loss:', total_grid_loss)

Release of source code and pretrained weights

Hello,

I have read your paper. I was wondering if you are going to publish model that is trained on nuScenes data? Furthermore, if possible, do you have any time schedule to finalize and publish the source code that you can share publicly?

Thanks in advance.

About the depth loss and volume rendering sampling strategy

Hello,
Thank you for your interesting work! I found your formulation of depth loss with volume rendering really cool.
I was wondering if you have tried combining both of them, or combining depth loss with semantic loss?
Also, what sampling strategy did you use for the volume rendering?

pretrained weights

Hi,

Thank you for your excellent work!
Can you update your pre-trained model?
It will help a lot!