Code Monkey home page Code Monkey logo

simpleoccupancy's People

Contributors

ganwanshui avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

simpleoccupancy's Issues

Why do you mask the feat here?

mask_mems = (torch.abs(feat_mems) > 0).float()
feat_mem = basic.reduce_masked_mean(feat_mems, mask_mems, dim=1) # B, C, Z, Y, X
feat_mem = feat_mem.permute(0, 1, 4, 3, 2) # [0, ...].unsqueeze(0) # ZYX -> XYZ

Why do you flip the xyz here?

ind_norm = ((xyz - self.xyz_min) / (self.xyz_max - self.xyz_min)).flip((-1,)) * 2 - 1
In my opinion, the xyz has been converted to the world coordiante. So if flip the xyz, the function will sample grid along the wrong axis(x along z-axis, z along x-axis).
Is there anything wrong with my idea?

Supplements?

@GANWANSHUI Hi! Are there supplementary materials available for this paper? There are multiple references for them in the main paper.

btw, thanks for the interesting work.

Questions about depth map visualization and depth rendering

Hello and thank you for your great work. I have encountered some problems and hope to communicate with you.Forgive me, I need to spend some space to explain my problem.

  1. First of all, the depth rendered by your method achieves lower Abs rel than traditional depth estimation methods, but the visualization of the depth map looks worse. So my first question is, what causes this?
    image

  2. To explore this question, I did some experiments on Kitti (I don't have sufficient GPU resources, so I chose a smaller dataset). I expected to achieve better results than traditional self-supervised depth estimation method(like monodepth2), but the abs rel was just close to monodepth2, and it took more training time. At the same time, the visualization of the depth map still looks worse than monodepth2.

  3. Further, I would like to know if using gt occupancy label can render a better depth map?So I used semantic gt label to render some depth maps and got the following results.
    image

This looks strange, I found it's a problem with the visualization depth function. I need to add direct=True in visualize-depth: pred_depth_color=visualize_depth (pred_depth. copy()).And then I got a normal result.
image
So my second question is, does volume rendering get depth or disp? Do I need depth or disp to visualize depth maps? I think visualizing depth maps requires disp rather than depth, so your method needs to first convert depth to disp in visualization, and this is also true in monodepth2.

def visualize_depth(depth, mask=None, depth_min=None, depth_max=None, direct=False):
    """Visualize the depth map with colormap.
       Rescales the values so that depth_min and depth_max map to 0 and 1,
       respectively.
    """
    if not direct:
        depth = 1.0 / (depth + 1e-6)
    invalid_mask = np.logical_or(np.isnan(depth), np.logical_not(np.isfinite(depth)))
    if mask is not None:
        invalid_mask += np.logical_not(mask)
    if depth_min is None:
        depth_min = np.percentile(depth[np.logical_not(invalid_mask)], 5) # 0.027  0.02
    if depth_max is None:
        depth_max = np.percentile(depth[np.logical_not(invalid_mask)], 95) # 9.99  0.169
    depth[depth < depth_min] = depth_min
    depth[depth > depth_max] = depth_max
    depth[invalid_mask] = depth_max

    depth_scaled = (depth - depth_min) / (depth_max - depth_min)
    depth_scaled_uint8 = np.uint8(depth_scaled * 255)
    depth_color = cv2.applyColorMap(depth_scaled_uint8, cv2.COLORMAP_MAGMA)
    depth_color[invalid_mask, :] = 0

    return depth_color

But what confuses me is that when I use GT occ label instead of pred occ prob, the visualization is incorrect.
Have I misunderstood anything?
Looking forward to your reply!

Rendering depth too slow

I implemented a version of torch's volume rendering code, but the rendering speed during training is too slow,(20 x slower). Looking forward to your open source code.

Visualization implementation

This is an interesting work! I'm interested in the visualized results shown in the gifs, especially the sdf visualization. Is there any plan for the release of visualization code?

Why is this gt label not from the point cloud?

def get_gt_loss(self, inputs, scale, outputs, disp, depth_gt, mask):

    singel_scale_total_loss = 0

    if self.opt.volume_depth:

        if self.opt.l1_voxel != 'No':
            density_center = outputs[('density_center', 0)]
            label_true = torch.ones_like(density_center, requires_grad=False)

            all_empty = outputs[('all_empty', 0)]
            label_false = torch.zeros_like(all_empty, requires_grad=False)

            if 'l1' in self.opt.l1_voxel:
                surface_loss_true = F.l1_loss(density_center, label_true, size_average=True)
                surface_loss_false = F.l1_loss(all_empty, label_false, size_average=True)

                total_grid_loss = self.opt.empty_w * surface_loss_false + surface_loss_true

            elif 'ce' in self.opt.l1_voxel:
                label = torch.cat((label_true, label_false))
                pred = torch.cat((density_center, all_empty))

                total_grid_loss = self.criterion(pred, label)

                if self.local_rank == 0 and scale == 0:
                    print('ce loss:', total_grid_loss)

Release of source code and pretrained weights

Hello,

I have read your paper. I was wondering if you are going to publish model that is trained on nuScenes data? Furthermore, if possible, do you have any time schedule to finalize and publish the source code that you can share publicly?

Thanks in advance.

About the depth loss and volume rendering sampling strategy

Hello,
Thank you for your interesting work! I found your formulation of depth loss with volume rendering really cool.
I was wondering if you have tried combining both of them, or combining depth loss with semantic loss?
Also, what sampling strategy did you use for the volume rendering?

pretrained weights

Hi,

Thank you for your excellent work!
Can you update your pre-trained model?
It will help a lot!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.