ganwanshui / simpleoccupancy Goto Github PK
View Code? Open in Web Editor NEW(IEEE TIV) A Comprehensive Framework for 3D Occupancy Estimation in Autonomous Driving
(IEEE TIV) A Comprehensive Framework for 3D Occupancy Estimation in Autonomous Driving
mask_mems = (torch.abs(feat_mems) > 0).float()
feat_mem = basic.reduce_masked_mean(feat_mems, mask_mems, dim=1) # B, C, Z, Y, X
feat_mem = feat_mem.permute(0, 1, 4, 3, 2) # [0, ...].unsqueeze(0) # ZYX -> XYZ
Could you provide requirements.txt? Thank you very much.
ind_norm = ((xyz - self.xyz_min) / (self.xyz_max - self.xyz_min)).flip((-1,)) * 2 - 1
In my opinion, the xyz has been converted to the world coordiante. So if flip the xyz, the function will sample grid along the wrong axis(x along z-axis, z along x-axis).
Is there anything wrong with my idea?
@GANWANSHUI Hi! Are there supplementary materials available for this paper? There are multiple references for them in the main paper.
btw, thanks for the interesting work.
Hello and thank you for your great work. I have encountered some problems and hope to communicate with you.Forgive me, I need to spend some space to explain my problem.
First of all, the depth rendered by your method achieves lower Abs rel than traditional depth estimation methods, but the visualization of the depth map looks worse. So my first question is, what causes this?
To explore this question, I did some experiments on Kitti (I don't have sufficient GPU resources, so I chose a smaller dataset). I expected to achieve better results than traditional self-supervised depth estimation method(like monodepth2), but the abs rel was just close to monodepth2, and it took more training time. At the same time, the visualization of the depth map still looks worse than monodepth2.
Further, I would like to know if using gt occupancy label can render a better depth map?So I used semantic gt label to render some depth maps and got the following results.
This looks strange, I found it's a problem with the visualization depth function. I need to add direct=True
in visualize-depth: pred_depth_color=visualize_depth (pred_depth. copy())
.And then I got a normal result.
So my second question is, does volume rendering get depth or disp? Do I need depth or disp to visualize depth maps? I think visualizing depth maps requires disp rather than depth, so your method needs to first convert depth to disp in visualization, and this is also true in monodepth2.
def visualize_depth(depth, mask=None, depth_min=None, depth_max=None, direct=False):
"""Visualize the depth map with colormap.
Rescales the values so that depth_min and depth_max map to 0 and 1,
respectively.
"""
if not direct:
depth = 1.0 / (depth + 1e-6)
invalid_mask = np.logical_or(np.isnan(depth), np.logical_not(np.isfinite(depth)))
if mask is not None:
invalid_mask += np.logical_not(mask)
if depth_min is None:
depth_min = np.percentile(depth[np.logical_not(invalid_mask)], 5) # 0.027 0.02
if depth_max is None:
depth_max = np.percentile(depth[np.logical_not(invalid_mask)], 95) # 9.99 0.169
depth[depth < depth_min] = depth_min
depth[depth > depth_max] = depth_max
depth[invalid_mask] = depth_max
depth_scaled = (depth - depth_min) / (depth_max - depth_min)
depth_scaled_uint8 = np.uint8(depth_scaled * 255)
depth_color = cv2.applyColorMap(depth_scaled_uint8, cv2.COLORMAP_MAGMA)
depth_color[invalid_mask, :] = 0
return depth_color
But what confuses me is that when I use GT occ label instead of pred occ prob, the visualization is incorrect.
Have I misunderstood anything?
Looking forward to your reply!
I implemented a version of torch's volume rendering code, but the rendering speed during training is too slow,(20 x slower). Looking forward to your open source code.
This is an interesting work! I'm interested in the visualized results shown in the gifs, especially the sdf visualization. Is there any plan for the release of visualization code?
Such as near, far, n_samples, etc.
I see the point cloud processing file for nuscenes in tools but it is incomplete.
Could you upload the complete file and the label path in L41?
def get_gt_loss(self, inputs, scale, outputs, disp, depth_gt, mask):
singel_scale_total_loss = 0
if self.opt.volume_depth:
if self.opt.l1_voxel != 'No':
density_center = outputs[('density_center', 0)]
label_true = torch.ones_like(density_center, requires_grad=False)
all_empty = outputs[('all_empty', 0)]
label_false = torch.zeros_like(all_empty, requires_grad=False)
if 'l1' in self.opt.l1_voxel:
surface_loss_true = F.l1_loss(density_center, label_true, size_average=True)
surface_loss_false = F.l1_loss(all_empty, label_false, size_average=True)
total_grid_loss = self.opt.empty_w * surface_loss_false + surface_loss_true
elif 'ce' in self.opt.l1_voxel:
label = torch.cat((label_true, label_false))
pred = torch.cat((density_center, all_empty))
total_grid_loss = self.criterion(pred, label)
if self.local_rank == 0 and scale == 0:
print('ce loss:', total_grid_loss)
Hello,
I have read your paper. I was wondering if you are going to publish model that is trained on nuScenes data? Furthermore, if possible, do you have any time schedule to finalize and publish the source code that you can share publicly?
Thanks in advance.
Hello,
Thank you for your interesting work! I found your formulation of depth loss with volume rendering really cool.
I was wondering if you have tried combining both of them, or combining depth loss with semantic loss?
Also, what sampling strategy did you use for the volume rendering?
Hi,
Thank you for your excellent work!
Can you update your pre-trained model?
It will help a lot!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.