We find that you perform flipping,rescaling and cropping on the input images. Wh

Correct. (I expect what you are saying is about the construction of cost volume.

The question about data augmentation about depth-from-motion HOT 5 CLOSED

tai-wang commented on July 20, 2024

The question about data augmentation

from depth-from-motion.

Comments (5)

Tai-Wang commented on July 20, 2024

Correct. (I expect what you are saying is about the construction of cost volume.)
Don't need, because we only use cur2prevs when constructing the cost volume and we always recover the plane-sweep feature in the canonical space/real-world space.

from depth-from-motion.

jichaofeng commented on July 20, 2024

Thanks for all your help. We change the intrinsic matrix K for rescaling,cropping and flipping, conduct horizontal flipping for images, and only flip 3D grid when lifting the 2.5 coordinate to 3D, but the performance appears a little dropping.

from depth-from-motion.

Tai-Wang commented on July 20, 2024

Do you mean passing the adjusted intrinsic matrix into the build_dfm_cost function? If yes, could you please show the code concretely? Maybe you can compare the 3D features obtained in these two different ways.

from depth-from-motion.

jichaofeng commented on July 20, 2024

Yes. we will try to compare the 3D features obtained in these two different ways. Thank you.

from depth-from-motion.

jichaofeng commented on July 20, 2024

We find there are still some error in our implication for data augmentation. Our code is shown as follows. We perform random flipping,rescaling and cropping on the input images. P2_ori is the adjusted intrinsic matrix and equals [ 755.64,0,638.37,46.977;0,755.64,76.298,-0.06;0,0,1,0.0027;0,0,0,1] ,When flipping, P2_ori equals [ 755.64,0,638.37,-46.977;0,755.64,76.298,-0.06;0,0,1,0.0027;0,0,0,1]

def forward(self, cur_features, prev_features,P2_ori,cur2prevs):

batch_size = cur_features.shape[0]
num_depths = self.downsampled_depth.shape[-1]
h_out, w_out = cur_features.shape[-2:]
ws = torch.linspace(0, w_out - 1, w_out).cuda()
hs = torch.linspace(0, h_out - 1, h_out).cuda()
ds_3d, ys_3d, xs_3d = torch.meshgrid(self.downsampled_depth, hs, ws)
grid = torch.stack([xs_3d, ys_3d, ds_3d], dim=-1)
grid = grid[None].repeat(batch_size, 1, 1, 1, 1)
prev_cost_feats_list =[]
P2 = P2_ori.clone()
P2[:,0, :] = P2[:,0, :] // self.downsample_scale          
P2[:,1, :] = P2[:,1, :] // self.downsample_scale
for idx in range(batch_size):
    grid3d = points_img2cam(grid[idx].view(-1, 3), P2[idx][:3])
   
    pad_ones = grid3d.new_ones(grid3d.shape[0], 1)
    homo_grid3d = torch.cat([grid3d, pad_ones], dim=1)
   
    prev_grid3d = (homo_grid3d @ cur2prevs[idx].transpose(0, 1))[:, :3]
    prev_grid = points_cam2img(prev_grid3d, P2[idx])[:, :2]
    prev_grid = prev_grid.view(1, 1, -1, 2)
    prev_grid[..., 0] = prev_grid[..., 0] / (w_out - 1) * 2 - 1
    prev_grid[..., 1] = prev_grid[..., 1] / (h_out - 1) * 2 - 1

    prev_cost_feats = F.grid_sample(
        prev_features[idx:idx+1],
       prev_grid,
       mode='bilinear',
       padding_mode='zeros',
       align_corners=True)  # (B, C, 1, D*H_out*W_out)
       prev_cost_feats = prev_cost_feats.view(-1, num_depths, h_out,w_out)
       prev_cost_feats_list.append(prev_cost_feats)
prev_cost_feats_list=torch.stack(prev_cost_feats_list)  #(B, C, D, H_out, W_out)
cur_cost_feats_list= cur_features.unsqueeze(2).repeat(1,1,num_depths,1,1)
cost = (cur_cost_feats_list*prev_cost_feats_list).mean(dim=1)

from depth-from-motion.

The question about data augmentation about depth-from-motion HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent