Code Monkey home page Code Monkey logo

Comments (13)

benjaminum avatar benjaminum commented on June 15, 2024

The ranges for the ground truth data are basically correct. inf for inverse depth should be avoided and the optical flow can be larger or smaller than 1.0 or -1.0 but in practice this does rarely happens.

I assume that the values of the bootstrap net look reasonable.

Is this the first iterations of the iterative net?
Can you check what the output of the flow_to_depth operation looks like?
This is in the depthmotion_block_demon_original function in blocks_original.py part?

if data_format=='channels_first':
depth_from_flow = sops.flow_to_depth(
flow=prev_flow2,
intrinsics=intrinsics,
rotation=prev_rotation,
translation=prev_translation,
normalized_flow=True,
inverse_depth=True,
)

from demon.

sampepose avatar sampepose commented on June 15, 2024

Yes, the results of the bootstrap net look reasonable. Here are histograms of ground truth and output of bootstrap net:

gt output of bootstrap
screenshot 2017-08-14 10 10 29 screenshot 2017-08-14 10 10 37
screenshot 2017-08-14 10 11 39 screenshot 2017-08-14 10 11 35

This is the first iteration of the iterative net.

Here's the min/max/avg of the output of flow_to_depth for a few training iterations.
[0][179.47673][0.16061339]
[0][1.0737418e+09][3646125]
[0][903.28778][0.2184152]
[0][209.7283][0.38293144]
[0][1789.4653][12.30909]
[0][180.58949][1.5996453]

I'm not sure why the second iteration there has a crazy high max value.

The output of the iterative net visualized:

gt output of iterative
screenshot 2017-08-14 10 28 23 screenshot 2017-08-14 10 28 40
screenshot 2017-08-14 10 28 29 screenshot 2017-08-14 10 28 44

Depth predictions are in the range of [-1E8, 1E8] and normals in the range [-4E4, 8E4]

from demon.

sampepose avatar sampepose commented on June 15, 2024

I printed the inputs to flow_to_depth as well, and they look reasonable.

[name][min][max][avg]

[translation][-0.96468806][0.37828878][-0.27458572]
[rotation][-0.021255214][0.012888308][-0.0051129474]
[flow][-0.06706503][0.13209946][0.0039939629]
[depth from flow][0][33554432][1267849]

[translation][-0.99621952][0.37635651][-0.31681082]
[rotation][-0.020021241][0.043232266][-0.0013534732]
[flow][-0.043860085][0.082544379][0.0022001718]
[depth from flow][0][1.3421773e+08][5803308.5]

from demon.

sampepose avatar sampepose commented on June 15, 2024

The structure of the given flow looks pretty reasonable too.
screenshot 2017-08-14 18 22 01

The ranges are roughly the same (ground truth vs iterative net prediction):
screenshot 2017-08-14 18 27 23

from demon.

sampepose avatar sampepose commented on June 15, 2024

I verified the op looks fine for your given example:
iter0

However, it doesn't look fine even with my ground truth flow, rotation, and translation!
image
image
image

  • Ground truth flow is normalized by dividing x,y components by image width and height, respectively.
  • Ground truth translation is divided by the norm to make it a unit vector
  • Ground truth rotation is given as a rotation matrix, so I convert this to a rotation axis with a magnitude equal to the rotation angle.
  • Intrinsics are left alone to your defaults

from demon.

sampepose avatar sampepose commented on June 15, 2024

One issue is the flow_to_depth op always expects intrinsics to be a (N, 4) tensor. See here: https://github.com/lmb-freiburg/lmbspecialops/blob/master/src/flowtodepth.cc#L361

This works fine with a batch size of 1 like the example. However, has unexpected results with larger batch sizes. The code should either be changed to use a single intrinsics matrix for all images in the batch (makes sense) or the validation should check for the correct matrix rank.

Nonetheless, after fixing and recompiling this error locally (and verifying that I am getting valid intrinsic and fundamental matrices for every image in the batch), the depth maps still look like in my last comment.

from demon.

sampepose avatar sampepose commented on June 15, 2024

Sorry for all of the comments, I just want to make sure you have enough info to help me debug this!

I've visualized the results here. This shows my ground truth images, the flow from image A to image B, the ground truth inverse depth, and the depth from flow_to_depth with both inverse_depth True and False. Note that when inverse_depth=False, the resulting depth map looks OK. When inverse_depth=True, the depth map is nonsense.
visualization

Here are the histograms for those images:
histograms

I noticed that some of the depth values from flow_to_depth were < 1.0 (when not using inverse depth). If I change the op to ignore those values for now when computing inverse depth, the inverse depth looks better. Here are those points and what the inverse depth looks like. I'm not sure why it's those points in particular.

screenshot 2017-08-15 14 45 41

from demon.

benjaminum avatar benjaminum commented on June 15, 2024

One issue is the flow_to_depth op always expects intrinsics to be a (N, 4) tensor. See here: https://github.com/lmb-freiburg/lmbspecialops/blob/master/src/flowtodepth.cc#L361

This works fine with a batch size of 1 like the example. However, has unexpected results with larger batch sizes. The code should either be changed to use a single intrinsics matrix for all images in the batch (makes sense) or the validation should check for the correct matrix rank.

Thanks for pointing this out. The documentation of the op is misleading. I will adapt the documentation and add checks. We want to allow different intrinsics for each batch item to be more flexible.

Your comments definitely help to debug this. From your last comment it seems that there is something seriously wrong with the op.
Would it be possible to export the inputs and outputs as numpy array (.npy) and attach them here? I will try to have a look as soon as possible, but I cannot give guarantees how quickly this can be resolved.

If you need a workaround fast, you can replace the output of the flow_to_depth op with the previously predicted depth during finetuning.

from demon.

sampepose avatar sampepose commented on June 15, 2024

Thanks, Benjamin. I've attached a zip with my ground truth as .npy.

gt.zip

The main issue is the range produced by flow_to_depth is way too large to produce proper visualization. When normalized to [0, 255], almost all of the values are 0. Here's a box plot of the values:
boxplot

The median is way down at 0.X but there are a lot of values outside of the whiskers. The top whisker is at Q3 + 1.5*IQR.

from demon.

benjaminum avatar benjaminum commented on June 15, 2024

We have now added a flow_to_depth2 op with bugfixes.

The data in gt.zip however, still looks strange. Does the scale of the depth value correspond to the unit translation?

from demon.

sampepose avatar sampepose commented on June 15, 2024

from demon.

benjaminum avatar benjaminum commented on June 15, 2024

You simply scale the depth (not inverse depth) with the same scalar factor as the translation.

from demon.

sampepose avatar sampepose commented on June 15, 2024

Here's ground truth with the depth properly scaled. The depth in this .zip is inverse depth.

gt.zip

from demon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.