Code Monkey home page Code Monkey logo

Comments (15)

siamiz88 avatar siamiz88 commented on June 12, 2024 2

Thank you for your quick response.
I look forward to your update :)

from ar-depth.

holynski avatar holynski commented on June 12, 2024 2

Very useful and detailed comments, thanks.

It turns out the code that was released is actually not the intended (final) version, but rather a much earlier (and incomplete) iteration. Unfortunately, I no longer have access to the original repository used for development, so I'm stuck re-implementing the missing parts.

This may take me another day or two. Apologies for the trouble.

from ar-depth.

holynski avatar holynski commented on June 12, 2024 2

Just committed a new version reimplementing the missing components. The code should now be feature-complete and working. Would you please let me know if everything works for you?

from ar-depth.

siamiz88 avatar siamiz88 commented on June 12, 2024 1

The code is successfully running now. As opposed to my expectation, this requires significant time to generate the depth image. (1 image generation approximately requires an hour with 500 solver iterations) Is this normal '-'?

Anyway, thank you for your dedicated reply. I'm really touched.

from ar-depth.

holynski avatar holynski commented on June 12, 2024 1

For me, each frame is processed in about 2 minutes. This is running the sample data and with default settings on my laptop (a Macbook Pro).

It might be the case that the first image that is saved takes longer (since the code doesn't actually save the first X frames, to ensure the depth maps have initialized to a reasonable solution). If you'd like to disable this, you can set the value of skip_frames to zero before the main loop -- but keep in mind that the first few depth maps saved might not be great.

from ar-depth.

holynski avatar holynski commented on June 12, 2024

Thanks for pointing this out. I'll look into this now.

from ar-depth.

wlsh24 avatar wlsh24 commented on June 12, 2024

Hello, I am also trying to run this code and here is my status:

So, I slightly fixed my code and run again the code.
-> Solved the keyframe error.
However, the code couldn't compute the reference frame in this time. There is no difference between kf[0].Position() and kf[1].Position(). So I checked the translation info in the dictionary of views and found that they were all the same. [ 0.17589158 -0.09372958 0.2583495 ]

  1. It looks like the translation and orientation are overwritten everytime a new view is created. It can be avoided by putting the attributes into def __init__(self):.
    Furthermore, I believe the orientation is not set properly, according to [http://kieranwynn.github.io/pyquaternion/] the w,x,y,z attributes should be accessed using [0],[1],[2],[3] and not .w, .x, .y, .z

  2. After correcting this problem, there is also an error when computing the optical flows:
    def GetFlow(image1, image2): flow = cv2.calcOpticalFlowFarneback(\ cv2.cvtColor(image1,cv2.COLOR_BGR2GRAY),\ cv2.cvtColor(image2,cv2.COLOR_BGR2GRAY),\ 0.5, 3, 100, 100, 7, 1.5, 0) a,b,c = cv2.split(image1) return cv2.merge((a,b))
    This function doesn't really compute the flow and I guess it should return flow ?

  3. Also the following line:
    flow_grad_magnitude[reliability > max_reliability] = magnitude
    is producing an error since you try to access with a boolean a 2d array.

  4. Moreover, the reliability is not computed as in the paper it is just defined as
    reliability = np.zeros((flow.shape[0], flow.shape[1]))

I hope that these information will help fixing the code!

from ar-depth.

flow-dev avatar flow-dev commented on June 12, 2024

How are you? The demo of siggraph asia 2018 was fantastic. I will be waiting for your re-implementation!!!

from ar-depth.

flamehaze1115 avatar flamehaze1115 commented on June 12, 2024

How are you? The demo of siggraph asia 2018 was fantastic. I will be waiting for your re-implementation!!!

same!

from ar-depth.

siamiz88 avatar siamiz88 commented on June 12, 2024

Happy new year ~!
Thank you for your updating. I will check and leave some comments.

Could you let me know how to obtain a future framework in the real-time application?
Did you delay some frames to get 'past / current / future' frames?

from ar-depth.

holynski avatar holynski commented on June 12, 2024

Could you let me know how to obtain a future framework in the real-time application?

So, at the moment, the slow components (in order of slowest to fastest, tested on my laptop) are the following:

  • Solver: For the results presented in the paper, we used hierarchical preconditioning to reduce the number of overall solver iterations required (see the end of section 4.4 of the paper, or the original paper on the solver here). Adding this component is likely what will make the biggest difference.
  • Python loops: You may have noticed from my (less than elegant) Python code, but this is my first real foray into Python coding, and I otherwise code entirely in C++. So, the majority of image processing ops here are written in C-style loops, which is largely inefficient in Python. So, replacing these loops with vectorizeable operations (like OpenCV's matrix ops), will likely provide a big speedup. For examples of this, see the sections of code corresponding to Canny edges, reliability estimation, temporal median, and computing initialization.
  • Flow: To get even faster results, you can change the settings of the flow estimation to the faster preset. More specifically, change the "2" in the following line to "1" or "0":
    dis = cv2.optflow.createOptFlow_DIS(2)

Although -- if you're looking to run a real-time version of this code on a mobile phone, I would strongly suggest first porting this code to C/C++. The steps above should get you most of the way, and depending on your implementation, you may already be at real-time. To improve the speed even further, there are a number of other easy optimizations (mentioned in section 5.2 of the paper) that can be made to drastically reduce runtime. These include:

  • Implementing a real-time solver similar to the one described here.
  • Reducing the resolution at which the flow/reliability/gradient is computed. These are blurred after being computed anyways, so it likely doesn't matter if they're computed at a lower resolution and upsampled.
  • Optimizing/vectorizing the Canny implementation
  • Reducing the number of solver iterations at non-keyframes (places without new depth points), since we're not getting strong information about changes in 3D structure, and are mostly just moving object boundaries slightly.

Did you delay some frames to get 'past / current / future' frames?

In the provided code, the video is read all at once, and then frames are processed sequentially. When each frame is processed, optical flow is computed to both a past frame and a future frame, so yes, there is a slight delay from real-time. If you would like the code to instead ONLY rely on previous frames (i.e. remove the latency), you can do the following:

Remove the second loop from the function GetReferenceFrames() (remove the following code):

        for idx in range(view_id - 1, self.min_view_id, -1):
            if idx not in self.views:
                continue
            if (np.linalg.norm(pos -\
                              self.views[idx].Position()) > dist):
                ref.append(idx)
                break

Please let me know if this makes sense, and if you've managed to get the code working.

from ar-depth.

mpottinger avatar mpottinger commented on June 12, 2024

Just a heads up that the latest pyquaternion version has somehow broken this code. I had to downgrade to version 9.2 in order for it to work.

The original code seems to take about 2-4 minutes per frame on my core i5-9600k 3.7ghz desktop. Getting the first saved images took around an hour.

I was very interested in getting this to work real-time, so I tried my best at optimising it. I was able to get it down to 20 seconds per frame using Numba (a very nice JIT compiler tool for python), and changing the way the sparse matrix was populated before the solver.

The solver part takes 10 seconds for me, and the rest of the code takes the other 10. I am not well versed in the math behind this algorithm to use a different solver, so I think I am going to give up on getting it to run real time. Still, it was a fun exercise trying!

Looking forward to when Arkit/Arcore have this kind of functionality built-in, that will be truly amazing!

from ar-depth.

holynski avatar holynski commented on June 12, 2024

Just a heads up that the latest pyquaternion version has somehow broken this code. I had to downgrade to version 9.2 in order for it to work.

Thanks for pointing this out. I went ahead and updated the code to support the newest version. In doing so, I also realized that OpenCV has moved around their implementation of DISOpticalFlow, so I changed that too. You may need to upgrade to OpenCV 4.0 for everything to work. Please let me know if this works for you, so I can close this issue (I should have done this a while ago :-) )

As for timing -- if you really wanted to get a real-time working with the smallest amount of effort, I would suggest scaling down all of the inputs, and scaling them up at the very end using joint bilateral upsampling. I'd be glad to help if you were interested in implementing this.

from ar-depth.

roxanneluo avatar roxanneluo commented on June 12, 2024

I tested it with pyquaternion 0.9.5 and opencv 4.0.0 and it works for me.

from ar-depth.

holynski avatar holynski commented on June 12, 2024

Great -- closing this issue.

Feel free to open another one if anything else comes up.

from ar-depth.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.