Code Monkey home page Code Monkey logo

Comments (14)

nlamprian avatar nlamprian commented on July 18, 2024

Hello drhalftone,

Thank you for all your feedback. Soon, I'll start working again on the related projects, and I'll apply the fixes.

At the moment, I'm not able to run any OpenCL code, so I can't do much for you. But let's give it a try:

Please, compile CLUtils and run its tests. Is everything OK there?

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

DrHalftone:build dllau$ ./bin/clutils_vecAdd
Out of Range error: unordered_map::at: key not found (/Users/dllau/SourceTree/CLUtils/src/CLUtils.cpp:420)
DrHalftone:build dllau$ ./bin/clutils_tests
[==========] Running 5 tests from 4 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from CLEnv
[ RUN ] CLEnv.BasicFunctionality
Out of Range error: unordered_map::at: key not found (/Users/dllau/SourceTree/CLUtils/src/CLUtils.cpp:420)

The above shows you what I get when I run the vecAdd as well as the test executables.

On Dec 6, 2015, at 7:37 AM, Nick Lamprianidis [email protected] wrote:

Hello drhalftone,

Thank you for all your feedback. Soon, I'll start working again on the related projects, and I'll apply the fixes.

At the moment, I'm not able to run any OpenCL code, so I can't do much for you. But let's give it a try:

Please, compile CLUtils and run its tests. Is everything OK there?


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

To follow up my emails, I just wanted to say that I really want to help debug this code for two reasons. The first is that I’ve been doing a lot of opengl shader language programming, and now, I’m eager to learn how to do OpenCL programming. Also, I have been working for the past year to develop software for interfacing with RGB+D cameras. The code I have sends the video to my shader language code, which outputs XYZW+RGBA as a floating point buffer. Those buffers are then processed using libpointmatcher, which is a CPU implementation of ICP. I’ve been able to get it to run in real-time, but it takes up all my CPU cores to do so. It also doesn’t compile on Windows. So I’m looking for a GPU implementation of ICP that I can incorporate into my RGB+D video processing application.

That all being said, I’m a Qt programmer. So I’m importing your CMake stuff into Qt Creator. I can compile the CLUtils project, and I can edit the source code when a compiler error comes up. I can run the code, but I can’t run the debugger. So when I see an error, I can’t stop the code at the offending line and figure out what’s wrong. Perhaps you could instruct me on how to get my debugger to work with your code inside Qt Creator, and then I can contribute some meaningful fixes.

Dr. Daniel L. Lau, Professor and Certified Professional Engineer
Department of Electrical and Computer Engineering
University of Kentucky
Lexington, KY 40506-0046

office: (859) 257-1787
fax: (859) 257-3092
cell: (859) 312-8047
web: http://www.engr.uky.edu/~dllau

[email protected]:[email protected]

P Please consider the environment before printing this email.

CONFIDENTIALITY NOTICE: The information in this email may be confidential and/or privileged. This email is intended to be reviewed by only the individual or organization named above. If you are not the intended recipient or an authorized representative of the intended recipient, you are hereby notified that any review, dissemination, or copying of this email and its attachments, if any, or the information contained herein is prohibited. If you have received this email in error, please immediately notify the sender by return email and delete this email from your system.

from icp.

nlamprian avatar nlamprian commented on July 18, 2024

It's a problem with the kernel files. Try calling the readSource function and verify that you are able to access the files and the code is loaded properly. Beyond that point, I don't expect any errors since it's straightforward OpenCL workflow. If the problem persists, walk through the CLEnv constructor in CLUtils.cpp and check the intermediate variables.

As for debugger, I used CodeXL, only for profiling. I think it's available only on Windows/Linux.

OpenCL is the way to go for these kinds of things. Currently, there is too little work done on the CPU (e.g. Kinect library is by far the biggest load), but the communication with the CPU at every iteration of the ICP limits the performance. There are things I plan to do to make the process more accurate and faster, but the biggest change will come when I'll support OpenCL 2.0.

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

Out of curiosity, I noticed that you posted that the ICP implementation completes one iteration per 1.1msec for the given point cloud parameters. I’m not really sure I understand the meaning of cloud size parameters. I will read that paper in greater detail to figure that out. But in the mean time, can you post how long it took to merge just the two frames of RGB+D video that you show in Github? In other words, if I just plug my prime sense camera into your code and then slowly swing the camera from side to side, about how many frames per second do you think I can maintain with this code?

On Dec 6, 2015, at 10:39 AM, Nick Lamprianidis [email protected] wrote:

It's a problem with the kernel files. Try calling the readSource function and verify that you are able to access the files and the code is loaded properly. Beyond that point, I don't expect any errors since it's straightforward OpenCL workflow. If the problem persists, walk through the CLEnv constructor in CLUtils.cpp and check the intermediate variables.

As for debugger, I used CodeXL, only for profiling. I think it's available only on Windows/Linux.

OpenCL is the way to go for these kinds of things. Currently, there is too little work done on the CPU (e.g. Kinect library is by far the biggest load), but the communication with the CPU at every iteration of the ICP limits the performance. There are things I plan to do to make the process more accurate and faster, but the biggest change will come when I'll support OpenCL 2.0.


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

I’ve switched to XCode, so now I can now run a debugger. I’d love to know how to set up Qt Creator to debug CMake files, so maybe another Qt expert will chime in.

On Dec 6, 2015, at 10:39 AM, Nick Lamprianidis [email protected] wrote:

It's a problem with the kernel files. Try calling the readSource function and verify that you are able to access the files and the code is loaded properly. Beyond that point, I don't expect any errors since it's straightforward OpenCL workflow. If the problem persists, walk through the CLEnv constructor in CLUtils.cpp and check the intermediate variables.

As for debugger, I used CodeXL, only for profiling. I think it's available only on Windows/Linux.

OpenCL is the way to go for these kinds of things. Currently, there is too little work done on the CPU (e.g. Kinect library is by far the biggest load), but the communication with the CPU at every iteration of the ICP limits the performance. There are things I plan to do to make the process more accurate and faster, but the biggest change will come when I'll support OpenCL 2.0.


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

Okay, I changed the name of the files to include their absolute paths. So they are being loaded. The issue now is that the CLUtil constructor is extracting the kernel name as “initRand\0” instead of “initRand”. So this line of code:

        unsigned int kIdx = kernelIdx.at (pgIdx).at (std::string (kernel_name));

is throwing an exception when the two strings don’t match.

On Dec 6, 2015, at 10:39 AM, Nick Lamprianidis [email protected] wrote:

It's a problem with the kernel files. Try calling the readSource function and verify that you are able to access the files and the code is loaded properly. Beyond that point, I don't expect any errors since it's straightforward OpenCL workflow. If the problem persists, walk through the CLEnv constructor in CLUtils.cpp and check the intermediate variables.

As for debugger, I used CodeXL, only for profiling. I think it's available only on Windows/Linux.

OpenCL is the way to go for these kinds of things. Currently, there is too little work done on the CPU (e.g. Kinect library is by far the biggest load), but the communication with the CPU at every iteration of the ICP limits the performance. There are things I plan to do to make the process more accurate and faster, but the biggest change will come when I'll support OpenCL 2.0.


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

nlamprian avatar nlamprian commented on July 18, 2024

The repo's description was written early in the development and I haven't updated it. Some further tests have shown that an ICP iteration takes about 1.3msec. This number alone can be misleading. I'll explain below. The mean time for the ICP to complete is 60msec. I was able to run OCLSLAM (which performs ICP and builds a map) at 10Hz.

Practically, the numbers vary greatly. The time it takes to perform the ICP algorithm depends on the surfaces on the scene, the number of points chosen for each image, how the points were chosen, how accurate the nearest neighbor search is, what is the initial transformation. Also, from the perspective of the application running, there is a lot of dead time. If it needs 20 ICP iterations to complete, the time it really takes is greater than 20*1.3msec. The same goes between consecutive ICP alignments. Let's lump all these delays and attribute them to system load. So, the 1.3msec time for the ICP iteration, it actually says nothing about the actual system performance.

The set cardinalities (|F|=|M|) in the description refer to the number of the points used in the ICP calculation, and the representative points (|R|) are related to the RBC data structure.

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

Okay, it took some work because I’m not familiar with the standard template libraries, since Qt provides their own versions of this library. Basically, I had to delete the \0 from the character strings you were getting from the kernel files. I found an example online, so forgive me if its not an efficient means of remove the character. Here is what my loop looked like in the constructor of the CLUtil class:

        // Retrieve the kernels from program 0
        kernels.emplace_back ();
        kernelIdx.emplace_back ();
        for (unsigned int idx = 0; idx < kernel_names.size (); ++idx)
        {
            kernel_names[idx].erase(std::remove(kernel_names[idx].begin(), kernel_names[idx].end(), '\0'), kernel_names[idx].end());
            kernels[0].emplace_back (programs[0], kernel_names[idx].c_str ());
            kernelIdx[0][kernel_names[idx]] = idx;
        }

On Dec 6, 2015, at 1:05 PM, Nick Lamprianidis [email protected] wrote:

The repo's description was written early in the development and I haven't updated it. Some further tests have shown that an ICP iteration takes about 1.3msec. This number alone can be misleading. I'll explain below. The mean time for the ICP to complete is 60msec. I was able to run OCLSLAM (which performs ICP and builds a map) at 10Hz.

Practically, the numbers vary greatly. The time it takes to perform the ICP algorithm depends on the surfaces on the scene, the number of points chosen for each image, how the points were chosen, how accurate the nearest neighbor search is, what is the initial transformation. Also, from the perspective of the application running, there is a lot of dead time. If it needs 20 ICP iterations to complete, the time it really takes is greater than 20*1.3msec. The same goes between consecutive ICP alignments. Let's lump all these delays and attribute them to system load. So, the 1.3msec time for the ICP iteration, it actually says nothing about the actual system performance.

The set cardinalities (|F|=|M|) in the description refer to the number of the points used in the ICP calculation, and the representative points (|R|) are related to the RBC data structure.


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

Here is my output:

[==========] Running 5 tests from 4 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from CLEnv
[ RUN ] CLEnv.BasicFunctionality
unknown file: Failure
C++ exception with description "clEnqueueNDRangeKernel" thrown in the test body.
[ FAILED ] CLEnv.BasicFunctionality (115 ms)
[ RUN ] CLEnv.AddMoreCLObjects
[ OK ] CLEnv.AddMoreCLObjects (67 ms)
[----------] 2 tests from CLEnv (182 ms total)

[----------] 1 test from ProfilingInfo
[ RUN ] ProfilingInfo.BasicFunctionality
[ OK ] ProfilingInfo.BasicFunctionality (0 ms)
[----------] 1 test from ProfilingInfo (0 ms total)

[----------] 1 test from CPUTimer
[ RUN ] CPUTimer.BasicFunctionality
/Users/dllau/SourceTree/CLUtils/tests/tests.cpp:170: Failure
Expected: (timer.duration () - 100000) <= (1000), actual: 2794.41 vs 1000
[ FAILED ] CPUTimer.BasicFunctionality (103 ms)
[----------] 1 test from CPUTimer (103 ms total)

[----------] 1 test from GPUTimer
[ RUN ] GPUTimer.BasicFunctionality
[ OK ] GPUTimer.BasicFunctionality (11 ms)
[----------] 1 test from GPUTimer (12 ms total)

[----------] Global test environment tear-down
[==========] 5 tests from 4 test cases ran. (297 ms total)
[ PASSED ] 3 tests.
[ FAILED ] 2 tests, listed below:
[ FAILED ] CLEnv.BasicFunctionality
[ FAILED ] CPUTimer.BasicFunctionality

2 FAILED TESTS
Program ended with exit code: 1

On Dec 6, 2015, at 1:05 PM, Nick Lamprianidis [email protected] wrote:

The repo's description was written early in the development and I haven't updated it. Some further tests have shown that an ICP iteration takes about 1.3msec. This number alone can be misleading. I'll explain below. The mean time for the ICP to complete is 60msec. I was able to run OCLSLAM (which performs ICP and builds a map) at 10Hz.

Practically, the numbers vary greatly. The time it takes to perform the ICP algorithm depends on the surfaces on the scene, the number of points chosen for each image, how the points were chosen, how accurate the nearest neighbor search is, what is the initial transformation. Also, from the perspective of the application running, there is a lot of dead time. If it needs 20 ICP iterations to complete, the time it really takes is greater than 20*1.3msec. The same goes between consecutive ICP alignments. Let's lump all these delays and attribute them to system load. So, the 1.3msec time for the ICP iteration, it actually says nothing about the actual system performance.

The set cardinalities (|F|=|M|) in the description refer to the number of the points used in the ICP calculation, and the representative points (|R|) are related to the RBC data structure.


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

Okay, CLUtils to pass all tests. All that I had to do was explicitly modify the code to only create GPU devices. Here is what I am now getting for output:

[==========] Running 5 tests from 4 test cases.
[----------] Global test environment set-up.
[----------] 2 tests from CLEnv
[ RUN ] CLEnv.BasicFunctionality
[ OK ] CLEnv.BasicFunctionality (83 ms)
[ RUN ] CLEnv.AddMoreCLObjects
[ OK ] CLEnv.AddMoreCLObjects (81 ms)
[----------] 2 tests from CLEnv (164 ms total)

[----------] 1 test from ProfilingInfo
[ RUN ] ProfilingInfo.BasicFunctionality
[ OK ] ProfilingInfo.BasicFunctionality (0 ms)
[----------] 1 test from ProfilingInfo (0 ms total)

[----------] 1 test from CPUTimer
[ RUN ] CPUTimer.BasicFunctionality
[ OK ] CPUTimer.BasicFunctionality (100 ms)
[----------] 1 test from CPUTimer (100 ms total)

[----------] 1 test from GPUTimer
[ RUN ] GPUTimer.BasicFunctionality
[ OK ] GPUTimer.BasicFunctionality (14 ms)
[----------] 1 test from GPUTimer (14 ms total)

[----------] Global test environment tear-down
[==========] 5 tests from 4 test cases ran. (279 ms total)
[ PASSED ] 5 tests.
Program ended with exit code: 0

On Dec 6, 2015, at 1:05 PM, Nick Lamprianidis [email protected] wrote:

The repo's description was written early in the development and I haven't updated it. Some further tests have shown that an ICP iteration takes about 1.3msec. This number alone can be misleading. I'll explain below. The mean time for the ICP to complete is 60msec. I was able to run OCLSLAM (which performs ICP and builds a map) at 10Hz.

Practically, the numbers vary greatly. The time it takes to perform the ICP algorithm depends on the surfaces on the scene, the number of points chosen for each image, how the points were chosen, how accurate the nearest neighbor search is, what is the initial transformation. Also, from the perspective of the application running, there is a lot of dead time. If it needs 20 ICP iterations to complete, the time it really takes is greater than 20*1.3msec. The same goes between consecutive ICP alignments. Let's lump all these delays and attribute them to system load. So, the 1.3msec time for the ICP iteration, it actually says nothing about the actual system performance.

The set cardinalities (|F|=|M|) in the description refer to the number of the points used in the ICP calculation, and the representative points (|R|) are related to the RBC data structure.


Reply to this email directly or view it on GitHub #3 (comment).

from icp.

soulslicer avatar soulslicer commented on July 18, 2024

Hi was this change pushed to the main branch? Perhaps that is why I am getting my current issue?

from icp.

drhalftone avatar drhalftone commented on July 18, 2024

Now, I kept the changes to myself, and hoped that the maintainer would make the corrections.

On Oct 19, 2016, at 10:19 AM, Raaj <[email protected] mailto:[email protected]> wrote:

Hi was this change pushed to the main branch?


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub #3 (comment), or mute the thread https://github.com/notifications/unsubscribe-auth/AKL1ovLLb6R4ftvW-kCQkduiH8OFOs7fks5q1ib4gaJpZM4Gvk-t.

from icp.

soulslicer avatar soulslicer commented on July 18, 2024

Would you be able to send that codebase to my email?

from icp.

Related Issues (4)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.