Code Monkey home page Code Monkey logo

Comments (40)

JammyZhou avatar JammyZhou commented on May 19, 2024 1

Thanks for the suggestions. I will have a try with that.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Hmmm. right. not much info.

We can go in two directions:

  • in coriander, we can gradually drill down, find out where it's failing. but looks like failing in the kernel launch call
  • we can also start with a basic opencl application, without using coriander, and gradually make it more complicated, till we reproduce the issue

Let's start with this second approach, since it's more satisfying to run stuff until it breaks, than to repeatedly run somethign that doesnt work :-)

I'm thinking, perhaps you can start by installing EasyCL, build the unit tests, run those, see how far that gets? (If even easycl crashes, we'll go lower :-) )

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

I tried cuda_sample on my Intel machine, and it can work as expected.

OpenCL platform: Intel Gen OCL Driver
OpenCL device: Intel(R) HD Graphics Haswell CRW GT3 Desktop
hostFloats[2] 123
hostFloats[2] 222
hostFloats[2] 444

I used COCL_DUMP_CL=1 to dump the generated OpenCL kernel for cuda_sample on both platforms, and the kernel source is exactly the same as attached. So I'm suspecting that there is some problem in the PowerVR OpenCL driver on ARM64 platform. After adding some debug info in EasyCL, I found that the problem happened when calling clBuildProgram. Maybe it is another bug of the driver, or there is some compatibility issue. Will dig into it later.

0.cl.txt

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Please run the easycl unit tests, and let me know what happens.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

By the way, you might consider trying COCL_OFFSETS_32BIT=1, https://github.com/hughperkins/coriander/blob/master/doc/advanced_usage.md#cocl_offsets_32bit-for-beignet

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

@hughperkins only testscalars.test1 failed, and all other 62 tests passed. The same results with "COCL_OFFSETS_32BIT=1". And the failing exception seems a little strange.

$ ./easycl_unittests --gtest_filter=testscalars.test1
args: ./easycl_unittests --gtest_filter=testscalars.test1
Note: Google Test filter = testscalars.test1
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from testscalars
[ RUN      ] testscalars.test1
found opencl library
OpenCL platform: PowerVR Rogue
OpenCL device: PowerVR Rogue G6230
-156 -155 -154 -153 -152 
3000 3001 3002 3003 3004 
-2524653 -2524652 -2524651 -2524650 -2524649 
1353523545 1353523546 1353523547 1353523548 1353523549 
1.234 2.234 3.234 4.234 5.234 
unknown file: Failure
C++ exception with description "FAIL 4.2340 != 4.2340" thrown in the test body.
[  FAILED  ] testscalars.test1 (108 ms)
[----------] 1 test from testscalars (109 ms total)
 
[----------] Global test environment tear-down
[==========] 1 test from 1 test case ran. (109 ms total)
[  PASSED  ] 0 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ] testscalars.test1
 
 1 FAILED TEST

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

By the way, Clang/LLVM 3.3 is used by the OCL driver as I know. Will it make a difference? Since Clang/LLVM 4.0.0 is used by Coriander now.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

C++ exception with description "FAIL 4.2340 != 4.2340" thrown in the test body.

This is probalby an artifact. If you pull down latest (last 3 minutes or so) version of EasyCL, it probalby passes now?

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

By the way, Clang/LLVM 3.3 is used by the OCL driver as I know. Will it make a difference? Since Clang/LLVM 4.0.0 is used by Coriander now.

Maybe. We might need to link statically

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Thats a good observation actually, about using two versions of llvm. Let's try linking statically.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

As for how to link statically... I'm not sure... but it might involve hacking around either in bin/cocl, and/or maybe some of the bash scripts in the coriander cmake directory.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Hmmm, seems it is already statically linked. easy to check: run nm libcocl.so | grep -i llvm. Any lines that start with a bunch of zeros, then T are linked dynamically. But there are no such lines.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Lets turn on some logging. Can you go into the ccmake .. page, for Coriander build, and turn on COCL_SPAM, then press c, then turn on all the COCL_SPAM_xxxx options , press c a couple of times, then g, then rebuild, and retest, please?

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

Yes, with latest EasyCL, the easycl_unittests can all pass now. With regard to COCL_SPAM_XXXX, I can only see COCL_SPAM option with ccmake .. (no more sub options). Did I miss something?

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Maybe you need the latest coriander, from github master branch? Or maybe that's something in my dev branch, I suppose that's possible... as far as the latter, I'll probably be merging current dev to master in the next few hours.

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

With all COCL_SPAM_XXX options enabled, there is still no good clue.

$ sudo ./cuda_sample
OpenCL platform: PowerVR Rogue
OpenCL device: PowerVR Rogue G6230
[MEM] cudaMalloc using cl, size 4096 memory=0x55816c9580 fakePos=128
[LAUNCH] =========================================
[LAUNCH] setKernelArgGpuBuffer offset=0
[LAUNCH] setKernelArgInt32 2
[LAUNCH] setKernelArgFloat 123
[LAUNCH] kernelGo() kernel: _Z8setValuePfif
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Stack dump:
0.	<eof> parser at end of file

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

'parser at end of file'. Sounds like maybe it is crashing whilst parsing the bytecode?

Let's try sprinkling some couts around a bit. Can you open the file src/hostside_opencl_funcs.cpp, and add some lines like cout << "generateOpencl 1" << endl;, cout << "some text thats meaningful, anything ok" << endl; in the methods generateOpenCL and compileOpenCLKernel, and rebuild, and retest, and try to find out which
line is seg faulting?

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

I did that already before. The segfault happened at clBuildProgram.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Oh cool. Thats good info. Can you get hold of the program it's trying to build? You can do this by setting the enviornment variable COCL_DUMP_CL=1, like:

COCL_DUMP_CL=1 ./cuda_sample

The opencl file should appear at /tmp/0.cl

You can then run from this opencl file, by swapping COCL_DUMP_CL=1 with COCL_LOAD_CL=1:

COCL_LOAD_CL=1 ./cuda_sample

What you could then try is, open /tmp/0.cl in a text editor, and comment out everything except the kernel function declaration, and see if that does/doesnt build/run ok. If it doesnt, then report backw ith that information. If it does run ok, then start to uncomment stuff, and try to find the single line that if added, triggers the build to fail, and if commented out, then build works ok.

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

I dumped the CL kernel before with COCL_DUMP_CL, but didn't use COCL_LOAD_CL, which seems a good method to debug the kernel line by line. I will try it later today, thanks!

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

The bad_alloc exception is still there with only the kernel declaration in /tmp/0.cl. But I'm not quite sure if 0.cl was correctly loaded, because no additional error happened for the loading even if I removed this file.

$ sudo COCL_LOAD_CL=1 ./cuda_sample
OpenCL platform: PowerVR Rogue
OpenCL device: PowerVR Rogue G6230
[MEM] cudaMalloc using cl, size 4096 memory=0x557e4db580 fakePos=128
[LAUNCH] =========================================
[LAUNCH] setKernelArgGpuBuffer offset=0
[LAUNCH] setKernelArgInt32 2
[LAUNCH] setKernelArgFloat 123
[LAUNCH] kernelGo() kernel: _Z8setValuePfif
loading cl sourcecode from /tmp/0.cl
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Stack dump:
0.	<eof> parser at end of file

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

I added some cout in compileOpenCLKernel and verified that 0.cl is loaded successfully. So the problem happens no matter what the kernel source is (even empty).

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

Will the problem be related to the CL calling sequence in Coriander side not compatible with the CL driver?

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

I did some experiment with testStructs.cl from easycl_unittests. It can run successfully as below.
$ sudo ./easycl_unittests --gtest_filter=testStructs.main

Then I copy testStructs.cl to /tmp/0.cl, and run cuda_sample with COCL_LOAD_CL=1, but it failed as mentioned before.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

The problem is still there if 0.cl contains only the line below. I agree with you that the problem may happen before clBuildProgram already.

kernel void test();

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

@hughperkins I'm still suspecting the failure is caused by LLVM version mismatch, although LLVM is statically linked. C++11 from LLVM 4.0 is used by Coriander, while C++11 is not supported by LLVM 3.3 used by the OpenCL driver. Is there some way to disable C++11 in Coriander side for a try?

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

'disable c++11'. I assume c++11 throughout the code. You can simply remove the -std=c++11 from the CMakeLists.txt, and the bin/cocl/bin/cocl.py script, but I am not sure that will compile. You'll probably need to modify a bunch of things.

What do you mean by 'c++11 is not supported by llvm 3.3'? Code that works on c++98 will run ok using c++11 runtime. I"m like 98% sure of this point, though you might want to search for some evidence for/against.

The OpenCL uses neither c++11 nor c++98: it is written in pure C99 C-code. You can check this point by looking at the generated OpenCL, as earlier.

from coriander.

JammyZhou avatar JammyZhou commented on May 19, 2024

Maybe my description was not accurate. Currently Coriander is built by GCC 6.3/LLVM 4.0 with c++11 enabled, while the OpenCL driver is built by GCC 4.9/LLVM 3.3 with c++11 disabled.

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

So, here's how I imagine it in my head:

https://docs.google.com/presentation/d/1ZLMkQkYXb89hhPuWmKjjiPqSSQ7ER9oeSSKB_MN7pgQ/edit?usp=sharing

screen shot 2017-06-20 at 9 59 01 am

Whilst everything links to c++11 runtime library, that library should be backwards compatible. In what way do you feel that libcocl.so itself using or not using c++11 functions will influence the behavior of gpu_driver.so ?

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

(I suspect the original thesis of linking with two llvm versions is possible though. If I was in your position, I might try making coriander work with llvm 3.3, but that sounds ooollllldddddd. I dont know if things that Coriander uses are supported or not. I have no idea whether that's a lot of work or not).

(after pondering a bit: another way forward could be to rename all the llvm symbols, to have some prefix, eg llvm40_. Since llvm can be built from source, this might not be as hard as it sounds. Though it does sound quite hard :-P . But .... maybe can be done after compilation?

[after googling a bit] Ok, I googled a bit, for things like 'linux linker rename symbols', and didnt find a solution, but found some interesting bread-crumbs:

I kind of like this idea: https://stackoverflow.com/questions/7204836/renaming-symbols-at-compile-time-without-changing-the-code-in-a-cross-platform-w/7215980#7215980
screen shot 2017-06-20 at 10 51 04 am

I know that llvm has a ton of symbols, but you can list them all out, after building the libraries, using nm, and then use that to generate, via some kind of a script, a huge list of #defines, in a .h file, and pass that to -include, or similar, at llvm compile time. So, you can rename the symbols from eg Instruction to llvm40_Instruction (slightly complicated by the C++ name mangling... not sure how big an issue taht will be). Then, use the same define file for Coriander and ... hopefully all will work ok???

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

(Note that renaming the llvm symbols could be generally useful; Oclgrind seems to encounter similar llvm version clashes, plausibly:

)

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

(another link: JuliaLang/julia#12644 )

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

(note: just to be clear: you dont need to rebuild clang/clang++, or any of the clang/llvm tools, just the llvm libraries, ie the ones that have names like libLLVMCore.a, libLLVMSupport.a, ...

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Seems this might work :-) . I had a dabble at renaming llvm::Value, a particularly common llvm symbol to llvm::llvm40_Value. Here are nm dumps before and after

These are dumps of the libLLVMCore.so library, with the names unmangled, and grepping for ::Value. On Mac, to display these names, I did:

nm -A lib/libLLVMCore.a  | c++filt  | grep '::Value' | tee ~/Documents/value_after.txt

On linux, you'd probalby use nm -C, instead of | c++filter. The -A is not necessary, I just included it by accident, whislt trying to figure out how to unmangle the names.

To rename the values, I did the following:

#pragma once
#ifndef Value
#define Value llvm40_Value
#endif
  • added the following, at the start of CMakeLists.txt, at line 22:
set(CMAKE_CXX_FLAGS "-include ${CMAKE_SOURCE_DIR}/defines.h")
  • did the whole mkdir build; cd build; ccmake .. thing:
    • set LLVM_TARGETS_TO_BUILD to X86, so it doesnt build the whole world
    • deselected LLVM_BUILD_TOOLS, LLVM_BUILD_UTILS
    • (everything else was by default, I think)
  • ran:
make -j 8 LLVMCore

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

whoa, check this out :-P . Rather than spending hours renaming all of the eight thousand one hundred and forty seven symbols that libcocl.so uses, simply renamed the namespace from llvm to llvm40 :-D https://gist.github.com/hughperkins/124e75193921681aa423460701102b5e

Did the whole set of symbols in one go :-)

The defines.h for this was:

#pragma once

#ifndef Value
#define Value llvm40_Value
#endif

#ifndef llvm
#define llvm llvm40
#endif

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

(Can I leave you try this, see how it goes? By the way, you might be able to simply add, in the CMakeLists.txt:

add_definitions(-Dllvm=llvm40)

)

from coriander.

hughperkins avatar hughperkins commented on May 19, 2024

Cool. Looking forward to knowing what happens on this :-) . Hoping this might solve a bunch of similar-ish problems.

from coriander.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.