hsafoundation / gccbrig Goto Github PK
View Code? Open in Web Editor NEWHSAIL (BRIG) frontend for gcc
License: GNU General Public License v2.0
HSAIL (BRIG) frontend for gcc
License: GNU General Public License v2.0
Running following snippet: http://pastebin.com/X2tkqnFN
Produces following BRIG with GCC HSA BE:
prog function &foo(arg_s32 %res)(
arg_s32 %a,
arg_s32 %b)
{
ld_arg_align(4)_s32 $s0, [%a];
@BB_2362_2:
sub_s32 $s0, $s0, 1;
cvt_u64_s32 $d0, $s0;
sbr_u64 $d0 [@BB_2362_3, @BB_2362_3, @BB_2362_4, @BB_2362_4, @BB_2362_5, @BB_2362_5, @BB_2362_6, @BB_2362_6, @BB_2362_6, @BB_2362_6, @BB_2362_6];
br @BB_2362_7;
@BB_2362_3:
mov_s32 $s1, 1;
mov_s32 $s0, $s1;
br @BB_2362_8;
@BB_2362_4:
mov_s32 $s1, 2;
mov_s32 $s0, $s1;
br @BB_2362_8;
@BB_2362_5:
mov_s32 $s1, 3;
mov_s32 $s0, $s1;
br @BB_2362_8;
@BB_2362_6:
mov_s32 $s0, 4;
mov_s32 $s0, $s0;
br @BB_2362_8;
@BB_2362_7:
mov_s32 $s1, -1;
mov_s32 $s0, $s1;
@BB_2362_8:
st_arg_align(4)_s32 $s0, [%res];
ret;
};
Where expected output for a == 0 is -1.
However running gccbrig produces:
http://pastebin.com/u3MBXQSD
Where $ ./a.out | HEAD:
./a.out | head
foo: 0 = 1
foo: 1 = 1
foo: 2 = 1
foo: 3 = 2
foo: 4 = 2
foo: 5 = 3
foo: 6 = 3
foo: 7 = 4
foo: 8 = 4
foo: 9 = 4
Thanks,
Martin
Running GCC HSA BE of:
http://pastebin.com/v6DVryXw
HSAIL:
prog function &foo(align(4) arg_u8 %res[12])(align(4) arg_u8 %c[12])
{
align(4) private_u8 %__private_0[12];
align(4) private_u8 %__hsa_anonymous_2357[12];
ld_arg_align(8)_u64 $d0, [%c];
st_private_align(8)_u64 $d0, [%__private_0];
ld_arg_align(4)_u32 $s0, [%c][8];
st_private_align(4)_u32 $s0, [%__private_0][8];
@BB_2354_2:
st_private_align(4)_s32 11, [%__private_0];
ld_private_align(8)_u64 $d0, [%__private_0];
st_private_align(8)_u64 $d0, [%__hsa_anonymous_2357];
ld_private_align(4)_u32 $s0, [%__private_0][8];
st_private_align(4)_u32 $s0, [%__hsa_anonymous_2357][8];
@BB_2354_3:
ld_private_align(8)_u64 $d0, [%__hsa_anonymous_2357];
st_arg_align(8)_u64 $d0, [%res];
ld_private_align(4)_u32 $s0, [%__hsa_anonymous_2357][8];
st_arg_align(4)_u32 $s0, [%res][8];
ret;
};
Produces following ICE:
http://pastebin.com/WdJYZ9Ne
Thanks,
Martin
Hello, I noticed the hsafoundation site data base connection is failing. Is HSAFoundation still a real effort taking place?
Hello.
Compiling http://pastebin.com/00S64WAs:
$ gcc -O2 -g -fdump-tree-hsagen -fno-tree-vectorize -fopenmp switch-3.c
$ ./a.out
dlopen() error: /tmp/phsa-finalizer-4gw71X/ec73e258e6d9.so: undefined symbol: CSWTCH_12
symbol 'main__omp_fn_0' got address 0!
Aborted (core dumped)
Thanks,
Martin
Running: http://pastebin.com/2L4bHmEv
with -O0 and HSA BE produces following double-free issue:
==20984== Thread 2:
==20984== Invalid write of size 8
==20984== at 0xDC92C4B: _foo__omp_fn_0 (in /tmp/phsa-finalizer-cck4Oq/964ed0b53097.so)
==20984== by 0xE0B5953: phsa_execute_work_groups (workitems.c:486)
==20984== by 0xE0B5A45: __phsa_launch_wg_function (workitems.c:565)
==20984== by 0xDC92E58: foo__omp_fn_0 (in /tmp/phsa-finalizer-cck4Oq/964ed0b53097.so)
==20984== by 0x5EEED2A: CpuAgent::DoWork() (cpu_agent.cpp:155)
==20984== by 0x5EEE85F: CpuKernelExecutorThread(CpuAgent*) (cpu_agent.cpp:68)
==20984== by 0x5EF7E0C: void boost::_bi::list1<boost::_bi::value<CpuAgent*> >::operator()<void (*)(CpuAgent*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(CpuAgent*), boost::_bi::list0&, int) (bind.hpp:259)
==20984== by 0x5EF7AA2: boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > >::operator()() (bind.hpp:1222)
==20984== by 0x5EF76CB: boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > > >::run() (thread.hpp:116)
==20984== by 0x6D4F724: ??? (in /usr/lib64/libboost_thread.so.1.60.0)
==20984== by 0x506D4A3: start_thread (in /lib64/libpthread-2.22.so)
==20984== by 0x536CDEC: clone (in /lib64/libc-2.22.so)
==20984== Address 0x8a0f408 is 8 bytes after a block of size 768 alloc'd
==20984== at 0x4C2C5A6: memalign (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20984== by 0x4C2C6B1: posix_memalign (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==20984== by 0xE0B5877: phsa_execute_work_groups (workitems.c:463)
==20984== by 0xE0B5A45: __phsa_launch_wg_function (workitems.c:565)
==20984== by 0xDC92E58: foo__omp_fn_0 (in /tmp/phsa-finalizer-cck4Oq/964ed0b53097.so)
==20984== by 0x5EEED2A: CpuAgent::DoWork() (cpu_agent.cpp:155)
==20984== by 0x5EEE85F: CpuKernelExecutorThread(CpuAgent*) (cpu_agent.cpp:68)
==20984== by 0x5EF7E0C: void boost::_bi::list1<boost::_bi::value<CpuAgent*> >::operator()<void (*)(CpuAgent*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(CpuAgent*), boost::_bi::list0&, int) (bind.hpp:259)
==20984== by 0x5EF7AA2: boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > >::operator()() (bind.hpp:1222)
==20984== by 0x5EF76CB: boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > > >::run() (thread.hpp:116)
==20984== by 0x6D4F724: ??? (in /usr/lib64/libboost_thread.so.1.60.0)
==20984== by 0x506D4A3: start_thread (in /lib64/libpthread-2.22.so)
Thanks
Martin
Running gccbrig for attached BRIG file produces:
$ /Programming/bin/gccbrig/bin/gcc z.brig
In function ‘_main__omp_fn_0’:
brig1: internal compiler error: in build_address_operand, at brig/brigfrontend/brig-code-entry-handler.cc:750
0x585915 brig_code_entry_handler::build_address_operand(BrigInstBase const&, BrigOperandAddress const&)
../../gcc/brig/brigfrontend/brig-code-entry-handler.cc:750
0x585ed5 brig_code_entry_handler::build_operands(BrigInstBase const&)
../../gcc/brig/brigfrontend/brig-code-entry-handler.cc:1881
0x591341 brig_mem_inst_handler::operator()(BrigBase const*)
../../gcc/brig/brigfrontend/brig-mem-inst-handler.cc:172
0x595f8c brig_to_generic::parse(char const*)
../../gcc/brig/brigfrontend/brig_to_generic.cc:275
0x57f5d4 brig_langhook_parse_file
../../gcc/brig/brig-lang.c:216
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.
HSAIL file (please assemble the HSAIL file):
http://pastebin.com/gk8aNn5k
Martin
Running following snippet with GCC HSA BE:
http://pastebin.com/x93CvJDM
Produces:
prog kernel &main__omp_fn_0(kernarg_u64 %_omp_data_i)
{
ld_kernarg_align(8)_u64 $d0, [%_omp_data_i];
@BB_3045_2:
ld_align(8)_u64 $d0, [$d0];
st_align(4)_u8 9, 0, 0, 0, [$d0+8];
ret;
};
While running gccbrig I see following wrong store of zero value (instead of 9):
__wi_loop_x:
d0_28 = MEM[(unsigned long *)__args_27(D)];
_29 = VIEW_CONVERT_EXPR<unsigned char *>(d0_28);
d0_30 = MEM[(unsigned long *)_29];
_31 = d0_30 + 8;
_32 = VIEW_CONVERT_EXPR<unsigned char *>(_31);
*_32 = 0;
__local_x_34 = __local_x_1 + 1;
if (__cur_wg_size_x_11 > __local_x_34)
goto <bb 5> (__wi_loop_x);
else
goto <bb 6>;
Thanks,
Martin
BRIG is little endian, and gccbrig currently basically assumes everything is little endian. If the host that runs the finalizer is big endian, there are probably some issues that need to be resolved case by case.
Having a test-case with:
alloca_align(8)_u32 $s1, $s1;
Produces ICE in:
In function ‘main__omp_fn_0’:
brig1: internal compiler error: mem inst opcode 104 not implemented
0x590318 brig_mem_inst_handler::build_mem_access(BrigInstBase const, tree_node_, tree_node_)
../../gcc/brig/brigfrontend/brig-mem-inst-handler.cc:40
0x5903aa brig_mem_inst_handler::operator()(BrigBase const_)
../../gcc/brig/brigfrontend/brig-mem-inst-handler.cc:155
0x593fec brig_to_generic::write_globals()
../../gcc/brig/brigfrontend/brig_to_generic.cc:625
0x581a9d brig_langhook_write_globals
../../gcc/brig/brig-lang.c:349
Thanks,
Martin
Building http://pastebin.com/99Cb9C9H:
$ gcc -fopenmp switch-3.c -O2
$ ./a.out
produces following segfault (valgrind error):
==11791== For counts of detected and suppressed errors, rerun with: -v
==11791== ERROR SUMMARY: 9 errors from 9 contexts (suppressed: 0 from 0)
==11788== Thread 2:
==11788== Invalid read of size 1
==11788== at 0x9C9CE64: _main__omp_fn_0 (in /tmp/phsa-finalizer-rV0Pks/8959f5e78937.so)
==11788== by 0xD0BF953: phsa_execute_work_groups (workitems.c:486)
==11788== by 0xD0BFA45: __phsa_launch_wg_function (workitems.c:565)
==11788== by 0x9C9CF20: main__omp_fn_0 (in /tmp/phsa-finalizer-rV0Pks/8959f5e78937.so)
==11788== by 0x4EF870C: CpuAgent::DoWork() (cpu_agent.cpp:155)
==11788== by 0x4EF8241: CpuKernelExecutorThread(CpuAgent*) (cpu_agent.cpp:68)
==11788== by 0x4F01A80: void boost::_bi::list1<boost::_bi::value<CpuAgent*> >::operator()<void (*)(CpuAgent*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(CpuAgent*), boost::_bi::list0&, int) (bind.hpp:259)
==11788== by 0x4F01716: boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > >::operator()() (bind.hpp:1222)
==11788== by 0x4F0133F: boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > > >::run() (thread.hpp:116)
==11788== by 0x6751724: ??? (in /usr/lib64/libboost_thread.so.1.60.0)
==11788== by 0x55AE4A3: start_thread (in /lib64/libpthread-2.22.so)
==11788== by 0x58ADDEC: clone (in /lib64/libc-2.22.so)
==11788== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==11788==
==11788==
==11788== Process terminating with default action of signal 11 (SIGSEGV)
==11788== Access not within mapped region at address 0x0
==11788== at 0x9C9CE64: _main__omp_fn_0 (in /tmp/phsa-finalizer-rV0Pks/8959f5e78937.so)
==11788== by 0xD0BF953: phsa_execute_work_groups (workitems.c:486)
==11788== by 0xD0BFA45: __phsa_launch_wg_function (workitems.c:565)
==11788== by 0x9C9CF20: main__omp_fn_0 (in /tmp/phsa-finalizer-rV0Pks/8959f5e78937.so)
==11788== by 0x4EF870C: CpuAgent::DoWork() (cpu_agent.cpp:155)
==11788== by 0x4EF8241: CpuKernelExecutorThread(CpuAgent*) (cpu_agent.cpp:68)
==11788== by 0x4F01A80: void boost::_bi::list1<boost::_bi::value<CpuAgent*> >::operator()<void (*)(CpuAgent*), boost::_bi::list0>(boost::_bi::type<void>, void (*&)(CpuAgent*), boost::_bi::list0&, int) (bind.hpp:259)
==11788== by 0x4F01716: boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > >::operator()() (bind.hpp:1222)
==11788== by 0x4F0133F: boost::detail::thread_data<boost::_bi::bind_t<void, void (*)(CpuAgent*), boost::_bi::list1<boost::_bi::value<CpuAgent*> > > >::run() (thread.hpp:116)
==11788== by 0x6751724: ??? (in /usr/lib64/libboost_thread.so.1.60.0)
==11788== by 0x55AE4A3: start_thread (in /lib64/libpthread-2.22.so)
==11788== by 0x58ADDEC: clone (in /lib64/libc-2.22.so)
==11788== If you believe this happened as a result of a stack
==11788== overflow in your program's main thread (unlikely but
==11788== possible), you can try to increase the size of the
==11788== main thread stack using the --main-stacksize= flag.
==11788== The main thread stack size used in this run was 8388608.
When I try to make the gccbrig
it shows the message about memory,
virtual memory exhausted: Cannot allocate memory
virtual memory exhausted: Cannot allocate memory
and I type "top" to check
GiB Mem: 3.247 total, 3.165 used, 0.081 free, 0.005 buffe
GiB Swap: 7.629 total, 7.571 used, 0.058 free. 0.044 cache
Does it really need a big memory when I compiler it ?
or is there something wrong about my configuration ?
I found that in some tests there is illegal code motion when running with higher optimizations. It moves a 16b load across a byte write that touches part of that word.
This is likely a gcc codegen bug, but could be something with type based alias analysis, or similar because gccbrig heavily casts pointers to different types depending on the memory access width at hand.
;; ************ Correct code optimized with -O1:
leaq 128(%rbp,%rax), %rdx
;; byte 1 stored here
movb %r8b, (%rdx)
movzbl 1(%rsi), %r8d
leaq 129(%rbp,%rax), %rsi
;; byte 2 stored here
movb %r8b, (%rsi)
movq 8(%rbx), %rax
;; The 16b load that accesses both bytes
movswl (%rdx), %r8d
;; ************ Broken code optimized with -O2:
leaq 128(%rbp,%rax), %rdi
movq %rcx, %rdx
addq (%rbx), %rdx
movzbl (%rdx), %esi
;; byte 1 stored here
movb %sil, (%rdi)
movzbl 1(%rdx), %esi
;; Illegal hoist above the store below!
movswl (%rdi), %edx
;; this modifies the 2nd byte:
movb %sil, 129(%rax,%rbp)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.