Comments (4)
Yes, it seems like a fundamental issue in roctracer (i.e. outside of the scope of omnitrace). I’ll pass on the bug report and see if it can get patched.
from omnitrace.
Could you provide a backtrace? There is usually one printed out when you hit Ctrl+C.
I suspect there is something funny going on in roctracer, which delivers callbacks to omnitrace about the HIP calls. Can you try disabling roctracer support and see if it still hangs? Could you also try running it with rocprof and seeing if it still hangs?
from omnitrace.
Below is the full output of the program, after hitting ctrl+c.
Unfortunately I don't have time to investigate the other things right now, will get back to it on Monday.
$ omnitrace-sample -- ./program.x
HSA_TOOLS_LIB=/pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2
HSA_TOOLS_REPORT_LOAD_FAILURE=1
LD_PRELOAD=/pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2
OMNITRACE_CRITICAL_TRACE=false
OMNITRACE_USE_PROCESS_SAMPLING=false
OMNITRACE_USE_SAMPLING=true
OMP_TOOL_LIBRARIES=/pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2
ROCP_HSA_INTERCEPT=1
ROCP_TOOL_LIB=/pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace.so.1.10.2
[omnitrace][dl][11382] omnitrace_main
[omnitrace][11382][omnitrace_init_tooling] Instrumentation mode: Sampling
______ .___ ___. .__ __. __ .___________..______ ___ ______ _______
/ __ \ | \/ | | \ | | | | | || _ \ / \ / || ____|
| | | | | \ / | | \| | | | `---| |----`| |_) | / ^ \ | ,----'| |__
| | | | | |\/| | | . ` | | | | | | / / /_\ \ | | | __|
| `--' | | | | | | |\ | | | | | | |\ \----./ _____ \ | `----.| |____
\______/ |__| |__| |__| \__| |__| |__| | _| `._____/__/ \__\ \______||_______|
omnitrace v1.10.2 (rev: 0b751d2aef7d32d8b4fab184d0b34d4013b6d986, tag: v1.10.2, compiler: GNU v7.5.0, rocm: v5.2.x)
[761.763] perfetto.cc:58656 Configured tracing session 1, #sources:1, duration:0 ms, #buffers:1, total buffer size:1024000 KB, total sessions:1, uid:0 session name: ""
AAA
BBB
CCC
I am dummy kernel 1
DDD
EEE
^C
[omnitrace][11382][0] Signal 2 caught : Interrupt (Signal sent by the kernel 0 0)
### ERROR ### [omnitrace][PID=11382][TID=0] signal=2 (SIGINT) interrupt program. code: 128
Backtrace:
[PID=11382][TID=0][0/7] __restore_rt
[PID=11382][TID=0][1/7] hsa_amd_image_get_info_max_dim +0x5e9
[PID=11382][TID=0][2/7] hsa_amd_image_get_info_max_dim +0x4ba
[PID=11382][TID=0][7/7] main +0xfa
[PID=11382][TID=0][8/7] omnitrace_main +0x3bd
[PID=11382][TID=0][9/7] __libc_start_main +0xef
[PID=11382][TID=0][10/7] _start +0x2a
Backtrace (demangled):
[PID=11382][TID=0][0/11] /lib64/libpthread.so.0(+0x168c0) [0x1531bc0268c0]
[PID=11382][TID=0][1/11] /opt/rocm/lib/libhsa-runtime64.so.1(+0x4ce49) [0x1531b26b2e49]
[PID=11382][TID=0][2/11] /opt/rocm/lib/libhsa-runtime64.so.1(+0x4cd1a) [0x1531b26b2d1a]
[PID=11382][TID=0][3/11] /opt/rocm/lib/libhsa-runtime64.so.1(+0x40ce9) [0x1531b26a6ce9]
[PID=11382][TID=0][4/11] /opt/rocm/lib/libamdhip64.so.5(+0x27f2cb) [0x1531bae0d2cb]
[PID=11382][TID=0][5/11] /opt/rocm/lib/libamdhip64.so.5(+0x26e6ea) [0x1531badfc6ea]
[PID=11382][TID=0][6/11] /opt/rocm/lib/libamdhip64.so.5(hipDeviceSynchronize+0xe9) [0x1531bac13e09]
[PID=11382][TID=0][7/11] ./program.x() [0x20cd6a]
[PID=11382][TID=0][8/11] /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2(+0x14f5d) [0x1531bc66af5d]
[PID=11382][TID=0][9/11] /lib64/libc.so.6(__libc_start_main+0xef) [0x1531ba3ab29d]
[PID=11382][TID=0][10/11] ./program.x() [0x20cb5a]
/proc/11382/maps:
00200000-0020c000 r--p 00000000 a67:eeb30 144119912674099405 /pfs/lustrep1/users/homolaja/tests/hip_host_function/program.x
0020c000-0020e000 r-xp 0000b000 a67:eeb30 144119912674099405 /pfs/lustrep1/users/homolaja/tests/hip_host_function/program.x
0020e000-0020f000 r--p 0000c000 a67:eeb30 144119912674099405 /pfs/lustrep1/users/homolaja/tests/hip_host_function/program.x
0020f000-00210000 rwxp 0000c000 a67:eeb30 144119912674099405 /pfs/lustrep1/users/homolaja/tests/hip_host_function/program.x
00210000-021c0000 rw-p 00000000 00:00 0 [heap]
152906710000-152914000000 rw-p 00000000 00:00 0
152914000000-152914043000 rw-p 00000000 00:00 0
152914043000-152918000000 ---p 00000000 00:00 0
15291bc00000-15291fc2f000 rw-p 00000000 00:00 0
15291fe00000-152923e2f000 rw-p 00000000 00:00 0
152924000000-152924021000 rw-p 00000000 00:00 0
152924021000-152928000000 ---p 00000000 00:00 0
15292b200000-152a2b200000 ---p 00000000 00:00 0
152a2b400000-152b2b400000 ---p 00000000 00:00 0
152b2b600000-152c2b600000 ---p 00000000 00:00 0
152c2b800000-152d2b800000 ---p 00000000 00:00 0
152d2ba00000-152e2ba00000 ---p 00000000 00:00 0
152e2bc00000-152f2bc00000 ---p 00000000 00:00 0
152f2be00000-15302be00000 ---p 00000000 00:00 0
15302c000000-15302c021000 rw-p 00000000 00:00 0
15302c021000-153030000000 ---p 00000000 00:00 0
153032800000-1530335d5000 rw-p 00000000 00:00 0
153033600000-153035900000 ---p 10a186000 00:05 1606 /dev/dri/renderD128
153035900000-153133600000 ---p 00000000 00:00 0
153138000000-153138021000 rw-p 00000000 00:00 0
153138021000-15313c000000 ---p 00000000 00:00 0
15313c5ff000-15313c600000 ---p 00000000 00:00 0
15313c600000-15313c800000 rwxp 00000000 00:00 0
15313c800000-15313ca00000 rw-s 10004c000 00:05 1649 /dev/dri/renderD135
15313cc00000-15313cd01000 rw-p 00000000 00:00 0
15313ce00000-15313cf01000 rw-p 00000000 00:00 0
15313d000000-15313d200000 rw-s 10004c000 00:05 1643 /dev/dri/renderD134
15313d400000-15313d501000 rw-p 00000000 00:00 0
15313d600000-15313d701000 rw-p 00000000 00:00 0
15313d7fe000-15313d7ff000 ---p 00000000 00:00 0
15313d7ff000-15317bfff000 rw-p 00000000 00:00 0
15317bfff000-15317c000000 ---p 00000000 00:00 0
15317c000000-15317c021000 rw-p 00000000 00:00 0
15317c021000-153180000000 ---p 00000000 00:00 0
153180000000-153180021000 rw-p 00000000 00:00 0
153180021000-153184000000 ---p 00000000 00:00 0
153184200000-153184400000 rw-s 10004c000 00:05 1637 /dev/dri/renderD133
153184600000-153184701000 rw-p 00000000 00:00 0
153184800000-153184901000 rw-p 00000000 00:00 0
153184a00000-153184c00000 rw-s 10004c000 00:05 1631 /dev/dri/renderD132
153184e00000-153184f01000 rw-p 00000000 00:00 0
153185000000-153185101000 rw-p 00000000 00:00 0
153185200000-153185400000 rw-s 10004c000 00:05 1625 /dev/dri/renderD131
153185600000-153185701000 rw-p 00000000 00:00 0
153185800000-153185901000 rw-p 00000000 00:00 0
153185a00000-153185c00000 rw-s 10004c000 00:05 1619 /dev/dri/renderD130
153185e00000-153185f01000 rw-p 00000000 00:00 0
153186000000-153186101000 rw-p 00000000 00:00 0
153186200000-153186400000 rw-s 10004c000 00:05 1613 /dev/dri/renderD129
153186600000-153186701000 rw-p 00000000 00:00 0
153186800000-153186901000 rw-p 00000000 00:00 0
153186a00000-153186c00000 rw-s 100257000 00:05 1606 /dev/dri/renderD128
153186d00000-153186d80000 rw-p 00000000 00:00 0
153186e00000-153186f01000 rw-p 00000000 00:00 0
153187000000-153187101000 rw-p 00000000 00:00 0
1531871a7000-1531871a8000 ---p 00000000 00:00 0
1531871a8000-1531873a8000 rwxp 00000000 00:00 0
1531873a8000-1531873eb000 r-xp 00000000 07:01 7774 /opt/rocm-5.2.3/lib/libhsa-amd-aqlprofile64.so.1.0.50203
1531873eb000-1531875eb000 ---p 00043000 07:01 7774 /opt/rocm-5.2.3/lib/libhsa-amd-aqlprofile64.so.1.0.50203
1531875eb000-1531875ee000 r--p 00043000 07:01 7774 /opt/rocm-5.2.3/lib/libhsa-amd-aqlprofile64.so.1.0.50203
1531875ee000-1531875fb000 rw-p 00046000 07:01 7774 /opt/rocm-5.2.3/lib/libhsa-amd-aqlprofile64.so.1.0.50203
1531875fb000-1531875fc000 ---p 00000000 00:00 0
1531875fc000-1531877fc000 rwxp 00000000 00:00 0
1531877fc000-1531877fd000 ---p 00000000 00:00 0
1531877fd000-1531879fd000 rwxp 00000000 00:00 0
1531879fd000-1531879fe000 ---p 00000000 00:00 0
1531879fe000-153187dfe000 rw-p 00000000 00:00 0
153187dfe000-153187dff000 ---p 00000000 00:00 0
153187dff000-153187e00000 ---p 00000000 00:00 0
153187e00000-153188000000 rwxp 00000000 00:00 0
153188000000-153188021000 rw-p 00000000 00:00 0
153188021000-15318c000000 ---p 00000000 00:00 0
15318c000000-15318c021000 rw-p 00000000 00:00 0
15318c021000-153190000000 ---p 00000000 00:00 0
153190000000-15319002b000 rw-p 00000000 00:00 0
15319002b000-153194000000 ---p 00000000 00:00 0
15319406b000-15319406c000 ---p 00000000 00:00 0
15319406c000-15319426c000 rwxp 00000000 00:00 0
15319426c000-15319426d000 ---p 00000000 00:00 0
15319426d000-15319446d000 rwxp 00000000 00:00 0
15319446d000-1531959c0000 rw-p 00000000 00:00 0
153195a1c000-153195a1d000 ---p 00000000 00:00 0
153195a1d000-153195a60000 rwxp 00000000 00:00 0
153195a60000-153195a64000 rw-p 00000000 00:00 0
153195a68000-153195a6c000 rw-s 1093a9000 00:05 1606 /dev/dri/renderD128
153195a70000-153195a79000 rw-p 00000000 00:00 0
153195a80000-153195ac0000 rw-p 00000000 00:00 0
153195ac6000-153195ac8000 rw-p 00000000 00:00 0
153195aca000-153195acb000 ---p 00000000 00:00 0
153195acc000-153195acd000 rw-p 00000000 00:00 0
153195ace000-153195acf000 rw-p 00000000 00:00 0
153195ad0000-153195ad9000 rw-s 10938f000 00:05 1606 /dev/dri/renderD128
153195ada000-153195adb000 rw-p 00000000 00:00 0
153195adc000-153195ade000 rw-p 00000000 00:00 0
153195ae0000-153195ae1000 rw-p 00000000 00:00 0
153195ae2000-153195ae3000 ---p 1052de000 00:05 1606 /dev/dri/renderD128
153195ae4000-153195ae5000 rw-p 00000000 00:00 0
153195ae6000-153195ae7000 rw-p 00000000 00:00 0
153195ae8000-153195ae9000 rw-p 00000000 00:00 0
153195aea000-153195aec000 rw-s fd31c00000000000 00:05 1603 /dev/kfd
153195aee000-153195aef000 ---p 101269000 00:05 1606 /dev/dri/renderD128
153195af0000-153195af1000 rw-p 00000000 00:00 0
153195af2000-153195af3000 rw-p 00000000 00:00 0
153195af4000-153195af5000 rw-p 00000000 00:00 0
153195af6000-153195af7000 rw-p 00000000 00:00 0
153195af8000-153195af9000 rw-p 00000000 00:00 0
153195afa000-153195afb000 rw-p 00000000 00:00 0
153195afc000-153195afd000 rw-p 00000000 00:00 0
153195afe000-153195aff000 rw-p 00000000 00:00 0
153195b00000-153195b08000 rw-s 100044000 00:05 1606 /dev/dri/renderD128
153195b0a000-153195b0b000 rw-p 00000000 00:00 0
153195b0c000-153195b0d000 ---p 00000000 00:00 0
153195b0d000-153195d0d000 rwxp 00000000 00:00 0
153195d0d000-153195d0e000 ---p 00000000 00:00 0
153195d0e000-153195f0e000 rwxp 00000000 00:00 0
153195f0e000-1531ae593000 rw-p 00000000 00:00 0
1531ae593000-1531ae617000 r-xp 00000000 07:01 7818 /opt/rocm-5.2.3/lib/librocm_smi64.so.5.0.50203
1531ae617000-1531ae817000 ---p 00084000 07:01 7818 /opt/rocm-5.2.3/lib/librocm_smi64.so.5.0.50203
1531ae817000-1531ae81a000 rwxp 00084000 07:01 7818 /opt/rocm-5.2.3/lib/librocm_smi64.so.5.0.50203
1531ae81a000-1531ae81b000 rwxp 00000000 07:02 22662 /opt/cray/pe/papi/6.0.0.17/lib/libpfm.so.4.12.1
1531ae81b000-1531aea0b000 r-xp 00001000 07:02 22662 /opt/cray/pe/papi/6.0.0.17/lib/libpfm.so.4.12.1
1531aea0b000-1531aec0a000 ---p 001f1000 07:02 22662 /opt/cray/pe/papi/6.0.0.17/lib/libpfm.so.4.12.1
1531aec0a000-1531aecb2000 r--p 001f0000 07:02 22662 /opt/cray/pe/papi/6.0.0.17/lib/libpfm.so.4.12.1
1531aecb2000-1531aecb3000 rwxp 00298000 07:02 22662 /opt/cray/pe/papi/6.0.0.17/lib/libpfm.so.4.12.1
1531aecb3000-1531aed14000 rw-p 00299000 07:02 22662 /opt/cray/pe/papi/6.0.0.17/lib/libpfm.so.4.12.1
1531aed14000-1531aed16000 rw-p 00000000 00:00 0
1531aed16000-1531aed58000 r-xp 00000000 07:01 7821 /opt/rocm-5.2.3/lib/librocprofiler64.so.1.0.50203
1531aed58000-1531aef58000 ---p 00042000 07:01 7821 /opt/rocm-5.2.3/lib/librocprofiler64.so.1.0.50203
1531aef58000-1531aef59000 r--p 00042000 07:01 7821 /opt/rocm-5.2.3/lib/librocprofiler64.so.1.0.50203
1531aef59000-1531aef5a000 rwxp 00043000 07:01 7821 /opt/rocm-5.2.3/lib/librocprofiler64.so.1.0.50203
1531aef5a000-1531aef96000 r-xp 00000000 07:01 7833 /opt/rocm-5.2.3/lib/libroctracer64.so.1.0.50203
1531aef96000-1531af196000 ---p 0003c000 07:01 7833 /opt/rocm-5.2.3/lib/libroctracer64.so.1.0.50203
1531af196000-1531af197000 r--p 0003c000 07:01 7833 /opt/rocm-5.2.3/lib/libroctracer64.so.1.0.50203
1531af197000-1531af198000 rwxp 0003d000 07:01 7833 /opt/rocm-5.2.3/lib/libroctracer64.so.1.0.50203
1531af198000-1531af199000 rw-p 00000000 00:00 0
1531af199000-1531b12d8000 r-xp 00000000 a67:eeb30 144119826271531182 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace.so.1.10.2
1531b12d8000-1531b1320000 r--p 0213e000 a67:eeb30 144119826271531182 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace.so.1.10.2
1531b1320000-1531b1321000 rwxp 02186000 a67:eeb30 144119826271531182 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace.so.1.10.2
1531b1321000-1531b132f000 rwxp 02187000 a67:eeb30 144119826271531182 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace.so.1.10.2
1531b132f000-1531b13a0000 rw-p 02195000 a67:eeb30 144119826271531182 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace.so.1.10.2
1531b13a0000-1531b19de000 rw-p 00000000 00:00 0
1531b19de000-1531b19e7000 r-xp 00000000 00:2d 853 /usr/lib64/libdrm_amdgpu.so.1.0.0
1531b19e7000-1531b1be6000 ---p 00009000 00:2d 853 /usr/lib64/libdrm_amdgpu.so.1.0.0
1531b1be6000-1531b1be7000 r--p 00008000 00:2d 853 /usr/lib64/libdrm_amdgpu.so.1.0.0
1531b1be7000-1531b1be8000 rwxp 00009000 00:2d 853 /usr/lib64/libdrm_amdgpu.so.1.0.0
1531b1be8000-1531b1bfb000 r-xp 00000000 00:2d 861 /usr/lib64/libdrm.so.2.4.0
1531b1bfb000-1531b1dfa000 ---p 00013000 00:2d 861 /usr/lib64/libdrm.so.2.4.0
1531b1dfa000-1531b1dfb000 r--p 00012000 00:2d 861 /usr/lib64/libdrm.so.2.4.0
1531b1dfb000-1531b1dfc000 rwxp 00013000 00:2d 861 /usr/lib64/libdrm.so.2.4.0
1531b1dfc000-1531b1e13000 r-xp 00000000 00:2d 883 /usr/lib64/libelf-0.185.so
1531b1e13000-1531b2013000 ---p 00017000 00:2d 883 /usr/lib64/libelf-0.185.so
1531b2013000-1531b2014000 r--p 00017000 00:2d 883 /usr/lib64/libelf-0.185.so
1531b2014000-1531b2015000 rwxp 00018000 00:2d 883 /usr/lib64/libelf-0.185.so
1531b2015000-1531b203b000 r-xp 00000000 00:2d 57 /lib64/libtinfo.so.6.1
1531b203b000-1531b223a000 ---p 00026000 00:2d 57 /lib64/libtinfo.so.6.1
1531b223a000-1531b223b000 r--p 00025000 00:2d 57 /lib64/libtinfo.so.6.1
1531b223b000-1531b223c000 rwxp 00026000 00:2d 57 /lib64/libtinfo.so.6.1
1531b223c000-1531b2243000 rw-p 00027000 00:2d 57 /lib64/libtinfo.so.6.1
1531b2243000-1531b2259000 r-xp 00000000 00:2d 65 /lib64/libz.so.1.2.11
1531b2259000-1531b2458000 ---p 00016000 00:2d 65 /lib64/libz.so.1.2.11
1531b2458000-1531b2459000 rwxp 00015000 00:2d 65 /lib64/libz.so.1.2.11
1531b2459000-1531b245a000 rw-p 00016000 00:2d 65 /lib64/libz.so.1.2.11
1531b245a000-1531b2465000 r-xp 00000000 00:2d 1192 /usr/lib64/libnuma.so.1.0.0
1531b2465000-1531b2664000 ---p 0000b000 00:2d 1192 /usr/lib64/libnuma.so.1.0.0
1531b2664000-1531b2665000 r--p 0000a000 00:2d 1192 /usr/lib64/libnuma.so.1.0.0
1531b2665000-1531b2666000 rwxp 0000b000 00:2d 1192 /usr/lib64/libnuma.so.1.0.0
1531b2666000-1531b278c000 r-xp 00000000 07:01 7777 /opt/rocm-5.2.3/lib/libhsa-runtime64.so.1.5.50203
1531b278c000-1531b298b000 ---p 00126000 07:01 7777 /opt/rocm-5.2.3/lib/libhsa-runtime64.so.1.5.50203
1531b298b000-1531b2993000 r--p 00125000 07:01 7777 /opt/rocm-5.2.3/lib/libhsa-runtime64.so.1.5.50203
1531b2993000-1531b2995000 rwxp 0012d000 07:01 7777 /opt/rocm-5.2.3/lib/libhsa-runtime64.so.1.5.50203
1531b2995000-1531b2aad000 rw-p 0012f000 07:01 7777 /opt/rocm-5.2.3/lib/libhsa-runtime64.so.1.5.50203
1531b2aad000-1531b2ab0000 rw-p 00000000 00:00 0
1531b2ab0000-1531b9a35000 r-xp 00000000 07:01 7748 /opt/rocm-5.2.3/lib/libamd_comgr.so.2.4.50203
1531b9a35000-1531b9c35000 ---p 06f85000 07:01 7748 /opt/rocm-5.2.3/lib/libamd_comgr.so.2.4.50203
1531b9c35000-1531ba101000 r--p 06f85000 07:01 7748 /opt/rocm-5.2.3/lib/libamd_comgr.so.2.4.50203
1531ba101000-1531ba103000 rwxp 07451000 07:01 7748 /opt/rocm-5.2.3/lib/libamd_comgr.so.2.4.50203
1531ba103000-1531ba10f000 rw-p 07453000 07:01 7748 /opt/rocm-5.2.3/lib/libamd_comgr.so.2.4.50203
1531ba10f000-1531ba172000 rw-p 00000000 00:00 0
1531ba172000-1531ba175000 r-xp 00000000 00:2d 32 /lib64/libdl-2.31.so
1531ba175000-1531ba374000 ---p 00003000 00:2d 32 /lib64/libdl-2.31.so
1531ba374000-1531ba375000 rwxp 00002000 00:2d 32 /lib64/libdl-2.31.so
1531ba375000-1531ba376000 rw-p 00003000 00:2d 32 /lib64/libdl-2.31.so
1531ba376000-1531ba55c000 r-xp 00000000 00:2d 30 /lib64/libc-2.31.so
1531ba55c000-1531ba75c000 ---p 001e6000 00:2d 30 /lib64/libc-2.31.so
1531ba75c000-1531ba75e000 r--p 001e6000 00:2d 30 /lib64/libc-2.31.so
1531ba75e000-1531ba75f000 rwxp 001e8000 00:2d 30 /lib64/libc-2.31.so
1531ba75f000-1531ba767000 rw-p 001e9000 00:2d 30 /lib64/libc-2.31.so
1531ba767000-1531ba76b000 rw-p 00000000 00:00 0
1531ba76b000-1531ba97e000 r-xp 00000000 00:2d 1335 /usr/lib64/libstdc++.so.6.0.30
1531ba97e000-1531bab7d000 ---p 00213000 00:2d 1335 /usr/lib64/libstdc++.so.6.0.30
1531bab7d000-1531bab88000 r--p 00212000 00:2d 1335 /usr/lib64/libstdc++.so.6.0.30
1531bab88000-1531bab8b000 rwxp 0021d000 00:2d 1335 /usr/lib64/libstdc++.so.6.0.30
1531bab8b000-1531bab8e000 rwxp 00000000 00:00 0
1531bab8e000-1531bab8f000 rwxp 00000000 07:01 7751 /opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203
1531bab8f000-1531baf33000 r-xp 00001000 07:01 7751 /opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203
1531baf33000-1531bb133000 ---p 003a5000 07:01 7751 /opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203
1531bb133000-1531bb139000 r--p 003a5000 07:01 7751 /opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203
1531bb139000-1531bb13b000 rwxp 003ab000 07:01 7751 /opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203
1531bb13b000-1531bbaab000 rw-p 003ad000 07:01 7751 /opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203
1531bbaab000-1531bbabc000 rw-p 00000000 00:00 0
1531bbabc000-1531bbac4000 r-xp 00000000 00:2d 52 /lib64/librt-2.31.so
1531bbac4000-1531bbcc3000 ---p 00008000 00:2d 52 /lib64/librt-2.31.so
1531bbcc3000-1531bbcc4000 rwxp 00007000 00:2d 52 /lib64/librt-2.31.so
1531bbcc4000-1531bbcc5000 rw-p 00008000 00:2d 52 /lib64/librt-2.31.so
1531bbcc5000-1531bbe0d000 r-xp 00000000 00:2d 34 /lib64/libm-2.31.so
1531bbe0d000-1531bc00d000 ---p 00148000 00:2d 34 /lib64/libm-2.31.so
1531bc00d000-1531bc00e000 rwxp 00148000 00:2d 34 /lib64/libm-2.31.so
1531bc00e000-1531bc010000 rw-p 00149000 00:2d 34 /lib64/libm-2.31.so
1531bc010000-1531bc02e000 r-xp 00000000 00:2d 48 /lib64/libpthread-2.31.so
1531bc02e000-1531bc22d000 ---p 0001e000 00:2d 48 /lib64/libpthread-2.31.so
1531bc22d000-1531bc22e000 rwxp 0001d000 00:2d 48 /lib64/libpthread-2.31.so
1531bc22e000-1531bc22f000 rw-p 0001e000 00:2d 48 /lib64/libpthread-2.31.so
1531bc22f000-1531bc233000 rw-p 00000000 00:00 0
1531bc233000-1531bc250000 r-xp 00000000 00:2d 489 /lib64/libgcc_s.so.1
1531bc250000-1531bc450000 ---p 0001d000 00:2d 489 /lib64/libgcc_s.so.1
1531bc450000-1531bc451000 r--p 0001d000 00:2d 489 /lib64/libgcc_s.so.1
1531bc451000-1531bc452000 rwxp 0001e000 00:2d 489 /lib64/libgcc_s.so.1
1531bc452000-1531bc47c000 r-xp 00000000 00:2d 17 /lib64/ld-2.31.so
1531bc47e000-1531bc47f000 rw-p 00000000 00:00 0
1531bc480000-1531bc481000 rw-p 00000000 00:00 0
1531bc482000-1531bc483000 rw-p 00000000 00:00 0
1531bc484000-1531bc485000 rw-p 00000000 00:00 0
1531bc486000-1531bc487000 rw-s 21c3800000000000 00:05 1603 /dev/kfd
1531bc488000-1531bc489000 rw-s 2b84c00000000000 00:05 1603 /dev/kfd
1531bc48a000-1531bc48b000 rw-s 3c52800000000000 00:05 1603 /dev/kfd
1531bc48c000-1531bc48d000 rw-s 2617c00000000000 00:05 1603 /dev/kfd
1531bc48e000-1531bc5d8000 rw-p 00000000 00:00 0
1531bc5d8000-1531bc5e6000 r-xp 00000000 a67:eeb30 144119826271531190 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/omnitrace/libunwind.so.99.0.0
1531bc5e6000-1531bc5e7000 rwxp 0000e000 a67:eeb30 144119826271531190 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/omnitrace/libunwind.so.99.0.0
1531bc5e7000-1531bc5e8000 rw-p 0000f000 a67:eeb30 144119826271531190 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/omnitrace/libunwind.so.99.0.0
1531bc5e8000-1531bc5f2000 rw-p 00000000 00:00 0
1531bc5f2000-1531bc5fb000 r-xp 00000000 a67:eeb30 144119826271531238 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/omnitrace/libgotcha.so.2.1.0
1531bc5fb000-1531bc5fc000 rwxp 00009000 a67:eeb30 144119826271531238 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/omnitrace/libgotcha.so.2.1.0
1531bc5fc000-1531bc5fd000 rw-p 0000a000 a67:eeb30 144119826271531238 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/omnitrace/libgotcha.so.2.1.0
1531bc5fd000-1531bc632000 r-xp 00000000 00:2d 177 /usr/lib64/libudev.so.1.7.2
1531bc632000-1531bc633000 rwxp 00034000 00:2d 177 /usr/lib64/libudev.so.1.7.2
1531bc633000-1531bc634000 rw-p 00035000 00:2d 177 /usr/lib64/libudev.so.1.7.2
1531bc634000-1531bc63f000 rw-p 00000000 00:00 0
1531bc640000-1531bc641000 rw-s 38ed800000000000 00:05 1603 /dev/kfd
1531bc642000-1531bc643000 rw-s 22a6c00000000000 00:05 1603 /dev/kfd
1531bc644000-1531bc645000 rw-s 3b7f400000000000 00:05 1603 /dev/kfd
1531bc645000-1531bc646000 rw-s 00000000 00:31 750293 /dev/shm/hsakmt_shared_mem
1531bc646000-1531bc647000 rw-s 3d31c00000000000 00:05 1603 /dev/kfd
1531bc647000-1531bc648000 rw-s 00000000 00:31 750292 /dev/shm/N2DkIP (deleted)
1531bc648000-1531bc649000 rw-s 00000000 00:31 750287 /dev/shm/rocm_smi_card7
1531bc649000-1531bc64a000 rw-s 00000000 00:31 750286 /dev/shm/rocm_smi_card6
1531bc64a000-1531bc64b000 rw-s 00000000 00:31 750285 /dev/shm/rocm_smi_card5
1531bc64b000-1531bc64c000 rw-s 00000000 00:31 750284 /dev/shm/rocm_smi_card4
1531bc64c000-1531bc64d000 rw-s 00000000 00:31 750283 /dev/shm/rocm_smi_card3
1531bc64d000-1531bc64e000 rw-s 00000000 00:31 750282 /dev/shm/rocm_smi_card2
1531bc64e000-1531bc64f000 rw-s 00000000 00:31 750281 /dev/shm/rocm_smi_card1
1531bc64f000-1531bc650000 rw-s 00000000 00:31 750280 /dev/shm/rocm_smi_card0
1531bc650000-1531bc652000 rw-p 00000000 00:00 0
1531bc652000-1531bc654000 r-xp 00000000 a67:eeb30 144119826271531269 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-user.so.1.10.2
1531bc654000-1531bc655000 rwxp 00001000 a67:eeb30 144119826271531269 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-user.so.1.10.2
1531bc655000-1531bc656000 rw-p 00002000 a67:eeb30 144119826271531269 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-user.so.1.10.2
1531bc656000-1531bc678000 r-xp 00000000 a67:eeb30 144119826271531161 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2
1531bc678000-1531bc679000 rwxp 00021000 a67:eeb30 144119826271531161 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2
1531bc679000-1531bc67a000 rw-p 00022000 a67:eeb30 144119826271531161 /pfs/lustrep1/users/homolaja/Apps/omnitrace/lib/libomnitrace-dl.so.1.10.2
1531bc67a000-1531bc67c000 rw-p 00000000 00:00 0
1531bc67c000-1531bc67d000 rwxp 0002a000 00:2d 17 /lib64/ld-2.31.so
1531bc67d000-1531bc67f000 rw-p 0002b000 00:2d 17 /lib64/ld-2.31.so
7fff58cdf000-7fff58d18000 rwxp 00000000 00:00 0 [stack]
7fff58d18000-7fff58d21000 rw-p 00000000 00:00 0
7fff58d9d000-7fff58da1000 r--p 00000000 00:00 0 [vvar]
7fff58da1000-7fff58da3000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Backtrace (demangled):
[PID=11382][TID=0][0/7] __restore_rt
[PID=11382][TID=0][1/7] hsa_amd_image_get_info_max_dim +0x5e9
[PID=11382][TID=0][2/7] hsa_amd_image_get_info_max_dim +0x4ba
[PID=11382][TID=0][7/7] main +0xfa
[PID=11382][TID=0][8/7] omnitrace_main +0x3bd
[PID=11382][TID=0][9/7] __libc_start_main +0xef
[PID=11382][TID=0][10/7] _start +0x2a
Backtrace (lineinfo):
[PID=11382][TID=0][0/9]
[/lib64/libpthread.so.0:?] __restore_rt
[PID=11382][TID=0][1/9]
[/opt/rocm-5.2.3/lib/libhsa-runtime64.so.1.5.50203:?] hsa_amd_image_get_info_max_dim
[PID=11382][TID=0][2/9]
[/opt/rocm/lib/libhsa-runtime64.so.1:?] no unwind info found
[PID=11382][TID=0][3/9]
[/opt/rocm/lib/libamdhip64.so.5:?] no unwind info found
[PID=11382][TID=0][4/9]
[/opt/rocm/lib/libamdhip64.so.5:?] no unwind info found
[PID=11382][TID=0][5/9]
[/opt/rocm-5.2.3/lib/libamdhip64.so.5.2.50203:?] no unwind info found
[PID=11382][TID=0][6/9]
[/pfs/lustrep1/users/homolaja/tests/hip_host_function/source.hip.cpp:10] main
[PID=11382][TID=0][7/9]
[/home/omnitrace/source/lib/omnitrace-dl/dl.cpp:1443] omnitrace_main
[PID=11382][TID=0][8/9]
[/lib64/libc-2.31.so:?] __libc_start_main
[omnitrace][11382] Finalizing after signal 2 :: Signal: SIGINT (signal number: 2)
interrupt program
[omnitrace][11382][0][omnitrace_finalize] finalizing...
[omnitrace][11382][0][omnitrace_finalize]
[omnitrace][11382][0][omnitrace_finalize] omnitrace/process/11382 : 13.174841 sec wall_clock, 252.280 MB peak_rss, 251.036 MB page_rss, 25.120000 sec cpu_clock, 190.7 % cpu_util [laps: 1]
[omnitrace][11382][0][omnitrace_finalize] omnitrace/process/11382/thread/0 : 13.173039 sec wall_clock, 12.751506 sec thread_cpu_clock, 96.8 % thread_cpu_util, 252.280 MB peak_rss [laps: 1]
[omnitrace][11382][0][omnitrace_finalize] omnitrace/process/11382/thread/1 : 0.002652 sec wall_clock, 0.000530 sec thread_cpu_clock, 19.9 % thread_cpu_util, 0.000 MB peak_rss [laps: 1]
[omnitrace][11382][0][omnitrace_finalize]
[omnitrace][11382][0][omnitrace_finalize] Finalizing perfetto...
[omnitrace][11382][perfetto]> Outputting '/users/homolaja/tests/hip_host_function/omnitrace-program.x-output/2023-09-22_16.27/perfetto-trace-11382.proto' (3331.21 KB / 3.33 MB / 0.00 GB)... Done
[omnitrace][11382][metadata]> Outputting 'omnitrace-program.x-output/2023-09-22_16.27/metadata-11382.json' and 'omnitrace-program.x-output/2023-09-22_16.27/functions-11382.json'
[omnitrace][11382][0][omnitrace_finalize] Finalized: 0.276525 sec wall_clock, 318.508 MB peak_rss, 12.448 MB page_rss, 0.550000 sec cpu_clock, 198.9 % cpu_util
from omnitrace.
So I tried running it with rocprof, and it also hangs, it seems that in the first hipDeviceSynchronize()
:
$ rocprof --sys-trace ./program.x
RPL: on '230925_095831' from '/opt/rocm-5.2.3' in '/users/homolaja/tests/hip_host_function'
RPL: profiling '"./program.x"'
RPL: input file ''
RPL: output dir '/tmp/rpl_data_230925_095831_123756'
RPL: result dir '/tmp/rpl_data_230925_095831_123756/input_results_230925_095831'
AAA
ROCTracer (pid=123776):
HSA-trace()
HIP-trace()
If I comment out the hipStreamAddCallback
, it does not hang.
Running the program with omnitrace-sample --exclude roctracer -- ./program.x
, it works fine without any issue (but I don't get the GPU tracing data).
from omnitrace.
Related Issues (20)
- `omnitrace-avail` fails on ROCM 5.3 and RX 6800XT HOT 2
- Omnitrace hangs and prints errors while running STEMDL/stdfc with more than 1 GPU HOT 9
- Update Dyninst submodule
- Segmentation fault in multi-threaded code HOT 10
- Segmentation fault in sampling multi-processing code HOT 8
- Missing Information for some Memory Operations (host to device or device to host) HOT 2
- Slice has duration of "Did not end." HOT 10
- Missing GPU kernels when using @profile and -b flag HOT 1
- Still an issue related to "Segmentation fault in multi-threaded code" HOT 3
- Issues with Python support HOT 2
- Inaccurate device counter trace HOT 1
- ROCm 6.0 HOT 4
- Omnitrace hangs with omnitrace-instrument HOT 6
- omnitrace-python errors with OMNITRACE_USE_ROCM_SMI = true HOT 11
- omnitrace-avail fails with GFXIP is not supported(gfx90c) HOT 1
- Enabling Detailed Profiling of Graph Nodes in OmniTrace HOT 1
- torch.cuda.is_available() aborts after module loading omnitrace HOT 1
- Visualizing profiling results for multi-GPUs HOT 4
- [Documentation]: Fix User API Example
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from omnitrace.