open-power-sdk / curt Goto Github PK
View Code? Open in Web Editor NEWCompute processor utilization and system call processing metrics based on "perf" trace data
License: GNU General Public License v2.0
Compute processor utilization and system call processing metrics based on "perf" trace data
License: GNU General Public License v2.0
Currently, curt only prints a per-process summary over ALL cpus. Add printing the per-CPU summary for each process.
Some levels of perf
pass an additional parameter, perf_sample_dict
, to the event handlers, in an incompatible fashion. It would be nice to have splat
be able to support running with the older and newer perf
APIs.
Currently, the only hypervisor events supported are powerpc:hcall_entry
and powerpc:hcall_exit
, which are (unsurprisingly) specific to powerpc platforms. Other hypervisors, like KVM, have their own set of interesting events, which should be supported similarly.
report which syscalls invoked which hcalls and relevant statistics
curt
will not work on Ubuntu until support is added there in perf
for the required Python bindings.
Document this in the README.
Track interrupts and report associated statistics, per-task, per-process, and system-wide. (also per-cpu?)
Per IRQ:
I believe the associated events are
irq:irq_handler_entry
irq:irq_handler_exit
and perhaps:
irq:softirq_entry
irq:softirq_exit
and perhaps:
workqueue:workqueue_execute_end
workqueue:workqueue_execute_start
Output occasionally (always?) includes some references to a "None" syscall:
73572:
-- [ task] command cpu user sys busy idle util% moves
[ 73572] test 64 0.657500 24.068404 0.000000 2.723860
[ 73572] test 65 0.001740 0.010420 0.000000 0.338130
[ 73572] test 66 0.105708 0.437624 0.000000 1.332782
[ 73572] test ALL 0.764948 24.516448 0.000000 4.394772 85.2% 2
-- ( id)name count elapsed pending average minimum maximum
(120)clone 1 23.148990 0.000000 23.148990 23.148990 23.148990
( 0)None 1 0.004508 0.000000 0.004508 0.004508 0.004508
This syscall with id=0 might be a bug in perf
, as I haven't seen anything similar with the raw kernel trace data in /sys/kernel/debug/tracing/trace. I think it's always associated with a new task exiting it's initial clone
syscall.
Currently, determining which perf API to use is left to the user, with the --api
parameter, with the default being the older API which works everywhere, but which does not report process-specific information.
Use variadic arguments instead, like:
def irq__irq_handler_exit(event_name, context, common_cpu, common_secs, common_nsecs, common_pid, common_comm, common_callchain, irq, ret, *args):
with perf_sample_dict
existing as the first tuple of args
, or not existing. The pid can then either be determined, or set to 'unknown'
based on that.
Independent data structures, like:
task_state[tid]['mode']
to
task_mode[tid]
or aggregation in a class... should help performance.
At some point, the perf
command added an additional parameter to the event handlers, "perf_sample_dict". curt.py
currently uses this to get the process ID (pid). On most current Linux distributions, this parameter is not present, and running perf script -s ./curt.py
on those systems results in an error:
/usr/bin/perf script -s ../curt.py
TypeError: powerpc__hcall_entry() takes exactly 10 arguments (9 given)
It would be nice to be able to support both APIs, old and new.
seccomp can apparently reject system calls, and when they return, the syscall ID is set to -1:
perf 5375 [007] 59632.478528: raw_syscalls:sys_enter: NR 1 (3, 9fb888, 8, 2d83740, 1, 7ffff)
perf 5375 [007] 59632.478532: raw_syscalls:sys_exit: NR 1 = 8
perf 5375 [007] 59632.478538: raw_syscalls:sys_enter: NR 15 (11, 7ffffca734b0, 7ffffca73380, 2d83740, 1, 7ffff)
perf 5375 [007] 59632.478539: raw_syscalls:sys_exit: NR -1 = 8
perf 5375 [007] 59632.478543: raw_syscalls:sys_enter: NR 16 (4, 2401, 0, 2d83740, 1, 0)
perf 5375 [007] 59632.478551: raw_syscalls:sys_exit: NR 16 = 0
This confuses the current code, which expects that the syscall ID for sys_exit
events will match the syscall ID for sys_enter
, and ends up with the syscall None
appearing in the report:
5375:
-- [ task] command cpu user sys irq hv busy idle | runtime sleep wait blocked iowait unaccounted | util% moves
[ 5375] perf 7 2.093589 4.979433 0.000000 0.000000 0.000000 0.015286 | 6.242727 0.000000 0.000000 0.000000 0.000000 0.000000 | 99.8%
[ 5375] perf ALL 2.093589 4.979433 0.000000 0.000000 0.000000 0.015286 | 6.242727 0.000000 0.000000 0.000000 0.000000 0.000000 | 99.8% 0
-- ( ID)name count elapsed pending average minimum maximum
( 1)write 1290 4.924252 0.000000 0.003817 0.003055 0.019906
( 16)ioctl 7 0.040657 0.000000 0.005808 0.004360 0.007880
( 3)close 2 0.001038 0.000000 0.000519 0.000509 0.000529
( -1)None 1 7.033008 0.000000 7.033008 7.033008 7.033008
( 0)read 1 0.020837 0.000000 0.020837 0.020837 0.020837
( 2)open 1 0.006891 0.000000 0.006891 0.006891 0.006891
( 15)rt_sigreturn 0 0.000000 0.000000 -- -- --
Let's add a few snapshot of the output so that it will have visual clarity of the benefits that curt is capable of.
syscall 102, "socketcall", gets routed by its first parameter (see net/socket.c) to a large set of interesting system calls. These are defined in the kernel source include/uapi/linux/net.h:
#define SYS_SOCKET 1 /* sys_socket(2) */
#define SYS_BIND 2 /* sys_bind(2) */
#define SYS_CONNECT 3 /* sys_connect(2) */
#define SYS_LISTEN 4 /* sys_listen(2) */
#define SYS_ACCEPT 5 /* sys_accept(2) */
#define SYS_GETSOCKNAME 6 /* sys_getsockname(2) */
#define SYS_GETPEERNAME 7 /* sys_getpeername(2) */
#define SYS_SOCKETPAIR 8 /* sys_socketpair(2) */
#define SYS_SEND 9 /* sys_send(2) */
#define SYS_RECV 10 /* sys_recv(2) */
#define SYS_SENDTO 11 /* sys_sendto(2) */
#define SYS_RECVFROM 12 /* sys_recvfrom(2) */
#define SYS_SHUTDOWN 13 /* sys_shutdown(2) */
#define SYS_SETSOCKOPT 14 /* sys_setsockopt(2) */
#define SYS_GETSOCKOPT 15 /* sys_getsockopt(2) */
#define SYS_SENDMSG 16 /* sys_sendmsg(2) */
#define SYS_RECVMSG 17 /* sys_recvmsg(2) */
#define SYS_ACCEPT4 18 /* sys_accept4(2) */
#define SYS_RECVMMSG 19 /* sys_recvmmsg(2) */
#define SYS_SENDMMSG 20 /* sys_sendmmsg(2) */
It would be helpful to treat each of these as independent syscalls with respect to the reports generated.
I'm not sure this was intended, but this appears to be the case.
git clone https://github.ibm.com/sdk/curt.git
Cloning into 'curt'...
Username for 'https://github.ibm.com':
Password for 'https://github.ibm.com':
remote: Anonymous access denied
fatal: Authentication failed for 'https://github.ibm.com/sdk/curt.git/'
report the cpu utilization for multiple nodes in a cluster. also what interfaces can be provided so it will help integration will cluster management/reporting software.
Currently, curt only prints system-wide totals over ALL cpus. Add system-wide per-CPU totals.
In trace_end()
, if the tasks final state was idle
, and the resume-mode
is sys
, that pending time is also accumulated in that syscall's pending time:
elif task.mode == 'idle':
delta = curr_timestamp - task.timestamp
cpu = task.cpu
task.cpus[cpu].idle += delta
task.cpus[cpu].unaccounted += delta
if task.resume_mode == 'sys':
id = task.syscall
cpu = task.cpu
delta = curr_timestamp - task.syscalls[id].timestamp
task.syscalls[id].pending += delta
I believe similar accumulation should be done if the resume-mode
is hv
or irq
.
curt
is getting stuck in an infinite relaunch loop:
$ ./curt.py
Relaunching under "perf" command...
Relaunching under "perf" command...
Relaunching under "perf" command...
Currently, curt's output can look like the following:
-- [ task] command cpu user sys irq hv busy idle | runtime sleep wait blocked iowait unaccounted | util% moves
[ 34368] Scheduled Execu 33 0.004520 0.005744 0.000000 0.000000 0.000000 0.047514 | 0.019036 0.000000 0.000000 0.000000 0.000000 0.000000 | 17.8%
[ 34368] Scheduled Execu 27 0.000000 0.000000 0.000000 0.000000 0.000000 0.002582 | 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 | 0.0%
[ 34368] Scheduled Execu 28 0.097846 0.046144 0.000000 0.000000 0.000000 57.444170 | 0.146746 0.000000 0.000000 0.000000 0.000000 57.439282 | 0.3%
[ 34368] Scheduled Execu ALL 0.102366 0.051888 0.000000 0.000000 0.000000 57.494266 | 0.165782 0.000000 0.000000 0.000000 0.000000 57.439282 | 0.3% 2
I think would look a little better if the per-CPU lines were sorted by CPU number: 27, 28, 33.
fork and exec support in curt.py both rely on creating synthetic system call events, for "clone" and "execve", respectively. However, these events use hardcoded system call numbers which only match powerpc and not x86. Some of the results are thus skewed/incorrect on x86.
Add support for parsing output from reading the kernel trace buffer (usually /sys/kernel/debug/tracing/trace). This is text-based:
sshd-15165 [000] .... 4338237.604419: sys_enter: NR 142 (e, 1001a717b30, 1001a729ee0, 0, 0, 2008)
sshd-15165 [000] d... 4338237.604424: sched_switch: prev_comm=sshd prev_pid=15165 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
bash-15453 [004] .... 4338237.604655: sched_process_exec: filename=/bin/cat pid=15453 old_pid=15453
bash-15453 [004] .... 4338237.604658: sys_exit: NR 0 = 0
bash-15453 [004] .... 4338237.604690: sys_enter: NR 45 (0, 0, 3fff7d6b0000, 3fffe5aac332, 80, 3fff7d6f0680)
bash-15453 [004] .... 4338237.604691: sys_exit: NR 45 = 1099686346752
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.