Not consistently reproducible. This is occurring on a cluster of ~1944 nodes (sierra), of which <1% are exhibiting this behavior after boot. This reportedly started after upgrading the cluster to CHAOS 5.
munged is expected to use minimal cpu time. Instead, top shows it consuming as much of a core as it can. Attaching to the process with strace shows repeated stat()s of /etc/group
:
13:50:27.956403 stat("/etc/group", {st_dev=makedev(0, 19), st_ino=8006,
st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096,
st_blocks=104, st_size=49948, st_atime=2012/05/21-13:05:58,
st_mtime=2012/05/21-11:05:01, st_ctime=2012/05/21-13:05:29}) = 0
13:50:27.956457 stat("/etc/group", {st_dev=makedev(0, 19), st_ino=8006,
st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096,
st_blocks=104, st_size=49948, st_atime=2012/05/21-13:05:58,
st_mtime=2012/05/21-11:05:01, st_ctime=2012/05/21-13:05:29}) = 0
13:50:27.956507 stat("/etc/group", {st_dev=makedev(0, 19), st_ino=8006,
st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096,
st_blocks=104, st_size=49948, st_atime=2012/05/21-13:05:58,
st_mtime=2012/05/21-11:05:01, st_ctime=2012/05/21-13:05:29}) = 0
$ cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
$ grep "model name" /proc/cpuinfo | sort -u
model name : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz
$ uname -a
Linux sierra972 2.6.32-220.13.1.2chaos.ch5.x86_64 #1 SMP Thu Apr 19 12:15:29
PDT 2012 x86_64 x86_64 x86_64 GNU/Linux
(gdb) print ts_now
$1 = {tv_sec = 2604158256, tv_nsec = 519236000}
(gdb) print time(0)
$2 = 1337732647