On Wednesday, March 11, 2015 02:00:41 PM Nick Frampton
wrote:> On 11/03/15 07:59, Mark Johnston wrote:
> > On Tue, Mar 10, 2015 at 02:10:09PM -0400, John Baldwin wrote:
> >> Often loops using libkvm are due to programs using libkvm are
trying to read
> >> kernel data structures while they are changing. However, if you
use sysctls
> >> to fetch this data instead, you should be able to get a stable
snapshot of the
> >> system state without getting stuck in a possible loop. I believe
for libkvm
> >> to use sysctl instead of /dev/kmem you have to pass a NULL for the
kernel and
> >> "/dev/null" for the core image.
>
> In our code, we're invoking kvm_openfiles as you suggest:
> kd = kvm_openfiles (NULL, _PATH_DEVNULL, NULL, O_RDONLY, errbuf)
>
>
> > It sounds like this issue might be the one fixed in r272566: if the
> > KERN_PROC_ALL sysctl is read with an insufficiently large buffer, an
> > sbuf error return value could bubble up and be treated as ERESTART,
> > resulting in a loop.
> >
> > This can be confirmed with something like
> >
> > dtrace -n 'syscall:::entry /pid == $target/{@[probefunc] =
count();} tick-3s {exit(0);}' -p <pid of looping proc>
> >
> > If the output consists solely of __sysctl, this bug is likely the
> > culprit.
>
> Unfortunately, I accidentally killed fstat this morning before I could do
any further debug.
>
> I ran truss -p on it yesterday and it was spinning solely on __sysctl.
>
> I'll try compiling with debug symbols in case it happens again. I
haven't been able to reproduce the
> problem in a reasonable time frame so it could be days or weeks before we
see it happen again.
Tha truss output is consistent with Mark's suggestion, so I would try
his suggested fix of 272566.
--
John Baldwin