thr3ads.net - dtrace discuss - [dtrace-discuss] Iterating over all LWPs [Mar 2008]

If this information is useful, please help other people find it:
Share via:

Roman Shaposhnik

2008-Mar-24 17:35 UTC

[dtrace-discuss] Iterating over all LWPs

Hi!

Here''s the problem I''m facing: I need to sample a particular
attribute of all the LWPs in a given process. Now, sampling
itself is no problem -- I can use profile/tick provider and
all is good. The problem is: I can''t seem to think of a 
DTrace-friendly way to iterate over all the LWPs and report
the value of the attribute I''m interested in. 

Any suggestions?

Thanks,
Roman.

Nicolas Williams

2008-Mar-24 17:37 UTC

head link

[dtrace-discuss] Iterating over all LWPs

On Mon, Mar 24, 2008 at 10:35:43AM -0700, Roman Shaposhnik
wrote:> Any suggestions?
Either don''t use DTrace if you need to iterate, or start the target via
DTrace so you can keep track of all the LWPs in your D script.

Roman Shaposhnik

2008-Mar-24 17:56 UTC

head link

[dtrace-discuss] Iterating over all LWPs

On Mon, 2008-03-24 at 12:37 -0500, Nicolas Williams
wrote:> On Mon, Mar 24, 2008 at 10:35:43AM -0700, Roman Shaposhnik wrote:
> > Any suggestions?
> 
> Either don''t use DTrace if you need to iterate, or start the
target via
> DTrace so you can keep track of all the LWPs in your D script.
And how exactly starting the target via DTrace is going to make it
easier? As far as I can tell I have two problems to deal with:

  1. reporting N values every X milliseconds. 
  2. getting N values

#1 can be sort of solved by using aggregations (although what I would
really like to have is a %A specifier in printf for formating and
outputting associative arrays). #2 is way trickier and I really don''t
know how to solve it short of digging all the functions inside the
kernel that actually modify the attributes I''m looking for and
using fbt::funcs:entry to keep my array/aggregation in sync with what''s
going on inside the kernel structures. If you happen to think that
starting the target via DTrace can help address #2 -- please tell me
more.

Thanks,
Roman.

Nicolas Williams

2008-Mar-24 18:01 UTC

head link

[dtrace-discuss] Iterating over all LWPs

What I meant by the "don''t use DTrace if you need to iterate"
option is
this: try using MDB.  Given what you say you might find it a lot easier
to sample the items you need via MDB.

Jon Haslam

2008-Mar-25 14:03 UTC

head link

[dtrace-discuss] Iterating over all LWPs

Hi Roman,
> Here''s the problem I''m facing: I need to sample a
particular
> attribute of all the LWPs in a given process. Now, sampling
> itself is no problem -- I can use profile/tick provider and
> all is good. The problem is: I can''t seem to think of a 
> DTrace-friendly way to iterate over all the LWPs and report
> the value of the attribute I''m interested in. 
>
> Any suggestions?
To be honest, I''m not completely sure what you''re after but
I''ll throw this in there anyway. If you just want to iterate
over all the lwp''s in a given process you can use a tick probe
to do this (you seem to already know this though...). An example for
looking at ''nscd'' which is based upon Adam''s example
he posted
on his blog the other day:


#!/usr/sbin/dtrace -s

#pragma D option quiet

BEGIN
{
        self->pidp = `pidhash[$1 & (`pid_hashsz - 1)];
        pidp = self->pidp;
        printf("pid = %d\n", self->pidp->pid_id);
}

BEGIN
/self->pidp->pid_id == $1/
{
        this->slot = (*(uint32_t *)self->pidp) >> 8;
        procp = `procdir[this->slot].pe_proc;
        procname = stringof(procp->p_user.u_comm);
        t = procp->p_tlist;
}

tick-50ms
/pidp && t != NULL/
{
        printf("%s lwps %d/thr (%d): %d syscalls\n", procname, 
procp->p_lwpcnt,
            t->t_tid, t->t_lwp->lwp_ru.sysc);
        t = t->t_forw;
}


This produces:
# ./lwp.d 100179
pid = 100179
nscd lwps 33/thr (1): 57 syscalls
nscd lwps 33/thr (2): 2050124 syscalls
nscd lwps 33/thr (3): 5 syscalls
nscd lwps 33/thr (4): 25402 syscalls
nscd lwps 33/thr (5): 18 syscalls
nscd lwps 33/thr (6): 3306 syscalls
nscd lwps 33/thr (7): 71123 syscalls
nscd lwps 33/thr (8): 2798 syscalls
nscd lwps 33/thr (9): 2908 syscalls
nscd lwps 33/thr (10): 470 syscalls
nscd lwps 33/thr (11): 11695 syscalls
nscd lwps 33/thr (12): 470 syscalls
nscd lwps 33/thr (13): 2679 syscalls
nscd lwps 33/thr (14): 27171 syscalls
nscd lwps 33/thr (15): 83908 syscalls
nscd lwps 33/thr (16): 4113 syscalls
nscd lwps 33/thr (17): 470 syscalls
nscd lwps 33/thr (18): 8048 syscalls
nscd lwps 33/thr (19): 470 syscalls
nscd lwps 33/thr (20): 378 syscalls
nscd lwps 33/thr (21): 238 syscalls
nscd lwps 33/thr (22): 414 syscalls
nscd lwps 33/thr (23): 240 syscalls
nscd lwps 33/thr (24): 28 syscalls
nscd lwps 33/thr (25): 8747 syscalls
nscd lwps 33/thr (26): 589 syscalls
nscd lwps 33/thr (27): 246 syscalls
nscd lwps 33/thr (28): 11681 syscalls
nscd lwps 33/thr (29): 470 syscalls
nscd lwps 33/thr (30): 69942 syscalls
nscd lwps 33/thr (31): 5171 syscalls
nscd lwps 33/thr (32): 470 syscalls
nscd lwps 33/thr (499): 1680 syscalls

There may well be bugs in this or much better ways of doing
this though but it appears to work.

Jon.

Roman Shaposhnik

2008-Mar-26 01:02 UTC

head link

[dtrace-discuss] Iterating over all LWPs

On Tue, 2008-03-25 at 14:03 +0000, Jon Haslam wrote:> Hi Roman,
> 
> > Here''s the problem I''m facing: I need to sample a
particular
> > attribute of all the LWPs in a given process. Now, sampling
> > itself is no problem -- I can use profile/tick provider and
> > all is good. The problem is: I can''t seem to think of a 
> > DTrace-friendly way to iterate over all the LWPs and report
> > the value of the attribute I''m interested in. 
> >
> > Any suggestions?
> 
> To be honest, I''m not completely sure what you''re after
Well, I guess that makes you a mind reader, ''cause the rest
of your reply seems to be exactly what I was looking for. ;-)
Now, as far as implementation goes, I still have a few
questions:
>         self->pidp = `pidhash[$1 & (`pid_hashsz - 1)];
what is backtick doing here in front of pidhash? Is it a way
of accessing an arbitrary kernel variable?
> BEGIN
> /self->pidp->pid_id == $1/
Wow! Could you, please, elaborate on why exactly
the predicate is needed here? My reading of the first
BEGIN statement seems to suggest that the following
will always be true:
  self->pidp->pid_id == $1 

Or is it just a safety measure that prevents us from 
getting garbage from pidhash?
> {
>         this->slot = (*(uint32_t *)self->pidp) >> 8;
>         procp = `procdir[this->slot].pe_proc;
>         procname = stringof(procp->p_user.u_comm);
>         t = procp->p_tlist;
Wow! That''s some powerful kernel magic, if you ask me. ;-)
> tick-50ms
> /pidp && t != NULL/
> {
>         printf("%s lwps %d/thr (%d): %d syscalls\n", procname, 
> procp->p_lwpcnt,
>             t->t_tid, t->t_lwp->lwp_ru.sysc);
>         t = t->t_forw;
Now, here comes the crucial question: AFAIK, p_tlist points to
a circular list of kernel threads. We are traversing this list
using t = t->t_forw. Now, what happens if ''t'' points to 
a member of the list that used to be valid but has been
deallocated in between the two ticks of tick-50ms?

Thanks,
Roman.

Jon Haslam

2008-Mar-26 14:13 UTC

head link

[dtrace-discuss] Iterating over all LWPs

>>         self->pidp = `pidhash[$1 & (`pid_hashsz - 1)];
>
> what is backtick doing here in front of pidhash? Is it a way
> of accessing an arbitrary kernel variable?
Yes. The backquote is a scoping operator for kernel variables.
See the External Variables section in the docs:

http://wikis.sun.com/display/DTrace/Variables#Variables-ExternalVariables
>> BEGIN
>> /self->pidp->pid_id == $1/
>
> Wow! Could you, please, elaborate on why exactly
> the predicate is needed here? My reading of the first
> BEGIN statement seems to suggest that the following
> will always be true:
>   self->pidp->pid_id == $1 
>
> Or is it just a safety measure that prevents us from 
> getting garbage from pidhash?
Yes, it''s just there to ensure that we have extracted the
correct struct pid from the pidhash.
>>         this->slot = (*(uint32_t *)self->pidp) >> 8;
>>         procp = `procdir[this->slot].pe_proc;
>>         procname = stringof(procp->p_user.u_comm);
>>         t = procp->p_tlist;
>
> Wow! That''s some powerful kernel magic, if you ask me. ;-)
I stole this from Adam''s last blog entry so he''s the sorcerer.
Check it out if you haven''t seen it as it has a brief explanation
of what he''s doing there.
>> tick-50ms
>> /pidp && t != NULL/
>> {
>>         printf("%s lwps %d/thr (%d): %d syscalls\n",
procname,
>> procp->p_lwpcnt,
>>             t->t_tid, t->t_lwp->lwp_ru.sysc);
>>         t = t->t_forw;
>
> Now, here comes the crucial question: AFAIK, p_tlist points to
> a circular list of kernel threads. We are traversing this list
> using t = t->t_forw. Now, what happens if ''t'' points
to
> a member of the list that used to be valid but has been
> deallocated in between the two ticks of tick-50ms?
Using time based probes to iterate over data structures is
problematic as the data structures may well change beneath you.
I offered this up as an example of how to iterate over data structures
using a tick probe as it gets referenced quite a bit but there aren''t
that
many examples around of how to do it. The important point about
this technique is for the user to be aware of its limitations and
how the data they are observing is modified.

Jon.

Roman Shaposhnik

2008-Mar-26 18:39 UTC

head link

[dtrace-discuss] Iterating over all LWPs

On Wed, 2008-03-26 at 14:13 +0000, Jon Haslam wrote:> >> tick-50ms
> >> /pidp && t != NULL/
> >> {
> >>         printf("%s lwps %d/thr (%d): %d syscalls\n",
procname,
> >> procp->p_lwpcnt,
> >>             t->t_tid, t->t_lwp->lwp_ru.sysc);
> >>         t = t->t_forw;
> >
> > Now, here comes the crucial question: AFAIK, p_tlist points to
> > a circular list of kernel threads. We are traversing this list
> > using t = t->t_forw. Now, what happens if ''t''
points to
> > a member of the list that used to be valid but has been
> > deallocated in between the two ticks of tick-50ms?
> 
> Using time based probes to iterate over data structures is
> problematic as the data structures may well change beneath you.
> I offered this up as an example of how to iterate over data structures
> using a tick probe as it gets referenced quite a bit but there
aren''t that
> many examples around of how to do it. The important point about
> this technique is for the user to be aware of its limitations and
> how the data they are observing is modified.
Thanks for confirming my hunch. I guess at this point the only question
I have left is: what is really going to happen the next time
I do t->t_forw? A SEGV? Or I''ll be just off to chasing these
pointers
forever? See, I really don''t know kernel well enough to make
even an educated guess here. Any help would be appreciated.

Thanks,
Roman.

Jon Haslam

2008-Mar-26 21:38 UTC

head link

[dtrace-discuss] Iterating over all LWPs

>> Using time based probes to iterate over data structures is
>> problematic as the data structures may well change beneath you.
>> I offered this up as an example of how to iterate over data structures
>> using a tick probe as it gets referenced quite a bit but there
aren''t that
>> many examples around of how to do it. The important point about
>> this technique is for the user to be aware of its limitations and
>> how the data they are observing is modified.
>
> Thanks for confirming my hunch. I guess at this point the only question
> I have left is: what is really going to happen the next time
> I do t->t_forw? A SEGV? Or I''ll be just off to chasing these
pointers
> forever? See, I really don''t know kernel well enough to make
> even an educated guess here. Any help would be appreciated.
If the data that you''re using becomes invalid the logic of your
script may be affected or you may see runtime errors reported by
dtrace(1M) when you try and dereference invalid pointers (for example).
However, you shouldn''t see any failures (such as a SEGV or panic) as
safety is baked into the design of DTrace - you can dereference all the
bad pointers you want from within your D script and all you see is a ton
of error messages reported back to you.

Jon.

dtrace discuss - Mar 2008 - Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs

[dtrace-discuss] Iterating over all LWPs