thr3ads.net - dtrace discuss - [dtrace-discuss] Trouble with speculations and running out of speculative buffers. [Jun 2008]

If this information is useful, please help other people find it:
Share via:

Brian Utterback

2008-Jun-06 18:58 UTC

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

A while back, I asked how to detect speculations that didn''t complete. 
At that time, I was told that the return value of speculation() is 
simply an integer in the range from 1 up to the value of nspec, and so 
you could tap into any speculation by using an integer.

Here was and is my problem. I have a customer who have a call to the 
connect syscall that hangs and never returns. This is using UNIX 
domain sockets, so that really shouldn''t happen at all. I am trying to 
find out what is different about the threads that hang. So, I create a 
speculation and trace the thread. The only thing is, how do I commit 
the speculation based on something that doesn''t happen?

The suggestion at the time was to place a timestamp into an array when 
the call starts, and then use the tick-100hz trick to iterate over the 
array, looking for timestamps above some threshold in the past. When I 
found one, use the index as the speculation number and commit it.

That sounded good, but it didn''t work. I thought maybe I was getting 
too clever with the tick trick and all that, so based on a message 
earlier this week, I though about simplifying the whole thing using 
END probes and unrolling the loop. You see, the customer can see the 
hung processes, so I figured if I committed all of the outstanding 
speculations when the DTrace script is interrupted, that would be just 
as good and much simpler, less chance something could go wrong.

So, just to test it out before giving it to my customer, I wrote a 
script that uses the pollsys command and traces the kernel threads 
that are in pollsys at any given moment, and also it prints a stack 
trace when it goes into the resume call (which is the last kernel 
function called in pollsys when it blocks).

I wrote a macro to print out each speculation, along with how many 
milliseconds it had been in the pollsys call. I stored the timestamp 
for each speculation for when it was created, and then printed out the 
difference when the script is interrupted. I used a this variable to 
iterate through the list of speculations.

It seems to work, except for some reason I get output for speculations 
with no speculation buffer. Here is the script:

#!/usr/sbin/dtrace  -Cs

#pragma D option nspec=16

#define printspec       END \
/ts[this->spec]/ \
{ \
         speculate(this->spec); \
         trace(this->spec); \
         trace((timestamp-ts[this->spec])/1000000); \
 
trace("\n===================================================\n\n"); \
} \
END \
/ts[this->spec]/ \
{ \
         commit(this->spec); \
         this->spec++; \
}

long ts[17];

syscall::pollsys:entry
{
         self->spec = speculation();
         speculate(self->spec);
         ts[self->spec] = timestamp;
         trace(self->spec);
         trace(tid);
         trace(pid);
         trace(execname);
}

fbt::resume:entry
/self->spec/
{
         speculate(self->spec);
         stack();
}

syscall::pollsys:return
/self->spec/
{
         ts[self->spec] = 0;
         discard(self->spec);
         self->spec = 0;
}

END{
         trace("end");
         trace(timestamp);
         this->spec = 1;
}

printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec
printspec


And here is the output after 15 seconds:

dtrace: script ''./test1.d'' matched 36 probes
^C
CPU     ID    FUNCTION:NAME
   1      2    :END   end      2071629828966800
   1   7359    pollsys:entry    1   1 19878  snmpd
   1  11509    resume:entry
               genunix`cv_timedwait_sig+0x160
               genunix`cv_waituntil_sig+0x84
               genunix`poll_common+0x498
               genunix`pollsys+0xcc
               genunix`dtrace_systrace_syscall+0xa8
               unix`syscall_trap+0xac

   1      2    :END         1         675
==================================================

   1   7359    pollsys:entry    2   1 550  sendmail
   1  11509    resume:entry
               genunix`cv_timedwait_sig+0x160
               genunix`cv_waituntil_sig+0x84
               genunix`poll_common+0x498
               genunix`pollsys+0xcc
               genunix`dtrace_systrace_syscall32+0xa8
               unix`syscall_trap32+0xcc

   1      2    :END         2        4748
==================================================

   1   7359    pollsys:entry    3   1  456  dhcpagent
   1  11509    resume:entry
               genunix`cv_timedwait_sig+0x160
               genunix`cv_waituntil_sig+0x84
               genunix`poll_common+0x498
               genunix`pollsys+0xcc
               genunix`dtrace_systrace_syscall32+0xa8
               unix`syscall_trap32+0xcc

   1      2    :END         3         818
==================================================

   1   7359    pollsys:entry    4   12   911  java
   1  11509    resume:entry
               genunix`cv_timedwait_sig+0x160
               genunix`cv_waituntil_sig+0x84
               genunix`poll_common+0x13c
               genunix`pollsys+0xcc
               genunix`dtrace_systrace_syscall32+0xa8
               unix`syscall_trap32+0xcc

   1      2    :END         4          29
==================================================

   1   7359    pollsys:entry    5   14   911  java
   1  11509    resume:entry
               genunix`cv_timedwait_sig+0x160
               genunix`cv_waituntil_sig+0x84
               genunix`poll_common+0x13c
               genunix`pollsys+0xcc
               genunix`dtrace_systrace_syscall32+0xa8
               unix`syscall_trap32+0xcc

   1      2    :END         5        9163
==================================================

   1      2    :END         6        1686
==================================================


See the problem? In this case speculations 5 and 6 both have non-zero 
  ts array entries (or they would not have printed) but there is 
nothing in the speculation buffer (otherwise there would have been the 
first set of traces printed). Any ideas why this is happening?
-- 
blu

There are two rules in life:
Rule 1- Don''t tell people everything you know
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

Brian Utterback

2008-Jun-07 12:20 UTC

head link

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

Is it possible that speculation buffers are per CPU? I ran a program 
that reads data from a socket and mostly juts sits in pollsys. When I 
ran the program I enclosed, it often did not print out anything, even 
though I know it went into pollsys. I even changed the tests to print 
out all of the speculation buffers without regard to the setting of 
the ts variable, and I still did not see anything, but I almost always 
had indices where the ts variable was non-zero but the commit didn''t 
print anything. I thought maybe the speculation buffer might need to 
be on the samne CPU or something.

This is really a problem, I have to be sure I print out the hung 
threads. It takes a while for the processes and I can''t keep going 
back and saying, "well, that script didn''t get it, try this
one."


Brian Utterback wrote:> A while back, I asked how to detect speculations that didn''t
complete.
> At that time, I was told that the return value of speculation() is 
> simply an integer in the range from 1 up to the value of nspec, and so 
> you could tap into any speculation by using an integer.
> 
> Here was and is my problem. I have a customer who have a call to the 
> connect syscall that hangs and never returns. This is using UNIX 
> domain sockets, so that really shouldn''t happen at all. I am
trying to
> find out what is different about the threads that hang. So, I create a 
> speculation and trace the thread. The only thing is, how do I commit 
> the speculation based on something that doesn''t happen?
> 
> The suggestion at the time was to place a timestamp into an array when 
> the call starts, and then use the tick-100hz trick to iterate over the 
> array, looking for timestamps above some threshold in the past. When I 
> found one, use the index as the speculation number and commit it.
> 
> That sounded good, but it didn''t work. I thought maybe I was
getting
> too clever with the tick trick and all that, so based on a message 
> earlier this week, I though about simplifying the whole thing using 
> END probes and unrolling the loop. You see, the customer can see the 
> hung processes, so I figured if I committed all of the outstanding 
> speculations when the DTrace script is interrupted, that would be just 
> as good and much simpler, less chance something could go wrong.
> 
> So, just to test it out before giving it to my customer, I wrote a 
> script that uses the pollsys command and traces the kernel threads 
> that are in pollsys at any given moment, and also it prints a stack 
> trace when it goes into the resume call (which is the last kernel 
> function called in pollsys when it blocks).
> 
> I wrote a macro to print out each speculation, along with how many 
> milliseconds it had been in the pollsys call. I stored the timestamp 
> for each speculation for when it was created, and then printed out the 
> difference when the script is interrupted. I used a this variable to 
> iterate through the list of speculations.
> 
> It seems to work, except for some reason I get output for speculations 
> with no speculation buffer. Here is the script:
> 
> #!/usr/sbin/dtrace  -Cs
> 
> #pragma D option nspec=16
> 
> #define printspec       END \
> /ts[this->spec]/ \
> { \
>          speculate(this->spec); \
>          trace(this->spec); \
>          trace((timestamp-ts[this->spec])/1000000); \
>  
>
trace("\n===================================================\n\n"); \
> } \
> END \
> /ts[this->spec]/ \
> { \
>          commit(this->spec); \
>          this->spec++; \
> }
> 
> long ts[17];
> 
> syscall::pollsys:entry
> {
>          self->spec = speculation();
>          speculate(self->spec);
>          ts[self->spec] = timestamp;
>          trace(self->spec);
>          trace(tid);
>          trace(pid);
>          trace(execname);
> }
> 
> fbt::resume:entry
> /self->spec/
> {
>          speculate(self->spec);
>          stack();
> }
> 
> syscall::pollsys:return
> /self->spec/
> {
>          ts[self->spec] = 0;
>          discard(self->spec);
>          self->spec = 0;
> }
> 
> END{
>          trace("end");
>          trace(timestamp);
>          this->spec = 1;
> }
> 
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> printspec
> 
> 
> And here is the output after 15 seconds:
> 
> dtrace: script ''./test1.d'' matched 36 probes
> ^C
> CPU     ID    FUNCTION:NAME
>    1      2    :END   end      2071629828966800
>    1   7359    pollsys:entry    1   1 19878  snmpd
>    1  11509    resume:entry
>                genunix`cv_timedwait_sig+0x160
>                genunix`cv_waituntil_sig+0x84
>                genunix`poll_common+0x498
>                genunix`pollsys+0xcc
>                genunix`dtrace_systrace_syscall+0xa8
>                unix`syscall_trap+0xac
> 
>    1      2    :END         1         675
> ==================================================> 
> 
>    1   7359    pollsys:entry    2   1 550  sendmail
>    1  11509    resume:entry
>                genunix`cv_timedwait_sig+0x160
>                genunix`cv_waituntil_sig+0x84
>                genunix`poll_common+0x498
>                genunix`pollsys+0xcc
>                genunix`dtrace_systrace_syscall32+0xa8
>                unix`syscall_trap32+0xcc
> 
>    1      2    :END         2        4748
> ==================================================> 
> 
>    1   7359    pollsys:entry    3   1  456  dhcpagent
>    1  11509    resume:entry
>                genunix`cv_timedwait_sig+0x160
>                genunix`cv_waituntil_sig+0x84
>                genunix`poll_common+0x498
>                genunix`pollsys+0xcc
>                genunix`dtrace_systrace_syscall32+0xa8
>                unix`syscall_trap32+0xcc
> 
>    1      2    :END         3         818
> ==================================================> 
> 
>    1   7359    pollsys:entry    4   12   911  java
>    1  11509    resume:entry
>                genunix`cv_timedwait_sig+0x160
>                genunix`cv_waituntil_sig+0x84
>                genunix`poll_common+0x13c
>                genunix`pollsys+0xcc
>                genunix`dtrace_systrace_syscall32+0xa8
>                unix`syscall_trap32+0xcc
> 
>    1      2    :END         4          29
> ==================================================> 
> 
>    1   7359    pollsys:entry    5   14   911  java
>    1  11509    resume:entry
>                genunix`cv_timedwait_sig+0x160
>                genunix`cv_waituntil_sig+0x84
>                genunix`poll_common+0x13c
>                genunix`pollsys+0xcc
>                genunix`dtrace_systrace_syscall32+0xa8
>                unix`syscall_trap32+0xcc
> 
>    1      2    :END         5        9163
> ==================================================> 
> 
>    1      2    :END         6        1686
> ==================================================> 
> 
> 
> See the problem? In this case speculations 5 and 6 both have non-zero 
>   ts array entries (or they would not have printed) but there is 
> nothing in the speculation buffer (otherwise there would have been the 
> first set of traces printed). Any ideas why this is happening?
-- 
blu

There are two rules in life:
Rule 1- Don''t tell people everything you know
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

Pramod Batni

2008-Jun-09 09:35 UTC

head link

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

>
>    1      2    :END         5        9163
> ==================================================>
>
>    1      2    :END         6        1686
> ==================================================>
>
>
> See the problem? In this case speculations 5 and 6 both have non-zero 
>   ts array entries (or they would not have printed) but there is 
> nothing in the speculation buffer (otherwise there would have been the 
> first set of traces printed). Any ideas why this is happening?
>   Consider the following 2 probes firing on 2 different cpu''s at 
approximately the same time.> } \
> END \
> /ts[this->spec]/ \
> { \
>          commit(this->spec); \
>          this->spec++; \
> }
>
>
>
> syscall::pollsys:return
> /self->spec/
> {
>          ts[self->spec] = 0;
>          discard(self->spec);
>          self->spec = 0;
> }
>
>   The interleaved execution of the predicate and the clause could lead to 
the kind of output you are observing.
The first probe would find ts[this->spec] to be non-zero and before it 
can execute ''commit(this->spec)'' the second
probe would fire and execute ''discard(self->spec)'' . The
contents of the
speculative buffer are discarded and the
''commit(this->spec)''
would become a nop.
Such a sequence could explain the missing data for speculation 5 and 6.


Pramod

Brian Utterback

2008-Jun-09 11:36 UTC

head link

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

Actually, I thought that there might be some kind of race condition 
like that, but that really can''t be it. See the time stamps? 
Speculation 5 was set 9 seconds earlier, and speculation 6 was set 1.6 
  seconds earlier. The point is that I am seeing a very small number 
of syspoll calls and they tend to wait a relatively long time. I could 
see getting that kind of race once in awhile at that rate, or even all 
of the time if the rate of pollsys calls was very high, but that isn''t 
the case. Also, there is the one process that I know is waiting in 
syspoll that doesn''t show up here at all.

Pramod Batni wrote:> 
> 
> 
>>
>>    1      2    :END         5        9163
>> ==================================================>>
>>
>>    1      2    :END         6        1686
>> ==================================================>>
>>
>>
>> See the problem? In this case speculations 5 and 6 both have non-zero 
>>   ts array entries (or they would not have printed) but there is 
>> nothing in the speculation buffer (otherwise there would have been the 
>> first set of traces printed). Any ideas why this is happening?
>>   
> Consider the following 2 probes firing on 2 different cpu''s at 
> approximately the same time.
>> } \
>> END \
>> /ts[this->spec]/ \
>> { \
>>          commit(this->spec); \
>>          this->spec++; \
>> }
>>
>>
>>
>> syscall::pollsys:return
>> /self->spec/
>> {
>>          ts[self->spec] = 0;
>>          discard(self->spec);
>>          self->spec = 0;
>> }
>>
>>   
> The interleaved execution of the predicate and the clause could lead to 
> the kind of output you are observing.
> The first probe would find ts[this->spec] to be non-zero and before it 
> can execute ''commit(this->spec)'' the second
> probe would fire and execute ''discard(self->spec)'' .
The contents of the
> speculative buffer are discarded and the
''commit(this->spec)''
> would become a nop.
> Such a sequence could explain the missing data for speculation 5 and 6.
> 
> 
> Pramod
-- 
blu

There are two rules in life:
Rule 1- Don''t tell people everything you know
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

Brian Utterback

2008-Jun-09 12:16 UTC

head link

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

Ah, Pramod was right, for at least part of my problem. I just re-read 
the chapter on speculative tracing, and it says that when a commit is 
done, the speculative buffer from the cpu that does the commit is 
immediately copied to the principle buffer, with the speculative 
buffer from other cpus copied sometime later. This greatly increases 
the chances of the the discard happening before the buffer is copied, 
but after the value of ts is examined. It still doesn''t explain the 
missing processes though.

I worked around this problem by having a global variable on the clause 
with the discard. In the very first END clause, I change the global so 
the clause with the discard is thus always false and no more discard 
occur. This got rid of all the empty output. Are global variables 
coherent across CPUs?

Brian Utterback wrote:> Actually, I thought that there might be some kind of race condition 
> like that, but that really can''t be it. See the time stamps? 
> Speculation 5 was set 9 seconds earlier, and speculation 6 was set 1.6 
>   seconds earlier. The point is that I am seeing a very small number 
> of syspoll calls and they tend to wait a relatively long time. I could 
> see getting that kind of race once in awhile at that rate, or even all 
> of the time if the rate of pollsys calls was very high, but that
isn''t
> the case. Also, there is the one process that I know is waiting in 
> syspoll that doesn''t show up here at all.
> 
> Pramod Batni wrote:
>>
>>
>>>    1      2    :END         5        9163
>>> ==================================================>>>
>>>
>>>    1      2    :END         6        1686
>>> ==================================================>>>
>>>
>>>
>>> See the problem? In this case speculations 5 and 6 both have
non-zero
>>>   ts array entries (or they would not have printed) but there is 
>>> nothing in the speculation buffer (otherwise there would have been
the
>>> first set of traces printed). Any ideas why this is happening?
>>>   
>> Consider the following 2 probes firing on 2 different cpu''s at
>> approximately the same time.
>>> } \
>>> END \
>>> /ts[this->spec]/ \
>>> { \
>>>          commit(this->spec); \
>>>          this->spec++; \
>>> }
>>>
>>>
>>>
>>> syscall::pollsys:return
>>> /self->spec/
>>> {
>>>          ts[self->spec] = 0;
>>>          discard(self->spec);
>>>          self->spec = 0;
>>> }
>>>
>>>   
>> The interleaved execution of the predicate and the clause could lead to
>> the kind of output you are observing.
>> The first probe would find ts[this->spec] to be non-zero and before
it
>> can execute ''commit(this->spec)'' the second
>> probe would fire and execute ''discard(self->spec)''
. The contents of the
>> speculative buffer are discarded and the
''commit(this->spec)''
>> would become a nop.
>> Such a sequence could explain the missing data for speculation 5 and 6.
>>
>>
>> Pramod
> 
-- 
blu

There are two rules in life:
Rule 1- Don''t tell people everything you know
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

Brian Utterback

2008-Jun-09 14:15 UTC

head link

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

Actually, some more testing and I found what I think is either a bug, 
or perhaps a mis-feature. The problem has nothing to do with a race 
between the commit and the discard. It has all to do with commit 
processing in an END clause.

The missing processes always have their missing speculative data 
having been recorded on a cpu that different from the one executing 
the END clauses. When I do a commit of a speculation in an END clause, 
I never, ever, see any of the speculative data that was recorded on 
any other cpu. From this fact and from what I said in my last message 
about when speculative buffers from other cpus get copied into the 
principle buffer, I will speculate (heh) that the root cause is that 
after the END clauses finish, dtrace exits and thus never gives itself 
  a chance to do the copying. According to the docs, it does this 
copying at a rate set by the "cleanrate". My guess is that dtrace 
exits and does no more copying after the last END clause finishes.

I tried putting in one last END clause with a chill in it that was 
longer than the cleanrate, put that didn''t help. I suspect that the 
chill happens in the kernel context and the other stuff is in the 
dtrace context, so the chill always executes first, even though it is 
positionally last. In fact, it had the interesting side effect of 
stopping any  data from being printed out at all in my END clauses. I 
suspect that the de-allocation of some the variables had already 
occurred. Boy, the END processing can get interesting.

Brian Utterback wrote:> Ah, Pramod was right, for at least part of my problem. I just re-read 
> the chapter on speculative tracing, and it says that when a commit is 
> done, the speculative buffer from the cpu that does the commit is 
> immediately copied to the principle buffer, with the speculative 
> buffer from other cpus copied sometime later. This greatly increases 
> the chances of the the discard happening before the buffer is copied, 
> but after the value of ts is examined. It still doesn''t explain
the
> missing processes though.
> 
> I worked around this problem by having a global variable on the clause 
> with the discard. In the very first END clause, I change the global so 
> the clause with the discard is thus always false and no more discard 
> occur. This got rid of all the empty output. Are global variables 
> coherent across CPUs?
> 
> Brian Utterback wrote:
>> Actually, I thought that there might be some kind of race condition 
>> like that, but that really can''t be it. See the time stamps? 
>> Speculation 5 was set 9 seconds earlier, and speculation 6 was set 1.6 
>>   seconds earlier. The point is that I am seeing a very small number 
>> of syspoll calls and they tend to wait a relatively long time. I could 
>> see getting that kind of race once in awhile at that rate, or even all 
>> of the time if the rate of pollsys calls was very high, but that
isn''t
>> the case. Also, there is the one process that I know is waiting in 
>> syspoll that doesn''t show up here at all.
>>
>> Pramod Batni wrote:
>>>
>>>>    1      2    :END         5        9163
>>>>
==================================================>>>>
>>>>
>>>>    1      2    :END         6        1686
>>>>
==================================================>>>>
>>>>
>>>>
>>>> See the problem? In this case speculations 5 and 6 both have
non-zero
>>>>   ts array entries (or they would not have printed) but there
is
>>>> nothing in the speculation buffer (otherwise there would have
been the
>>>> first set of traces printed). Any ideas why this is happening?
>>>>   
>>> Consider the following 2 probes firing on 2 different
cpu''s at
>>> approximately the same time.
>>>> } \
>>>> END \
>>>> /ts[this->spec]/ \
>>>> { \
>>>>          commit(this->spec); \
>>>>          this->spec++; \
>>>> }
>>>>
>>>>
>>>>
>>>> syscall::pollsys:return
>>>> /self->spec/
>>>> {
>>>>          ts[self->spec] = 0;
>>>>          discard(self->spec);
>>>>          self->spec = 0;
>>>> }
>>>>
>>>>   
>>> The interleaved execution of the predicate and the clause could
lead to
>>> the kind of output you are observing.
>>> The first probe would find ts[this->spec] to be non-zero and
before it
>>> can execute ''commit(this->spec)'' the second
>>> probe would fire and execute
''discard(self->spec)'' . The contents of the
>>> speculative buffer are discarded and the
''commit(this->spec)''
>>> would become a nop.
>>> Such a sequence could explain the missing data for speculation 5
and 6.
>>>
>>>
>>> Pramod
> 
-- 
blu

There are two rules in life:
Rule 1- Don''t tell people everything you know
----------------------------------------------------------------------
Brian Utterback - Solaris RPE, Sun Microsystems, Inc.
Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom

Pramod Batni

2008-Jun-09 14:16 UTC

head link

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

Brian Utterback wrote:> Ah, Pramod was right, for at least part of my problem. I just re-read 
> the chapter on speculative tracing, and it says that when a commit is 
> done, the speculative buffer from the cpu that does the commit is 
> immediately copied to the principle buffer, with the speculative 
> buffer from other cpus copied sometime later. This greatly increases 
> the chances of the the discard happening before the buffer is copied, 
> but after the value of ts is examined. It still doesn''t explain
the
> missing processes though.
   The dtrace provider chapter mentions 
(http://wikis.sun.com/display/DTrace/dtrace+Provider) :
  ''The END probe fires after all other probes. This probe will not fire
until all other probe clauses have completed''

  As per the above assertion the race [as mentioned in the previous 
mail] between the commit and discard cannot happen because in the
  script the commit is done when the END  probe fires and the END probe 
fires when all the other probe clauses
  (discard clause) have completed.

Pramod>
> I worked around this problem by having a global variable on the clause 
> with the discard. In the very first END clause, I change the global so 
> the clause with the discard is thus always false and no more discard 
> occur. This got rid of all the empty output. Are global variables 
> coherent across CPUs?
>
> Brian Utterback wrote:
>> Actually, I thought that there might be some kind of race condition 
>> like that, but that really can''t be it. See the time stamps? 
>> Speculation 5 was set 9 seconds earlier, and speculation 6 was set 
>> 1.6   seconds earlier. The point is that I am seeing a very small 
>> number of syspoll calls and they tend to wait a relatively long time. 
>> I could see getting that kind of race once in awhile at that rate, or 
>> even all of the time if the rate of pollsys calls was very high, but 
>> that isn''t the case. Also, there is the one process that I
know is
>> waiting in syspoll that doesn''t show up here at all.
>>
>> Pramod Batni wrote:
>>>
>>>
>>>>    1      2    :END         5        9163
>>>>
==================================================>>>>
>>>>
>>>>    1      2    :END         6        1686
>>>>
==================================================>>>>
>>>>
>>>>
>>>> See the problem? In this case speculations 5 and 6 both have 
>>>> non-zero   ts array entries (or they would not have printed)
but
>>>> there is nothing in the speculation buffer (otherwise there
would
>>>> have been the first set of traces printed). Any ideas why this
is
>>>> happening?
>>>>   
>>> Consider the following 2 probes firing on 2 different
cpu''s at
>>> approximately the same time.
>>>> } \
>>>> END \
>>>> /ts[this->spec]/ \
>>>> { \
>>>>          commit(this->spec); \
>>>>          this->spec++; \
>>>> }
>>>>
>>>>
>>>>
>>>> syscall::pollsys:return
>>>> /self->spec/
>>>> {
>>>>          ts[self->spec] = 0;
>>>>          discard(self->spec);
>>>>          self->spec = 0;
>>>> }
>>>>
>>>>   
>>> The interleaved execution of the predicate and the clause could
lead
>>> to the kind of output you are observing.
>>> The first probe would find ts[this->spec] to be non-zero and
before
>>> it can execute ''commit(this->spec)'' the second
>>> probe would fire and execute
''discard(self->spec)'' . The contents of
>>> the speculative buffer are discarded and the
''commit(this->spec)''
>>> would become a nop.
>>> Such a sequence could explain the missing data for speculation 5
and 6.
>>>
>>>
>>> Pramod
>>
>

dtrace discuss - Jun 2008 - Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.

[dtrace-discuss] Trouble with speculations and running out of speculative buffers.