Brian Utterback
2008-Jun-06 18:58 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
A while back, I asked how to detect speculations that didn''t complete. At that time, I was told that the return value of speculation() is simply an integer in the range from 1 up to the value of nspec, and so you could tap into any speculation by using an integer. Here was and is my problem. I have a customer who have a call to the connect syscall that hangs and never returns. This is using UNIX domain sockets, so that really shouldn''t happen at all. I am trying to find out what is different about the threads that hang. So, I create a speculation and trace the thread. The only thing is, how do I commit the speculation based on something that doesn''t happen? The suggestion at the time was to place a timestamp into an array when the call starts, and then use the tick-100hz trick to iterate over the array, looking for timestamps above some threshold in the past. When I found one, use the index as the speculation number and commit it. That sounded good, but it didn''t work. I thought maybe I was getting too clever with the tick trick and all that, so based on a message earlier this week, I though about simplifying the whole thing using END probes and unrolling the loop. You see, the customer can see the hung processes, so I figured if I committed all of the outstanding speculations when the DTrace script is interrupted, that would be just as good and much simpler, less chance something could go wrong. So, just to test it out before giving it to my customer, I wrote a script that uses the pollsys command and traces the kernel threads that are in pollsys at any given moment, and also it prints a stack trace when it goes into the resume call (which is the last kernel function called in pollsys when it blocks). I wrote a macro to print out each speculation, along with how many milliseconds it had been in the pollsys call. I stored the timestamp for each speculation for when it was created, and then printed out the difference when the script is interrupted. I used a this variable to iterate through the list of speculations. It seems to work, except for some reason I get output for speculations with no speculation buffer. Here is the script: #!/usr/sbin/dtrace -Cs #pragma D option nspec=16 #define printspec END \ /ts[this->spec]/ \ { \ speculate(this->spec); \ trace(this->spec); \ trace((timestamp-ts[this->spec])/1000000); \ trace("\n===================================================\n\n"); \ } \ END \ /ts[this->spec]/ \ { \ commit(this->spec); \ this->spec++; \ } long ts[17]; syscall::pollsys:entry { self->spec = speculation(); speculate(self->spec); ts[self->spec] = timestamp; trace(self->spec); trace(tid); trace(pid); trace(execname); } fbt::resume:entry /self->spec/ { speculate(self->spec); stack(); } syscall::pollsys:return /self->spec/ { ts[self->spec] = 0; discard(self->spec); self->spec = 0; } END{ trace("end"); trace(timestamp); this->spec = 1; } printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec printspec And here is the output after 15 seconds: dtrace: script ''./test1.d'' matched 36 probes ^C CPU ID FUNCTION:NAME 1 2 :END end 2071629828966800 1 7359 pollsys:entry 1 1 19878 snmpd 1 11509 resume:entry genunix`cv_timedwait_sig+0x160 genunix`cv_waituntil_sig+0x84 genunix`poll_common+0x498 genunix`pollsys+0xcc genunix`dtrace_systrace_syscall+0xa8 unix`syscall_trap+0xac 1 2 :END 1 675 ================================================== 1 7359 pollsys:entry 2 1 550 sendmail 1 11509 resume:entry genunix`cv_timedwait_sig+0x160 genunix`cv_waituntil_sig+0x84 genunix`poll_common+0x498 genunix`pollsys+0xcc genunix`dtrace_systrace_syscall32+0xa8 unix`syscall_trap32+0xcc 1 2 :END 2 4748 ================================================== 1 7359 pollsys:entry 3 1 456 dhcpagent 1 11509 resume:entry genunix`cv_timedwait_sig+0x160 genunix`cv_waituntil_sig+0x84 genunix`poll_common+0x498 genunix`pollsys+0xcc genunix`dtrace_systrace_syscall32+0xa8 unix`syscall_trap32+0xcc 1 2 :END 3 818 ================================================== 1 7359 pollsys:entry 4 12 911 java 1 11509 resume:entry genunix`cv_timedwait_sig+0x160 genunix`cv_waituntil_sig+0x84 genunix`poll_common+0x13c genunix`pollsys+0xcc genunix`dtrace_systrace_syscall32+0xa8 unix`syscall_trap32+0xcc 1 2 :END 4 29 ================================================== 1 7359 pollsys:entry 5 14 911 java 1 11509 resume:entry genunix`cv_timedwait_sig+0x160 genunix`cv_waituntil_sig+0x84 genunix`poll_common+0x13c genunix`pollsys+0xcc genunix`dtrace_systrace_syscall32+0xa8 unix`syscall_trap32+0xcc 1 2 :END 5 9163 ================================================== 1 2 :END 6 1686 ================================================== See the problem? In this case speculations 5 and 6 both have non-zero ts array entries (or they would not have printed) but there is nothing in the speculation buffer (otherwise there would have been the first set of traces printed). Any ideas why this is happening? -- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Brian Utterback
2008-Jun-07 12:20 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
Is it possible that speculation buffers are per CPU? I ran a program that reads data from a socket and mostly juts sits in pollsys. When I ran the program I enclosed, it often did not print out anything, even though I know it went into pollsys. I even changed the tests to print out all of the speculation buffers without regard to the setting of the ts variable, and I still did not see anything, but I almost always had indices where the ts variable was non-zero but the commit didn''t print anything. I thought maybe the speculation buffer might need to be on the samne CPU or something. This is really a problem, I have to be sure I print out the hung threads. It takes a while for the processes and I can''t keep going back and saying, "well, that script didn''t get it, try this one." Brian Utterback wrote:> A while back, I asked how to detect speculations that didn''t complete. > At that time, I was told that the return value of speculation() is > simply an integer in the range from 1 up to the value of nspec, and so > you could tap into any speculation by using an integer. > > Here was and is my problem. I have a customer who have a call to the > connect syscall that hangs and never returns. This is using UNIX > domain sockets, so that really shouldn''t happen at all. I am trying to > find out what is different about the threads that hang. So, I create a > speculation and trace the thread. The only thing is, how do I commit > the speculation based on something that doesn''t happen? > > The suggestion at the time was to place a timestamp into an array when > the call starts, and then use the tick-100hz trick to iterate over the > array, looking for timestamps above some threshold in the past. When I > found one, use the index as the speculation number and commit it. > > That sounded good, but it didn''t work. I thought maybe I was getting > too clever with the tick trick and all that, so based on a message > earlier this week, I though about simplifying the whole thing using > END probes and unrolling the loop. You see, the customer can see the > hung processes, so I figured if I committed all of the outstanding > speculations when the DTrace script is interrupted, that would be just > as good and much simpler, less chance something could go wrong. > > So, just to test it out before giving it to my customer, I wrote a > script that uses the pollsys command and traces the kernel threads > that are in pollsys at any given moment, and also it prints a stack > trace when it goes into the resume call (which is the last kernel > function called in pollsys when it blocks). > > I wrote a macro to print out each speculation, along with how many > milliseconds it had been in the pollsys call. I stored the timestamp > for each speculation for when it was created, and then printed out the > difference when the script is interrupted. I used a this variable to > iterate through the list of speculations. > > It seems to work, except for some reason I get output for speculations > with no speculation buffer. Here is the script: > > #!/usr/sbin/dtrace -Cs > > #pragma D option nspec=16 > > #define printspec END \ > /ts[this->spec]/ \ > { \ > speculate(this->spec); \ > trace(this->spec); \ > trace((timestamp-ts[this->spec])/1000000); \ > > trace("\n===================================================\n\n"); \ > } \ > END \ > /ts[this->spec]/ \ > { \ > commit(this->spec); \ > this->spec++; \ > } > > long ts[17]; > > syscall::pollsys:entry > { > self->spec = speculation(); > speculate(self->spec); > ts[self->spec] = timestamp; > trace(self->spec); > trace(tid); > trace(pid); > trace(execname); > } > > fbt::resume:entry > /self->spec/ > { > speculate(self->spec); > stack(); > } > > syscall::pollsys:return > /self->spec/ > { > ts[self->spec] = 0; > discard(self->spec); > self->spec = 0; > } > > END{ > trace("end"); > trace(timestamp); > this->spec = 1; > } > > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > printspec > > > And here is the output after 15 seconds: > > dtrace: script ''./test1.d'' matched 36 probes > ^C > CPU ID FUNCTION:NAME > 1 2 :END end 2071629828966800 > 1 7359 pollsys:entry 1 1 19878 snmpd > 1 11509 resume:entry > genunix`cv_timedwait_sig+0x160 > genunix`cv_waituntil_sig+0x84 > genunix`poll_common+0x498 > genunix`pollsys+0xcc > genunix`dtrace_systrace_syscall+0xa8 > unix`syscall_trap+0xac > > 1 2 :END 1 675 > ==================================================> > > 1 7359 pollsys:entry 2 1 550 sendmail > 1 11509 resume:entry > genunix`cv_timedwait_sig+0x160 > genunix`cv_waituntil_sig+0x84 > genunix`poll_common+0x498 > genunix`pollsys+0xcc > genunix`dtrace_systrace_syscall32+0xa8 > unix`syscall_trap32+0xcc > > 1 2 :END 2 4748 > ==================================================> > > 1 7359 pollsys:entry 3 1 456 dhcpagent > 1 11509 resume:entry > genunix`cv_timedwait_sig+0x160 > genunix`cv_waituntil_sig+0x84 > genunix`poll_common+0x498 > genunix`pollsys+0xcc > genunix`dtrace_systrace_syscall32+0xa8 > unix`syscall_trap32+0xcc > > 1 2 :END 3 818 > ==================================================> > > 1 7359 pollsys:entry 4 12 911 java > 1 11509 resume:entry > genunix`cv_timedwait_sig+0x160 > genunix`cv_waituntil_sig+0x84 > genunix`poll_common+0x13c > genunix`pollsys+0xcc > genunix`dtrace_systrace_syscall32+0xa8 > unix`syscall_trap32+0xcc > > 1 2 :END 4 29 > ==================================================> > > 1 7359 pollsys:entry 5 14 911 java > 1 11509 resume:entry > genunix`cv_timedwait_sig+0x160 > genunix`cv_waituntil_sig+0x84 > genunix`poll_common+0x13c > genunix`pollsys+0xcc > genunix`dtrace_systrace_syscall32+0xa8 > unix`syscall_trap32+0xcc > > 1 2 :END 5 9163 > ==================================================> > > 1 2 :END 6 1686 > ==================================================> > > > See the problem? In this case speculations 5 and 6 both have non-zero > ts array entries (or they would not have printed) but there is > nothing in the speculation buffer (otherwise there would have been the > first set of traces printed). Any ideas why this is happening?-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Pramod Batni
2008-Jun-09 09:35 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
> > 1 2 :END 5 9163 > ==================================================> > > 1 2 :END 6 1686 > ==================================================> > > > See the problem? In this case speculations 5 and 6 both have non-zero > ts array entries (or they would not have printed) but there is > nothing in the speculation buffer (otherwise there would have been the > first set of traces printed). Any ideas why this is happening? >Consider the following 2 probes firing on 2 different cpu''s at approximately the same time.> } \ > END \ > /ts[this->spec]/ \ > { \ > commit(this->spec); \ > this->spec++; \ > } > > > > syscall::pollsys:return > /self->spec/ > { > ts[self->spec] = 0; > discard(self->spec); > self->spec = 0; > } > >The interleaved execution of the predicate and the clause could lead to the kind of output you are observing. The first probe would find ts[this->spec] to be non-zero and before it can execute ''commit(this->spec)'' the second probe would fire and execute ''discard(self->spec)'' . The contents of the speculative buffer are discarded and the ''commit(this->spec)'' would become a nop. Such a sequence could explain the missing data for speculation 5 and 6. Pramod
Brian Utterback
2008-Jun-09 11:36 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
Actually, I thought that there might be some kind of race condition like that, but that really can''t be it. See the time stamps? Speculation 5 was set 9 seconds earlier, and speculation 6 was set 1.6 seconds earlier. The point is that I am seeing a very small number of syspoll calls and they tend to wait a relatively long time. I could see getting that kind of race once in awhile at that rate, or even all of the time if the rate of pollsys calls was very high, but that isn''t the case. Also, there is the one process that I know is waiting in syspoll that doesn''t show up here at all. Pramod Batni wrote:> > > >> >> 1 2 :END 5 9163 >> ==================================================>> >> >> 1 2 :END 6 1686 >> ==================================================>> >> >> >> See the problem? In this case speculations 5 and 6 both have non-zero >> ts array entries (or they would not have printed) but there is >> nothing in the speculation buffer (otherwise there would have been the >> first set of traces printed). Any ideas why this is happening? >> > Consider the following 2 probes firing on 2 different cpu''s at > approximately the same time. >> } \ >> END \ >> /ts[this->spec]/ \ >> { \ >> commit(this->spec); \ >> this->spec++; \ >> } >> >> >> >> syscall::pollsys:return >> /self->spec/ >> { >> ts[self->spec] = 0; >> discard(self->spec); >> self->spec = 0; >> } >> >> > The interleaved execution of the predicate and the clause could lead to > the kind of output you are observing. > The first probe would find ts[this->spec] to be non-zero and before it > can execute ''commit(this->spec)'' the second > probe would fire and execute ''discard(self->spec)'' . The contents of the > speculative buffer are discarded and the ''commit(this->spec)'' > would become a nop. > Such a sequence could explain the missing data for speculation 5 and 6. > > > Pramod-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Brian Utterback
2008-Jun-09 12:16 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
Ah, Pramod was right, for at least part of my problem. I just re-read the chapter on speculative tracing, and it says that when a commit is done, the speculative buffer from the cpu that does the commit is immediately copied to the principle buffer, with the speculative buffer from other cpus copied sometime later. This greatly increases the chances of the the discard happening before the buffer is copied, but after the value of ts is examined. It still doesn''t explain the missing processes though. I worked around this problem by having a global variable on the clause with the discard. In the very first END clause, I change the global so the clause with the discard is thus always false and no more discard occur. This got rid of all the empty output. Are global variables coherent across CPUs? Brian Utterback wrote:> Actually, I thought that there might be some kind of race condition > like that, but that really can''t be it. See the time stamps? > Speculation 5 was set 9 seconds earlier, and speculation 6 was set 1.6 > seconds earlier. The point is that I am seeing a very small number > of syspoll calls and they tend to wait a relatively long time. I could > see getting that kind of race once in awhile at that rate, or even all > of the time if the rate of pollsys calls was very high, but that isn''t > the case. Also, there is the one process that I know is waiting in > syspoll that doesn''t show up here at all. > > Pramod Batni wrote: >> >> >>> 1 2 :END 5 9163 >>> ==================================================>>> >>> >>> 1 2 :END 6 1686 >>> ==================================================>>> >>> >>> >>> See the problem? In this case speculations 5 and 6 both have non-zero >>> ts array entries (or they would not have printed) but there is >>> nothing in the speculation buffer (otherwise there would have been the >>> first set of traces printed). Any ideas why this is happening? >>> >> Consider the following 2 probes firing on 2 different cpu''s at >> approximately the same time. >>> } \ >>> END \ >>> /ts[this->spec]/ \ >>> { \ >>> commit(this->spec); \ >>> this->spec++; \ >>> } >>> >>> >>> >>> syscall::pollsys:return >>> /self->spec/ >>> { >>> ts[self->spec] = 0; >>> discard(self->spec); >>> self->spec = 0; >>> } >>> >>> >> The interleaved execution of the predicate and the clause could lead to >> the kind of output you are observing. >> The first probe would find ts[this->spec] to be non-zero and before it >> can execute ''commit(this->spec)'' the second >> probe would fire and execute ''discard(self->spec)'' . The contents of the >> speculative buffer are discarded and the ''commit(this->spec)'' >> would become a nop. >> Such a sequence could explain the missing data for speculation 5 and 6. >> >> >> Pramod >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Brian Utterback
2008-Jun-09 14:15 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
Actually, some more testing and I found what I think is either a bug, or perhaps a mis-feature. The problem has nothing to do with a race between the commit and the discard. It has all to do with commit processing in an END clause. The missing processes always have their missing speculative data having been recorded on a cpu that different from the one executing the END clauses. When I do a commit of a speculation in an END clause, I never, ever, see any of the speculative data that was recorded on any other cpu. From this fact and from what I said in my last message about when speculative buffers from other cpus get copied into the principle buffer, I will speculate (heh) that the root cause is that after the END clauses finish, dtrace exits and thus never gives itself a chance to do the copying. According to the docs, it does this copying at a rate set by the "cleanrate". My guess is that dtrace exits and does no more copying after the last END clause finishes. I tried putting in one last END clause with a chill in it that was longer than the cleanrate, put that didn''t help. I suspect that the chill happens in the kernel context and the other stuff is in the dtrace context, so the chill always executes first, even though it is positionally last. In fact, it had the interesting side effect of stopping any data from being printed out at all in my END clauses. I suspect that the de-allocation of some the variables had already occurred. Boy, the END processing can get interesting. Brian Utterback wrote:> Ah, Pramod was right, for at least part of my problem. I just re-read > the chapter on speculative tracing, and it says that when a commit is > done, the speculative buffer from the cpu that does the commit is > immediately copied to the principle buffer, with the speculative > buffer from other cpus copied sometime later. This greatly increases > the chances of the the discard happening before the buffer is copied, > but after the value of ts is examined. It still doesn''t explain the > missing processes though. > > I worked around this problem by having a global variable on the clause > with the discard. In the very first END clause, I change the global so > the clause with the discard is thus always false and no more discard > occur. This got rid of all the empty output. Are global variables > coherent across CPUs? > > Brian Utterback wrote: >> Actually, I thought that there might be some kind of race condition >> like that, but that really can''t be it. See the time stamps? >> Speculation 5 was set 9 seconds earlier, and speculation 6 was set 1.6 >> seconds earlier. The point is that I am seeing a very small number >> of syspoll calls and they tend to wait a relatively long time. I could >> see getting that kind of race once in awhile at that rate, or even all >> of the time if the rate of pollsys calls was very high, but that isn''t >> the case. Also, there is the one process that I know is waiting in >> syspoll that doesn''t show up here at all. >> >> Pramod Batni wrote: >>> >>>> 1 2 :END 5 9163 >>>> ==================================================>>>> >>>> >>>> 1 2 :END 6 1686 >>>> ==================================================>>>> >>>> >>>> >>>> See the problem? In this case speculations 5 and 6 both have non-zero >>>> ts array entries (or they would not have printed) but there is >>>> nothing in the speculation buffer (otherwise there would have been the >>>> first set of traces printed). Any ideas why this is happening? >>>> >>> Consider the following 2 probes firing on 2 different cpu''s at >>> approximately the same time. >>>> } \ >>>> END \ >>>> /ts[this->spec]/ \ >>>> { \ >>>> commit(this->spec); \ >>>> this->spec++; \ >>>> } >>>> >>>> >>>> >>>> syscall::pollsys:return >>>> /self->spec/ >>>> { >>>> ts[self->spec] = 0; >>>> discard(self->spec); >>>> self->spec = 0; >>>> } >>>> >>>> >>> The interleaved execution of the predicate and the clause could lead to >>> the kind of output you are observing. >>> The first probe would find ts[this->spec] to be non-zero and before it >>> can execute ''commit(this->spec)'' the second >>> probe would fire and execute ''discard(self->spec)'' . The contents of the >>> speculative buffer are discarded and the ''commit(this->spec)'' >>> would become a nop. >>> Such a sequence could explain the missing data for speculation 5 and 6. >>> >>> >>> Pramod >-- blu There are two rules in life: Rule 1- Don''t tell people everything you know ---------------------------------------------------------------------- Brian Utterback - Solaris RPE, Sun Microsystems, Inc. Ph:877-259-7345, Em:brian.utterback-at-ess-you-enn-dot-kom
Pramod Batni
2008-Jun-09 14:16 UTC
[dtrace-discuss] Trouble with speculations and running out of speculative buffers.
Brian Utterback wrote:> Ah, Pramod was right, for at least part of my problem. I just re-read > the chapter on speculative tracing, and it says that when a commit is > done, the speculative buffer from the cpu that does the commit is > immediately copied to the principle buffer, with the speculative > buffer from other cpus copied sometime later. This greatly increases > the chances of the the discard happening before the buffer is copied, > but after the value of ts is examined. It still doesn''t explain the > missing processes though.The dtrace provider chapter mentions (http://wikis.sun.com/display/DTrace/dtrace+Provider) : ''The END probe fires after all other probes. This probe will not fire until all other probe clauses have completed'' As per the above assertion the race [as mentioned in the previous mail] between the commit and discard cannot happen because in the script the commit is done when the END probe fires and the END probe fires when all the other probe clauses (discard clause) have completed. Pramod> > I worked around this problem by having a global variable on the clause > with the discard. In the very first END clause, I change the global so > the clause with the discard is thus always false and no more discard > occur. This got rid of all the empty output. Are global variables > coherent across CPUs? > > Brian Utterback wrote: >> Actually, I thought that there might be some kind of race condition >> like that, but that really can''t be it. See the time stamps? >> Speculation 5 was set 9 seconds earlier, and speculation 6 was set >> 1.6 seconds earlier. The point is that I am seeing a very small >> number of syspoll calls and they tend to wait a relatively long time. >> I could see getting that kind of race once in awhile at that rate, or >> even all of the time if the rate of pollsys calls was very high, but >> that isn''t the case. Also, there is the one process that I know is >> waiting in syspoll that doesn''t show up here at all. >> >> Pramod Batni wrote: >>> >>> >>>> 1 2 :END 5 9163 >>>> ==================================================>>>> >>>> >>>> 1 2 :END 6 1686 >>>> ==================================================>>>> >>>> >>>> >>>> See the problem? In this case speculations 5 and 6 both have >>>> non-zero ts array entries (or they would not have printed) but >>>> there is nothing in the speculation buffer (otherwise there would >>>> have been the first set of traces printed). Any ideas why this is >>>> happening? >>>> >>> Consider the following 2 probes firing on 2 different cpu''s at >>> approximately the same time. >>>> } \ >>>> END \ >>>> /ts[this->spec]/ \ >>>> { \ >>>> commit(this->spec); \ >>>> this->spec++; \ >>>> } >>>> >>>> >>>> >>>> syscall::pollsys:return >>>> /self->spec/ >>>> { >>>> ts[self->spec] = 0; >>>> discard(self->spec); >>>> self->spec = 0; >>>> } >>>> >>>> >>> The interleaved execution of the predicate and the clause could lead >>> to the kind of output you are observing. >>> The first probe would find ts[this->spec] to be non-zero and before >>> it can execute ''commit(this->spec)'' the second >>> probe would fire and execute ''discard(self->spec)'' . The contents of >>> the speculative buffer are discarded and the ''commit(this->spec)'' >>> would become a nop. >>> Such a sequence could explain the missing data for speculation 5 and 6. >>> >>> >>> Pramod >> >