Chad Mynhier
2006-Aug-08 12:51 UTC
[dtrace-discuss] DTracing reads/writes from a socketpair (trying to debug a race condition)
I''m trying to trace data being written to/read from the two file descriptors being returned from socketpair(), but I don''t seem to be doing what I want. (It''s also feasible that I''m trying to track down the underlying problem in the wrong way. I go into detail below for anyone who cares to read that far.) I''m trying to track things based on the address of the file_t associated with the file descriptor. In so_socketpair(), there''s a setf() call that I use to grab that address, and then I need a little bit of speculation to find the write()/read() associated with that file_t (committing after the getf() that returns a file_t that I can compare to.) Is there a different approach I should take? If I''m taking a reasonable approach, what am I doing wrong in the following code? fbt:sockfs:so_socketpair:entry / execname == "sshd" / { printf("sshd (%d) called fbt so_socketpair\n", pid); self->trace_pipe = 1; self->first_fp = 0; } fbt::setf:entry / self->trace_pipe / { printf("fd == %d fp == 0x%x\n",arg0,arg1); self->fp = arg1; } fbt:sockfs:so_socketpair:return / self->trace_pipe / { self->trace_pipe = 0; } syscall::write:entry, syscall::read:entry { self->spec = speculation(); speculate(self->spec); printf("%s %d bytes to fd %d\n",probefunc,arg2,arg0); } fbt::getf:return / self->spec && arg1 == self->fp / { commit(self->spec); self->spec = 0; } fbt::getf:return / self->spec && arg1 != self->fp / { discard(self->spec); self->spec = 0; } syscall::write:entry / execname == "cat" && arg2 == 415159 / { printf("CAT: Writing %d bytes to fd %d\n",arg2,arg0); self->trace_cat = 1; } syscall::write:return / self->trace_cat / { printf("CAT: Wrote %d bytes\n",arg0); self->trace_cat = 0; } WRT the underlying problem, I''m trying to track down a race condition I''m seeing with OpenSSH. The problem can be summarized in the following: server1> wc -l /var/tmp/bigfile 6000 /var/tmp/bigfile server1> server2> ssh -n server1 "/bin/sh -c ''cat /var/tmp/bigfile''" | wc -l 5718 server2> I see this behavior 100% of the time when server1 is an otherwise-idle x2100, about 20% of the time with an otherwise-idle v440 (the other 80% of the time, all of the data is transferred correctly), about 25% of the time when there''s 50% CPU utilization on the x2100 and never when there''s 100% CPU utilization. I never see this on slower servers. I also never see this if I stick ''&& sleep 1'' after the cat command, or even the smallest microsleep I can effectively get (~10,000 us.) Thus my assumption that this is a race condition somewhere. Some data points: - I only see the problem on Solaris 10 and not on Solaris 8, although this may simply be an artifact of server speeds. - I see this with OpenSSH versions 4.3.p2 and 3.7.1.p1. - OpenSSH sets O_NONBLOCK for the file descriptor it''s reading from. - During failure, the amount of data transferred falls on 64KB boundaries. (OpenSSH is reading from the child process with a 16KB buffer.) - The write() performed by cat returns the correct file size (415159.) The sum of the return values of the read()s performed by sshd is incorrect (393216.) - The final read() returns 0, not -1 with EAGAIN set. (A possibility because of the O_NONBLOCK, but OpenSSH handles this case correctly.) In tracing the reads/writes from the two ends of the socketpair, I''m attempting to determine where 415159 bytes gets truncated to 393216. If anyone has better suggestions for tracing this problem, I''d appreciate hearing them. I''ve done some bug database searching, but I don''t see anything that seems to fit this problem. I''ve also tried to replicate the problem in a small program, but I''m unable to do so. Chad Mynhier
Casper.Dik at Sun.COM
2006-Aug-08 13:02 UTC
[dtrace-discuss] DTracing reads/writes from a socketpair (trying to debug a race condition)
>server1> wc -l /var/tmp/bigfile > 6000 /var/tmp/bigfile >server1> > >server2> ssh -n server1 "/bin/sh -c ''cat /var/tmp/bigfile''" | wc -l > 5718 >server2>This looks very much like 6227895 Unix domain sockets still losing data on last close, affects sshd (Truss and just counting bytes send/received helped me find a problem like this)> >- I only see the problem on Solaris 10 and not on Solaris 8, although >this may simply be an artifact of server speeds.Which Solaris 10 release? Casper
Chad Mynhier
2006-Aug-08 13:25 UTC
[dtrace-discuss] DTracing reads/writes from a socketpair (trying to debug a race condition)
On 8/8/06, Casper.Dik at sun.com <Casper.Dik at sun.com> wrote:> >server1> wc -l /var/tmp/bigfile > > 6000 /var/tmp/bigfile > >server1> > > > >server2> ssh -n server1 "/bin/sh -c ''cat /var/tmp/bigfile''" | wc -l > > 5718 > >server2> > > > This looks very much like > > 6227895 Unix domain sockets still losing data on last close, affects sshd > > (Truss and just counting bytes send/received helped me find a problem > like this)Yes, this appears to be the same problem.> >- I only see the problem on Solaris 10 and not on Solaris 8, although > >this may simply be an artifact of server speeds. > > Which Solaris 10 release?Solaris 10 03/05, so the problem has been fixed since. Thanks, Chad Mynhier