Chad Mynhier
2006-Aug-08 12:51 UTC
[dtrace-discuss] DTracing reads/writes from a socketpair (trying to debug a race condition)
I''m trying to trace data being written to/read from the two file
descriptors being returned from socketpair(), but I don''t seem to be
doing what I want. (It''s also feasible that I''m trying to
track down
the underlying problem in the wrong way. I go into detail below for
anyone who cares to read that far.)
I''m trying to track things based on the address of the file_t
associated with the file descriptor. In so_socketpair(), there''s a
setf() call that I use to grab that address, and then I need a little
bit of speculation to find the write()/read() associated with that
file_t (committing after the getf() that returns a file_t that I can
compare to.)
Is there a different approach I should take? If I''m taking a
reasonable approach, what am I doing wrong in the following code?
fbt:sockfs:so_socketpair:entry
/ execname == "sshd" /
{
printf("sshd (%d) called fbt so_socketpair\n", pid);
self->trace_pipe = 1;
self->first_fp = 0;
}
fbt::setf:entry
/ self->trace_pipe /
{
printf("fd == %d fp == 0x%x\n",arg0,arg1);
self->fp = arg1;
}
fbt:sockfs:so_socketpair:return
/ self->trace_pipe /
{
self->trace_pipe = 0;
}
syscall::write:entry,
syscall::read:entry
{
self->spec = speculation();
speculate(self->spec);
printf("%s %d bytes to fd %d\n",probefunc,arg2,arg0);
}
fbt::getf:return
/ self->spec && arg1 == self->fp /
{
commit(self->spec);
self->spec = 0;
}
fbt::getf:return
/ self->spec && arg1 != self->fp /
{
discard(self->spec);
self->spec = 0;
}
syscall::write:entry
/ execname == "cat" && arg2 == 415159 /
{
printf("CAT: Writing %d bytes to fd %d\n",arg2,arg0);
self->trace_cat = 1;
}
syscall::write:return
/ self->trace_cat /
{
printf("CAT: Wrote %d bytes\n",arg0);
self->trace_cat = 0;
}
WRT the underlying problem, I''m trying to track down a race condition
I''m seeing with OpenSSH. The problem can be summarized in the
following:
server1> wc -l /var/tmp/bigfile
6000 /var/tmp/bigfile
server1>
server2> ssh -n server1 "/bin/sh -c ''cat
/var/tmp/bigfile''" | wc -l
5718
server2>
I see this behavior 100% of the time when server1 is an otherwise-idle
x2100, about 20% of the time with an otherwise-idle v440 (the other
80% of the time, all of the data is transferred correctly), about 25%
of the time when there''s 50% CPU utilization on the x2100 and never
when there''s 100% CPU utilization. I never see this on slower
servers. I also never see this if I stick ''&& sleep
1'' after the cat
command, or even the smallest microsleep I can effectively get
(~10,000 us.) Thus my assumption that this is a race condition
somewhere.
Some data points:
- I only see the problem on Solaris 10 and not on Solaris 8, although
this may simply be an artifact of server speeds.
- I see this with OpenSSH versions 4.3.p2 and 3.7.1.p1.
- OpenSSH sets O_NONBLOCK for the file descriptor it''s reading from.
- During failure, the amount of data transferred falls on 64KB
boundaries. (OpenSSH is reading from the child process with a 16KB
buffer.)
- The write() performed by cat returns the correct file size (415159.)
The sum of the return values of the read()s performed by sshd is
incorrect (393216.)
- The final read() returns 0, not -1 with EAGAIN set. (A possibility
because of the O_NONBLOCK, but OpenSSH handles this case correctly.)
In tracing the reads/writes from the two ends of the socketpair, I''m
attempting to determine where 415159 bytes gets truncated to 393216.
If anyone has better suggestions for tracing this problem, I''d
appreciate hearing them.
I''ve done some bug database searching, but I don''t see
anything that
seems to fit this problem. I''ve also tried to replicate the problem
in a small program, but I''m unable to do so.
Chad Mynhier
Casper.Dik at Sun.COM
2006-Aug-08 13:02 UTC
[dtrace-discuss] DTracing reads/writes from a socketpair (trying to debug a race condition)
>server1> wc -l /var/tmp/bigfile > 6000 /var/tmp/bigfile >server1> > >server2> ssh -n server1 "/bin/sh -c ''cat /var/tmp/bigfile''" | wc -l > 5718 >server2>This looks very much like 6227895 Unix domain sockets still losing data on last close, affects sshd (Truss and just counting bytes send/received helped me find a problem like this)> >- I only see the problem on Solaris 10 and not on Solaris 8, although >this may simply be an artifact of server speeds.Which Solaris 10 release? Casper
Chad Mynhier
2006-Aug-08 13:25 UTC
[dtrace-discuss] DTracing reads/writes from a socketpair (trying to debug a race condition)
On 8/8/06, Casper.Dik at sun.com <Casper.Dik at sun.com> wrote:> >server1> wc -l /var/tmp/bigfile > > 6000 /var/tmp/bigfile > >server1> > > > >server2> ssh -n server1 "/bin/sh -c ''cat /var/tmp/bigfile''" | wc -l > > 5718 > >server2> > > > This looks very much like > > 6227895 Unix domain sockets still losing data on last close, affects sshd > > (Truss and just counting bytes send/received helped me find a problem > like this)Yes, this appears to be the same problem.> >- I only see the problem on Solaris 10 and not on Solaris 8, although > >this may simply be an artifact of server speeds. > > Which Solaris 10 release?Solaris 10 03/05, so the problem has been fixed since. Thanks, Chad Mynhier