thr3ads.net - dtrace discuss - [dtrace-discuss] Context of tcp_connect

If this information is useful, please help other people find it:
Share via:

Wee Yeh Tan

2005-Nov-23 08:29 UTC

[dtrace-discuss] Context of tcp_connect_finish & squeue

Hi,

I''m trying to locate a function that I can use to tie an incoming tcp
packet to the thread that finally accepts the connection.

I noted that Brendan''s tcpsnoop uses tcp_accept_finish() but that is
not always correct given that tcp_accept_finish() is called by
squeue_enter() which may be fire under a different thread.

Is there an easy solution or do I really have to work through the
squeue stuff for this (I''m sure it''ll be rewarding though).


--
Just me,
Wire ...

Brendan Gregg

2005-Nov-23 11:55 UTC

head link

[dtrace-discuss] Context of tcp_connect_finish & squeue

G''Day Folks,

On Wed, 23 Nov 2005, Wee Yeh Tan wrote:
> Hi,
>
> I''m trying to locate a function that I can use to tie an incoming
tcp
> packet to the thread that finally accepts the connection.
>
> I noted that Brendan''s tcpsnoop uses tcp_accept_finish() but that
is
> not always correct given that tcp_accept_finish() is called by
> squeue_enter() which may be fire under a different thread.
If any of my scripts are problematic, please post samples that demonstrate
it, including instructions on how I can recreate it, or just email me.

As an example, the following tests tcp_accept_finish() on a single CPU
system under varying load, and connected from client system 10000 times
using finger (inetd),

   # dtrace -n ''fbt::tcp_accept_finish:entry { @[execname] = count();
}''
   dtrace: description ''fbt::tcp_accept_finish:entry '' matched
1 probe
   ^C
     inetd                                                         10000

That''s 100% correct. Now a 4 CPU server (which I rarely have access
to).
This time 1000 times (10000 took too long!),

   # dtrace -n ''fbt::tcp_accept_finish:entry { @[execname] = count();
}''
   dtrace: description ''fbt::tcp_accept_finish:entry '' matched
1 probe
   ^C
     inetd                                                          1000

Great. Now the same test but different server load (now mostly idle
rather than busy),

   # dtrace -n ''fbt::tcp_accept_finish:entry { @[execname] = count();
}''
   dtrace: description ''fbt::tcp_accept_finish:entry '' matched
1 probe
   ^C
     sched                                                             2
     inetd                                                           998

Bugger. I''m not at all happy with 99.8% correct for some scenarios.
Looks
like I''ll need to update all my tcp tools.

Brendan

Wee Yeh Tan

2005-Nov-24 01:25 UTC

head link

[dtrace-discuss] Context of tcp_connect_finish & squeue

Hi Brendan,

On 11/23/05, Brendan Gregg <brendan.gregg at tpg.com.au>
wrote:> If any of my scripts are problematic, please post samples that demonstrate
> it, including instructions on how I can recreate it, or just email me.
Sure :).

It didn''t cross me that your script was problematic though.  I spent
half a day trying to weave through tcp.c (and many others) and was
looking at tcpsnoop for ideas on what function to look at before I
gave up and posted to the board.
>    # dtrace -n ''fbt::tcp_accept_finish:entry { @[execname] =
count(); }''
>    dtrace: description ''fbt::tcp_accept_finish:entry ''
matched 1 probe
>    ^C
>      sched                                                             2
>      inetd                                                           998
Technically, the situation should arise whenever a thread drains the
squeue and works a request that is not owned by itself.
> Bugger. I''m not at all happy with 99.8% correct for some
scenarios. Looks
> like I''ll need to update all my tcp tools.
Please let me know if you see any solution.  I''ve been trying to do
this on and off but would love to get something going.


--
Just me,
Wire ...

Kais Belgaied

2005-Nov-28 19:43 UTC

head link

[dtrace-discuss] Context of tcp_connect_finish & squeue

Wee Yeh Tan wrote On 11/23/05 00:29,:> Hi,
> 
> I''m trying to locate a function that I can use to tie an incoming
tcp
> packet to the thread that finally accepts the connection.
> 
> I noted that Brendan''s tcpsnoop uses tcp_accept_finish() but that
is
> not always correct given that tcp_accept_finish() is called by
> squeue_enter() which may be fire under a different thread.
you gotta catch it at the socket level, where the context of execution
is the acceptor thread''s. It happens upstreams from
tcp_accept_finish().

The acceptor thread calls the socket function sotpi_accept() which
in turn calls sowaitconnind() and blocks until a connection indication
arrives. sotpi_accept() then produces a TPI M_PROTO message that carries the
T_conn_res (socket''s response to the incoming TCP connection). It sends
that stream message downstreams to ip (TCP/IP module). There is a
putnext() call in the sending, thus the squeue_enter().
TCP handles that message via tcp_accept_finish().

To use a variant of Brendan''s 1-liner, you may try catching the returns
from sowaitconnind() :

# dtrace -n ''fbt:sockfs:sowaitconnind:return { trace(execname);
}''

	Kais.
> 
> Is there an easy solution or do I really have to work through the
> squeue stuff for this (I''m sure it''ll be rewarding
though).
> 
> 
> --
> Just me,
> Wire ...
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss at opensolaris.org

Wee Yeh Tan

2005-Dec-02 10:10 UTC

head link

[dtrace-discuss] Context of tcp_connect_finish & squeue

Kais,

Thanks...

I''m gonna need to work this a little more.  I wasn''t able to
extract
anything off the sonode*.  Will look at this over the weekend.


--
Just me,
Wire ...

On 11/29/05, Kais Belgaied <Kais.Belgaied at sun.com>
wrote:> you gotta catch it at the socket level, where the context of execution
> is the acceptor thread''s. It happens upstreams from
tcp_accept_finish().
>
> The acceptor thread calls the socket function sotpi_accept() which
> in turn calls sowaitconnind() and blocks until a connection indication
> arrives. sotpi_accept() then produces a TPI M_PROTO message that carries
the
> T_conn_res (socket''s response to the incoming TCP connection). It
sends
> that stream message downstreams to ip (TCP/IP module). There is a
> putnext() call in the sending, thus the squeue_enter().
> TCP handles that message via tcp_accept_finish().
>
> To use a variant of Brendan''s 1-liner, you may try catching the
returns
> from sowaitconnind() :
>
> # dtrace -n ''fbt:sockfs:sowaitconnind:return { trace(execname);
}''
>
>         Kais.

Brendan Gregg

2005-Dec-03 14:42 UTC

head link

[dtrace-discuss] Context of tcp_connect_finish & squeue

G''Day Folks,

On Mon, 28 Nov 2005, Kais Belgaied wrote:
> Wee Yeh Tan wrote On 11/23/05 00:29,:
> > Hi,
> >
> > I''m trying to locate a function that I can use to tie an
incoming tcp
> > packet to the thread that finally accepts the connection.
So to recap the problem:

* We have tcp events, we''d like to know thread/execname/pid details.
* The tcp events can be key''d by a (int)(conn_t *)
* If we can build hashes of the form,
	tname[(conn_t *)] = execname
	tpid[(conn_t *)] = pid
	...
  Then we can easily lookup thread details for tcp events.
  (I''m deliberatly didn''t using a hash of structs, btw)

I tried,

   fbt:ip:tcp_accept_finish:entry
   {
           self->connp = (conn_t *)arg0;
           tname[(int)self->connp] = execname;
           tpid[(int)self->connp] = pid;
           tuid[(int)self->connp] = uid;
   }

Which we now know doesn''t always work on multi-CPU servers. And being
slightly wrong isn''t good enough (unless clearly marked as an
estimation
tool).

A number of people in the past have suggested I look at socket events, for
which there are a few problems I won''t go into yet. The main one was no
obvious path from the socket events (struct sonode *) to anything from
the tcp events, either (conn_t *) or (tcp_t *) (which are both linked).
I never figured this one out (maybe I should mention that I begin writing
these tools without access to the kernel code (the safe public access we
enjoy today!)).
> > I noted that Brendan''s tcpsnoop uses tcp_accept_finish() but
that is
> > not always correct given that tcp_accept_finish() is called by
> > squeue_enter() which may be fire under a different thread.
>
> you gotta catch it at the socket level, where the context of execution
> is the acceptor thread''s. It happens upstreams from
tcp_accept_finish().
>
> The acceptor thread calls the socket function sotpi_accept() which
> in turn calls sowaitconnind() and blocks until a connection indication
> arrives. sotpi_accept() then produces a TPI M_PROTO message that carries
the
> T_conn_res (socket''s response to the incoming TCP connection). It
sends
> that stream message downstreams to ip (TCP/IP module). There is a
> putnext() call in the sending, thus the squeue_enter().
> TCP handles that message via tcp_accept_finish().
>
> To use a variant of Brendan''s 1-liner, you may try catching the
returns
> from sowaitconnind() :
>
> # dtrace -n ''fbt:sockfs:sowaitconnind:return { trace(execname);
}''
Ahh, thank you Kais - this gives me a sensible reference point where I can
trust execname/etc, and you have explained the process well. (at one point
I tried tracing putnext()s, but I''ve steered away from them as they can
occur rapidly).

sowaitconnind looks suitable in testing (100% correct so far), however at
that point the new socket and it''s details aren''t populated.

If I wait to sotpi_accept:return, then I can fetch the details we need to
find the (conn_t *). In testing, sotpi_accept:return is also 100% correct;
I just need to think carefully about what could happen between the
sowaitconnind:return and the sotpi_accept:return to cause the acceptor
thread to be incorrect again (hopefully nothing).


Testing went like this,

--------socktest.d--------
#!/usr/sbin/dtrace -s

#pragma D option quiet

fbt::tcp_accept_finish:entry    { @tcp[execname] = count(); }
fbt:sockfs:sowaitconnind:return { @sowait[execname] = count(); }
fbt:sockfs:sotpi_accept:return  { @soaccept[execname] = count(); }

dtrace:::END
{
        printf("tcp:\n"); printa(@tcp);
        printf("sowait:\n"); printa(@sowait);
        printf("soaccept:\n"); printa(@soaccept);
}
--------socktest.d--------

Then after 10000 connections on a loaded multi-CPU server,

   # ./socktest.d
   ^C
   tcp:

     sched                                                             6
     inetd                                                          9994
   sowait:

     inetd                                                         10000
   soaccept:

     inetd                                                         10000

Cool. tcp_accept_finish is wrong (as we now expect), and both
sowaitconnind:return and sotpi_accept:return are 100% correct.


So, back to what I WAS using,

   fbt:ip:tcp_accept_finish:entry
   {
           self->connp = (conn_t *)arg0;
           tname[(int)self->connp] = execname;
           tpid[(int)self->connp] = pid;
           tuid[(int)self->connp] = uid;
   }

And now the NEW code (that I''m still testing),

----------------
fbt:sockfs:sotpi_accept:entry
{
        self->sop = args[0];
}

fbt:sockfs:sotpi_create:return
/self->sop/
{
        self->nsop = (struct sonode *)arg1;
}

fbt:sockfs:sotpi_accept:return
/self->nsop/
{
        this->tcpp = (tcp_t *)self->nsop->so_priv;
        this->connp = (conn_t *)this->tcpp->tcp_connp;
        tname[(int)this->connp] = execname;
        tpid[(int)this->connp] = pid;
        tuid[(int)this->connp] = uid;
}

fbt:sockfs:sotpi_accept:return
{
        self->nsop = 0;
        self->sop = 0;
}
----------------

... the key is tracking the new sonode''s, knowing that so_priv will
contain (tcp_t *), and knowing when to read it (sotpi_accept:return).
I''ll add some code so that it skips localhost connections, which
I''m
usually not interested in (maybe my tools should have a switch for that).

cheers,

Brendan

dtrace discuss - Nov 2005 - Context of tcp_connect_finish & squeue

[dtrace-discuss] Context of tcp_connect_finish & squeue

[dtrace-discuss] Context of tcp_connect_finish & squeue

[dtrace-discuss] Context of tcp_connect_finish & squeue

[dtrace-discuss] Context of tcp_connect_finish & squeue

[dtrace-discuss] Context of tcp_connect_finish & squeue

[dtrace-discuss] Context of tcp_connect_finish & squeue