Brendan Gregg - Sun Microsystems
2006-Oct-05 18:33 UTC
[dtrace-discuss] DTrace TCP Provider
G''Day Folks,
Yesterday at the CEC 2006 conference, Bryan and I gave a demonstration
of a prototype DTrace tcp provider, which would be one of a suggested
collection of DTrace network providers. I''ve just put together a
website
to list the demonstrations performed,
http://www.opensolaris.org/os/community/dtrace/NetworkProvider/CEC2006/
Below are a couple of examples from the website. tcpio2.d traces TCP
traffic and provides IP address, TCP port, TCP payload size and
TCP flag information:
# ./tcpio2.d
LADDR:PORT RADDR:PORT BYTES FLAGS
192.168.1.185:79 <- 192.168.1.109:56145 0 (SYN)
192.168.1.185:79 -> 192.168.1.109:56145 0 (SYN|ACK)
192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK)
192.168.1.185:79 <- 192.168.1.109:56145 2 (PUSH|ACK)
192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK)
192.168.1.185:79 -> 192.168.1.109:56145 126 (PUSH|ACK)
192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK)
192.168.1.185:79 -> 192.168.1.109:56145 0 (FIN|ACK)
192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK)
192.168.1.185:79 <- 192.168.1.109:56145 0 (FIN|ACK)
192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK)
^C
And tcpaccept2.d aggregates inbound connections by hostname and
service name,
# ./tcpaccept2.d
dtrace: script ''./tcpaccept2.d'' matched 1 probe
Tracing... Hit Ctrl-C to end.
^C
HOSTNAME PORT COUNT
marlin6 finger 1
bass login 2
marlin ssh 2
marlin telnet 2
marlin finger 3
The source to these scripts and numerous other demonstrations are on
the website.
We began writing this prototype tcp provider last monday, based on the
design and feedback posted on dtrace-discuss during the last few weeks.
This prototype has highlighted to us possible maintainability issues,
which has led to several probes being dropped, and many arguments.
We want to make this prototype widely available soon, which will probably
be done as patches to the latest OpenSolaris build.
cheers,
Brendan
--
Brendan
[CA, USA]
Brendan Gregg - Sun Microsystems wrote:>G''Day Folks, > >Yesterday at the CEC 2006 conference, Bryan and I gave a demonstration >of a prototype DTrace tcp provider, which would be one of a suggested >collection of DTrace network providers. I''ve just put together a website >to list the demonstrations performed, > >http://www.opensolaris.org/os/community/dtrace/NetworkProvider/CEC2006/ > >Below are a couple of examples from the website. tcpio2.d traces TCP >traffic and provides IP address, TCP port, TCP payload size and >TCP flag information: > > # ./tcpio2.d > LADDR:PORT RADDR:PORT BYTES FLAGS > 192.168.1.185:79 <- 192.168.1.109:56145 0 (SYN) > 192.168.1.185:79 -> 192.168.1.109:56145 0 (SYN|ACK) > 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) > 192.168.1.185:79 <- 192.168.1.109:56145 2 (PUSH|ACK) > 192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK) > 192.168.1.185:79 -> 192.168.1.109:56145 126 (PUSH|ACK) > 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) > 192.168.1.185:79 -> 192.168.1.109:56145 0 (FIN|ACK) > 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) > 192.168.1.185:79 <- 192.168.1.109:56145 0 (FIN|ACK) > 192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK) >Is this, by itself, a worthwhile goal of dtrace''s TCP provider? Why would I care to use dtrace for this when snoop/tcpdump does it? Or to put it differently, I''d never use dtrace for this task and I''d never suggest/recommend anyone use it for this task either. In my mind it is (and will always be) the wrong tool for this particular job. Show me something I can''t do with tcpdump/snoop...examining loopback traffic on Solaris is interesting, but there''s a good chance that the IP observability project will deliver this before dtrace.>And tcpaccept2.d aggregates inbound connections by hostname and >service name, > > # ./tcpaccept2.d > dtrace: script ''./tcpaccept2.d'' matched 1 probe > Tracing... Hit Ctrl-C to end. > ^C > HOSTNAME PORT COUNT > marlin6 finger 1 > bass login 2 > marlin ssh 2 > marlin telnet 2 > marlin finger 3 >I find this a more interesting demonstration of dtrace and TCP. Can you tell me the average lifespan of each connection? And that per-host? What about for http, can you tell me the number of connections my web server gets for each host, on average how long each one takes (from accept to close) and the average number of bytes sent by TCP? Darren
Brendan Gregg - Sun Microsystems
2006-Oct-12 04:12 UTC
[dtrace-discuss] DTrace TCP Provider
G''Day Darren, On Tue, Oct 10, 2006 at 03:21:40PM -0700, Darren.Reed at sun.com wrote: [...]> > # ./tcpio2.d > > LADDR:PORT RADDR:PORT BYTES FLAGS > > 192.168.1.185:79 <- 192.168.1.109:56145 0 (SYN) > > 192.168.1.185:79 -> 192.168.1.109:56145 0 (SYN|ACK) > > 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) > > 192.168.1.185:79 <- 192.168.1.109:56145 2 (PUSH|ACK) > > 192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK) > > 192.168.1.185:79 -> 192.168.1.109:56145 126 (PUSH|ACK) > > 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) > > 192.168.1.185:79 -> 192.168.1.109:56145 0 (FIN|ACK) > > 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) > > 192.168.1.185:79 <- 192.168.1.109:56145 0 (FIN|ACK) > > 192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK) > > > > Is this, by itself, a worthwhile goal of dtrace''s TCP provider? > Why would I care to use dtrace for this when snoop/tcpdump does it?Noone said it was a worthwhile goal. This output is from the CEC 2006 demonstration, and was used to demonstrate some of the capabilities of the provider. Which it does nicely. It demonstrates that one can acess IP and TCP information with the rich programming functionality and low performance costs of DTrace. No, it doesn''t show why the provider trumps snoop - so perhaps you are right, and I shouldn''t copy-n-paste it as demonstration #1.> Or to put it differently, I''d never use dtrace for this task and > I''d never suggest/recommend anyone use it for this task either. > In my mind it is (and will always be) the wrong tool for this > particular job.Such a snoop like output is supposed to trigger the imagination for what customisations and enhancements one can make with the DTrace programming language.> Show me something I can''t do with tcpdump/snoop...examining loopback > traffic on Solaris is interesting, but there''s a good chance that the > IP observability project will deliver this before dtrace.How about snoop without the performance cost, # prstat PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP 1835 nobody 2456K 1640K sleep 41 0 0:01:39 51% snoop/1 1838 root 2360K 944K run 53 0 0:00:20 11% rsh/1 1836 root 2108K 1868K sleep 59 0 0:00:09 4.9% dd/1 744 brendan 216M 194M run 59 0 2:33:39 1.2% Xorg/1 1875 brendan 142M 120M sleep 49 0 0:29:54 0.7% opera/2 That was a "snoop -rs64" on a busy single NIC host. Good thing it didn''t have busy apps - as application performance may well suffer with snoop wanting 51% CPU. Anyhow - I don''t want to make this about why snoop may be bad, it''s more about what we can do with the provider, what we can solve, and what new performance metrics we can create.> >And tcpaccept2.d aggregates inbound connections by hostname and > >service name, > > > > # ./tcpaccept2.d > > dtrace: script ''./tcpaccept2.d'' matched 1 probe > > Tracing... Hit Ctrl-C to end. > > ^C > > HOSTNAME PORT COUNT > > marlin6 finger 1 > > bass login 2 > > marlin ssh 2 > > marlin telnet 2 > > marlin finger 3 > > > > I find this a more interesting demonstration of dtrace and TCP. > > Can you tell me the average lifespan of each connection? > And that per-host?Great idea (and it rings a bell - customers have asked me this before) - I''ve added it to the website (actual screenshots), http://www.opensolaris.org/os/community/dtrace/NetworkProvider/#tcp18 # ./tcplifespan1.d ^C Inbound Connection Lifespan Report (ms), deimos http value ------------- Distribution ------------- count 1 | 0 2 |@@@@@@@@@@@@@@@@@@@@@@@ 7 4 |@@@@@@@@@@ 3 8 |@@@ 1 16 | 0 32 |@@@ 1 64 | 0 deimos finger value ------------- Distribution ------------- count 8 | 0 16 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 4 32 |@@@@@@@@ 1 64 | 0 deimos ssh value ------------- Distribution ------------- count 2048 | 0 4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 1 8192 | 0 I wrote varients which I haven''t put on the website yet, including, # ./tcplifespan2.d Tracing... Hit Ctrl-C to end. ^C Inbound Connection Lifespan Report, HOSTNAME PORT AVG(ms) MAX(ms) COUNT deimos http 3 4 3 deimos finger 40 56 3 deimos ssh 2238 2238 1> What about for http, can you tell me the number of connections > my web server gets for each host, on average how long each one > takes (from accept to close) and the average number of bytes > sent by TCP?Yes - I should add a bytes column to tcplifespan2.d. :) These are good questions - customers have asked me for such metrics before, and they are the exact type of questions that the provider will answer best. cheers, Brendan -- Brendan [CA, USA]
Brendan Gregg - Sun Microsystems
2006-Oct-12 17:59 UTC
[dtrace-discuss] DTrace TCP Provider
G''Day Darren, On Tue, Oct 10, 2006 at 03:21:40PM -0700, Darren.Reed at sun.com wrote: [...]> Show me something I can''t do with tcpdump/snoop...examining loopback > traffic on Solaris is interesting, but there''s a good chance that the > IP observability project will deliver this before dtrace.You mean Bug ID 4085089 will be completed soon? Sure, the IP observabality project should provide loopback snooping, and that will be useful in numerous ways. truss is useful in numerous ways; but that''s no reason why we shouldn''t have, in addition, a syscall provider. Here is where I''m at with loopback tracing, # ./tcpio3.d LADDR:PORT RADDR:PORT BYTES FLAGS 127.0.0.1:51309 -> 127.0.0.1:79 0 (SYN) 127.0.0.1:79 <- 127.0.0.1:51309 0 (SYN) 127.0.0.1:79 -> 127.0.0.1:51309 0 (SYN|ACK) 127.0.0.1:51309 <- 127.0.0.1:79 0 (SYN|ACK) 127.0.0.1:51309 -> 127.0.0.1:79 0 (ACK) 127.0.0.1:79 <- 127.0.0.1:51309 0 (ACK) 127.0.0.1:51309 -> 127.0.0.1:79 2 (loopback) 127.0.0.1:79 <- 127.0.0.1:51309 2 (loopback) 127.0.0.1:79 -> 127.0.0.1:51309 207 (loopback) 127.0.0.1:51309 <- 127.0.0.1:79 207 (loopback) 127.0.0.1:79 <- 127.0.0.1:51309 0 (loopback) 127.0.0.1:51309 <- 127.0.0.1:79 0 (loopback) 127.0.0.1:79 -> 127.0.0.1:51309 0 (FIN|ACK) 127.0.0.1:51309 <- 127.0.0.1:79 0 (FIN|ACK) 127.0.0.1:51309 -> 127.0.0.1:79 0 (ACK) 127.0.0.1:79 <- 127.0.0.1:51309 0 (ACK) 127.0.0.1:51309 -> 127.0.0.1:79 0 (FIN|ACK) 127.0.0.1:79 <- 127.0.0.1:51309 0 (FIN|ACK) 127.0.0.1:79 -> 127.0.0.1:51309 0 (ACK) 127.0.0.1:51309 <- 127.0.0.1:79 0 (ACK) ^C That was a trace of a localhost finger connection. Events appear duplicatied as we see both the send and receive. IPv6 works fine too. ftp is also interesting to check, as we can initiate some known sized data transfers (assuming no tcp retransmits), and check that the numbers match. Here I send a "contents" file to /tmp from both deimos and localhost, # ls -l /tmp/contents -rw-r--r-- 1 fred other 21380410 Oct 12 10:16 /tmp/contents # ./tcpbytes3.d Tracing... Hit Ctrl-C to end. ^C HOST PORT BYTES deimos ftp 536 deimos ssh 1568 localhost ftp-data 21380410 localhost 60228 21380410 deimos ftp-data 21380410 Both the ftp-data connections match the file size. The 60228 port is the client sending data, our output shows both send and receive totals. cheers, Brendan -- Brendan [CA, USA]
Sebastien Roy
2006-Oct-12 18:43 UTC
[networking-discuss] Re: [dtrace-discuss] DTrace TCP Provider
On Thu, 2006-10-12 at 10:59 -0700, Brendan Gregg - Sun Microsystems wrote:> You mean Bug ID 4085089 will be completed soon?Yes, and if you''d like a test-drive, you can download early-access development bits for this work from our Clearview project downloads page: http://www.opensolaris.org/os/project/clearview/downloads/ bfu, reboot, then type "snoop -d lo0". :-) -Seb
Dan McDonald
2006-Oct-12 19:11 UTC
[networking-discuss] Re: [dtrace-discuss] DTrace TCP Provider
On Thu, Oct 12, 2006 at 02:43:41PM -0400, Sebastien Roy wrote:> Yes, and if you''d like a test-drive, you can download early-access > development bits for this work from our Clearview project downloads > page: > > http://www.opensolaris.org/os/project/clearview/downloads/ > > bfu, reboot, then type "snoop -d lo0". :-)You forgot to mention that you can also snoop ip.tunN, even if IPsec is enabled. Not that I have any interest in tunnels... ;) Dan
Sebastien Roy
2006-Oct-12 19:16 UTC
[networking-discuss] Re: [dtrace-discuss] DTrace TCP Provider
On Thu, 2006-10-12 at 15:11 -0400, Dan McDonald wrote:> On Thu, Oct 12, 2006 at 02:43:41PM -0400, Sebastien Roy wrote: > > Yes, and if you''d like a test-drive, you can download early-access > > development bits for this work from our Clearview project downloads > > page: > > > > http://www.opensolaris.org/os/project/clearview/downloads/ > > > > bfu, reboot, then type "snoop -d lo0". :-) > > You forgot to mention that you can also snoop ip.tunN, even if IPsec is > enabled. Not that I have any interest in tunnels... ;)That''s because the early access bits above don''t yet have support for that, but it''s coming soon as development progresses. -Seb
Brendan Gregg - Sun Microsystems
2006-Oct-12 19:28 UTC
[networking-discuss] Re: [dtrace-discuss] DTrace TCP Provider
G''Day Sebastien, On Thu, Oct 12, 2006 at 02:43:41PM -0400, Sebastien Roy wrote:> On Thu, 2006-10-12 at 10:59 -0700, Brendan Gregg - Sun Microsystems > wrote: > > You mean Bug ID 4085089 will be completed soon? > > Yes, and if you''d like a test-drive, you can download early-access > development bits for this work from our Clearview project downloads > page: > > http://www.opensolaris.org/os/project/clearview/downloads/ > > bfu, reboot, then type "snoop -d lo0". :-)Thanks, I just tried it out. So, I know you said early-access, but when I run "snoop -d lo0" I get the following, # snoop -Sd lo0 Using device /dev/lo0 (promiscuous mode) localhost -> localhost length: 66 FTP-DATA R port=34166 localhost -> localhost length: 66 FTP-DATA C port=34166 localhost -> localhost length: 54 FTP-DATA R port=34166 localhost -> localhost length: 54 FTP-DATA C port=34166 localhost -> localhost length: 54 FTP-DATA R port=34166 localhost -> localhost length: 54 FTP-DATA R port=34166 localhost -> localhost length: 54 FTP-DATA C port=34166 ^C # That was an ftp-data transfer of the 20 Mbyte "contents" file, which I was demo''ing earlier. I don''t see the file data being transferred. DTrace can see all of loopback traffic, including fused data transfers, # ./tcpio3.d LADDR:PORT RADDR:PORT BYTES FLAGS 127.0.0.1:20 -> 127.0.0.1:63566 0 (SYN) 127.0.0.1:63566 <- 127.0.0.1:20 0 (SYN) 127.0.0.1:63566 -> 127.0.0.1:20 0 (SYN|ACK) 127.0.0.1:20 <- 127.0.0.1:63566 0 (SYN|ACK) 127.0.0.1:20 -> 127.0.0.1:63566 0 (ACK) 127.0.0.1:63566 <- 127.0.0.1:20 0 (ACK) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) 127.0.0.1:20 <- 127.0.0.1:63566 49152 (loopback) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) 127.0.0.1:20 <- 127.0.0.1:63566 49152 (loopback) 127.0.0.1:63566 -> 127.0.0.1:20 16384 (loopback) [...] and the byte sizes sum up correctly. Sure, watching the rate of loopback connections is itself useful, so I can see that Clearview can already solve many customer problems. Once it can see fused data traffic also (without actually switching fusion off), that would be great. :-) cheers, Brendan -- Brendan [CA, USA]
Sebastien Roy
2006-Oct-12 19:44 UTC
[networking-discuss] Re: [dtrace-discuss] DTrace TCP Provider
On Thu, 2006-10-12 at 12:28 -0700, Brendan Gregg - Sun Microsystems wrote:> Thanks, I just tried it out. So, I know you said early-access, but when > I run "snoop -d lo0" I get the following, > > # snoop -Sd lo0 > Using device /dev/lo0 (promiscuous mode) > localhost -> localhost length: 66 FTP-DATA R port=34166 > localhost -> localhost length: 66 FTP-DATA C port=34166 > localhost -> localhost length: 54 FTP-DATA R port=34166 > localhost -> localhost length: 54 FTP-DATA C port=34166 > localhost -> localhost length: 54 FTP-DATA R port=34166 > localhost -> localhost length: 54 FTP-DATA R port=34166 > localhost -> localhost length: 54 FTP-DATA C port=34166 > ^C > # > > That was an ftp-data transfer of the 20 Mbyte "contents" file, which I > was demo''ing earlier. I don''t see the file data being transferred. > > DTrace can see all of loopback traffic, including fused data transfers,That''s a documented shortcoming of these early access bits. See the section entitled "Known issues specific to Clearview" in the release notes for the bfu archives you downloaded. In short, for now, setting "ip:do_tcp_fusion = 0x0" in /etc/system is the workaround as we work on the fix.> > and the byte sizes sum up correctly.Fabulous.> > Sure, watching the rate of loopback connections is itself useful, so I > can see that Clearview can already solve many customer problems. > Once it can see fused data traffic also (without actually switching > fusion off), that would be great. :-)Yes, obviously. Thanks, -Seb
Brendan Gregg - Sun Microsystems wrote:>G''Day Darren, > >On Tue, Oct 10, 2006 at 03:21:40PM -0700, Darren.Reed at sun.com wrote: >[...] > > >>> # ./tcpio2.d >>> LADDR:PORT RADDR:PORT BYTES FLAGS >>> 192.168.1.185:79 <- 192.168.1.109:56145 0 (SYN) >>> 192.168.1.185:79 -> 192.168.1.109:56145 0 (SYN|ACK) >>> 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) >>> 192.168.1.185:79 <- 192.168.1.109:56145 2 (PUSH|ACK) >>> 192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK) >>> 192.168.1.185:79 -> 192.168.1.109:56145 126 (PUSH|ACK) >>> 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) >>> 192.168.1.185:79 -> 192.168.1.109:56145 0 (FIN|ACK) >>> 192.168.1.185:79 <- 192.168.1.109:56145 0 (ACK) >>> 192.168.1.185:79 <- 192.168.1.109:56145 0 (FIN|ACK) >>> 192.168.1.185:79 -> 192.168.1.109:56145 0 (ACK) >>> >>> >>> >>Is this, by itself, a worthwhile goal of dtrace''s TCP provider? >>Why would I care to use dtrace for this when snoop/tcpdump does it? >> >> > >Noone said it was a worthwhile goal. This output is from the CEC 2006 >demonstration, and was used to demonstrate some of the capabilities of >the provider. Which it does nicely. > >It demonstrates that one can acess IP and TCP information with the >rich programming functionality and low performance costs of DTrace. > >No, it doesn''t show why the provider trumps snoop - so perhaps >you are right, and I shouldn''t copy-n-paste it as demonstration #1. > >This was the point of my comment. Not that it can''t be done, just that it''s not really an incredibly interesting thing to see.>>Or to put it differently, I''d never use dtrace for this task and >>I''d never suggest/recommend anyone use it for this task either. >>In my mind it is (and will always be) the wrong tool for this >>particular job. >> >> > >Such a snoop like output is supposed to trigger the imagination for >what customisations and enhancements one can make with the DTrace >programming language. > >Ahh....so when will dtrace take plugins for protocol descriptions? :)> ... > >>>And tcpaccept2.d aggregates inbound connections by hostname and >>>service name, >>> >>> # ./tcpaccept2.d >>> dtrace: script ''./tcpaccept2.d'' matched 1 probe >>> Tracing... Hit Ctrl-C to end. >>> ^C >>> HOSTNAME PORT COUNT >>> marlin6 finger 1 >>> bass login 2 >>> marlin ssh 2 >>> marlin telnet 2 >>> marlin finger 3 >>> >>> >>> >>I find this a more interesting demonstration of dtrace and TCP. >> >>Can you tell me the average lifespan of each connection? >>And that per-host? >> >> > >Great idea (and it rings a bell - customers have asked me this before) >- I''ve added it to the website (actual screenshots), >... > > >>What about for http, can you tell me the number of connections >>my web server gets for each host, on average how long each one >>takes (from accept to close) and the average number of bytes >>sent by TCP? >> >> > >Yes - I should add a bytes column to tcplifespan2.d. :) > >These are good questions - customers have asked me for such metrics >before, and they are the exact type of questions that the provider >will answer best. > >:-) I think if you can work from here and start answering other things, like average number of dup-ACK''s per TCP connection for each host, etc. These things are very hard to do with tcpdump/snoop (you''ll need to add some perl in there.) There is a whole host of questions around here that we might have wanted to ask in the past but for which we could never get an easy answer. Darren