eric kustarz
2006-Dec-08 03:06 UTC
[zfs-discuss] Re: [nfs-discuss] A Plea for Help: Thumper/ZFS/NFS/B43
Ben Rockwood wrote:> I wanted to add one more piece of information to this problem that may > or not may be helpful. > > On an NFS client if we just do "ls" commands over and over and over we > can snoop the wire and see TCP retransmits whenever the CPU is burned > up. nfsstat doesn''t record these retransmits, they are happening lower > down. Here''s an example packet exchange: > > Frame Time Packet Time Delta > 85 7.91 GETATTR Call 0.00 > 89 8.31 Retransmit of 85 0.407s > 93 11.81 GETATTR Reply 3.497s > > > When the CPU on the Thumper is "normal" transactions look like this: > > 22:38:16.52564 private.atlantis -> 10.71.165.6 NFS C GETATTR3 FH=0E05 > 22:38:16.52574 10.71.165.6 -> private.atlantis NFS R GETATTR3 OK > 22:38:16.57938 private.atlantis -> 10.71.165.6 TCP D=2049 S=992 > Ack=1937283399 Seq=4211323728 Len=0 Win=49640 > > > When the CPU is tapped out, it looks like this: > > 22:37:50.55974 private.atlantis -> 10.71.165.6 NFS C GETATTR3 FH=0E05 > 22:37:50.96940 private.atlantis -> 10.71.165.6 NFS C GETATTR3 FH=0E05 > (retransmit) > 22:37:50.96949 10.71.165.6 -> private.atlantis TCP D=992 S=2049 > Ack=4211321824 Seq=1937281311 Len=0 Win=49640 > 22:37:53.84858 10.71.165.6 -> private.atlantis NFS R GETATTR3 OK > 22:37:53.89939 private.atlantis -> 10.71.165.6 TCP D=2049 S=992 > Ack=1937281427 Seq=4211321824 Len=0 Win=49640 > > > > This finding caused us to re-evaluate the entire network scheme. > However, I can''t ignore the fact that this only happens when the CPU''s > are tapped out. Perhaps there is a correlation perhaps not. Just wanted > to throw it on there. > > > benr.Thanks for providing more info... Yeah, i would think its TCP/NFS server dropping packets/not getting to processing the packets in a timely matter because it doesn''t have enough resource (CPU). There may be network inefficiences in your setup, but i think this is simply due to lack of CPU on the server. eric