thr3ads.net - zfs discuss - [zfs-discuss] diagnosing read performance problem [Oct 2008]

If this information is useful, please help other people find it:
Share via:

Matt Harrison

2008-Oct-24 23:57 UTC

[zfs-discuss] diagnosing read performance problem

Hi all,

I''ve got a lot of video files on a zfs/cifs fileserver running SXCE. A
little while ago the dual onboard NICs died and I had to replace them with a
PCI 10/100 NIC. The system was fine for a couple of weeks but now the
performance when viewing a video file from the cifs share is appauling. Videos
stop and jerk with audio distortion.

I have tried this from several client machines so I''m pretty certain it
lies
with the server but I''m unsure of the next step to find out the source
of
the problem.

Is there any tool I should be using to find out if this is a zfs, network or
other problem?

Grateful for any ideas

Thanks

Matt

Bob Friesenhahn

2008-Oct-25 04:59 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Sat, 25 Oct 2008, Matt Harrison wrote:> I''ve got a lot of video files on a zfs/cifs fileserver running
SXCE. A
> little while ago the dual onboard NICs died and I had to replace them with
a
> PCI 10/100 NIC. The system was fine for a couple of weeks but now the
> performance when viewing a video file from the cifs share is appauling.
Videos
> stop and jerk with audio distortion.
>
> I have tried this from several client machines so I''m pretty
certain it lies
> with the server but I''m unsure of the next step to find out the
source of
> the problem.
Other people on this list who experienced the exact same problem 
ultimately determined that the problem was with the network card.  I 
recall that Intel NICs were the recommended solution.

Note that 100MBit is now considered to be a slow link and PCI is also 
considered to be slow.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Matt Harrison

2008-Oct-25 07:57 UTC

head link

[zfs-discuss] diagnosing read performance problem

Bob Friesenhahn wrote:> Other people on this list who experienced the exact same problem
> ultimately determined that the problem was with the network card.  I
> recall that Intel NICs were the recommended solution.
> 
> Note that 100MBit is now considered to be a slow link and PCI is also
> considered to be slow.
Thanks for the reply,

Yes I understand that 100mbit and pci are a bit outdated, unfortunately
I''m still campaigning to have our switches upgraded to gbit or 10gbit.

I will see if I can aquire an intel nic to test it with, however before
the problem with NICs started it operating fine. It seems though that
there is an ongoing problem with NICs on this machine.

The onboard ones haven''t so much died (they still allow me to use them
from the OS) but they just won''t start up or accept there is a cable
plugged in. The PCI nic does seem to be working and transfers to/from
the server seem ok except when there''s video being moved.

I will do some testing and see if I can come up with a more definite
reason to the performance problems.

Thanks

Matt

Bob Friesenhahn

2008-Oct-25 16:10 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Sat, 25 Oct 2008, Matt Harrison wrote:>
> The onboard ones haven''t so much died (they still allow me to use
them
> from the OS) but they just won''t start up or accept there is a
cable
> plugged in. The PCI nic does seem to be working and transfers to/from
> the server seem ok except when there''s video being moved.
Hmmm, this may indicate that there is an ethernet cable problem.  Use 
''netstat -I interface'' (where interface is the interface name
shown by
''ifconfig -a'') to see if the interface error count is
increasing.  If
you are using a "smart" switch, use the switch admistrative interface 
and see if the error count is increasing for the attached switch port. 
Unfortunately your host can only see errors for packets it receives 
and it may be that errors are occuring for packets it sends.

If the ethernet cable is easy to replace, then it may be easiest to 
simply replace it and use a different switch port to see if the 
problem just goes away.

Bob
=====================================Bob Friesenhahn
bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/

Matt Harrison

2008-Oct-25 16:33 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Sat, Oct 25, 2008 at 11:10:42AM -0500, Bob Friesenhahn
wrote:> Hmmm, this may indicate that there is an ethernet cable problem.  Use 
> ''netstat -I interface'' (where interface is the interface
name shown by
> ''ifconfig -a'') to see if the interface error count is
increasing.  If you
> are using a "smart" switch, use the switch admistrative interface
and see
> if the error count is increasing for the attached switch port. 
> Unfortunately your host can only see errors for packets it receives and it 
> may be that errors are occuring for packets it sends.
>
> If the ethernet cable is easy to replace, then it may be easiest to simply 
> replace it and use a different switch port to see if the problem just goes 
> away.
Ok, I''ve just tried 2 other cables, one doesn''t even get a
link light so
it''s probably dead. The other one I had suspected was bad and indeed
the
connection is terrible and the Oerr field in netstat does increase.

On the other hand, the Oerr field doesn''t increase with the original
cable,
however the video performance is still bad (although not as bad as with the
2nd replacement cable).

I will make up some new cables, and also place an order for an Intell
Pro100, as they are supposed to be really reliable.

Thanks

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081025/f3705cf8/attachment.bin>

Nigel Smith

2008-Oct-26 01:50 UTC

head link

[zfs-discuss] diagnosing read performance problem

Hi Matt
What chipset is your PCI network card?
(obviously, it not Intel, but what is it?)
Do you know which driver the card is using?

You say ''..The system was fine for a couple of weeks..''.
At that point did you change any software - do any updates or upgrades?
For instance, did you upgrade to a new build of OpenSolaris?

If not, then I would guess it''s some sort of hardware problem.
Can you try different cables and a different switch - anything
in the path between client & server is suspect.

A mismatch of Ethernet duplex settings can cause problems - are
you sure this is Ok.

To get an idea of how the network is running try this:

On the Solaris box, do an Ethernet capture with ''snoop'' to a
file.
http://docs.sun.com/app/docs/doc/819-2240/snoop-1m?a=view

 # snoop -d {device} -o {filename}

.. then while capturing, try to play your video file through the network.
Control-C to stop the capture.

You can then use Ethereal or WireShark to analyze the capture file.
On the ''Analyze'' menu, select ''Expert Info''.
This will look through all the packets and will report
any warning or errors it sees.
Regards
Nigel Smith
--
This message posted from opensolaris.org

Matt Harrison

2008-Oct-26 14:40 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Sat, Oct 25, 2008 at 06:50:46PM -0700, Nigel Smith
wrote:> Hi Matt
> What chipset is your PCI network card?
> (obviously, it not Intel, but what is it?)
> Do you know which driver the card is using?
I believe it''s some sort of Realtek (8139 probably). It''s
coming up as rtls0
> You say ''..The system was fine for a couple of weeks..''.
> At that point did you change any software - do any updates or upgrades?
> For instance, did you upgrade to a new build of OpenSolaris?
No, since the original problem with the onboard NICs it hasn''t been
upgraded
or anything.
> If not, then I would guess it''s some sort of hardware problem.
> Can you try different cables and a different switch - anything
> in the path between client & server is suspect.
Have tried different cables and switch ports, I will try a different switch
as soon as I can get some space on one of the others.
> A mismatch of Ethernet duplex settings can cause problems - are
> you sure this is Ok.
Not 100% sure, but I will check as best I can.
> To get an idea of how the network is running try this:
> 
> On the Solaris box, do an Ethernet capture with ''snoop''
to a file.
> http://docs.sun.com/app/docs/doc/819-2240/snoop-1m?a=view
> 
>  # snoop -d {device} -o {filename}
> 
> .. then while capturing, try to play your video file through the network.
> Control-C to stop the capture.
> 
> You can then use Ethereal or WireShark to analyze the capture file.
> On the ''Analyze'' menu, select ''Expert
Info''.
> This will look through all the packets and will report
> any warning or errors it sees.
It''s coming up with a huge number of "TCP Bad Checksum"
errors, a few
"Previous Segment Lost" and a few "Fast retransmission".

Thanks

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081026/756a270b/attachment.bin>

Nigel Smith

2008-Oct-27 00:42 UTC

head link

[zfs-discuss] diagnosing read performance problem

Ok on the answers to all my questions.
There''s nothing that really stands out as being obviously wrong.
Just out of interest, what build of OpenSolaris are you using?

One thing you could try on the Ethernet capture file, is to set
the WireShark ''Time'' column like this:
"View > Time Display Format > Seconds Since Previous Displayed
Packet"

Then look down the time column for any unusual high time delays
between packets. Any unusually high delays during
a data transfer phase, may indicate a problem.


Another thing you could try is measuring network performance
with a utility called ''iperf''.
It''s not part of Solaris, so you would need to compile it.
Download the source from here:
http://sourceforge.net/projects/iperf/

I''ve just compiled the latest version 2.0.4 on snv_93
without problem, using the normal "configure, make, make install".

If you want to run ''iperf'' on a windows box, you can
download a ''.exe'' of an older version here:
http://www.noc.ucf.edu/Tools/Iperf/

You can find tutorials on how to use it at these links:
http://www.openmaniak.com/iperf.php
http://www.enterprisenetworkingplanet.com/netos/article.php/3657236

I''ve just tried ''iperf'' between my OpenSolaris pc
& an old
Windows pc, both with low-cost realtek gigabit cards and
linked via a low-cost NetGear switch. I measured a TCP
bandwidth of 196 Mbit/sec in one direction and
145 Mbit/sec in the opposite direction.
(On OpenSolaris, Iperf was not able to increase
the default TCP window size of 48K bytes.)
Regards
Nigel Smith
--
This message posted from opensolaris.org

Matt Harrison

2008-Oct-27 01:16 UTC

head link

[zfs-discuss] diagnosing read performance problem

Nigel Smith wrote:> Ok on the answers to all my questions.
> There''s nothing that really stands out as being obviously wrong.
> Just out of interest, what build of OpenSolaris are you using?
> 
> One thing you could try on the Ethernet capture file, is to set
> the WireShark ''Time'' column like this:
> "View > Time Display Format > Seconds Since Previous Displayed
Packet"
> 
> Then look down the time column for any unusual high time delays
> between packets. Any unusually high delays during
> a data transfer phase, may indicate a problem.
Along with the errors that I noted previously, some of the packets to
seem to be taking a rather long time (>0.5s).

I''ve taken a cap file from wireshark in the hope it clears up some
information. The capture is less than a minute of playing a video over
the cifs share.

It''s a little too large to send in a mail so I''ve posted it at

http://distfiles.genestate.com/_00001_20081027010354.zip
> Another thing you could try is measuring network performance
> with a utility called ''iperf''.
Thanks for pointing this program out, I''ve just run it to the gentoo
firewall we''ve got, and it''s reporting good speeds for the
network.

Thanks

Matt

Matt Harrison

2008-Oct-27 01:17 UTC

head link

[zfs-discuss] diagnosing read performance problem

Nigel Smith wrote:> Ok on the answers to all my questions.
> There''s nothing that really stands out as being obviously wrong.
> Just out of interest, what build of OpenSolaris are you using?
Damn forgot to add that, I''m running SXCE snv_97.

Thanks

Matt

Casper.Dik at Sun.COM

2008-Oct-27 16:24 UTC

head link

[zfs-discuss] Scrub is suddenly done.

I''m running a scrub and I''m running "zpool status"
every 5 minutes.

This happens:

  pool: export
 state: ONLINE
 scrub: scrub in progress for 1h16m, 44.91% done, 1h34m to go
config:

        NAME        STATE     READ WRITE CKSUM
        export      ONLINE       0     0     0
          c0d0s7    ONLINE       0     0     0

errors: No known data errors

And then 5 minutes later (or probably even 1 minute), this happens"

Mon Oct 27 17:03:21 MET 2008
  pool: export
 state: ONLINE
 scrub: scrub completed after 1h17m with 0 errors on Mon Oct 27 16:59:30 2008
config:

        NAME        STATE     READ WRITE CKSUM
        export      ONLINE       0     0     0
          c0d0s7    ONLINE       0     0     0

Is this normal?  

Casper

Nigel Smith

2008-Oct-28 01:18 UTC

head link

[zfs-discuss] diagnosing read performance problem

Hi Matt
Unfortunately, I''m having problems un-compressing that zip file.
I tried with 7-zip and WinZip reports this:

skipping _00001_20081027010354.cap: this file was compressed using an unknown
compression method.
   Please visit www.winzip.com/wz54.htm for more information.
   The compression method used for this file is 98.

Please can you check it out, and if necessary use a more standard
compression algorithm.
Download File Size was 8,782,584 bytes.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org

Matt Harrison

2008-Oct-28 13:26 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Mon, Oct 27, 2008 at 06:18:59PM -0700, Nigel Smith
wrote:> Hi Matt
> Unfortunately, I''m having problems un-compressing that zip file.
> I tried with 7-zip and WinZip reports this:
> 
> skipping _00001_20081027010354.cap: this file was compressed using an
unknown compression method.
>    Please visit www.winzip.com/wz54.htm for more information.
>    The compression method used for this file is 98.
> 
> Please can you check it out, and if necessary use a more standard
> compression algorithm.
> Download File Size was 8,782,584 bytes.
Apologies, I had let winzip compress it with whatever it thought was best,
apparently this was the best method for size, not compatibility.

There''s a new upload under the same URL compressed with 2.0 compatible
compression. Fingers crossed that works better for you.

Thanks

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081028/36429ef6/attachment.bin>

Nigel Smith

2008-Oct-29 00:30 UTC

head link

[zfs-discuss] diagnosing read performance problem

Hi Matt.
Ok, got the capture and successfully ''unzipped'' it.
(Sorry, I guess I''m using old software to do this!)

I see 12840 packets. The capture is a TCP conversation 
between two hosts using the SMB aka CIFS protocol.

10.194.217.10 is the client - Presumably Windows?
10.194.217.3 is the server - Presumably OpenSolaris - CIFS server?

Using WireShark,
Menu: ''Statistics > Endpoints'' show:

The Client has transmitted 4849 packets, and
the Server has transmitted 7991 packets.

Menu: ''Analyze > Expert info Composite'':
The ''Errors'' tab shows:
4849 packets with a ''Bad TCP checksum'' error - These are all
transmitted by the Client.

(Apply a filter of ''ip.src_host == "10.194.217.10"''
to confirm this.)

The ''Notes'' tab shows:
..numerous ''Duplicate Ack''s''
For example, for 60 different ACK packets, the exact same packet was
re-transmitted 7 times!
Packet #3718 was duplicated 17 times.
Packet #8215 was duplicated 16 times.
packet #6421 was duplicated 15 times, etc.
These bursts of duplicate ACK packets are all coming from the client side.

This certainly looks strange to me - I''ve not seen anything like this
before.
It''s not going to help the speed to unnecessarily duplicate packets
like
that, and these burst are often closely followed by a short delay, ~0.2 seconds.
And as far as I can see, it looks to point towards the client as the source
of the problem.
If you are seeing the same problem with other client PC, then I guess we need to
suspect the ''switch'' that connects them.

Ok, that''s my thoughts & conclusion for now.
Maybe you could get some more snoop captures with other clients, and
with a different switch, and do a similar analysis.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org

Richard Elling

2008-Oct-29 00:45 UTC

head link

[zfs-discuss] diagnosing read performance problem

I replied to Matt directly, but didn''t hear back.  It may be a driver
issue
with checksum offloading.  Certainly the symptoms are consistent.
To test with a workaround see
    http://bugs.opensolaris.org/view_bug.do?bug_id=6686415
 -- richard

Nigel Smith wrote:> Hi Matt.
> Ok, got the capture and successfully ''unzipped'' it.
> (Sorry, I guess I''m using old software to do this!)
>
> I see 12840 packets. The capture is a TCP conversation 
> between two hosts using the SMB aka CIFS protocol.
>
> 10.194.217.10 is the client - Presumably Windows?
> 10.194.217.3 is the server - Presumably OpenSolaris - CIFS server?
>
> Using WireShark,
> Menu: ''Statistics > Endpoints'' show:
>
> The Client has transmitted 4849 packets, and
> the Server has transmitted 7991 packets.
>
> Menu: ''Analyze > Expert info Composite'':
> The ''Errors'' tab shows:
> 4849 packets with a ''Bad TCP checksum'' error - These are
all transmitted by the Client.
>
> (Apply a filter of ''ip.src_host ==
"10.194.217.10"'' to confirm this.)
>
> The ''Notes'' tab shows:
> ..numerous ''Duplicate Ack''s''
> For example, for 60 different ACK packets, the exact same packet was
re-transmitted 7 times!
> Packet #3718 was duplicated 17 times.
> Packet #8215 was duplicated 16 times.
> packet #6421 was duplicated 15 times, etc.
> These bursts of duplicate ACK packets are all coming from the client side.
>
> This certainly looks strange to me - I''ve not seen anything like
this before.
> It''s not going to help the speed to unnecessarily duplicate
packets like
> that, and these burst are often closely followed by a short delay, ~0.2
seconds.
> And as far as I can see, it looks to point towards the client as the source
> of the problem.
> If you are seeing the same problem with other client PC, then I guess we
need to
> suspect the ''switch'' that connects them.
>
> Ok, that''s my thoughts & conclusion for now.
> Maybe you could get some more snoop captures with other clients, and
> with a different switch, and do a similar analysis.
> Regards
> Nigel Smith
>

Matt Harrison

2008-Oct-29 09:29 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Tue, Oct 28, 2008 at 05:30:55PM -0700, Nigel Smith
wrote:> Hi Matt.
> Ok, got the capture and successfully ''unzipped'' it.
> (Sorry, I guess I''m using old software to do this!)
> 
> I see 12840 packets. The capture is a TCP conversation 
> between two hosts using the SMB aka CIFS protocol.
> 
> 10.194.217.10 is the client - Presumably Windows?
> 10.194.217.3 is the server - Presumably OpenSolaris - CIFS server?
All correct so far
> Using WireShark,
> Menu: ''Statistics > Endpoints'' show:
> 
> The Client has transmitted 4849 packets, and
> the Server has transmitted 7991 packets.
> 
> Menu: ''Analyze > Expert info Composite'':
> The ''Errors'' tab shows:
> 4849 packets with a ''Bad TCP checksum'' error - These are
all transmitted by the Client.
> 
> (Apply a filter of ''ip.src_host ==
"10.194.217.10"'' to confirm this.)
> 
> The ''Notes'' tab shows:
> ..numerous ''Duplicate Ack''s''
> For example, for 60 different ACK packets, the exact same packet was
re-transmitted 7 times!
> Packet #3718 was duplicated 17 times.
> Packet #8215 was duplicated 16 times.
> packet #6421 was duplicated 15 times, etc.
> These bursts of duplicate ACK packets are all coming from the client side.
> 
> This certainly looks strange to me - I''ve not seen anything like
this before.
> It''s not going to help the speed to unnecessarily duplicate
packets like
> that, and these burst are often closely followed by a short delay, ~0.2
seconds.
> And as far as I can see, it looks to point towards the client as the source
> of the problem.
> If you are seeing the same problem with other client PC, then I guess we
need to
> suspect the ''switch'' that connects them.
I have another switch on the way to move to. I will see if this helps.

Thanks for your input

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081029/b9aa53b8/attachment.bin>

Matt Harrison

2008-Oct-29 09:31 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Tue, Oct 28, 2008 at 05:45:48PM -0700, Richard Elling
wrote:> I replied to Matt directly, but didn''t hear back.  It may be a
driver issue
> with checksum offloading.  Certainly the symptoms are consistent.
> To test with a workaround see
>     http://bugs.opensolaris.org/view_bug.do?bug_id=6686415
Hi, Sorry for not replying, we had some problems with our email provider
yesterday and I was up all night restoring backups.

I did try the workaround, but it didn''t have any effect, presumbably
because
it''s not using the rge driver as you stated before.

I''ll try swapping the switch out and post back my results.

Many Thanks

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081029/94abcdfa/attachment.bin>

Richard Elling

2008-Oct-29 15:50 UTC

head link

[zfs-discuss] diagnosing read performance problem

Matt Harrison wrote:> On Tue, Oct 28, 2008 at 05:45:48PM -0700, Richard Elling wrote:
>   
>> I replied to Matt directly, but didn''t hear back.  It may be a
driver issue
>> with checksum offloading.  Certainly the symptoms are consistent.
>> To test with a workaround see
>>     http://bugs.opensolaris.org/view_bug.do?bug_id=6686415
>>     
>
> Hi, Sorry for not replying, we had some problems with our email provider
> yesterday and I was up all night restoring backups.
>
> I did try the workaround, but it didn''t have any effect,
presumbably because
> it''s not using the rge driver as you stated before.
>   
The dohwcksum is not an rge option, it is an ip option (hence it is
named ip:dohwcksum) and it transcends NIC drivers.  But if that
didn''t fix anything, then you should be able to safely ignore it.
> I''ll try swapping the switch out and post back my results.
>   
Yeah, I''ve seen this sort of thing before, too.  I once had a switch
that lost its mind and wouldn''t let big packets through unscathed.
We could telnet, ping, ftp, and do all sorts of things through the switch,
but we couldn''t push NFS through.  These sorts of failures can be
difficult to isolate.
 -- richard

Nigel Smith

2008-Oct-29 17:01 UTC

head link

[zfs-discuss] diagnosing read performance problem

Hi Matt
Can you just confirm if that Ethernet capture file, that you made available,
was done on the client, or on the server. I''m beginning to suspect you
did it on the client.

You can get a capture file on the server (OpenSolaris) using the
''snoop''
command, as per one of my previous emails.  You can still view the
capture file with WireShark as it supports the ''snoop'' file
format.

Normally it would not be too important where the capture was obtained,
but here, where something strange is happening, it could be critical to 
understanding what is going wrong and where.

It would be interesting to do two separate captures - one on the client
and the one on the server, at the same time, as this would show if the
switch was causing disruption.  Try to have the clocks on the client &
server synchronised as close as possible.
Thanks
Nigel Smith
-- 
This message posted from opensolaris.org

Matt Harrison

2008-Oct-29 18:25 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Wed, Oct 29, 2008 at 10:01:09AM -0700, Nigel Smith
wrote:> Hi Matt
> Can you just confirm if that Ethernet capture file, that you made
available,
> was done on the client, or on the server. I''m beginning to suspect
you
> did it on the client.
That capture was done from the client
> You can get a capture file on the server (OpenSolaris) using the
''snoop''
> command, as per one of my previous emails.  You can still view the
> capture file with WireShark as it supports the ''snoop''
file format.
I am uploading a snoop from the server to 

http://distfiles.genestate.com/snoop.zip

Please note this snoop will include traffic to ssh as I can''t work out
how
to filter that out :P
> Normally it would not be too important where the capture was obtained,
> but here, where something strange is happening, it could be critical to 
> understanding what is going wrong and where.
> 
> It would be interesting to do two separate captures - one on the client
> and the one on the server, at the same time, as this would show if the
> switch was causing disruption.  Try to have the clocks on the client &
> server synchronised as close as possible.
Clocks are synced via ntp as we''re using Active Directory with CIFS.

On another note, I''ve just moved the offending network to another
switch and
it''s even worse I think. I''ve noticed that under high load,
the link light
for the server''s connection blinks on and off, not quite steadily but
about
every 2 seconds.

This appears in /var/adm/messages:

Oct 29 18:24:22 exodus mac: [ID 435574 kern.info] NOTICE: rtls0 link up, 100
Mbps, full duplex
Oct 29 18:24:24 exodus mac: [ID 486395 kern.info] NOTICE: rtls0 link down
Oct 29 18:24:25 exodus mac: [ID 435574 kern.info] NOTICE: rtls0 link up, 100
Mbps, full duplex
Oct 29 18:24:27 exodus mac: [ID 486395 kern.info] NOTICE: rtls0 link down
Oct 29 18:24:28 exodus mac: [ID 435574 kern.info] NOTICE: rtls0 link up, 100
Mbps, full duplex
Oct 29 18:24:30 exodus mac: [ID 486395 kern.info] NOTICE: rtls0 link down
Oct 29 18:24:31 exodus mac: [ID 435574 kern.info] NOTICE: rtls0 link up, 100
Mbps, full duplex

I think it''s got to be the NIC, the network runs full duplex quite
happily
so I don''t think its an auto-neg problem.

Thanks for sticking with this :)

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081029/bba6903f/attachment.bin>

Nigel Smith

2008-Oct-30 00:32 UTC

head link

[zfs-discuss] diagnosing read performance problem

Hi Matt
In your previous capture, (which you have now confirmed was done
on the Windows client), all those ''Bad TCP checksum'' packets
sent by the client,
are explained, because you must be doing hardware TCP checksum offloading
on the client network adaptor.  WireShark will capture the packets before
that hardware calculation is done, so the checksum all appear to be wrong,
as they have not yet been calculated!

  http://wiki.wireshark.org/TCP_checksum_offload
  http://www.wireshark.org/docs/wsug_html_chunked/ChAdvChecksums.html

Ok, so lets look at the new capture, ''snoop''ed on the
OpenSolaris box.

I was surprised how small that snoop capture file was
 - only 753400 bytes after unzipping.
I soon realized why...

The strange thing is that I''m only seeing half of the conversation!
I see packets sent from client to server.
That is from source: 10.194.217.10 to destination: 10.194.217.3

I can also see some packets from
source: 10.194.217.5 (Your AD domain controller) to destination  10.194.217.3

But you''ve not capture anything transmitted from your
OpenSolaris server - source: 10.194.217.3

(I checked, and I did not have any filters applied in WireShark
that would cause the missing half!)
Strange! I''m not sure how you did that.

The half of the conversation that I can see looks fine - there
does not seem to be any problem.  I''m not seeing any duplication
of ACK''s from the client in this capture.  
(So again somewhat strange, unless you''ve fixed the problem!)

I''m assuming your using a single network card in the Solaris server, 
but maybe you had better just confirm that.

Regarding not capturing SSH traffic and only capturing traffic from
(& hopefully to) the client, try this:

 # snoop -o test.cap -d rtls0 host 10.194.217.10 and not port 22

Regarding those ''link down'', ''link up''
messages, ''/var/adm/messages''.
I can tie up some of those events with your snoop capture file,
but it just shows that no packets are being received while the link is down,
which is exactly what you would expect.
But dropping the link for a second will surely disrupt your video playback!

If the switch is ok, and the cable from the switch is ok, then it does
now point towards the network card in the OpenSolaris box.  
Maybe as simple as a bad mechanical connection on the cable socket....

BTW, just run ''/usr/X11/bin/scanpci''  and identify the
''vendor id'' and
''device id'' for the network card, just in case it turns out to
be a driver bug.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org

Matt Harrison

2008-Oct-30 01:26 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Wed, Oct 29, 2008 at 05:32:39PM -0700, Nigel Smith
wrote:> Hi Matt
> In your previous capture, (which you have now confirmed was done
> on the Windows client), all those ''Bad TCP checksum''
packets sent by the client,
> are explained, because you must be doing hardware TCP checksum offloading
> on the client network adaptor.  WireShark will capture the packets before
> that hardware calculation is done, so the checksum all appear to be wrong,
> as they have not yet been calculated!
I know that the client I was using has an nForce board with nVidia network
controllers. There is an option to offload to hardware but I believe that
was disabled.
> The strange thing is that I''m only seeing half of the
conversation!
> I see packets sent from client to server.
> That is from source: 10.194.217.10 to destination: 10.194.217.3
> 
> I can also see some packets from
> source: 10.194.217.5 (Your AD domain controller) to destination 
10.194.217.3
> 
> But you''ve not capture anything transmitted from your
> OpenSolaris server - source: 10.194.217.3
> 
> (I checked, and I did not have any filters applied in WireShark
> that would cause the missing half!)
> Strange! I''m not sure how you did that.
I believe i was using the wrong filter expression...my bad :(
> The half of the conversation that I can see looks fine - there
> does not seem to be any problem.  I''m not seeing any duplication
> of ACK''s from the client in this capture.  
> (So again somewhat strange, unless you''ve fixed the problem!)
>
> I''m assuming your using a single network card in the Solaris
server,
> but maybe you had better just confirm that.
Confirmed, there is a single PCI NIC that i''m using (there is the dual
onboard but they don''t work for me anymore).
 > Regarding not capturing SSH traffic and only capturing traffic from
> (& hopefully to) the client, try this:
> 
>  # snoop -o test.cap -d rtls0 host 10.194.217.10 and not port 22
Much better thanks. I am attaching a second snoop from the server with the
full conversation.

http://distfiles.genestate.com/snoop2.zip

Incidentally, this is talking to a different client, which although
doesn''t
show checksum errors, does still have a load of duplicate ACKs. If this
confuses the issue, I can do it from the old client as soon as it becomes
free.
> Regarding those ''link down'', ''link up''
messages, ''/var/adm/messages''.
> I can tie up some of those events with your snoop capture file,
> but it just shows that no packets are being received while the link is
down,
> which is exactly what you would expect.
> But dropping the link for a second will surely disrupt your video playback!
> 
> If the switch is ok, and the cable from the switch is ok, then it does
> now point towards the network card in the OpenSolaris box.  
> Maybe as simple as a bad mechanical connection on the cable socket....
Very possible. I have an Intell Pro 1000 and a new GB switch on the way.
> BTW, just run ''/usr/X11/bin/scanpci''  and identify the
''vendor id'' and
> ''device id'' for the network card, just in case it turns
out to be a driver bug.
pci bus 0x0001 cardnum 0x06 function 0x00: vendor 0x10ec device 0x8139
 Realtek Semiconductor Co., Ltd. RTL-8139/8139C/8139C+

and the two onboards that no longer function:

pci bus 0x0000 cardnum 0x08 function 0x00: vendor 0x10de device 0x0373
 nVidia Corporation MCP55 Ethernet

pci bus 0x0000 cardnum 0x09 function 0x00: vendor 0x10de device 0x0373
 nVidia Corporation MCP55 Ethernet

Thanks

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081030/11998203/attachment.bin>

Nigel Smith

2008-Oct-31 00:22 UTC

head link

[zfs-discuss] diagnosing read performance problem

Hi Matt
Well this time you have filtered out any SSH traffic on port 22 successfully.

But I''m still only seeing half of the conversation!
I see packets sent from client to server.
That is from source: 10.194.217.12 to destination: 10.194.217.3
So a different client IP this time

And the Duplicate ACK packets (often long bursts) are back in this capture.
I''ve looked at these a little bit more carefully this time,
and I now notice it''s using the ''TCP selective
acknowledgement'' feature (SACK)
on those packets.

Now this is not something I''ve come across before, so I need to do some
googling!  SACK is defined in RFC1208.

 http://www.ietf.org/rfc/rfc2018.txt

I found this explanation of when SACK is used:

 http://thenetworkguy.typepad.com/nau/2007/10/one-of-the-most.html
 http://thenetworkguy.typepad.com/nau/2007/10/tcp-selective-a.html

This seems to indicate these ''SACK'' packets are triggered as a
result
of ''lost packets'', in this case, it must be the packets sent
back from
your server to the client, that is during your video playback.

Of course I''m not seeing ANY of those packets in this capture
because there are none captured from server to client!  
I''m still not sure why you cannot seem to capture these packets!

Oh, by the way, I probably should advise you to run...

 # netstat -i

..on the OpenSolaris box, to see if any errors are being counted
on the network interface.

Are you still seeing the link going up/down in
''/var/admin/message''?
You are never going to do any good while that is happening.
I think you need to try a different network card in the server.
Regards
Nigel Smith
-- 
This message posted from opensolaris.org

Matt Harrison

2008-Oct-31 01:03 UTC

head link

[zfs-discuss] diagnosing read performance problem

Nigel Smith wrote:> Hi Matt
> Well this time you have filtered out any SSH traffic on port 22
successfully.
> 
> But I''m still only seeing half of the conversation!
Grr this is my day, I think I know what the problem was...user error as 
I''m not used to snoop.
> I see packets sent from client to server.
> That is from source: 10.194.217.12 to destination: 10.194.217.3
> So a different client IP this time
> 
> And the Duplicate ACK packets (often long bursts) are back in this capture.
> I''ve looked at these a little bit more carefully this time,
> and I now notice it''s using the ''TCP selective
acknowledgement'' feature (SACK)
> on those packets.
> 
> Now this is not something I''ve come across before, so I need to do
some
> googling!  SACK is defined in RFC1208.
> 
>  http://www.ietf.org/rfc/rfc2018.txt
> 
> I found this explanation of when SACK is used:
> 
>  http://thenetworkguy.typepad.com/nau/2007/10/one-of-the-most.html
>  http://thenetworkguy.typepad.com/nau/2007/10/tcp-selective-a.html
> 
> This seems to indicate these ''SACK'' packets are triggered
as a result
> of ''lost packets'', in this case, it must be the packets
sent back from
> your server to the client, that is during your video playback.
Well thats a bit above me. I can understand the lost packets though, it 
sounds about right for the situation.
> Of course I''m not seeing ANY of those packets in this capture
> because there are none captured from server to client!  
> I''m still not sure why you cannot seem to capture these packets!
I think I know the problem, I thought I should enable promiscuous mode, 
so I quickly scanned the help output and added the -P switch. However 
that does the opposide of what I thought and takes the snoop out of 
promiscuous mode.
> Oh, by the way, I probably should advise you to run...
> 
>  # netstat -i
Yes one of the previous replies to this thread advised me to try that. 
The count does increase, however quite slowly to me.

After being up for 6 hours with a few video playback tests the Oerr 
count sits at 92 currently.
> ..on the OpenSolaris box, to see if any errors are being counted
> on the network interface.
> 
> Are you still seeing the link going up/down in
''/var/admin/message''?
> You are never going to do any good while that is happening.
> I think you need to try a different network card in the server.
Strangely the link up/down problem was only present on the second switch 
  I tried (which works perfectly for other connections). On the first 
switch the link appears stable at first glance however we''re getting 
these duplicate acks, and checksum errors (although the csums might be 
caused by the hardware offloading of that client as you pointed out).

I''ve got a couple of brand new Intel Pro 1000s and a new switch
arriving
  by courier tomorrow morning, so with any luck I should see some 
difference.

I''m getting a bit busy but I will attempt to make another snoop 
*without* disabling promiscuous mode.

Thanks for all your input

Matt

Matt Harrison

2008-Oct-31 11:52 UTC

head link

[zfs-discuss] diagnosing read performance problem

Ok, I have recieved a new set of NICs and a new switch and the problem still
remains.

Just for something to do I ran some tests:

Copying a 200Mb file over scp from the main problem workstation to a totally
unrelated gentoo linux box. Absolutely no problems.

So I thought it was down to the zfs fileserver. Then I ran the same test to
the zfs filer just to check. Absolutely no problems again...."!$%?&%^

I may just be getting light-headed from the hair pulling, but it seems that
the problem only occurs when the traffic is going thru the CIFS server. 

I''m going to write a new thread to cifs-discuss and provide them some
captures, maybe they have a clue why this might happen.

I''m also going to switch back to the snv_95 BE I still have on the
server,
it''s possible it might have some effect.

Thanks

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081031/c952325c/attachment.bin>

Matt Harrison

2008-Oct-31 12:12 UTC

head link

[zfs-discuss] diagnosing read performance problem

On Fri, Oct 31, 2008 at 11:52:09AM +0000, Matt Harrison
wrote:> Ok, I have recieved a new set of NICs and a new switch and the problem
still
> remains.
> 
> Just for something to do I ran some tests:
> 
> Copying a 200Mb file over scp from the main problem workstation to a
totally
> unrelated gentoo linux box. Absolutely no problems.
> 
> So I thought it was down to the zfs fileserver. Then I ran the same test to
> the zfs filer just to check. Absolutely no problems
again...."!$%?&%^
> 
> I may just be getting light-headed from the hair pulling, but it seems that
> the problem only occurs when the traffic is going thru the CIFS server. 
Wrong, wrong, wrong. The problem only manifests when copying data *from* the
filer, not to.
> I''m going to write a new thread to cifs-discuss and provide them
some
> captures, maybe they have a clue why this might happen.
> 
> I''m also going to switch back to the snv_95 BE I still have on the
server,
> it''s possible it might have some effect.
Well I would, but the 95 BE keeps rebooting on me.

<me>goes off for a cry</me>

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081031/5634b374/attachment.bin>

Matt Harrison

2008-Oct-31 17:30 UTC

head link

[zfs-discuss] diagnosing read performance problem

Well, somehow it''s fixed:

Since putting in the new Intel card, the transfer from the box dropped so
badly, I couldn''t even copy a snoop from it.

So I removed the dohwchksum line from /etc/system and rebooted. Then just to
clean up a bit I disabled the onboard NICs in the bios.

Now I''m still seeing the duplicate ACKs and the checksum errors from
that
client, but the transfers have sped right up.

Video playback and all other copying from the server are now working again
without any problems so far.

Thanks to all that have contributed to this thread, it really did help me
organise my thoughts.

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL:
<http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081031/b09aeada/attachment.bin>

Possibly Parallel Threads

Search for more apparently analagous threads

zfs discuss - Oct 2008 - diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] Scrub is suddenly done.

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

[zfs-discuss] diagnosing read performance problem

Possibly Parallel Threads