Janåke Rönnblom
2008-Oct-08 19:58 UTC
[zfs-discuss] Troubleshooting ZFS performance with SIL3124 cards
Hi! I have a problem with ZFS and most likely the SATA PCI-X controllers. I run opensolaris 2008.11 snv_98 and my hardware is Sun Netra x4200 M2 with 3 SIL3124 PCI-X with 4 eSATA ports each connected to 3 1U diskchassis which each hold 4 SATA disks manufactured by Seagate model ES.2 (500 and 750) for a total of 12 disks. Every disk has its own eSATA cable connected to the ports on the PCI-X cards. The problem I have is that disk access seems to stop for a few seconds and then continue. This happens every few seconds and the end result is that the performance is terrible and unusable. The idea was to use this box for serving iSCSI to a Windows 2003 Server. However with IOmeter on the Windows box and looking at Task manager i noticed that the speed pulses from 90% to 0% all the time. Investigating further I noticed that I get the same behavior during a simple cp on the localhost. /usr/X11/bin/scanpci gives me this information pci bus 0x0006 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124 Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller pci bus 0x0084 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124 Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller pci bus 0x0088 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124 Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller c5*, c6* and c7* are the eSATA disks. zpool create -f zfsatan mirror c5t0d0 c5t1d0 mirror c5t2d0 c5t3d0 mirror c6t0d0 c6t1d0 mirror c6t2d0 c6t3d0 mirror c7t0d0 c7t1d0 mirror c7t2d0 c7t3d0 zfs create zfsatan/fs01 -bash-3.2# time dd if=/dev/zero bs=1024x1024x1024 count=8 of=/zfsatan/ fs01/storfil 8+0 records in 8+0 records out real 2m58.863s user 0m0.001s sys 0m10.636s gives me for this run a 8192/178 gives me around 46MBytes / second... That is really sucky speed for 12 drives. However this speed varies since the hangups seems to occur on random and for a random time. If you look at the output from iostat -cxn 1 below you find that the first one is okay but the second on the disks are in 100 %w... and it stays at 100 %w for a few seconds. us sy wt id 0 34 0 66 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 0.0 400.9 0.0 49560.1 14.1 0.5 35.2 1.2 47 48 c5t0d0 0.0 156.0 0.0 18327.1 4.6 0.2 29.5 1.1 17 18 c5t1d0 0.0 7.0 0.0 132.0 2.7 0.0 386.0 4.9 56 2 c5t2d0 0.0 293.0 0.0 36735.2 13.4 0.3 45.6 1.1 89 34 c5t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t1d0 0.0 142.0 0.0 17409.8 4.9 0.2 34.9 1.4 20 20 c6t0d0 0.0 350.0 0.0 44030.5 12.6 0.4 36.0 1.3 44 44 c6t1d0 0.0 291.0 0.0 34599.7 9.6 0.3 33.1 1.2 34 35 c6t2d0 0.0 334.0 0.0 40231.0 11.3 0.4 34.0 1.2 39 40 c6t3d0 0.0 241.0 0.0 28210.0 18.1 0.3 75.0 1.1 77 27 c7t0d0 0.0 317.0 0.0 38064.8 10.6 0.4 33.4 1.2 38 38 c7t1d0 0.0 162.0 0.0 18455.7 4.5 0.2 27.6 1.1 18 18 c7t2d0 0.0 162.0 0.0 18455.7 4.5 0.2 27.7 1.1 18 18 c7t3d0 cpu us sy wt id 0 22 0 78 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t1d0 0.0 0.0 0.0 0.0 5.0 0.0 0.0 0.0 100 0 c5t2d0 0.0 0.0 0.0 0.0 5.0 0.0 0.0 0.0 100 0 c5t3d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t3d0 0.0 0.0 0.0 0.0 21.0 0.0 0.0 0.0 100 0 c7t0d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7t1d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7t2d0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7t3d0 Perhaps related bugs: Disk access stops for minutes with 100% blocking http://bugs.opensolaris.org/view_bug.do?bug_id=6544624 si3124 driver loses interrupts. http://bugs.opensolaris.org/view_bug.do?bug_id=6566207 Any ideas? Should I ditch the SIL3224 cards as they seem to have a bad rep on this maillist? I have ordered a Sun SG-XPCI8SAS-E-Z which is an SAS PCI-X card but it will cost me a lot more money just without adding any extra benefit... Except that it might actually work ;) -J ----------------------------------------------------- Jan?ke R?nnblom Phone : +46-910-699 180 Mobile : 070-397 07 43 URL : http://www.ronnblom.se ----------------------------------------------------- "Those who do not understand Unix are condemned to reinvent it, poorly." -- Henry Spencer
Richard Elling
2008-Oct-08 22:28 UTC
[zfs-discuss] Troubleshooting ZFS performance with SIL3124 cards
comment below... Jan?ke R?nnblom wrote:> Hi! > > I have a problem with ZFS and most likely the SATA PCI-X controllers. > I run > opensolaris 2008.11 snv_98 and my hardware is Sun Netra x4200 M2 with > 3 SIL3124 PCI-X with 4 eSATA ports each connected to 3 1U diskchassis > which each hold 4 SATA disks manufactured by Seagate model ES.2 > (500 and 750) for a total of 12 disks. Every disk has its own eSATA > cable > connected to the ports on the PCI-X cards. > > The problem I have is that disk access seems to stop for a few seconds > and > then continue. This happens every few seconds and the end result is > that the > performance is terrible and unusable. > > The idea was to use this box for serving iSCSI to a Windows 2003 > Server. However > with IOmeter on the Windows box and looking at Task manager i noticed > that the > speed pulses from 90% to 0% all the time. Investigating further I > noticed that > I get the same behavior during a simple cp on the localhost. > > /usr/X11/bin/scanpci gives me this information > > pci bus 0x0006 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124 > Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller > > pci bus 0x0084 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124 > Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller > > pci bus 0x0088 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124 > Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller > > c5*, c6* and c7* are the eSATA disks. > > zpool create -f zfsatan mirror c5t0d0 c5t1d0 mirror c5t2d0 c5t3d0 > mirror c6t0d0 c6t1d0 mirror c6t2d0 c6t3d0 mirror c7t0d0 c7t1d0 mirror > c7t2d0 c7t3d0 > zfs create zfsatan/fs01 > > -bash-3.2# time dd if=/dev/zero bs=1024x1024x1024 count=8 of=/zfsatan/ > fs01/storfil > 8+0 records in > 8+0 records out > > real 2m58.863s > user 0m0.001s > sys 0m10.636s > > gives me for this run a 8192/178 gives me around 46MBytes / second... > That is really sucky speed > for 12 drives. However this speed varies since the hangups seems to > occur on random and for a > random time. > > If you look at the output from iostat -cxn 1 below you find that the > first one is okay but the > second on the disks are in 100 %w... and it stays at 100 %w for a few > seconds. > > us sy wt id > 0 34 0 66 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 > 0.0 400.9 0.0 49560.1 14.1 0.5 35.2 1.2 47 48 c5t0d0 > 0.0 156.0 0.0 18327.1 4.6 0.2 29.5 1.1 17 18 c5t1d0 > 0.0 7.0 0.0 132.0 2.7 0.0 386.0 4.9 56 2 c5t2d0 > 0.0 293.0 0.0 36735.2 13.4 0.3 45.6 1.1 89 34 c5t3d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t1d0 > 0.0 142.0 0.0 17409.8 4.9 0.2 34.9 1.4 20 20 c6t0d0 > 0.0 350.0 0.0 44030.5 12.6 0.4 36.0 1.3 44 44 c6t1d0 > 0.0 291.0 0.0 34599.7 9.6 0.3 33.1 1.2 34 35 c6t2d0 > 0.0 334.0 0.0 40231.0 11.3 0.4 34.0 1.2 39 40 c6t3d0 > 0.0 241.0 0.0 28210.0 18.1 0.3 75.0 1.1 77 27 c7t0d0 > 0.0 317.0 0.0 38064.8 10.6 0.4 33.4 1.2 38 38 c7t1d0 > 0.0 162.0 0.0 18455.7 4.5 0.2 27.6 1.1 18 18 c7t2d0 > 0.0 162.0 0.0 18455.7 4.5 0.2 27.7 1.1 18 18 c7t3d0 > cpu > us sy wt id > 0 22 0 78 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c3t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c4t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c0t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c5t1d0 > 0.0 0.0 0.0 0.0 5.0 0.0 0.0 0.0 100 0 c5t2d0 > 0.0 0.0 0.0 0.0 5.0 0.0 0.0 0.0 100 0 c5t3d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c8t1d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t1d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t2d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c6t3d0 > 0.0 0.0 0.0 0.0 21.0 0.0 0.0 0.0 100 0 c7t0d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7t1d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7t2d0 > 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 c7t3d0 >In iostat, the wait, wsvc_t, and %w are for I/Os that are queuing to the HBA. Similarly, the actv, asvc_t, and %b are for I/Os that are queuing to a device.> > Perhaps related bugs: > > Disk access stops for minutes with 100% blocking > http://bugs.opensolaris.org/view_bug.do?bug_id=6544624 > > si3124 driver loses interrupts. > http://bugs.opensolaris.org/view_bug.do?bug_id=6566207 >Based on the above iostat data, I would suspect the HBA. CR 6544624 was marked as a dup of CR 6429205 which was fixed in snv_87. CR 6566207 was fixed in snv_71. There may be a new bug lurking here. -- richard> Any ideas? Should I ditch the SIL3224 cards as they seem to have a bad > rep on this > maillist? I have ordered a Sun SG-XPCI8SAS-E-Z which is an SAS PCI-X > card but it will > cost me a lot more money just without adding any extra benefit... > Except that it > might actually work ;) > > -J > > > ----------------------------------------------------- > Jan?ke R?nnblom > Phone : +46-910-699 180 > Mobile : 070-397 07 43 > URL : http://www.ronnblom.se > ----------------------------------------------------- > "Those who do not understand Unix are condemned to reinvent it, > poorly." -- Henry Spencer > > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >