A little back story: I have a Norco DS-1220, a 12 bay SATA box, it is connected to eSATA (SiI3124) via PCI-X two drives are straight connections, then the other two ports go to 5x multipliers within the box. My needs/hopes for this was using 12 500GB drives and ZFS make a very large & simple data dump spot on my network for other servers to rsync to daily & use zfs snapshots for some quick backup & if it things worked out start trying to save up towards getting a thumper someday The trouble is it is too slow to really useable. At times it is fast enough to be useable, ~ 13MB/s write. However, this last for only a few minutes. It then just stalls doing nothing. iostat shows 100% blocking for one of the drives in the pool I can however use dd to read or write directly to/from the disks all at the same time with good speed (~30MB/s according to dd) The test pools I have had are either 2 raidz of 6 drives or 3 raidz of 4 drives. The system is using an Athlon 64 3500+ & 1GB of RAM. Any suggestions on what I could do to make this useable? More RAM? Too many drives for ZFS? Any tests to find the real slow down? I would really like to use ZFS & solaris for this. Linux was able to use the same hardware using some beta kernel modules for the sata multipliers & its software raid at an acceptable speed, but I would like to finally rid my network of linux boxen. Thanks, Stuart
Stuart Glenn wrote:> A little back story: I have a Norco DS-1220, a 12 bay SATA box, it is > connected to eSATA (SiI3124) via PCI-X two drives are straight > connections, then the other two ports go to 5x multipliers within the > box. My needs/hopes for this was using 12 500GB drives and ZFS make a > very large & simple data dump spot on my network for other servers to > rsync to daily & use zfs snapshots for some quick backup & if it > things worked out start trying to save up towards getting a thumper > someday > > The trouble is it is too slow to really useable. At times it is fast > enough to be useable, ~ 13MB/s write. However, this last for only a > few minutes. It then just stalls doing nothing. iostat shows 100% > blocking for one of the drives in the pool > > I can however use dd to read or write directly to/from the disks all > at the same time with good speed (~30MB/s according to dd) > > The test pools I have had are either 2 raidz of 6 drives or 3 raidz of > 4 drives. The system is using an Athlon 64 3500+ & 1GB of RAM. > > Any suggestions on what I could do to make this useable? More RAM? Too > many drives for ZFS? Any tests to find the real slow down? > > I would really like to use ZFS & solaris for this. Linux was able to > use the same hardware using some beta kernel modules for the sata > multipliers & its software raid at an acceptable speed, but I would > like to finally rid my network of linux boxen.I have similar issues on my home workstation. They started happening when I put Seagate SATA-II drives with NCQ on a SI3124. I do not believe this to be an issue with ZFS. I''ve largely dismissed the issue as hardware caused, although I may be wrong. This system has had several problems with SATA-II drives which hardware forums suggest are issues with the nForce4 chipset and SATA-II. Anyway, your not alone, but its not a ZFS issue. Its possible a tunable parameter in the SATA drivers would help. If I find an answer I''ll let you know. benr.
On Dec 15, 2006, at 13:49, Ben Rockwood wrote:> I have similar issues on my home workstation. They started > happening when I put Seagate SATA-II drives with NCQ on a SI3124. > I do not believe this to be an issue with ZFS. I''ve largely > dismissed the issue as hardware caused, although I may be wrong. > This system has had several problems with SATA-II drives which > hardware forums suggest are issues with the nForce4 chipset and > SATA-II. > > Anyway, your not alone, but its not a ZFS issue. Its possible a > tunable parameter in the SATA drivers would help. If I find an > answer I''ll let you know.The drives are WD5000YS, which are SATA2, so it could very well then be what ever NCQ & SI3124 issues there are. Sadly working in grant not for profit medical research we don''t have the money for any more hardware If there are anythings to tune anywhere to give me a steady/ sustainable performance I would greatly appreciate it. The built in compression & snapshot make zfs worth it, but not at such a slow speed Thanks- Stuart
On Fri, 15 Dec 2006, Ben Rockwood wrote:> Stuart Glenn wrote: > > A little back story: I have a Norco DS-1220, a 12 bay SATA box, it is > > connected to eSATA (SiI3124) via PCI-X two drives are straight > > connections, then the other two ports go to 5x multipliers within the > > box. My needs/hopes for this was using 12 500GB drives and ZFS make a > > very large & simple data dump spot on my network for other servers to > > rsync to daily & use zfs snapshots for some quick backup & if it > > things worked out start trying to save up towards getting a thumper > > someday > > > > The trouble is it is too slow to really useable. At times it is fast > > enough to be useable, ~ 13MB/s write. However, this last for only a > > few minutes. It then just stalls doing nothing. iostat shows 100% > > blocking for one of the drives in the pool > > > > I can however use dd to read or write directly to/from the disks all > > at the same time with good speed (~30MB/s according to dd) > > > > The test pools I have had are either 2 raidz of 6 drives or 3 raidz of > > 4 drives. The system is using an Athlon 64 3500+ & 1GB of RAM. > > > > Any suggestions on what I could do to make this useable? More RAM? Too > > many drives for ZFS? Any tests to find the real slow down? > > > > I would really like to use ZFS & solaris for this. Linux was able to > > use the same hardware using some beta kernel modules for the sata > > multipliers & its software raid at an acceptable speed, but I would > > like to finally rid my network of linux boxen. > > I have similar issues on my home workstation. They started happening > when I put Seagate SATA-II drives with NCQ on a SI3124. I do not > believe this to be an issue with ZFS. I''ve largely dismissed the issue > as hardware caused, although I may be wrong. This system has had > several problems with SATA-II drives which hardware forums suggest are > issues with the nForce4 chipset and SATA-II. > > Anyway, your not alone, but its not a ZFS issue. Its possible a tunable > parameter in the SATA drivers would help. If I find an answer I''ll let > you know.I''ve seen this issue also after "hot wiring" up a drive to a 3124 based controller on a system with Build 53 installed while attempting to tar up a copy of the drive to a ZFS filesystem via an NFS mount. Looking at the output of "iostat -xcn 5" (for the source drive only), the %b number starts to climb - depending on the particular average file size of the data currently being archived. When a directory is being archived that contains many, many small files, the read I/O Ops/Sec increases[1] and the %b number starts to climb. It will do so for several iterations of the iostat command (every 5 Seconds). Then it reaches (roughly) 96% and the tar process stops completely. Subsequent iostat output indicates zero data transferred, and zero I/O read (or write) operations for about a minute (IIRC) or longer. Usually longer. IOW - it looks like the drive has died, or is about to die ... and I fully expect to see a timeout reported. But after 60 or 90 Seconds (I did''nt time it), the tar process continues as if nothing had happened... and the cycle repeats when a directory structure is encounter with many small files. I don''t think its a ZFS issue. My gut feeling is that its a 3124 (driver) or sd driver issue. Something to do with how the process that sources the data stream is throttled. I have no idea how Solaris throttles back individual I/O bound processes when the devices involved start to get too busy (and slow)[2]. But this is where I think this bug might be lurking. Again - this is just an observation of an issue that seems to be similar to your (and Bens) experiences. A lot more work needs to be done before this simplistic observation can be described accurately enough to be submitted as a bug report. BTW: the disk drive (from an Ultra 20 that died) backup did (eventually!) complete successfully after about 6 periods of being completely comatose as described above. [1] it seems to max out at about 1,500 ops/Sec. [2] OpenSolaris gurus please enlighten me! Al Hopper Logical Approach Inc, Plano, TX. al at logical-approach.com Voice: 972.379.2133 Fax: 972.379.2134 Timezone: US CDT OpenSolaris.Org Community Advisory Board (CAB) Member - Apr 2005 OpenSolaris Governing Board (OGB) Member - Feb 2006