Hi, I''ve had my RAIDz volume working well on SNV_131 but it has come to my attention that there has been some read issues with the drives. Previously I thought this was a CIFS problem but I''m noticing that when transfering files or uncompressing some fairly large 7z (1-2Gb) files (or even smaller rar - 200-300Mb) files occasionally running iostat will give the b% as 100 for a drive or two. I have the Western Digital EADS 1TB drives (Green ones) and not the more expensive black or enterprise drives (our sysadmins fault). The pool in question spans 4x 1TB drives. What exactly does this mean? Is it a controller problem disk problem or cable problem? I''ve got this on commodity hardware as its only used for a small business with 4-5 staff accessing our media server. Its using the Intel ICHR SATA controller. I''ve already changed the cables, swapped out the odd drive that exhibted this issue and the only thing I can think of is to buy a Intel or LSI SATA card. The scrub sessions take almost a day and a half now (previously at most 12hours!) but theres also 70% of space being used (files wise they''re chunky MPG files) or compressed artwork but there are no errors reported. Does anyone have any ideas? Thanks Em _________________________________________________________________ View photos of singles in your area! Looking for a hot date? http://clk.atdmt.com/NMN/go/150855801/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100507/cc97686d/attachment.html>
On Fri, May 7, 2010 at 8:07 AM, Emily Grettel <emilygrettelisnow at hotmail.com> wrote:> Hi, > > I''ve had my RAIDz volume working well on SNV_131 but it has come to my > attention that there has been some read issues with the drives. Previously I > thought this was a CIFS problem but I''m noticing that when transfering files > or uncompressing some fairly large 7z (1-2Gb) files (or even smaller rar - > 200-300Mb) files occasionally running iostat will give the b% as 100 for a > drive or two. >That''s the percent of time the disk is busy (transactions in progress) - iostat(1M).> > I have the Western Digital EADS 1TB drives (Green ones) and not the more > expensive black or enterprise drives (our sysadmins fault). > > The pool in question spans 4x 1TB drives. > > What exactly does this mean? Is it a controller problem disk problem or > cable problem? I''ve got this on commodity hardware as its only used for a > small business with 4-5 staff accessing our media server. Its using the > Intel ICHR SATA controller. I''ve already changed the cables, swapped out the > odd drive that exhibted this issue and the only thing I can think of is to > buy a Intel or LSI SATA card. > > The scrub sessions take almost a day and a half now (previously at most > 12hours!) but theres also 70% of space being used (files wise they''re chunky > MPG files) or compressed artwork but there are no errors reported. > > Does anyone have any ideas? >You might be maxing out your drives'' I/O capacity. That could happen when ZFS is commting the transactions to disk every 30 seconds but if %b is constantly high you disks might not be keeping up with the performance requirements. We''ve had some servers showing high asvc_t times but it turned out to be a firmware issue in the disk controller. It was very erratic (1-2 drives out of 24 would show that). If you look in the archives, people have sent a few averaged I/O performance numbers that you could compare to your workload. -- Giovanni -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100507/76656eaa/attachment.html>
Hi Giovani, Thanks for the reply. Here''s a bit of iostat after uncompressing a 2.4Gb RAR file that has 1 DWF file that we use. extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 1.0 13.0 26.0 18.0 0.0 0.0 0.0 0.8 0 1 c7t1d0 2.0 5.0 77.0 12.0 2.4 1.0 343.8 142.8 100 100 c7t2d0 1.0 16.0 25.5 15.5 0.0 0.0 0.0 0.3 0 0 c7t3d0 0.0 10.0 0.0 17.0 0.0 0.0 3.2 1.2 1 1 c7t4d0 1.0 12.0 25.5 15.5 0.4 0.1 32.4 10.9 14 14 c7t5d0 1.0 15.0 25.5 18.0 0.0 0.0 0.1 0.1 0 0 c0t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 0.0 0.0 0.0 2.0 1.0 0.0 0.0 100 100 c7t2d0 1.0 0.0 0.5 0.0 0.0 0.0 0.0 0.1 0 0 c7t0d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 5.0 15.0 128.0 18.0 0.0 0.0 0.0 1.8 0 3 c7t1d0 1.0 9.0 25.5 18.0 2.0 1.8 199.7 179.4 100 100 c7t2d0 3.0 13.0 102.5 14.5 0.0 0.1 0.0 5.2 0 5 c7t3d0 3.0 11.0 102.0 16.5 0.0 0.1 2.3 4.2 1 6 c7t4d0 1.0 4.0 25.5 2.0 0.4 0.8 71.3 158.9 12 79 c7t5d0 5.0 16.0 128.5 19.0 0.0 0.1 0.1 2.6 0 5 c0t1d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 0.0 4.0 0.0 2.0 2.0 2.0 496.1 498.0 99 100 c7t2d0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 c7t5d0 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 7.0 0.0 204.5 0.0 0.0 0.0 0.0 0.2 0 0 c7t1d0 1.0 0.0 25.5 0.0 3.0 1.0 2961.6 1000.0 99 100 c7t2d0 8.0 0.0 282.0 0.0 0.0 0.0 0.0 0.3 0 0 c7t3d0 6.0 0.0 282.5 0.0 0.0 0.0 6.1 2.3 1 1 c7t4d0 0.0 3.0 0.0 5.0 0.5 1.0 165.4 333.3 18 100 c7t5d0 7.0 0.0 204.5 0.0 0.0 0.0 0.0 1.6 0 1 c0t1d0 2.0 2.0 89.0 12.0 0.0 0.0 3.1 6.1 1 2 c3t0d0 0.0 2.0 0.0 12.0 0.0 0.0 0.0 0.2 0 0 c3t1d0 Sometimes two or more disks are going at 100. How does one solve this issue if its a firmware bug? I tried looking around for Western Digital Firmware for WD10EADS but couldn''t find any available. Would adding an SSD or two help here? Thanks, Em Date: Fri, 7 May 2010 14:38:25 -0300 Subject: Re: [zfs-discuss] ZFS Hard disk buffer at 100% From: gtirloni at sysdroid.com To: emilygrettelisnow at hotmail.com CC: zfs-discuss at opensolaris.org On Fri, May 7, 2010 at 8:07 AM, Emily Grettel <emilygrettelisnow at hotmail.com> wrote: Hi, I''ve had my RAIDz volume working well on SNV_131 but it has come to my attention that there has been some read issues with the drives. Previously I thought this was a CIFS problem but I''m noticing that when transfering files or uncompressing some fairly large 7z (1-2Gb) files (or even smaller rar - 200-300Mb) files occasionally running iostat will give the b% as 100 for a drive or two. That''s the percent of time the disk is busy (transactions in progress) - iostat(1M). I have the Western Digital EADS 1TB drives (Green ones) and not the more expensive black or enterprise drives (our sysadmins fault). The pool in question spans 4x 1TB drives. What exactly does this mean? Is it a controller problem disk problem or cable problem? I''ve got this on commodity hardware as its only used for a small business with 4-5 staff accessing our media server. Its using the Intel ICHR SATA controller. I''ve already changed the cables, swapped out the odd drive that exhibted this issue and the only thing I can think of is to buy a Intel or LSI SATA card. The scrub sessions take almost a day and a half now (previously at most 12hours!) but theres also 70% of space being used (files wise they''re chunky MPG files) or compressed artwork but there are no errors reported. Does anyone have any ideas? You might be maxing out your drives'' I/O capacity. That could happen when ZFS is commting the transactions to disk every 30 seconds but if %b is constantly high you disks might not be keeping up with the performance requirements. We''ve had some servers showing high asvc_t times but it turned out to be a firmware issue in the disk controller. It was very erratic (1-2 drives out of 24 would show that). If you look in the archives, people have sent a few averaged I/O performance numbers that you could compare to your workload. -- Giovanni Meet local singles online. Browse profiles for FREE! _________________________________________________________________ New, Used, Demo, Dealer or Private? Find it at CarPoint.com.au http://clk.atdmt.com/NMN/go/206222968/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100509/2aa6ab88/attachment.html>
The drive (c7t2d0)is bad and should be replaced. The second drive (c7t5d0) is either bad or going bad. This is exactly the kind of problem that can force a Thumper to it knees, ZFS performance is horrific, and as soon as you drop the bad disks things magicly return to normal. My first recommendation is to pull the SMART data from the disks if you can. I wrote a blog entry about SMART to address exactly the behavior your seeing back in 2008: http://www.cuddletech.com/blog/pivot/entry.php?id=993 Yes, people will claim that SMART data is useless for predicting failures, but in a case like yours you are just looking for data to corroborate a hypothesis. In order to test this condition, "zpool offline..." c7t2d0, which emulated removal. See if performance improves. On Thumpers I''d build a list of "suspect disks" based on ''iostat'', like you show, and then correlate the SMART data, and then systematically offline disks to see if it really was the problem. In my experience the only other reason you''ll legitimately see really wierd "bottoming out" of IO like this is if you hit the max conncurrent IO limits in ZFS (untill recently that limit was 35), so you''d see actv=35, and then when the device finally processed the IO''s the thing would snap back to life. But even in those cases you shouldn''t see request times (asvc_t) rise above 200ms. All that to say, replace those disks or at least test it. SSD''s won''t help, one or more drives are toast. benr. On 5/8/10 9:30 PM, Emily Grettel wrote:> Hi Giovani, > > Thanks for the reply. > > Here''s a bit of iostat after uncompressing a 2.4Gb RAR file that has 1 > DWF file that we use. > > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 1.0 13.0 26.0 18.0 0.0 0.0 0.0 0.8 0 1 c7t1d0 > 2.0 5.0 77.0 12.0 2.4 1.0 343.8 142.8 100 100 c7t2d0 > 1.0 16.0 25.5 15.5 0.0 0.0 0.0 0.3 0 0 c7t3d0 > 0.0 10.0 0.0 17.0 0.0 0.0 3.2 1.2 1 1 c7t4d0 > 1.0 12.0 25.5 15.5 0.4 0.1 32.4 10.9 14 14 c7t5d0 > 1.0 15.0 25.5 18.0 0.0 0.0 0.1 0.1 0 0 c0t1d0 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 0.0 0.0 0.0 2.0 1.0 0.0 0.0 100 100 c7t2d0 > 1.0 0.0 0.5 0.0 0.0 0.0 0.0 0.1 0 0 c7t0d0 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 5.0 15.0 128.0 18.0 0.0 0.0 0.0 1.8 0 3 c7t1d0 > 1.0 9.0 25.5 18.0 2.0 1.8 199.7 179.4 100 100 c7t2d0 > 3.0 13.0 102.5 14.5 0.0 0.1 0.0 5.2 0 5 c7t3d0 > 3.0 11.0 102.0 16.5 0.0 0.1 2.3 4.2 1 6 c7t4d0 > 1.0 4.0 25.5 2.0 0.4 0.8 71.3 158.9 12 79 c7t5d0 > 5.0 16.0 128.5 19.0 0.0 0.1 0.1 2.6 0 5 c0t1d0 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 0.0 4.0 0.0 2.0 2.0 2.0 496.1 498.0 99 100 c7t2d0 > 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0 100 c7t5d0 > extended device statistics > r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device > 7.0 0.0 204.5 0.0 0.0 0.0 0.0 0.2 0 0 c7t1d0 > 1.0 0.0 25.5 0.0 3.0 1.0 2961.6 1000.0 99 100 c7t2d0 > 8.0 0.0 282.0 0.0 0.0 0.0 0.0 0.3 0 0 c7t3d0 > 6.0 0.0 282.5 0.0 0.0 0.0 6.1 2.3 1 1 c7t4d0 > 0.0 3.0 0.0 5.0 0.5 1.0 165.4 333.3 18 100 c7t5d0 > 7.0 0.0 204.5 0.0 0.0 0.0 0.0 1.6 0 1 c0t1d0 > 2.0 2.0 89.0 12.0 0.0 0.0 3.1 6.1 1 2 c3t0d0 > 0.0 2.0 0.0 12.0 0.0 0.0 0.0 0.2 0 0 c3t1d0 > > Sometimes two or more disks are going at 100. How does one solve this > issue if its a firmware bug? I tried looking around for Western > Digital Firmware for WD10EADS but couldn''t find any available. > > Would adding an SSD or two help here? > > Thanks, > Em > > ------------------------------------------------------------------------ > Date: Fri, 7 May 2010 14:38:25 -0300 > Subject: Re: [zfs-discuss] ZFS Hard disk buffer at 100% > From: gtirloni at sysdroid.com > To: emilygrettelisnow at hotmail.com > CC: zfs-discuss at opensolaris.org > > > On Fri, May 7, 2010 at 8:07 AM, Emily Grettel > <emilygrettelisnow at hotmail.com <mailto:emilygrettelisnow at hotmail.com>> > wrote: > > Hi, > > I''ve had my RAIDz volume working well on SNV_131 but it has come > to my attention that there has been some read issues with the > drives. Previously I thought this was a CIFS problem but I''m > noticing that when transfering files or uncompressing some fairly > large 7z (1-2Gb) files (or even smaller rar - 200-300Mb) files > occasionally running iostat will give the b% as 100 for a drive or > two. > > > > That''s the percent of time the disk is busy (transactions in progress) > - iostat(1M). > > > > > I have the Western Digital EADS 1TB drives (Green ones) and not > the more expensive black or enterprise drives (our sysadmins fault). > > The pool in question spans 4x 1TB drives. > > What exactly does this mean? Is it a controller problem disk > problem or cable problem? I''ve got this on commodity hardware as > its only used for a small business with 4-5 staff accessing our > media server. Its using the Intel ICHR SATA controller. I''ve > already changed the cables, swapped out the odd drive that > exhibted this issue and the only thing I can think of is to buy a > Intel or LSI SATA card. > > The scrub sessions take almost a day and a half now (previously at > most 12hours!) but theres also 70% of space being used (files wise > they''re chunky MPG files) or compressed artwork but there are no > errors reported. > > Does anyone have any ideas? > > > You might be maxing out your drives'' I/O capacity. That could happen > when ZFS is commting the transactions to disk every 30 seconds but if > %b is constantly high you disks might not be keeping up with the > performance requirements. > > We''ve had some servers showing high asvc_t times but it turned out to > be a firmware issue in the disk controller. It was very erratic (1-2 > drives out of 24 would show that). > > If you look in the archives, people have sent a few averaged I/O > performance numbers that you could compare to your workload. > > -- > Giovanni > > > ------------------------------------------------------------------------ > Meet local singles online. Browse profiles for FREE! > <http://clk.atdmt.com/NMN/go/150855801/direct/01/> > ------------------------------------------------------------------------ > Find it at CarPoint.com.au New, Used, Demo, Dealer or Private? > <http://clk.atdmt.com/NMN/go/206222968/direct/01/> > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Hi Ben,> The drive (c7t2d0)is bad and should be replaced.> The second drive (c7t5d0) is either bad or going bad.Dagnabbit. I''m glad you told me this, but I would have thought that running a scrub would have alerted me to some fault?> and as soon as you drop the bad disks things magicly return to > normal.Being a raidz, is it OK for me to actually do zpool offline for one drive without degrading the entire pool? I''m wondering whether I should keep using the WD10EADS or ask the business to invest in the black versions. I was thinking of the WD1002FAEX (which is SATA-III but my cards only do SATA-II) which seems to be better accomodated for NAS''s. What are other peoples thoughts on this? Here''s my current layout - 1,2 & 3 are 320Gb drives. 0. c0t1d0 <ATA-WDC WD10EADS-00P-0A01-931.51GB> /pci at 0,0/pci1002,597a at 4/pci1458,b000 at 0/disk at 1,0 4. c7t1d0 <ATA-WDC WD10EADS-00L-1A01-931.51GB> /pci at 0,0/pci1458,b002 at 11/disk at 1,0 5. c7t2d0 <ATA-WDC WD10EADS-00P-0A01-931.51GB> /pci at 0,0/pci1458,b002 at 11/disk at 2,0 6. c7t3d0 <ATA-WDC WD10EADS-00P-0A01-931.51GB> /pci at 0,0/pci1458,b002 at 11/disk at 3,0 7. c7t4d0 <ATA-WDC WD10EADS-00P-0A01-931.51GB> /pci at 0,0/pci1458,b002 at 11/disk at 4,0 8. c7t5d0 <ATA-WDC WD10EADS-00P-0A01-931.51GB> /pci at 0,0/pci1458,b002 at 11/disk at 5,0 The other thing I was thinking of redoing the way the pool was setup, instead of a straight raidz layout, adopting a stripe and mirror? so 3 disks in RAID-0, then mirro them to the other three?> http://www.cuddletech.com/blog/pivot/entry.php?id=993Great blog entry! Unfortunately the SUNWhd package isn''t available in the repo and I haven''t been able to locate a similar SMART reader :( But your explanations are very valuable.> In my experience the only other reason you''ll legitimately see really > wierd "bottoming out" of IO like this is if you hit the max conncurrent > IO limits in ZFS (untill recently that limit was 35), so you''d see > actv=35, and then when the device finally processed the IO''s the thing > would snap back to life. But even in those cases you shouldn''t see > request times (asvc_t) rise above 200ms.Hmmm, I did remember another admin tweaking the zfs configuration. Are these to blame by chance: /etc/system set pcplusmp:apic_intr_policy=1 set zfs:zfs_txg_synctime=1 set zfs:zfs_vdev_max_pending=10 I''ve tried to avoid tweaking anything in the ZFS configuration in fear it may give worse performance.> All that to say, replace those disks or at least test it. SSD''s won''t > help, one or more drives are toast.Thanks mate, I really appreciate some backing about this :-) Cheers, Em _________________________________________________________________ Need a new place to live? Find it on Domain.com.au http://clk.atdmt.com/NMN/go/157631292/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100509/8bb74d3a/attachment.html>
On Sat, May 8 at 23:39, Ben Rockwood wrote:>The drive (c7t2d0)is bad and should be replaced. The second drive >(c7t5d0) is either bad or going bad. This is exactly the kind of >problem that can force a Thumper to it knees, ZFS performance is >horrific, and as soon as you drop the bad disks things magicly return to >normal.Problem is the OP is mixing client 4k drives with 512b drives. They may not actually be bad, but they appear to be getting "misused" in this application. I doubt they''re "broken" per say, they''re just dramatically slower than their peers in this workload. As a replacement recommendation, we''ve been beating on the WD 1TB RE3 drives for 18 months or so, and we''re happy with both performance and the price for what we get. $160/ea with a 5 year warranty. --eric -- Eric D. Mudama edmudama at mail.bounceswoosh.org
Hi Eric,> Problem is the OP is mixing client 4k drives with 512b drives.How do you come to that assesment? Here''s what I have: Ap_Id Information sata1/1::dsk/c7t1d0 Mod: WDC WD10EADS-00L5B1 FRev: 01.01A01 sata1/2::dsk/c7t2d0 Mod: WDC WD10EADS-00P8B0 FRev: 01.00A01 sata1/3::dsk/c7t3d0 Mod: WDC WD10EADS-00P8B0 FRev: 01.00A01 sata1/4::dsk/c7t4d0 Mod: WDC WD10EADS-00P8B0 FRev: 01.00A01 sata1/5::dsk/c7t5d0 Mod: WDC WD10EADS-00P8B0 FRev: 01.00A01 sata2/1::dsk/c0t1d0 Mod: WDC WD10EADS-00P8B0 FRev: 01.00A01 They all seem to indicate the older 512b from the WDC site unless I''m not understanding their spec sheets.> I doubt they''re "broken" per say, they''re just dramatically slower > than their peers in this workload.It does make sense though! My read speed (trying to copy 683Gb across to another machine) is roughly 7-8Mbps where I used to get on average 30-40Mbps.> As a replacement recommendation, we''ve been beating on the WD 1TB RE3Cool, either the RE3 or black drives it is :-) Thanks, Em _________________________________________________________________ View photos of singles in your area! Looking for a hot date? http://clk.atdmt.com/NMN/go/150855801/direct/01/ -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100510/05f43547/attachment.html>