Hi all I''ve been reading a little, and it seems using WD Green drives isn''t very popular in here. Can someone explain why these are so much worse than others? Usually I see drives go bad with more or less the same frequency... Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
I am new to OSOL/ZFS myself -- just placed an order for my first system last week. However, I have been reading these forums for a while - a lot of the data seems to be anecdotal, but here is what I have gathered as to why the WD green drives are not a good fit for a RAIDZ(n) system. (1) They seem to have a firmware setting (that may not be modified depending on revision) that has to do with the drive "parking" the drive after 8 seconds of inactivity to save power. These drives are rated for a certain number of park/unpark operations -- I think 300,000. Using these drives in a NAS results in a lot of park/unpark. (2) They are big and slow. This seems to be a very bad combination for RAIDZ(n). I have seen people report resilvering times of 2 to 4 days. I am not sure how much that impact performance. But that can be a long time to run in a degraded state for some people. I don''t know how the resilvering process works - so I don''t know if it is a ZFS issue or not.. Regardless of the drive speed it seems like more than 2 days to write 1 to 2 TB worth of data is ridiculous - but no one seems to complain that it is ZFS''s fault - so there must be a lot involved in resilvering that I don''t understand. I think that small and slow would be OK - if you had 500GB green drives you might be fine.. But people tend to look at the green drives because they have so much capacity for the money - so I haven''t seen anyone say that a Green 500GB drives work well. (3) They seem to have a lot of platters.. 3 or 4. More platters == more heat == more failure... apparently. (4) The larger WD drives are 4k sectors which is fine by itself, but because windows XP doesn''t like that they have some weird firmware stuff in there to emulate different sector sizes. I am not sure it can be disabled or if it is in fact a problem, but I have seen it mentioned in various gripes. Like I said, I am no expert. But these factors made me choose 1TB samsung spinpoints for my Raidz2 config that I just ordered. I may look into some of these green drives for a secondary mirror zpool at some point - I have some items where I don''t need need great redundancy and can trade off speed for cost. -- This message posted from opensolaris.org
----- "Brian" <broconne at vt.edu> skrev:> (1) They seem to have a firmware setting (that may not be modified > depending on revision) that has to do with the drive "parking" the > drive after 8 seconds of inactivity to save power. These drives are > rated for a certain number of park/unpark operations -- I think > 300,000. Using these drives in a NAS results in a lot of > park/unpark.8 seconds? is it really that low?> (2) They are big and slow. This seems to be a very bad combination > for RAIDZ(n). I have seen people report resilvering times of 2 to 4 > days. I am not sure how much that impact performance. But that can > be a long time to run in a degraded state for some people. I don''t > know how the resilvering process works - so I don''t know if it is a > ZFS issue or not.. Regardless of the drive speed it seems like more > than 2 days to write 1 to 2 TB worth of data is ridiculous - but no > one seems to complain that it is ZFS''s fault - so there must be a lot > involved in resilvering that I don''t understand. I think that small > and slow would be OK - if you had 500GB green drives you might be > fine.. But people tend to look at the green drives because they have > so much capacity for the money - so I haven''t seen anyone say that a > Green 500GB drives work well.We have a green array of 3x7 2TB drives in raidz2 (27TiB), almost full, and it takes some two and a half days to scrub it. Does anyone have scrub times for similar setups with, say, Black drives?> (3) They seem to have a lot of platters.. 3 or 4. More platters => more heat == more failure... apparently.I somehow doubt Black drives have less platters Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk.
On 13 May, 2010 - Roy Sigurd Karlsbakk sent me these 2,9K bytes:> ----- "Brian" <broconne at vt.edu> skrev: > > > (1) They seem to have a firmware setting (that may not be modified > > depending on revision) that has to do with the drive "parking" the > > drive after 8 seconds of inactivity to save power. These drives are > > rated for a certain number of park/unpark operations -- I think > > 300,000. Using these drives in a NAS results in a lot of > > park/unpark. > > 8 seconds? is it really that low?Yes. My disk went through 180k in like 2-3 months.. Then I told smartd to poll the disk every 5 seconds to prevent it from falling asleep. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
(3) Was more about the size than the Green vs. Black issue. This is all assuming most people are looking at green drives for the cost benefits associated with their large sizes. You are correct Green and Black would most likely have the same number of platters per size. -- This message posted from opensolaris.org
On Thu, May 13, 2010 at 3:19 AM, Roy Sigurd Karlsbakk <roy at karlsbakk.net>wrote:> I''ve been reading a little, and it seems using WD Green drives isn''t very > popular in here. Can someone explain why these are so much worse than > others? Usually I see drives go bad with more or less the same frequency... > > 1. They''re 5900 RPM drives, not 7200, making them even slower than normalSATA drives. 2. They come with a idle timeout of 8 seconds, after which the drive heads are parked. This shows up as the Load Store counter in SMART output. They''re rated for around 300,000 or 500,000 load cycle, which can happen in only a few short months on a server (we had over 40,000 in a week on one drive). On some firmware versions, this can be disabled completely using the wdidle3 DOS app. On other firmware versions, this can''t be disabled, but can be set to 362 seconds (or something like that). Each time the heads are parked, it takes a couple of seconds to bring the drive back up to speed. This can drop your pool disk I/O through the floor. 3. The firmware on the drives disables the time-limited error reporting (TLER) feature, and it cannot be enabled like on other WD drives. 4. Some of the Green drives are "Advanced Format" with 4 KB sectors, except that the drives all lie and say they use 512 B sectors, leading to all kinds of alignment issues and even more slow-downs. No matter which firmware you run, you cannot get the drives to report on the actual physical size of a disk sector, it always reports 512 B. This makes it very hard to align partitions and filesystems, further degrading performance. If you are building a system that needs to be quiet and power efficient with 2 TB of storage, then maybe using a single WD Green drive would be okay. Maybe a home media server. However, going with 2.5" drives may be better. But for any kind of bulk storage setup or non-home-desktop setup, you''ll want to avoid all of the WD Green drives (including the RE Green-power), and also avoid any 5900 RPM drives from other manufacturers (some Seagate 2 TB, for example). We made the mistake of putting 8 WD Green 1.5 TB drives into one of our storage servers, as they were on sale for $100 CDN. Throughput on that server has dropped quite a bit (~200 MB/s instead of the 300+ MB/s we had with all WD RE Black drives). It takes over 65 hours to resilver a single 1.5 TB drive, and a scrub on the entire pool takes over 3 days. When upgrading our secondary storage server, we went with Seagate 7200.11 1.5 TB drives. Re-silver of a drive takes under 35 hours (first drive was over 35 hours, 6th drive was just under). Haven''t scrubbed the pool yet (still replacing drives in the raidz2 vdev). Performance has improved slightly, though. -- Freddie Cash fjwcash at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100513/a12c36b1/attachment.html>
1. even though they''re 5900, not 7200, benchmarks I''ve seen show they are quite good 3. what is TLER? 4. I thought most partitions were aligned at 4k these days? We don''t need too much speed on this system, we''re still limited to 1Gbps ethernet, and it''s mostly archive data, no reads exceeding the ethernet bandwidth Vennlige hilsener roy -- Roy Sigurd Karlsbakk (+47) 97542685 roy at karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et element?rt imperativ for alle pedagoger ? unng? eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer p? norsk. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100513/1c4a5dee/attachment.html>
On Thu, May 13, 2010 at 9:09 AM, Roy Sigurd Karlsbakk <roy at karlsbakk.net>wrote:> 1. even though they''re 5900, not 7200, benchmarks I''ve seen show they are > quite good >They have a 64 MB cache onboard, which hides some of their slowness. But they are slow.> 3. what is TLER? >Time Limited Error Reporting, I think. It''s a RAID feature where the drive will only try for X seconds to read a sector and then fail so that the RAID controller can take action. Non-RAID drives will tend to continue trying to read a sector for aeons before giving up leading to timeouts and whatnot further up the storage stack. For ZFS systems, this may not be too big of a deal.> 4. I thought most partitions were aligned at 4k these days? > > Nope. Most disk partitioning tools still use the old "start after 63sectors for cylinder alignment", which creates a non-4 KB-aligned first partition, and thus non-aligned filesystems. This is why the WD Advanced Format drives come with a hardware jumper to shift numbering by 1 (partition is created at the 63rd logical sector, which is actually the 64th physical sector). However, Unix disk partitioning tools are getting better at this, and it seems that the new "standard" will be to create the first partition at the 1 MB mark. This is then aligned for everything up to 256 KB sectors or something like that. :) There''s a list somewhere that shows the status of all the Linux partitioning tools. Not sure about Solaris. FreeBSD partitioning tools doesn''t "do the right thing" yet, but allows you to manually specify the starting offset so you can manually align things.> We don''t need too much speed on this system, we''re still limited to 1Gbps > ethernet, and it''s mostly archive data, no reads exceeding the ethernet > bandwidth >If you absolutely must use these drives, then download the wdidle3 utility, stick it on a DOS boot disk, attach the disk to a SATA port on the motherboard, boot to DOS, and disable the idle timeout. Do this for every disk, *before* you put it into the system where they''ll be used. You''ll save yourself a lot of headaches. :) And the drives will last longer than 3 months or so. (If they''ve removed the download for wdidle3, I have a copy here.) -- Freddie Cash fjwcash at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100513/128306f0/attachment.html>
On Thu, May 13, 2010 at 3:19 AM, Roy Sigurd Karlsbakk <roy at karlsbakk.net> wrote:> I''ve been reading a little, and it seems using WD Green drives isn''t very popular in here. Can someone explain why these are so much worse than others? Usually I see drives go bad with more or less the same frequency...I''ve been using 8 in a raidz2 for about a year and haven''t had any serious problems with them, but I changed the idle timer and TLER settings before doing anything else. They are slow, especially on random read or writes access. Adding a slog / l2arc on ssd has helped a lot with this. Sequential access is fast. On the plus side, the Greens are 5400rpm and generate less heat and noise. If you''re aware of their potential shortcomings and what to expect performance-wise, there''s no real problem with them. When it comes time to replace them (in 1.5 years while they are still under warranty) I will probably go with a faster drive. I''d also consider using 2.5" drives at this point. -B -- Brandon High : bhigh at freaks.com
On Thu, May 13, 2010 at 06:09:55PM +0200, Roy Sigurd Karlsbakk wrote:> 1. even though they''re 5900, not 7200, benchmarks I''ve seen show they are quite goodMinor correction, they are 5400rpm. Seagate makes some 5900rpm drives. The "green" drives have reasonable raw throughput rate, due to the extremely high platter density nowadays. however, due to their low spin speed, their average-access time is significantly slower than 7200rpm drives. For bulk archive data containing large files, this is less of a concern. Regarding slow reslivering times, in the absence of other disk activity, I think that should really be limited by the throughput rate, not the relatively slow random i/o performance...again assuming large files (and low fragmentation, which if the archive is write-and-never-delete is what i''d expect). One test i saw suggests 60MB/sec avg throughput on the 2TB drives. That works out to 9.25 hours to read the entire 2TB. At a conservative 50MB/sec it''s 11 hours. This assumes that you have enough I/O bandwidth and CPU on the system to saturate all your disks. if there''s other disk activity during a resilver, though, it turns into random i/o. Which is slow on these drives. danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 Visit our website: www.internet2.edu Follow us on Twitter: www.twitter.com/internet2 Become a Fan on Facebook: www.internet2.edu/facebook
>On Thu, May 13, 2010 at 06:09:55PM +0200, Roy Sigurd Karlsbakk wrote: >> 1. even though they''re 5900, not 7200, benchmarks I''ve seen show they are quite good > >Minor correction, they are 5400rpm. Seagate makes some 5900rpm drives. > >The "green" drives have reasonable raw throughput rate, due to the >extremely high platter density nowadays. however, due to their low >spin speed, their average-access time is significantly slower than >7200rpm drives. > >For bulk archive data containing large files, this is less of a concern. > >Regarding slow reslivering times, in the absence of other disk activity, >I think that should really be limited by the throughput rate, not the >relatively slow random i/o performance...again assuming large files >(and low fragmentation, which if the archive is write-and-never-delete >is what i''d expect).My experience is that they resilver fairly quickly and scrbs aren''t slow either. (300GB in 2hrs) Casper
On 17 May, 2010 - Dan Pritts sent me these 1,6K bytes:> On Thu, May 13, 2010 at 06:09:55PM +0200, Roy Sigurd Karlsbakk wrote: > > 1. even though they''re 5900, not 7200, benchmarks I''ve seen show they are quite good > > Minor correction, they are 5400rpm. Seagate makes some 5900rpm drives. > > The "green" drives have reasonable raw throughput rate, due to the > extremely high platter density nowadays. however, due to their low > spin speed, their average-access time is significantly slower than > 7200rpm drives. > > For bulk archive data containing large files, this is less of a concern. > > Regarding slow reslivering times, in the absence of other disk activity, > I think that should really be limited by the throughput rate, not the > relatively slow random i/o performance...again assuming large files > (and low fragmentation, which if the archive is write-and-never-delete > is what i''d expect). > > One test i saw suggests 60MB/sec avg throughput on the 2TB drives. > That works out to 9.25 hours to read the entire 2TB. At a conservative > 50MB/sec it''s 11 hours. This assumes that you have enough I/O bandwidth > and CPU on the system to saturate all your disks. > > if there''s other disk activity during a resilver, though, it turns into > random i/o. Which is slow on these drives.Resilver does a whole lot of random io itself, not bulk reads.. It reads the filesystem tree, not "block 0, block 1, block 2..". You won''t get 60MB/s sustained, not even close. /Tomas -- Tomas ?gren, stric at acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Ume? `- Sysadmin at {cs,acc}.umu.se
On Mon, May 17, 2010 at 9:25 AM, Tomas ?gren <stric at acc.umu.se> wrote:> On 17 May, 2010 - Dan Pritts sent me these 1,6K bytes: > > > On Thu, May 13, 2010 at 06:09:55PM +0200, Roy Sigurd Karlsbakk wrote: > > > 1. even though they''re 5900, not 7200, benchmarks I''ve seen show they > are quite good > > > > Minor correction, they are 5400rpm. Seagate makes some 5900rpm drives. > > > > The "green" drives have reasonable raw throughput rate, due to the > > extremely high platter density nowadays. however, due to their low > > spin speed, their average-access time is significantly slower than > > 7200rpm drives. > > > > For bulk archive data containing large files, this is less of a concern. > > > > Regarding slow reslivering times, in the absence of other disk activity, > > I think that should really be limited by the throughput rate, not the > > relatively slow random i/o performance...again assuming large files > > (and low fragmentation, which if the archive is write-and-never-delete > > is what i''d expect). > > > > One test i saw suggests 60MB/sec avg throughput on the 2TB drives. > > That works out to 9.25 hours to read the entire 2TB. At a conservative > > 50MB/sec it''s 11 hours. This assumes that you have enough I/O bandwidth > > and CPU on the system to saturate all your disks. > > > > if there''s other disk activity during a resilver, though, it turns into > > random i/o. Which is slow on these drives. > > Resilver does a whole lot of random io itself, not bulk reads.. It reads > the filesystem tree, not "block 0, block 1, block 2..". You won''t get > 60MB/s sustained, not even close. > > Resilver time for a 1.5 TB WD Green drive, with wdidle3 setting "disabled",in an 8-drive raidz2 vdev, is over 65 hours, with ~500 GB of data per drive. We just replaced 8 WD 500 GB RE Black drives with 8 1.5 TB WD Green drives, not realising just how horrible of a drive these are. :( So much for the $100 CDN bargain price. Resilver time for a 1.5 TB Seagate 7200.11 drive, in an 8-drive raidz2 vdev, is about 35 hours, with ~ 500 GB of data per drive. We just replaced 8 WD Black 500 GB drives and Seagate 7200.11 500 GB drives with 8 1.5 TB Seagate 7200.11. Much nicer drives, and way better performance than the WD Greens. Both servers are identical hardware (motherboard, CPU, RAM, RAID controllers, etc), both are using ZFSv14. The first is 64-bit FreeBSD 7.3-RELEASE, the second is FreeBSD 8-STABLE. For a home media server, the WD Greens may be okay. For anything else, they''re crap. Plain and simple. -- Freddie Cash fjwcash at gmail.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100517/2b7200b0/attachment.html>
On Mon, May 17, 2010 at 06:25:18PM +0200, Tomas ?gren wrote:> Resilver does a whole lot of random io itself, not bulk reads.. It reads > the filesystem tree, not "block 0, block 1, block 2..". You won''t get > 60MB/s sustained, not even close.Even with large, unfragmented files? danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 Visit our website: www.internet2.edu Follow us on Twitter: www.twitter.com/internet2 Become a Fan on Facebook: www.internet2.edu/facebook
On Mon, 2010-05-17 at 12:54 -0400, Dan Pritts wrote:> On Mon, May 17, 2010 at 06:25:18PM +0200, Tomas ?gren wrote: > > Resilver does a whole lot of random io itself, not bulk reads.. It reads > > the filesystem tree, not "block 0, block 1, block 2..". You won''t get > > 60MB/s sustained, not even close. > > Even with large, unfragmented files? > > danno > -- > Dan Pritts, Sr. Systems Engineer > Internet2 > office: +1-734-352-4953 | mobile: +1-734-834-7224Having large, unfragmented files will certainly help keep sustained throughput. But, also, you have to consider the amount of deletions done on the pool. For instance, let''s say you wrote files A, B, and C one right after another, and they''re all big files. Doing a re-silver, you''d be pretty well off on getting reasonable throughput reading A, then B, then C, since they''re going to be contiguous on the drive (both internally, and across the three files). However, if you have deleted B at some time, and say wrote a file D (where D < B in size) into B''s old space, then, well, you seek to A, read A, seek forward to C, read C, seek back to D, etc. Thus, you''ll get good throughput for resilver on these drives pretty much in just ONE case: large files with NO deletions. If you''re using them for write-once/read-many/no-delete archives, then you''re OK. Anything else is going to suck. :-) -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Mon, May 17, 2010 at 03:12:44PM -0700, Erik Trimble wrote:> On Mon, 2010-05-17 at 12:54 -0400, Dan Pritts wrote: > > On Mon, May 17, 2010 at 06:25:18PM +0200, Tomas ?gren wrote: > > > Resilver does a whole lot of random io itself, not bulk reads.. It reads > > > the filesystem tree, not "block 0, block 1, block 2..". You won''t get > > > 60MB/s sustained, not even close. > > > > Even with large, unfragmented files? > > > > danno > > -- > > Dan Pritts, Sr. Systems Engineer > > Internet2 > > office: +1-734-352-4953 | mobile: +1-734-834-7224 > > Having large, unfragmented files will certainly help keep sustained > throughput. But, also, you have to consider the amount of deletions > done on the pool. > > For instance, let''s say you wrote files A, B, and C one right after > another, and they''re all big files. Doing a re-silver, you''d be pretty > well off on getting reasonable throughput reading A, then B, then C, > since they''re going to be contiguous on the drive (both internally, and > across the three files). However, if you have deleted B at some time, > and say wrote a file D (where D < B in size) into B''s old space, then, > well, you seek to A, read A, seek forward to C, read C, seek back to D, > etc. > > Thus, you''ll get good throughput for resilver on these drives pretty > much in just ONE case: large files with NO deletions. If you''re using > them for write-once/read-many/no-delete archives, then you''re OK. > Anything else is going to suck. > > :-) >So basicly if you have a lot of small files with a lot of changes and deletions resilver is going to be really slow. Sounds like the traditional RAID would be better/faster to rebuild in this case.. -- Pasi
On Tue, May 18, 2010 at 09:40:15AM +0300, Pasi K?rkk?inen wrote:> > Thus, you''ll get good throughput for resilver on these drives pretty > > much in just ONE case: large files with NO deletions. If you''re using > > them for write-once/read-many/no-delete archives, then you''re OK. > > Anything else is going to suck.thanks for pointing out the obvious. :) Still, though, this is basically true for ANY drive. It''s worse for slower RPM drives, but it''s not like resilvers will exactly be fast with 7200rpm drives, either. danno -- Dan Pritts, Sr. Systems Engineer Internet2 office: +1-734-352-4953 | mobile: +1-734-834-7224 Visit our website: www.internet2.edu Follow us on Twitter: www.twitter.com/internet2 Become a Fan on Facebook: www.internet2.edu/facebook