FYI, this is actually a pretty good article which talks about improvements in SSDs. Don''t bet against Moore''s Law :-) Intel boosts speed, cuts prices of solid-state drives http://news.cnet.com/8301-13924_3-10291582-64.html?tag=newsEditorsPicksArea.0 -- richard
On Tue, 21 Jul 2009, Richard Elling wrote:> FYI, this is actually a pretty good article which talks about > improvements in SSDs. Don''t bet against Moore''s Law :-) > > Intel boosts speed, cuts prices of solid-state drives > http://news.cnet.com/8301-13924_3-10291582-64.html?tag=newsEditorsPicksArea.0This is pretty exciting stuff. We just have to switch to Windows 7. :-) I notice that the life of the drives depends considerably on write cycles. 20GB/day 24/7 does not support a data intensive environment, which could easly write 10-20X that much. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Bob Friesenhahn wrote:> On Tue, 21 Jul 2009, Richard Elling wrote: > >> FYI, this is actually a pretty good article which talks about >> improvements in SSDs. Don''t bet against Moore''s Law :-) >> >> Intel boosts speed, cuts prices of solid-state drives >> http://news.cnet.com/8301-13924_3-10291582-64.html?tag=newsEditorsPicksArea.0 >> > > This is pretty exciting stuff. We just have to switch to Windows 7. :-) > > I notice that the life of the drives depends considerably on write > cycles. 20GB/day 24/7 does not support a data intensive environment, > which could easly write 10-20X that much.The X25-M drives referred to are Intel''s Mainstream drives, using MLC flash. The Enterprise grade drives are X25-E, which currently use SLC flash (less dense, more reliable, much longer lasting/more writes). The expected lifetime is similar to an Enterprise grade hard drive. -- Andrew
On Tue, 21 Jul 2009, Andrew Gabriel wrote:> The X25-M drives referred to are Intel''s Mainstream drives, using MLC flash. > > The Enterprise grade drives are X25-E, which currently use SLC flash (less > dense, more reliable, much longer lasting/more writes). The expected lifetime > is similar to an Enterprise grade hard drive.Yes, but they store hardly any data. The X25-M sizes they mention are getting to the point that you could use them for a data drive. With wear leveling and zfs you would probably discover that the drive suddenly starts to wear out all at once once it reaches the end of its lifetime. Unless drive ages are carefully staggered, or different types of drives are intentionally used, it might be that data redundancy does not help. Poof! Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Jul 21, 2009, at 12:49 PM, Bob Friesenhahn wrote:> On Tue, 21 Jul 2009, Andrew Gabriel wrote: >> The X25-M drives referred to are Intel''s Mainstream drives, using >> MLC flash. >> >> The Enterprise grade drives are X25-E, which currently use SLC >> flash (less dense, more reliable, much longer lasting/more writes). >> The expected lifetime is similar to an Enterprise grade hard drive. > > Yes, but they store hardly any data. The X25-M sizes they mention > are getting to the point that you could use them for a data drive. > > With wear leveling and zfs you would probably discover that the > drive suddenly starts to wear out all at once once it reaches the > end of its lifetime. Unless drive ages are carefully staggered, or > different types of drives are intentionally used, it might be that > data redundancy does not help. Poof!Eh? Would you care to share how you calculate this? -- richard
Richard Elling wrote:> On Jul 21, 2009, at 12:49 PM, Bob Friesenhahn wrote: > >> On Tue, 21 Jul 2009, Andrew Gabriel wrote: >>> The X25-M drives referred to are Intel''s Mainstream drives, using MLC >>> flash. >>> >>> The Enterprise grade drives are X25-E, which currently use SLC flash >>> (less dense, more reliable, much longer lasting/more writes). The >>> expected lifetime is similar to an Enterprise grade hard drive. >> >> Yes, but they store hardly any data. The X25-M sizes they mention are >> getting to the point that you could use them for a data drive. >> >> With wear leveling and zfs you would probably discover that the drive >> suddenly starts to wear out all at once once it reaches the end of its >> lifetime. Unless drive ages are carefully staggered, or different >> types of drives are intentionally used, it might be that data >> redundancy does not help. Poof! > > Eh? Would you care to share how you calculate this?Well I''m assuming something like this: If all your drives have *exactly* the same lifetime, you really don''t want them all to fail at the same time...so you should ideally arrange that they fail a month or so apart. That should leave you plenty of time to replace the failed device without all your data going bye bye at the same time. Or maybe that wasn''t the part you wanted clarified. My bad :) Matt
On Wed 22/07/09 08:21 , "Richard Elling" richard.elling at gmail.com sent:> On Jul 21, 2009, at 12:49 PM, Bob Friesenhahn wrote:>> With wear leveling and zfs you would probably discover that the >> drive suddenly starts to wear out all at once once it reaches the >> end of its lifetime. Unless drive ages are carefully staggered, or >> different types of drives are intentionally used, it might be that >> data redundancy does not help. Poof! > > Eh? Would you care to share how you calculate this?The better the wear leveling, the higher the chances of multiple cells failing at one time? -- Ian.
On Tue, 21 Jul 2009, Richard Elling wrote:>> >> With wear leveling and zfs you would probably discover that the drive >> suddenly starts to wear out all at once once it reaches the end of its >> lifetime. Unless drive ages are carefully staggered, or different types of >> drives are intentionally used, it might be that data redundancy does not >> help. Poof! > > Eh? Would you care to share how you calculate this?It assumes that the devices are manufactured perfectly with quite uniform properties and very well designed wear leveling which exposes all cells to the same degree of wear. Perfection theroretically results in Poof! Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Jul 21, 2009, at 2:24 PM, Bob Friesenhahn wrote:> On Tue, 21 Jul 2009, Richard Elling wrote: >>> With wear leveling and zfs you would probably discover that the >>> drive suddenly starts to wear out all at once once it reaches the >>> end of its lifetime. Unless drive ages are carefully staggered, >>> or different types of drives are intentionally used, it might be >>> that data redundancy does not help. Poof! >> >> Eh? Would you care to share how you calculate this? > > It assumes that the devices are manufactured perfectly with quite > uniform properties and very well designed wear leveling which > exposes all cells to the same degree of wear. Perfection > theroretically results in Poof!Bad assumption. Not only are semiconductor processes quite variable, performance depends on environmental conditions like temperature. But to put this in perspective, you would have to *delete* 20 GBytes of data a day on a ZFS file system for 5 years (according to Intel) to reach the expected endurance. I don''t know many people who delete that much data continuously (I suspect that the satellite data vendors might in their staging servers... not exactly a market for SSDs) -- richard
On Tue, Jul 21, 2009 at 02:45:57PM -0700, Richard Elling wrote:> But to put this in perspective, you would have to *delete* 20 GBytesOr overwrite (since the overwrites turn in to COW writes of new blocks and the old blocks are released if not referred to from snapshot).> of data a day on a ZFS file system for 5 years (according to Intel) to > reach the expected endurance. I don''t know many people who delete > that much data continuously (I suspect that the satellite data vendors > might in their staging servers... not exactly a market for SSDs)Don''t forget atime updates. If you just read, you''re still writing. Of course, the writes from atime updates will generally be less than the number of data blocks read, so you might have to read many more times what you say in order to get the same effect. (Speaking of atime updates, I run my root datasets with atime updates disabled. I don''t have hard data, but it stands to reason that things can go fast that way. I also mount filesystems in VMs with atime disabled. Yes, I''m picking nits; sorry. Nico --
Louis-Frédéric Feuillette
2009-Jul-21 22:00 UTC
[zfs-discuss] SSDs get faster and less expensive
On Tue, 2009-07-21 at 14:45 -0700, Richard Elling wrote:> But to put this in perspective, you would have to *delete* 20 GBytes of > data a day on a ZFS file system for 5 years (according to Intel) to > reach the expected endurance.Forgive my ignorance, but is this not exactly what a SSD ZIL does? A ZIL would need to "delete" it''s data when it flushes to disk. I know this thread is about consumer SSDs but are the enterprise SSDs that much better in terms of write cycles (not speed, I know they differ in some cases dramatically). Richard, do you have a blog post about SSDs that I missed in my travels? -- Louis-Fr?d?ric Feuillette <jebnor at gmail.com>
On Tue, 21 Jul 2009, Richard Elling wrote:> > But to put this in perspective, you would have to *delete* 20 GBytes of > data a day on a ZFS file system for 5 years (according to Intel) to reach > the expected endurance. I don''t know many people who delete that > much data continuously (I suspect that the satellite data vendors might > in their staging servers... not exactly a market for SSDs)Any application which repeatedly overwrites files, or writes new ones while deleting the old one, will result in huge deletion of data. Zfs deletes data at the rate it is re-written as long as the data is not retained by a snapshot. The film post-production industry has no difficulty in trundling through several terrabytes of data in a day. One application that I have become more familiar with lately, is the image processing nodes at Flickr. The image processing work is done locally on a server and then saved off to bulk storage. The local disks on the server surely see massive amounts of use. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Louis-Fr?d?ric Feuillette wrote:> On Tue, 2009-07-21 at 14:45 -0700, Richard Elling wrote: >> But to put this in perspective, you would have to *delete* 20 GBytes of >> data a day on a ZFS file system for 5 years (according to Intel) to >> reach the expected endurance. > > Forgive my ignorance, but is this not exactly what a SSD ZIL does? A ZIL > would need to "delete" it''s data when it flushes to disk. I know this > thread is about consumer SSDs but are the enterprise SSDs that much > better in terms of write cycles (not speed, I know they differ in some > cases dramatically).Around 50 times better. The other thing to note is that the wear-out failure mode is very different from that of a hard drive. Throughout the life of the SSD, the spare capacity is slowly used to replace blocks that are getting weak. Failure happens when the spare capacity is all used up. At this point, you can''t write new data to the SSD, but it still has all your existing data available for reading. (At least for Enterprise SSD''s -- I don''t know much about the MLC consumer drives.) -- Andrew
On 07/21/09 03:00 PM, Nicolas Williams wrote:> On Tue, Jul 21, 2009 at 02:45:57PM -0700, Richard Elling wrote: > >> But to put this in perspective, you would have to *delete* 20 GBytes >> > > Or overwrite (since the overwrites turn in to COW writes of new blocks > and the old blocks are released if not referred to from snapshot). > > >> of data a day on a ZFS file system for 5 years (according to Intel) to >> reach the expected endurance. I don''t know many people who delete >> that much data continuously (I suspect that the satellite data vendors >> might in their staging servers... not exactly a market for SSDs) >> > > Don''t forget atime updates. If you just read, you''re still writing. > > Of course, the writes from atime updates will generally be less than the > number of data blocks read, so you might have to read many more times > what you say in order to get the same effect. > > (Speaking of atime updates, I run my root datasets with atime updates > disabled. I don''t have hard data, but it stands to reason that things > can go fast that way. I also mount filesystems in VMs with atime > disabled. >You might find this useful; http://www.sun.com/bigadmin/features/articles/nvm_boot.jsp It''s from a year ago. In general though, regardless of how you set things in the article, I was involved in some destructive testing on nand flash memory, both SLC and MLC in 2007. Our team found that when used as a boot disk, the amount of writes, with current wear-leveling techniques, were such that we estimated the device would not fail during the anticipated service life of the motherboard (5 to 7 years). Using an SSD as a data drive or storage cache drive is an entirely different situation. Solaris had been optimized to reduce writes to the boot disk long before SSD, in an attempt to maximize performance and reliability. So, for example, in using a CF card as a boot disk with unmodified Solaris, the write were so low per 24 hours that Mike and Krister''s team calculated a best case device life of 779 years and a worst case under abuse of approx 68,250 hours. The calculations change with device size, wear-level algorithm, etc. Current SSD''s are better. But the above calculations did not take into account random electronics failures (MTBF), just the failure mode of exhausting the maximum write count. So I really sleep fine at night if the SSD or CF is a boot disk, especially with atime disabled. If it''s for a cache, well, that might require some additional testing/modeling/calculation. If it were a write-cache for critical data, I would calculate, and then simply replace it periodically *before* it fails. Neal> Yes, I''m picking nits; sorry. > > Nico >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090721/90c2b352/attachment.html>
On Jul 21, 2009, at 3:00 PM, Louis-Fr?d?ric Feuillette wrote:> On Tue, 2009-07-21 at 14:45 -0700, Richard Elling wrote: >> But to put this in perspective, you would have to *delete* 20 >> GBytes of >> data a day on a ZFS file system for 5 years (according to Intel) to >> reach the expected endurance. > > Forgive my ignorance, but is this not exactly what a SSD ZIL does? A > ZIL > would need to "delete" it''s data when it flushes to disk. I know this > thread is about consumer SSDs but are the enterprise SSDs that much > better in terms of write cycles (not speed, I know they differ in some > cases dramatically).Good question. I don''t know of a sync workload that does that sort of traffic where I would only use one SSD, but I suppose it is possible. Yes, SSDs can vary in several ways: 1. the space reserved for endurance (spares) 2. ECC algorithms 3. failure management algorithms 4. DRAM buffers 5. SLC:MLC ratios 6. page size 7. quality of components Some of these are closely guarded secrets, so resorting to black box testing may be the best you can hope for. But even black box testing can be difficult because the failure rate is so low, especially in early life, that the confidence is low. Black boxes with a more consistent (and low) failure rate are easier to get high confidence in their failure behaviours. It will be interesting to see what happens over time :-)> > Richard, do you have a blog post about SSDs that I missed in my > travels?I think there are a few on my todo list... :-) -- richard
Where is the best space to read the latest support of ZFS with SSD and its roadmap as the latest ZFS release adds SSD management to ZFS. - Henry ----- Original Message ---- From: Richard Elling <richard.elling at gmail.com> To: Louis-Fr?d?ric Feuillette <jebnor at gmail.com> Cc: zfs-discuss at opensolaris.org Sent: Tuesday, July 21, 2009 5:43:23 PM Subject: Re: [zfs-discuss] SSDs get faster and less expensive On Jul 21, 2009, at 3:00 PM, Louis-Fr?d?ric Feuillette wrote:> On Tue, 2009-07-21 at 14:45 -0700, Richard Elling wrote: >> But to put this in perspective, you would have to *delete* 20 GBytes of >> data a day on a ZFS file system for 5 years (according to Intel) to >> reach the expected endurance. > > Forgive my ignorance, but is this not exactly what a SSD ZIL does? A ZIL > would need to "delete" it''s data when it flushes to disk. I know this > thread is about consumer SSDs but are the enterprise SSDs that much > better in terms of write cycles (not speed, I know they differ in some > cases dramatically).Good question. I don''t know of a sync workload that does that sort of traffic where I would only use one SSD, but I suppose it is possible. Yes, SSDs can vary in several ways: ??? 1. the space reserved for endurance (spares) ??? 2. ECC algorithms ??? 3. failure management algorithms ??? 4. DRAM buffers ??? 5. SLC:MLC ratios ??? 6. page size ??? 7. quality of components Some of these are closely guarded secrets, so resorting to black box testing may be the best you can hope for.? But even black box testing can be difficult because the failure rate is so low, especially in early life, that the confidence is low. Black boxes with a more consistent (and low) failure rate are easier to get high confidence in their failure behaviours. It will be interesting to see what happens over time :-)> > Richard, do you have a blog post about SSDs that I missed in my travels?I think there are a few on my todo list... :-) -- richard _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> Where is the best space to read the latest support of > ZFS with SSD and its roadmap as the latest ZFS > release adds SSD management to ZFS.I recommend these blog posts, if you have not read them yet. ZFS L2ARC http://blogs.sun.com/brendan/entry/test ZIL http://blogs.sun.com/perrin/entry/slog_blog_or_blogging_on http://blogs.sun.com/perrin/entry/the_lumberjack http://blogs.sun.com/realneel/entry/the_zfs_intent_log -- Irie Shin -- This message posted from opensolaris.org