Hello all, I am shopping around for 3.5" SSDs that I can mount into my storage and use as ZIL drives. As of yet, I have only found 3.5" models with the Sandforce 1200, which was not recommended on this list. Does anyone maybe know of a model that has the Sandforce 1500 and is 3.5"? Or any other 3.5" SSD that he/she can recommend? Cheers, budy -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101222/940eab3f/attachment.html>
We''ve always bought 2.5" and adapters for the super-micro cradles - works well, no issues to report here. Normally Intel''s or Samsung though we also use STECH. --- W. A. Khushil Dep - khushil.dep at gmail.com - 07905374843 Visit my blog at http://www.khushil.com/ On 22 December 2010 10:36, Stephan Budach <stephan.budach at jvm.de> wrote:> Hello all, > > I am shopping around for 3.5" SSDs that I can mount into my storage and use > as ZIL drives. > As of yet, I have only found 3.5" models with the Sandforce 1200, which was > not recommended on this list. > Does anyone maybe know of a model that has the Sandforce 1500 and is 3.5"? > Or any other 3.5" SSD that he/she can recommend? > > Cheers, > budy > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101222/81b58e13/attachment.html>
On Wed, Dec 22, 2010 at 11:36:48AM +0100, Stephan Budach wrote:> Hello all, > > I am shopping around for 3.5" SSDs that I can mount into my storage and > use as ZIL drives. > As of yet, I have only found 3.5" models with the Sandforce 1200, which > was not recommended on this list. >I think the "recommendation" was not to use SSDs at all for ZIL, not just specifially Sandforce controllers? -- Pasi> Does anyone maybe know of a model that has the Sandforce 1500 and is 3.5"? > Or any other 3.5" SSD that he/she can recommend? > > Cheers, > budy> _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Am 22.12.10 12:41, schrieb Pasi K?rkk?inen:> On Wed, Dec 22, 2010 at 11:36:48AM +0100, Stephan Budach wrote: >> Hello all, >> >> I am shopping around for 3.5" SSDs that I can mount into my storage and >> use as ZIL drives. >> As of yet, I have only found 3.5" models with the Sandforce 1200, which >> was not recommended on this list. >> > I think the "recommendation" was not to use SSDs at all for ZIL, > not just specifially Sandforce controllers? > > -- Pasi > >I think the recommendation has been either Intel X25 or Sandforce 1500-based SSDs. Cheers, budy
Hello, I was thinking of buying a couple of SSD''s until I found out that Trim is only supported with SATA drives. I''m not sure if TRIM will work with ZFS. I was concerned that with trim support the SSD life and write throughput will get affected. Doesn''t anybody have any thoughts on this? On 22 December 2010 10:36, Stephan Budach <stephan.budach at jvm.de> wrote:> Hello all, > > I am shopping around for 3.5" SSDs that I can mount into my storage and use > as ZIL drives. > As of yet, I have only found 3.5" models with the Sandforce 1200, which was > not recommended on this list. > Does anyone maybe know of a model that has the Sandforce 1500 and is 3.5"? > Or any other 3.5" SSD that he/she can recommend? > > Cheers, > budy > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >-- Thanks A Jabbar Azam -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101222/c9bb8b21/attachment.html>
> As of yet, I have only found 3.5" models with the Sandforce 1200, which was > not recommended on this list.I actually bought a SF-1200 based OCZ Agility 2 (60G) for use as a ZIL/L2ARC (haven''t installed it yet however, definitely jumped the gun on this purchase...) based on some recommendations from fellow users. Why are these not recommended? Is it performance related, or more "workload will degrade and kill this thing in no time" relate? --khd
> I''m not sure if TRIM will work with ZFS.Neither ZFS nor the ZIL code in particular support TRIM.> I was concerned that with trim support the SSD life and > write throughput will get affected.Your concerns about sustainable write performance (IOPS) for a Flash based SSD are valid, the resulting degradation will vary depending on the controller used. Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
On Wed, Dec 22, 2010 at 05:43:35AM -0800, Jabbar wrote:> Hello, > > I was thinking of buying a couple of SSD''s until I found out that Trim is only > supported with SATA drives. I''m not sure if TRIM will work with ZFS. I was > concerned that with trim support the SSD life and write throughput will get > affected. > > Doesn''t anybody have any thoughts on this?Have been using X-25E''s as ZIL for over a year. Cheap enough to replace a drive when they last that long... (still not seeing any reason to replace our current batch yet either). Ray
On Dec 22, 2010, at 08:43, Jabbar wrote:> I was thinking of buying a couple of SSD''s until I found out that > Trim is > only supported with SATA drives. I''m not sure if TRIM will work with > ZFS. I > was concerned that with trim support the SSD life and write > throughput will > get affected. > > Doesn''t anybody have any thoughts on this?Basic support for TRIM was added to b146, but ZFS does not make use of it yet: http://bugs.opensolaris.org/view_bug.do?bug_id=6866610 http://sparcv9.blogspot.com/2010/07/sata-trim-command-in-b146.html The statement "only support with SATA drives" is a bit confusing. SATA is merely a protocol between the host and the storage unit. Whether the storage unit is an SSD or spinning rust is irrelevant as either can support talk to the outside world via SATA. Similarly the SCSI world (which now runs over SAS for the transport layer) has a corresponding UNMAP command.
On Dec 22, 2010, at 09:55, Krunal Desai wrote:> I actually bought a SF-1200 based OCZ Agility 2 (60G) for use as a > ZIL/L2ARC (haven''t installed it yet however, definitely jumped the gun > on this purchase...) based on some recommendations from fellow users. > Why are these not recommended? Is it performance related, or more > "workload will degrade and kill this thing in no time" relate?There are two main reasons why they''re generally recommended: First, SF-1500 based devices usually come with a supercap or other battery system that helps preserve the buffers if the power goes out. They also generally respect the ''flush cache'' commands: when a SYNC command is sent to many other disks/SSDs they answer back "yes, the data is on stable storage" when in fact it is not. Lying to ZFS about what''s on stable storage causes problems when the power goes out. This is for slog devices. These shortcomings don''t matter (as much?) for cache / L2ARC devices since they''re mostly read-only. However for mostly-write I/O it can cause problems when it comes to pool recovery. Some recent threads on the subject: http://mail.opensolaris.org/pipermail/zfs-discuss/2010-May/thread.html#41326 http://mail.opensolaris.org/pipermail/zfs-discuss/2010-May/thread.html#41588 http://mail.opensolaris.org/pipermail/zfs-discuss/2010-June/thread.html#42298 SandForce has recently announced the 2000-series chip set, of which the SF-2500 and SF-2600 are labelled as "enterprise": http://www.sandforce.com/index.php?id=21 Note that for a slog / ZIL device it doesn''t have to be very big (at most 1/2 of physical RAM), so if your system has 16 GB of memory then your ZIL will at most be 8 GB: http://tinyurl.com/34ac5vv http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide#Separate_Log_Devices Which is why you may want to purchase less storage but go with a "better" SSD. For L2ARC, bigger can be construed to be better.
> I actually bought a SF-1200 based OCZ Agility 2 (60G)... > Why are these not recommended?The OCZ Agility 2 or any SF-1200 based SSD is an excellent choice for the L2ARC. As on-board volatile memory does *not* need power protection because the L2ARC contents are not required to survive a host power loss (at this time). Also, checksum fallback to the pool provides data redundancy. The ZIL accelerator''s requirements differ from the L2ARC, as it''s very purpose is to guarantee *all* data written to the log can be replayed (on next reboot) in case of host failure. Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
> The ZIL accelerator''s requirements differ from the L2ARC, as it''s very > purpose is to guarantee *all* data written to the log can be replayed > (on next reboot) in case of host failure.Ah, so this would be why say a super-capacitor backed SSD can be very helpful, as it will have some backup power present. Luckily, my use case is not a high-availability server, but a NAS in my basement. I''ve got it attached to a UPS with very conservative shut-down timing. Or are there other host failures aside from power a ZIL would be vulnerable too (system hard-locks?)?
> got it attached to a UPS with very conservative shut-down timing. Or > are there other host failures aside from power a ZIL would be > vulnerable too (system hard-locks?)?Correct, a system hard-lock is another example... Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
On Wed, Dec 22, 2010 at 01:43:35PM +0000, Jabbar wrote:> Hello, > > I was thinking of buying a couple of SSD''s until I found out that Trim is > only supported with SATA drives. >Yes, because TRIM is ATA command. SATA means Serial ATA. SCSI (SAS) drives have "WRITE SAME" command, which is the equivalent command there. -- Pasi
> > got it attached to a UPS with very conservative > shut-down timing. Or > > are there other host failures aside from power a > ZIL would be > > vulnerable too (system hard-locks?)? > > Correct, a system hard-lock is another example...How about comparing a non-battery backed ZIL to running a ZFS dataset with sync=disabled. Which is more risky? This has been an educational thread for me...I was not aware that SSD drives had some DRAM in front of the SSD part? -- This message posted from opensolaris.org
> How about comparing a non-battery backed ZIL to running a > ZFS dataset with sync=disabled. Which is more risky?Most likely, the 3.5" SSD''s on-board volatile (not power protected) memory would be small relative to the transaction group (txg) size and thus less "risky" than sync=disabled. Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
On 12/22/2010 7:05 AM, Christopher George wrote:>> I''m not sure if TRIM will work with ZFS. > Neither ZFS nor the ZIL code in particular support TRIM. > >> I was concerned that with trim support the SSD life and >> write throughput will get affected. > Your concerns about sustainable write performance (IOPS) > for a Flash based SSD are valid, the resulting degradation > will vary depending on the controller used. > > Best regards, > > Christopher George > Founder/CTO > www.ddrdrive.comChristopher is correct, in that SSDs will suffer from (non-trivial) performance degredation after they''ve exhausted their free list, and haven''t been told to reclaim emptied space. True battery-backed DRAM is the only permanent solution currently available which never runs into this problem. Even TRIM-supported SSDs eventually need reconditioning. However, this *can* be overcome by frequently re-formatting the SSD (not the Solaris format, a low-level format using a vendor-supplied utility). It''s generally a simple thing, but requires pulling the SSD from the server, connecting it to either a Linux or Windows box, running the reformatter, then replacing the SSD. Which, is a PITA. But, still a bit cheaper than buying a DDRdrive. <wink> -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On 12/22/2010 10:04 PM, Christopher George wrote:>> How about comparing a non-battery backed ZIL to running a >> ZFS dataset with sync=disabled. Which is more risky? > Most likely, the 3.5" SSD''s on-board volatile (not power protected) > memory would be small relative to the transaction group (txg) size > and thus less "risky" than sync=disabled. > > Best regards, > > Christopher George > Founder/CTO > www.ddrdrive.comTo the OP: First off, what do you mean by "sync=disabled"??? There is no such parameter for a mount option or attribute for ZFS, nor is there for exporting anything in NFS, nor for client-side NFS mounts. If you meant "disable the ZIL", well, DON''T. http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Disabling_the_ZIL_.28Don.27t.29 Moreover, disabling the ZIL on a per-dataset basis is not possible. As noted in the ETG, disabling ZIL can cause possible NFS-client-side corruption. If you absolutely must turn it off, however, you will get More Reliable transactions than a non-SuperCap''d SSD, by virtue that any sync-write on such a fileserver will not return as complete until the data has reach backing store. Which, in most cases, will tank (no pun intended) your synchronous performance. About the only case it won''t cripple performance is when your backing store is using some sort of NVRAM to buffer writes to the disks (as most large array controllers do - but make sure that cache is battery backed). But even there, it can be a relatively simple thing to overwhelm the very limited cache on such a controller, in which case your performance tanks again. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
ACARD 9010 is good enough in this aspect, if you need extremely high iops... Fred> -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Erik Trimble > Sent: ???, ??? 23, 2010 14:36 > To: Christopher George > Cc: zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL > > On 12/22/2010 7:05 AM, Christopher George wrote: > >> I''m not sure if TRIM will work with ZFS. > > Neither ZFS nor the ZIL code in particular support TRIM. > > > >> I was concerned that with trim support the SSD life and > >> write throughput will get affected. > > Your concerns about sustainable write performance (IOPS) > > for a Flash based SSD are valid, the resulting degradation > > will vary depending on the controller used. > > > > Best regards, > > > > Christopher George > > Founder/CTO > > www.ddrdrive.com > > Christopher is correct, in that SSDs will suffer from (non-trivial) > performance degredation after they''ve exhausted their free list, and > haven''t been told to reclaim emptied space. True battery-backed DRAM > is > the only permanent solution currently available which never runs into > this problem. Even TRIM-supported SSDs eventually need reconditioning. > > However, this *can* be overcome by frequently re-formatting the SSD > (not > the Solaris format, a low-level format using a vendor-supplied > utility). It''s generally a simple thing, but requires pulling the SSD > from the server, connecting it to either a Linux or Windows box, > running > the reformatter, then replacing the SSD. Which, is a PITA. > > But, still a bit cheaper than buying a DDRdrive. <wink> > > > -- > Erik Trimble > Java System Support > Mailstop: usca22-123 > Phone: x17195 > Santa Clara, CA > Timezone: US/Pacific (GMT-0800) > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> It''s generally a simple thing, but requires pulling the SSD from the > server, connecting it to either a Linux or Windows box, running > the reformatter, then replacing the SSD. Which, is a PITA.This procedure is more commonly known as a "Secure Erase". And it will return a Flash based SSD to it''s original or "new" performance. But as demonstrated in my presentation comparing Flash to DRAM based SSDs for ZIL accelerator applicability, the most dramatic write IOPS degradation occurs in less than 10 minutes of sustained use. For reference: http://www.ddrdrive.com/zil_accelerator.pdf So for the tested devices (OCZ Vertex 2 EX / Vertex 2 Pro) to come close to matching the vendor promised random write IOPS, one would have to remove the log device from the pool and Secure Erase after every ~10 minutes of sustained ZIL use. Would having to perform a Secure Erase every hour, day, or even week really be the most cost effective use of an administrators time? Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
ACARD 9010 is good enough in this aspect, if you DON''T need extremely high IOPS... Sorry for the typo. Fred> -----Original Message----- > From: Fred Liu > Sent: ???, ??? 23, 2010 15:30 > To: ''Erik Trimble''; Christopher George > Cc: zfs-discuss at opensolaris.org > Subject: RE: [zfs-discuss] Looking for 3.5" SSD for ZIL > > ACARD 9010 is good enough in this aspect, if you need extremely high > iops... > > Fred > > > -----Original Message----- > > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > > bounces at opensolaris.org] On Behalf Of Erik Trimble > > Sent: ???, ??? 23, 2010 14:36 > > To: Christopher George > > Cc: zfs-discuss at opensolaris.org > > Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL > > > > On 12/22/2010 7:05 AM, Christopher George wrote: > > >> I''m not sure if TRIM will work with ZFS. > > > Neither ZFS nor the ZIL code in particular support TRIM. > > > > > >> I was concerned that with trim support the SSD life and > > >> write throughput will get affected. > > > Your concerns about sustainable write performance (IOPS) > > > for a Flash based SSD are valid, the resulting degradation > > > will vary depending on the controller used. > > > > > > Best regards, > > > > > > Christopher George > > > Founder/CTO > > > www.ddrdrive.com > > > > Christopher is correct, in that SSDs will suffer from (non-trivial) > > performance degredation after they''ve exhausted their free list, and > > haven''t been told to reclaim emptied space. True battery-backed DRAM > > is > > the only permanent solution currently available which never runs into > > this problem. Even TRIM-supported SSDs eventually need > reconditioning. > > > > However, this *can* be overcome by frequently re-formatting the SSD > > (not > > the Solaris format, a low-level format using a vendor-supplied > > utility). It''s generally a simple thing, but requires pulling the > SSD > > from the server, connecting it to either a Linux or Windows box, > > running > > the reformatter, then replacing the SSD. Which, is a PITA. > > > > But, still a bit cheaper than buying a DDRdrive. <wink> > > > > > > -- > > Erik Trimble > > Java System Support > > Mailstop: usca22-123 > > Phone: x17195 > > Santa Clara, CA > > Timezone: US/Pacific (GMT-0800) > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> To the OP: First off, what do you mean by "sync=disabled"???I believe he is referring to ZIL synchronicity (PSARC/2010/108). http://arc.opensolaris.org/caselog/PSARC/2010/108/20100401_neil.perrin The following presentation by Robert Milkowski does an excellent job of placing in a larger context: http://www.oug.org/files/presentations/zfszilsynchronicity.pdf Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
We should get the reformatter(s) ported to illumos/solaris, if source is available. Something to consider. - Garrett -----Original Message----- From: zfs-discuss-bounces at opensolaris.org on behalf of Erik Trimble Sent: Wed 12/22/2010 10:36 PM To: Christopher George Cc: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL On 12/22/2010 7:05 AM, Christopher George wrote:>> I''m not sure if TRIM will work with ZFS. > Neither ZFS nor the ZIL code in particular support TRIM. > >> I was concerned that with trim support the SSD life and >> write throughput will get affected. > Your concerns about sustainable write performance (IOPS) > for a Flash based SSD are valid, the resulting degradation > will vary depending on the controller used. > > Best regards, > > Christopher George > Founder/CTO > www.ddrdrive.comChristopher is correct, in that SSDs will suffer from (non-trivial) performance degredation after they''ve exhausted their free list, and haven''t been told to reclaim emptied space. True battery-backed DRAM is the only permanent solution currently available which never runs into this problem. Even TRIM-supported SSDs eventually need reconditioning. However, this *can* be overcome by frequently re-formatting the SSD (not the Solaris format, a low-level format using a vendor-supplied utility). It''s generally a simple thing, but requires pulling the SSD from the server, connecting it to either a Linux or Windows box, running the reformatter, then replacing the SSD. Which, is a PITA. But, still a bit cheaper than buying a DDRdrive. <wink> -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101223/4b9245ad/attachment.html>
If anybody does know of any source to the secure erase/reformatters, I''ll happily volunteer to do the port and then maintain it. I''m currently in talks with several SSD and driver chip hardware peeps with regard getting datasheets for some SSD products etc. for the purpose of better support under the OI/Solaris driver model but these things can take a while to obtain, so if anybody knows of existing open source versions I''ll jump on it. Thanks, Deano From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Garrett D''Amore Sent: 23 December 2010 15:22 To: Erik Trimble; Christopher George Cc: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL We should get the reformatter(s) ported to illumos/solaris, if source is available. Something to consider. - Garrett -----Original Message----- From: zfs-discuss-bounces at opensolaris.org on behalf of Erik Trimble Sent: Wed 12/22/2010 10:36 PM To: Christopher George Cc: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL On 12/22/2010 7:05 AM, Christopher George wrote:>> I''m not sure if TRIM will work with ZFS. > Neither ZFS nor the ZIL code in particular support TRIM. > >> I was concerned that with trim support the SSD life and >> write throughput will get affected. > Your concerns about sustainable write performance (IOPS) > for a Flash based SSD are valid, the resulting degradation > will vary depending on the controller used. > > Best regards, > > Christopher George > Founder/CTO > www.ddrdrive.comChristopher is correct, in that SSDs will suffer from (non-trivial) performance degredation after they''ve exhausted their free list, and haven''t been told to reclaim emptied space. True battery-backed DRAM is the only permanent solution currently available which never runs into this problem. Even TRIM-supported SSDs eventually need reconditioning. However, this *can* be overcome by frequently re-formatting the SSD (not the Solaris format, a low-level format using a vendor-supplied utility). It''s generally a simple thing, but requires pulling the SSD from the server, connecting it to either a Linux or Windows box, running the reformatter, then replacing the SSD. Which, is a PITA. But, still a bit cheaper than buying a DDRdrive. <wink> -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101223/b8c6bc8c/attachment-0001.html>
On Thu, Dec 23, 2010 at 07:35:29AM -0800, Deano wrote:> If anybody does know of any source to the secure erase/reformatters, > I?ll happily volunteer to do the port and then maintain it. > > I?m currently in talks with several SSD and driver chip hardware > peeps with regard getting datasheets for some SSD products etc. for > the purpose of better support under the OI/Solaris driver model but > these things can take a while to obtain, so if anybody knows of > existing open source versions I?ll jump on it. > > Thanks, > DeanoA tool to help the end user know *when* they should run the reformatter tool would be helpful too. I know we can just wait until performance "degrades", but it would be nice to see what % of blocks are in use, etc. Ray
In an ideal world, if we could obtain details on how to reset/format blocks of a SSD, we could do it automatically running behind the ZIL. As a log its going in one direction, a background task could clean up behind it, making the performance lowing over time a non-issue for the ZIL. A first start may be calling unmap/trim on those blocks (which I was surprised to find in the source is already coded up in the SATA driver, just not used yet) but really a reset would be better. But as you say a tool to say if its need doing would be a good start. They certainly exist in closed source form... Deano -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Ray Van Dolson Sent: 23 December 2010 15:46 To: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL On Thu, Dec 23, 2010 at 07:35:29AM -0800, Deano wrote:> If anybody does know of any source to the secure erase/reformatters, > I?ll happily volunteer to do the port and then maintain it. > > I?m currently in talks with several SSD and driver chip hardware > peeps with regard getting datasheets for some SSD products etc. for > the purpose of better support under the OI/Solaris driver model but > these things can take a while to obtain, so if anybody knows of > existing open source versions I?ll jump on it. > > Thanks, > DeanoA tool to help the end user know *when* they should run the reformatter tool would be helpful too. I know we can just wait until performance "degrades", but it would be nice to see what % of blocks are in use, etc. Ray _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> However, this *can* be overcome by frequently re-formatting the SSD (not > the Solaris format, a low-level format using a vendor-supplied utility).For those looking to "Secure Erase" a OCZ SandForce based SSD to reclaim performance, the following OCZ Forum thread might be of interest: http://www.ocztechnologyforum.com/forum/showthread.php?75773-Secure-Erase-TRIM-and-anything-else-Sandforce OCZ uses the term "DuraClass" as a catch-all for algorithms controlling wear leveling, drive longevity... There is a direct correlation between Secure Erase frequency and expected SSD lifetime. Thread #1 detailing a recommended frequency of Secure Erase use: "3) Secure erase a drive every 6 months to free up previously read only blocks, secure erase every 2 days to get round Duraclass and you will kill the drive very quickly" Thread #5 explaining DuraClass and relationship to TRIM: "Duraclass is limiting the speed of the drive NOT TRIM. TRIM is used along with wear levelling." Thread #6 provides more details of DuraClass and TRIM: "Now Duraclass monitors all writes and control''s encryption and compression, this is what effects the speed of the blocks being written to..NOT the fact they have been TRIM''d or not TRIM''d." "You guys have become fixated at TRIM not speeding up the drive and forget that Duraclass controls all writes incurred by the drive once a GC map has been written." Above excerpts written by a OCZ employed thread moderator (Tony). Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
Secure Erase is currently a entire drive function, its writes all the cell resetting it. It also updates the firmware GC maps so it knows the drive is clean. Trim just gives more info the firmware that a block is unused (as normally a delete is just updating an index table and the firmware has no way of knowing which cells are no longer needed by the OS). Currently firmware is meant to help conventional file system usage. However ZIL isn''t normal usage and as such *IF* and it''s a big if, we can effectively bypass the firmware trying to be clever or at least help it be clever then we can avoid the downgrade over time. In particular if we could secure erase a few cells as once as required, the lifetime would be much longer, I''d even argue that taking the wear leveling off the drives hand would be useful in the ZIL case. The other thing is the slow down only occurs once the SSD fills and has to start getting clever where to put things and which cells to change, for a ZIL that is again something we could avoid in software fairly easy. Its also worth putting this in perspective, a complete secure erase every night to restore performance to your ZIL would still let the SSD last for *years*. And given how cheap some SSD are, it is probably still cheaper to effectively burn the ZIL out and just replace it once a year. Maybe not a classic level of RAID but the very essence of the idea, lots of cheap can be better than expensive if you know what you are doing. Bye, Deano -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Christopher George Sent: 23 December 2010 16:46 To: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL> However, this *can* be overcome by frequently re-formatting the SSD (not > the Solaris format, a low-level format using a vendor-supplied utility).For those looking to "Secure Erase" a OCZ SandForce based SSD to reclaim performance, the following OCZ Forum thread might be of interest: http://www.ocztechnologyforum.com/forum/showthread.php?75773-Secure-Erase-TR IM-and-anything-else-Sandforce OCZ uses the term "DuraClass" as a catch-all for algorithms controlling wear leveling, drive longevity... There is a direct correlation between Secure Erase frequency and expected SSD lifetime. Thread #1 detailing a recommended frequency of Secure Erase use: "3) Secure erase a drive every 6 months to free up previously read only blocks, secure erase every 2 days to get round Duraclass and you will kill the drive very quickly" Thread #5 explaining DuraClass and relationship to TRIM: "Duraclass is limiting the speed of the drive NOT TRIM. TRIM is used along with wear levelling." Thread #6 provides more details of DuraClass and TRIM: "Now Duraclass monitors all writes and control''s encryption and compression, this is what effects the speed of the blocks being written to..NOT the fact they have been TRIM''d or not TRIM''d." "You guys have become fixated at TRIM not speeding up the drive and forget that Duraclass controls all writes incurred by the drive once a GC map has been written." Above excerpts written by a OCZ employed thread moderator (Tony). Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On 12/23/2010 7:57 AM, Deano wrote:> In an ideal world, if we could obtain details on how to reset/format blocks of a SSD, we could do it automatically running behind the ZIL. As a log its going in one direction, a background task could clean up behind it, making the performance lowing over time a non-issue for the ZIL. A first start may be calling unmap/trim on those blocks (which I was surprised to find in the source is already coded up in the SATA driver, just not used yet) but really a reset would be better. > > But as you say a tool to say if its need doing would be a good start. They certainly exist in closed source form... > > Deano > > -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Ray Van Dolson > Sent: 23 December 2010 15:46 > To: zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL > > On Thu, Dec 23, 2010 at 07:35:29AM -0800, Deano wrote: >> If anybody does know of any source to the secure erase/reformatters, >> I?ll happily volunteer to do the port and then maintain it. >> >> I?m currently in talks with several SSD and driver chip hardware >> peeps with regard getting datasheets for some SSD products etc. for >> the purpose of better support under the OI/Solaris driver model but >> these things can take a while to obtain, so if anybody knows of >> existing open source versions I?ll jump on it. >> >> Thanks, >> Deano > A tool to help the end user know *when* they should run the reformatter > tool would be helpful too. > > I know we can just wait until performance "degrades", but it would be > nice to see what % of blocks are in use, etc. > > RayAFAIK, all the reformatter utilities are closed-source, direct from the SSD manufacturer. They talk directly to the drive firmware, so they''re decidedly implementation-specific (I''d be flabberghasted if one worked on two different manufacturers'' SSDs, even if they used the same basic controller). Many are DOS-based. As Christopher noted, you''ll get a drop-off in performance as soon as you collect enough sync writes to have written (in the aggregate) slightly more than the total capacity of the SSD (including the "extra" that most SSDs now have). That said, I would expect full TRIM support to possibly make this better, as it could free up partially-used pages more frequently, and thus increasing the time before performance drops (which is due to the page remapping/reshuffling demands on the SSD controller). But, yes, SSDs are inherently less fast than DRAM. They''re utility is entirely dependent on what your use case (and performance demands) are. The longer-term solution is to have SSDs change how they are designed, moving away from the current one-page-of-multiple-blocks as the atomic entity of writing, and straight to a one-block-per-page setup. Don''t hold your breath. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Wed, Dec 22 at 23:29, Christopher George wrote:>Would having to perform a Secure Erase every hour, day, or even >week really be the most cost effective use of an administrators time?You''re assuming that the "into an empty device" performance is required by their application. For many users, the worst-case steady-state of the device (6k IOPS on the Vertex2 EX, depending on workload, as per slide 48 in your presentation) is so much faster than a rotating drive (50x faster, assuming that cache disabled on a rotating drive is roughly 100 IOPS with queueing), that it''ll still provide a huge performance boost when used as a ZIL in their system. For a huge ZFS box providing tens of ZFS filesystems in a pool all with huge user loads, sure, a RAM based device makes sense, but it''s overkill for some large percentage of ZFS users, I imagine. -- Eric D. Mudama edmudama at mail.bounceswoosh.org
On Thu, Dec 23 at 9:14, Erik Trimble wrote:>The longer-term solution is to have SSDs change how they are >designed, moving away from the current one-page-of-multiple-blocks as >the atomic entity of writing, and straight to a one-block-per-page >setup. Don''t hold your breath.Will never happen using NAND technology. Non-NAND SSDs may or may not have similar or related limitations. -- Eric D. Mudama edmudama at mail.bounceswoosh.org
On Thu, Dec 23 at 17:11, Deano wrote:>Currently firmware is meant to help conventional file system usage. However >ZIL isn''t normal usage and as such *IF* and it''s a big if, we can >effectively bypass the firmware trying to be clever or at least help it be >clever then we can avoid the downgrade over time. In particular if we could >secure erase a few cells as once as required, the lifetime would be much >longer, I''d even argue that taking the wear leveling off the drives hand >would be useful in the ZIL case.In most cases, an SSD knows something isn''t valuable when it is overwritten. If the allocator for the ZIL would rewrite to sectors no-longer-needed, instead of walking sequentially across the entire available LBA space, slowdown of a ZIL would likely never occur on a NAND SSD, since the drive would always have a good idea which sectors were free and which were still in use. -- Eric D. Mudama edmudama at mail.bounceswoosh.org
> You''re assuming that the "into an empty device" performance is > required by their application.My assumption was stated in the paragraph prior, i.e. vendor promised random write IOPS. Based on the inquires we receive, most *actually* expect an OCZ SSD to perform as specified which is 50K 4KB random writes for both the Vertex 2 EX and the Vertex 2 Pro. The point I was trying to make, Secure Erase is not a viable solution to write IOPS degradation, of the above listed SSDs, relative to published specifications. I think we can all agree, if "Secure Erase" could magically solve the problem it would already be implemented by the SSD controller.> For many users, the worst-case steady-state of the device (6k IOPS > the Vertex2 EX, depending on workload, as per slide 48 in your > presentation) is so much faster than a rotating drive (50x faster, > assuming that cache disabled on a rotating drive is roughly 100 > IOPS with queueing), that it''ll still provide a huge performance boost > when used as a ZIL in their system.I agree 100%. I never intended to insinuate otherwise :-) Best regards, Christopher George Founder/CTO www.ddrdrive.com -- This message posted from opensolaris.org
On Thu, Dec 23 at 10:49, Christopher George wrote:>My assumption was stated in the paragraph prior, i.e. vendor promised >random write IOPS. Based on the inquires we receive, most *actually* >expect an OCZ SSD to perform as specified which is 50K 4KB >random writes for both the Vertex 2 EX and the Vertex 2 Pro.Okay, I understand where you''re coming from. Yes, buyers must be aware of the test methodologies for published benchmark results, especially those used to sell drives by the vendors themselves. "Up to" is generally a poor thing to base a buying decision. --eric -- Eric D. Mudama edmudama at mail.bounceswoosh.org
On Dec 22, 2010, at 8:57 PM, Bill Werner <werner at cubbyhole.com> wrote:>>> got it attached to a UPS with very conservative >> shut-down timing. Or >>> are there other host failures aside from power a >> ZIL would be >>> vulnerable too (system hard-locks?)? >> >> Correct, a system hard-lock is another example... > > How about comparing a non-battery backed ZIL to running a ZFS dataset with sync=disabled. Which is more risky?Disabling the ZIL is always more risky. But more importantly, disabling the ZIL is a policy decision. If the user is happy with that policy, then they should be happy with the consequence. -- richard>
"Friends don''t let friends disable the ZIL" - right Richard? :-) On 24 Dec 2010 20:34, "Richard Elling" <richard.elling at gmail.com> wrote: -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101225/a1264cec/attachment.html>
On Dec 25, 2010, at 10:19 AM, Khushil Dep wrote:> "Friends don''t let friends disable the ZIL" - right Richard? :-) > >Or, if you care about your data enough to bother with RAID, don''t disable the ZIL :-) -- richard -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20101225/35630c08/attachment.html>
> -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Stephan Budach > Sent: Wednesday, December 22, 2010 5:37 AM > To: zfs-discuss at opensolaris.org > Subject: [zfs-discuss] Looking for 3.5" SSD for ZIL > > Hello all, > > I am shopping around for 3.5" SSDs that I can mount into my storage and use > as ZIL drives. > As of yet, I have only found 3.5" models with the Sandforce 1200, which was > not recommended on this list. > Does anyone maybe know of a model that has the Sandforce 1500 and is > 3.5"? Or any other 3.5" SSD that he/she can recommend?I have not personally used one, but I have received recommendations for the STEC ZeusRAM. It is a 3.5" SAS RAM-based, flash-backed SSD. I was quoted a shade under $3K USD for an 8GB unit. My understanding is that these have to be ordered through an integrator and there is a significant lead time. There are 2.5->3.5 adapter shells that you should be able to use for any/all of the 2.5" SSDs on the market. -Will
Does anyone have a contact from whom I could purchase an STEC SSD? Thank you, Jordan ________________________________________ From: zfs-discuss-bounces at opensolaris.org [zfs-discuss-bounces at opensolaris.org] on behalf of Saxon, Will [Will.Saxon at sage.com] Sent: Monday, December 27, 2010 7:16 PM To: Stephan Budach; zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Looking for 3.5" SSD for ZIL> -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Stephan Budach > Sent: Wednesday, December 22, 2010 5:37 AM > To: zfs-discuss at opensolaris.org > Subject: [zfs-discuss] Looking for 3.5" SSD for ZIL > > Hello all, > > I am shopping around for 3.5" SSDs that I can mount into my storage and use > as ZIL drives. > As of yet, I have only found 3.5" models with the Sandforce 1200, which was > not recommended on this list. > Does anyone maybe know of a model that has the Sandforce 1500 and is > 3.5"? Or any other 3.5" SSD that he/she can recommend?I have not personally used one, but I have received recommendations for the STEC ZeusRAM. It is a 3.5" SAS RAM-based, flash-backed SSD. I was quoted a shade under $3K USD for an 8GB unit. My understanding is that these have to be ordered through an integrator and there is a significant lead time. There are 2.5->3.5 adapter shells that you should be able to use for any/all of the 2.5" SSDs on the market. -Will _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss