Jason J. W. Williams
2006-Dec-15 18:53 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Hi Folks, Roch Bourbonnais and Richard Elling helped me tremendously with the issue of ZFS killing performance on arrays with battery-backed cache. Since this seems to have been mentioned a bit recently, and there are no instructions on how to fix it on Sun StorageTek/Engenio arrays, I wanted to post back the solution that worked for me. Its posted here: http://blogs.digitar.com/jjww/?itemid=44 The instructions will tell you how to configure the array to ignore SCSI cache flushes/syncs on Engenio arrays. If anyone has additional instructions for other arrays, please let me know and I''ll be happy to add them! Best Regards, Jason
Jeremy Teo
2006-Dec-15 19:07 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
> The instructions will tell you how to configure the array to ignore > SCSI cache flushes/syncs on Engenio arrays. If anyone has additional > instructions for other arrays, please let me know and I''ll be happy to > add them!Wouldn''t it be more appropriate to allow the administrator to disable ZFS from issuing the write cache enable command during a commit? (assuming expensive high end battery backed cache etc etc) -- Regards, Jeremy
Jason J. W. Williams
2006-Dec-15 19:10 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Hi Jeremy, It would be nice if you could tell ZFS to turn off fsync() for ZIL writes on a per-zpool basis. That being said, I''m not sure there''s a consensus on that...and I''m sure not smart enough to be a ZFS contributor. :-) The behavior is a reality we had to deal with and workaround, so I posted the instructions to hopefully help others in a similar boat. I think this is a valuable discussion point though...at least for us. :-) Best Regards, Jason On 12/15/06, Jeremy Teo <white.wristband at gmail.com> wrote:> > The instructions will tell you how to configure the array to ignore > > SCSI cache flushes/syncs on Engenio arrays. If anyone has additional > > instructions for other arrays, please let me know and I''ll be happy to > > add them! > > Wouldn''t it be more appropriate to allow the administrator to disable > ZFS from issuing the write cache enable command during a commit? > (assuming expensive high end battery backed cache etc etc) > -- > Regards, > Jeremy >
Richard Elling
2006-Dec-16 05:18 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Jason J. W. Williams wrote:> Hi Jeremy, > > It would be nice if you could tell ZFS to turn off fsync() for ZIL > writes on a per-zpool basis. That being said, I''m not sure there''s a > consensus on that...and I''m sure not smart enough to be a ZFS > contributor. :-) > > The behavior is a reality we had to deal with and workaround, so I > posted the instructions to hopefully help others in a similar boat. > > I think this is a valuable discussion point though...at least for us. :-)This is one of those systems engineering problems that can be difficult to identify, as Jason discovered, and with no perfect solution. I hope someone here can help with the design concept behind the cache flush on the FLX 210, as well as other RAID arrays. I think there is ample discussion here already of the behaviour of ZFS. -- richard
Jeremy Teo
2006-Dec-16 07:24 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
On 12/16/06, Richard Elling <Richard.Elling at sun.com> wrote:> Jason J. W. Williams wrote: > > Hi Jeremy, > > > > It would be nice if you could tell ZFS to turn off fsync() for ZIL > > writes on a per-zpool basis. That being said, I''m not sure there''s a > > consensus on that...and I''m sure not smart enough to be a ZFS > > contributor. :-) > > > > The behavior is a reality we had to deal with and workaround, so I > > posted the instructions to hopefully help others in a similar boat. > > > > I think this is a valuable discussion point though...at least for us. :-) > > This is one of those systems engineering problems that can > be difficult to identify, as Jason discovered, and with no > perfect solution. I hope someone here can help with the > design concept behind the cache flush on the FLX 210, as well > as other RAID arrays. I think there is ample discussion here > already of the behaviour of ZFS. > -- richardI presume you feel that the current behaviour of ZFS always issuing the write cache flush is the best compromise. Are there actually storage arrays with battery backed cache that *don''t* allow themselves to be configured to ignore cache flush commands? -- Regards, Jeremy
Torrey McMahon
2006-Dec-16 18:08 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Richard Elling wrote:> Jason J. W. Williams wrote: >> Hi Jeremy, >> >> It would be nice if you could tell ZFS to turn off fsync() for ZIL >> writes on a per-zpool basis. That being said, I''m not sure there''s a >> consensus on that...and I''m sure not smart enough to be a ZFS >> contributor. :-) >> >> The behavior is a reality we had to deal with and workaround, so I >> posted the instructions to hopefully help others in a similar boat. >> >> I think this is a valuable discussion point though...at least for us. >> :-) > > This is one of those systems engineering problems that can > be difficult to identify, as Jason discovered, and with no > perfect solution. I hope someone here can help with the > design concept behind the cache flush on the FLX 210, as well > as other RAID arrays. I think there is ample discussion here > already of the behaviour of ZFS. > -- richard >Maybe there is a systems company that can help solve ... ummm ... wait a second. :-P Seriously - Ping the array team. I can get you some names off alias if you want them.
Torrey McMahon
2006-Dec-16 18:09 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Jeremy Teo wrote:> > > Are there actually storage arrays with battery backed cache that > *don''t* allow themselves to be configured to ignore cache flush > commands?The arrays/controllers that LSI makes are well known for being extremely configurable. I would presume that most arrays you find on the market aren''t going to have that sort of configuration option....though I''ll gladly take a correction.
Richard Elling
2006-Dec-16 22:56 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Jeremy Teo wrote:> On 12/16/06, Richard Elling <Richard.Elling at sun.com> wrote: >> Jason J. W. Williams wrote: >> > Hi Jeremy, >> > >> > It would be nice if you could tell ZFS to turn off fsync() for ZIL >> > writes on a per-zpool basis. That being said, I''m not sure there''s a >> > consensus on that...and I''m sure not smart enough to be a ZFS >> > contributor. :-) >> > >> > The behavior is a reality we had to deal with and workaround, so I >> > posted the instructions to hopefully help others in a similar boat. >> > >> > I think this is a valuable discussion point though...at least for >> us. :-) >> >> This is one of those systems engineering problems that can >> be difficult to identify, as Jason discovered, and with no >> perfect solution. I hope someone here can help with the >> design concept behind the cache flush on the FLX 210, as well >> as other RAID arrays. I think there is ample discussion here >> already of the behaviour of ZFS. >> -- richard > > I presume you feel that the current behaviour of ZFS always issuing > the write cache flush is the best compromise. > > Are there actually storage arrays with battery backed cache that > *don''t* allow themselves to be configured to ignore cache flush > commands?I don''t think the problem is as simple as that. As I see it, the problem is that the affects of the cache flush policies on perceived performance is unknown. On the one hand, this is predictable as these sorts of problems often arise when integrating subsystems. OTOH, now that we have some light shining on it, we may find that this is an isolated event, restricted to older array technology (presuming that as the arrays become more powerful, the designers are able to implement more sophisticated algorithms.) I could foresee similar issues if ZFS tries to get too clever with its scheduling algorithm (as Bill has noted previously) Regardless, I expect to see more of these unknown unknowns become known over the next few years. -- richard
Gregory Shaw
2006-Dec-16 23:29 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
That sounds like a really good idea. If you trust your high-end arrays (EMC, Engenio, HDS, Sun, etc.), I would think that a pool- level don''t-fsync-ZIL would be very beneficial. As stated in the article, doing this on a storage solution without battery backed cache is a very bad idea. However, for large(er) environments, I believe we''ll see this request more-and-more. I wonder what sort of performance change this would have for those encountering NFS/ZFS performance issues? If you''re using intelligent arrays with NFS, I would think it could have a big impact. On Dec 15, 2006, at 12:07 PM, Jeremy Teo wrote:>> The instructions will tell you how to configure the array to ignore >> SCSI cache flushes/syncs on Engenio arrays. If anyone has additional >> instructions for other arrays, please let me know and I''ll be >> happy to >> add them! > > Wouldn''t it be more appropriate to allow the administrator to disable > ZFS from issuing the write cache enable command during a commit? > (assuming expensive high end battery backed cache etc etc) > -- > Regards, > Jeremy > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss----- Gregory Shaw, IT Architect IT CTO Group, Sun Microsystems Inc. Phone: (303)-272-8817 500 Eldorado Blvd, UBRM02-157 greg.shaw at sun.com (work) Broomfield, CO 80021 shaw at fmsoft.com (home) "When Microsoft writes an application for Linux, I''ve won." - Linus Torvalds -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20061216/9674f056/attachment.html>
Roch - PAE
2006-Dec-19 12:02 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Jason J. W. Williams writes: > Hi Jeremy, > > It would be nice if you could tell ZFS to turn off fsync() for ZIL > writes on a per-zpool basis. That being said, I''m not sure there''s a > consensus on that...and I''m sure not smart enough to be a ZFS > contributor. :-) > > The behavior is a reality we had to deal with and workaround, so I > posted the instructions to hopefully help others in a similar boat. > > I think this is a valuable discussion point though...at least for us. :-) > > Best Regards, > Jason > To Summarize: Today, ZFS sends a ioctl to the storage that says flush the write cache, while what it really wants is, make sure data is on stable storage. The Storage should then flush or not the cache depending on if it is considered stable or not (only the storage knows that). Soon ZFS (more precisely SD) will be sending a ''qualified'' ioctl to clarify the requested behavior. In parallel, Storage vendor shall be implementing that qualified ioctl. ZFS Customers of third party storage probably have more influence to get those vendors to support the qualified behavior. http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6462690 With SD fixed and Storage vendor support, there will be no more need to tune anything. -r > On 12/15/06, Jeremy Teo <white.wristband at gmail.com> wrote: > > > The instructions will tell you how to configure the array to ignore > > > SCSI cache flushes/syncs on Engenio arrays. If anyone has additional > > > instructions for other arrays, please let me know and I''ll be happy to > > > add them! > > > > Wouldn''t it be more appropriate to allow the administrator to disable > > ZFS from issuing the write cache enable command during a commit? > > (assuming expensive high end battery backed cache etc etc) > > -- > > Regards, > > Jeremy > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Jason J. W. Williams
2006-Dec-19 19:59 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Hi Roch, That sounds like a most excellent resolution to me. :-) I believe Engenio devices support SBC-2. It seems to me making intelligent decisions for end-users is generally a good policy. Best Regards, Jason On 12/19/06, Roch - PAE <Roch.Bourbonnais at sun.com> wrote:> > > Jason J. W. Williams writes: > > Hi Jeremy, > > > > It would be nice if you could tell ZFS to turn off fsync() for ZIL > > writes on a per-zpool basis. That being said, I''m not sure there''s a > > consensus on that...and I''m sure not smart enough to be a ZFS > > contributor. :-) > > > > The behavior is a reality we had to deal with and workaround, so I > > posted the instructions to hopefully help others in a similar boat. > > > > I think this is a valuable discussion point though...at least for us. :-) > > > > Best Regards, > > Jason > > > > To Summarize: > > Today, ZFS sends a ioctl to the storage that says flush the > write cache, while what it really wants is, make sure data > is on stable storage. The Storage should then flush or not > the cache depending on if it is considered stable or not > (only the storage knows that). > > Soon ZFS (more precisely SD) will be sending a ''qualified'' > ioctl to clarify the requested behavior. > > In parallel, Storage vendor shall be implementing that > qualified ioctl. ZFS Customers of third party storage > probably have more influence to get those vendors to support > the qualified behavior. > > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6462690 > > With SD fixed and Storage vendor support, there will be no > more need to tune anything. > > -r > > > > > On 12/15/06, Jeremy Teo <white.wristband at gmail.com> wrote: > > > > The instructions will tell you how to configure the array to ignore > > > > SCSI cache flushes/syncs on Engenio arrays. If anyone has additional > > > > instructions for other arrays, please let me know and I''ll be happy to > > > > add them! > > > > > > Wouldn''t it be more appropriate to allow the administrator to disable > > > ZFS from issuing the write cache enable command during a commit? > > > (assuming expensive high end battery backed cache etc etc) > > > -- > > > Regards, > > > Jeremy > > > > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >
Roch Bourbonnais
2007-Jan-10 16:26 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing
It seems though that the critical feature we need was optional in the SBC-2 spec. So we still need some development to happen on the storage end. But we''ll get there... Le 19 d?c. 06 ? 20:59, Jason J. W. Williams a ?crit :> Hi Roch, > > That sounds like a most excellent resolution to me. :-) I believe > Engenio devices support SBC-2. It seems to me making intelligent > decisions for end-users is generally a good policy. > > Best Regards, > Jason > > On 12/19/06, Roch - PAE <Roch.Bourbonnais@sun.com> wrote: >> >> >> Jason J. W. Williams writes: >> > Hi Jeremy, >> > >> > It would be nice if you could tell ZFS to turn off fsync() for ZIL >> > writes on a per-zpool basis. That being said, I''m not sure >> there''s a >> > consensus on that...and I''m sure not smart enough to be a ZFS >> > contributor. :-) >> > >> > The behavior is a reality we had to deal with and workaround, so I >> > posted the instructions to hopefully help others in a similar >> boat. >> > >> > I think this is a valuable discussion point though...at least >> for us. :-) >> > >> > Best Regards, >> > Jason >> > >> >> To Summarize: >> >> Today, ZFS sends a ioctl to the storage that says flush the >> write cache, while what it really wants is, make sure data >> is on stable storage. The Storage should then flush or not >> the cache depending on if it is considered stable or not >> (only the storage knows that). >> >> Soon ZFS (more precisely SD) will be sending a ''qualified'' >> ioctl to clarify the requested behavior. >> >> In parallel, Storage vendor shall be implementing that >> qualified ioctl. ZFS Customers of third party storage >> probably have more influence to get those vendors to support >> the qualified behavior. >> >> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6462690 >> >> With SD fixed and Storage vendor support, there will be no >> more need to tune anything. >> >> -r >> >> >> >> > On 12/15/06, Jeremy Teo <white.wristband@gmail.com> wrote: >> > > > The instructions will tell you how to configure the array >> to ignore >> > > > SCSI cache flushes/syncs on Engenio arrays. If anyone has >> additional >> > > > instructions for other arrays, please let me know and I''ll >> be happy to >> > > > add them! >> > > >> > > Wouldn''t it be more appropriate to allow the administrator to >> disable >> > > ZFS from issuing the write cache enable command during a commit? >> > > (assuming expensive high end battery backed cache etc etc) >> > > -- >> > > Regards, >> > > Jeremy >> > > >> > _______________________________________________ >> > zfs-discuss mailing list >> > zfs-discuss@opensolaris.org >> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> >> > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Brad
2010-Jan-25 23:12 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Hi! So after reading through this thread and checking the bug report...do we still need to tell zfs to disable cache flush? set zfs:zfs_nocacheflush=1 -- This message posted from opensolaris.org
Cindy Swearingen
2010-Jan-26 23:35 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Brad, If you are referring to this thread that starting in 2006, then I would review this updated section: http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Cache_Flushes Check to see if your array is described or let us know which array you are referring to... Thanks, Cindy On 01/25/10 16:12, Brad wrote:> Hi! So after reading through this thread and checking the bug report...do we still need to tell zfs to disable cache flush? > > set zfs:zfs_nocacheflush=1
Brad
2010-Jan-27 18:43 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Cindy, It does not list our SAN (LSI/STK/NetApp)...I''m confused about disabling cache from the wiki entries. Should we disable it by turning off zfs cache syncs via "echo zfs_nocacheflush/W0t1 | mdb -kw " or specify it by storage device via the sd.conf method where the array ignores cache flushes from zfs? Brad -- This message posted from opensolaris.org
Cindy Swearingen
2010-Jan-27 20:17 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Brad, It depends on the Solaris release. What Solaris release are you running? Thanks, Cindy On 01/27/10 11:43, Brad wrote:> Cindy, > > It does not list our SAN (LSI/STK/NetApp)...I''m confused about disabling cache from the wiki entries. > > Should we disable it by turning off zfs cache syncs via "echo zfs_nocacheflush/W0t1 | mdb -kw " or specify it by storage device via the sd.conf method where the array ignores cache flushes from zfs? > > Brad
Brad
2010-Jan-27 22:47 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
We''re running 10/09 on the dev box but 11/06 is prodqa. -- This message posted from opensolaris.org
Cindy Swearingen
2010-Jan-27 23:24 UTC
[zfs-discuss] Instructions for ignoring ZFS write cache flushing on intelligent arrays
Hi Brad, You should see better performance on the dev box running 10/09 with the sd and ssd drivers as is because they should properly handle the SYNC_NV bit in this release. If you have determined that the 11/06 system is affected by this issue, then the best method is to set this parameter in the /kernel/drv/*conf file. I''m unclear whether you understand all the implications of disabling this parameter because we''re discussing this over email. Someone with more experience with tuning this parameter should weigh in. Brad is using SAN on (LSI/STK/NetApp). Thanks, Cindy On 01/27/10 15:47, Brad wrote:> We''re running 10/09 on the dev box but 11/06 is prodqa.