On Tue, Oct 18, 2011 at 9:38 AM, Gregory Shaw <greg.shaw at oracle.com> wrote:> Another item that made me nervous was my experience with ZFS. ?Even when > called ''ready for production'', a number of bugs were found that were pretty nasty. > They''ve since been fixed (years ago), but there were some surprises there that I''d > rather not encounter on a Linux system.I know that I have been really spoiled by UFS. It has been around for so long that it has been really optimized, some might even say, optimized beyond the point of diminishing returns :-) UFS is amazingly and has very reasonable performance, given it''s roots. I did not have to live through the early days of UFS and the pain of finding bugs. I _am_ living through that with ZFS :-( Having said that, I have yet to have any data loss with ZFS. I have developed a number of simple rules I follow with ZFS: 1. OS and DATA go on different zpools on different physical drives (if at all possible) 2. Do NOT move drives around without first exporting any zpools on those drives. 3. Do NOT let a system see drives with more than one OS zpool at the same time (I know you _can_ do this safely, but I have seen too many horror stories on this list that I just avoid it). -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players
Cindy Swearingen
2011-Oct-18 16:53 UTC
[zfs-discuss] FS Reliability WAS: about btrfs and zfs
Hi Paul, Your 1-3 is very sensible advice and I must ask about this statement: >I have yet to have any data loss with ZFS. Maybe this goes without saying, but I think you are using ZFS redundancy. Thanks, Cindy On 10/18/11 08:52, Paul Kraus wrote:> On Tue, Oct 18, 2011 at 9:38 AM, Gregory Shaw <greg.shaw at oracle.com> wrote: > >> Another item that made me nervous was my experience with ZFS. Even when >> called ''ready for production'', a number of bugs were found that were pretty nasty. >> They''ve since been fixed (years ago), but there were some surprises there that I''d >> rather not encounter on a Linux system. > > I know that I have been really spoiled by UFS. It has been around > for so long that it has been really optimized, some might even say, > optimized beyond the point of diminishing returns :-) UFS is amazingly > and has very reasonable performance, given it''s roots. I did not have > to live through the early days of UFS and the pain of finding bugs. I > _am_ living through that with ZFS :-( > > Having said that, I have yet to have any data loss with ZFS. I > have developed a number of simple rules I follow with ZFS: > > 1. OS and DATA go on different zpools on different physical drives (if > at all possible) > > 2. Do NOT move drives around without first exporting any zpools on those drives. > > 3. Do NOT let a system see drives with more than one OS zpool at the > same time (I know you _can_ do this safely, but I have seen too many > horror stories on this list that I just avoid it). >
On Tue, Oct 18, 2011 at 12:53 PM, Cindy Swearingen <cindy.swearingen at oracle.com> wrote:> Your 1-3 is very sensible adviceUnfortunately, I don''t think I have ever seen the recommendations I made stated quite so plainly.>and I must ask about this > statement:>>I have yet to have any data loss with ZFS. > > Maybe this goes without saying, but I think you are using > ZFS redundancy.Yes. I felt I needed to make that statement about ZFS because there _have_ been incidents recounted here on this list where people _have_ lost data. I have also seen data loss from HW RAID failures, through no fault of the people who configured the storage, but through corner cases the manufacturer did not account for. I have also had an fsck of a UFS come back with lost data. If you work in this business long enough and in enough varied environments you will see some data loss. So far, ZFS is one of the technologies that has not let me down. Of course, in some cases it has taken weeks if not months to resolve or work around a "bug" in the code, but in all cases the data was recovered. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players
> 3. Do NOT let a system see drives with more than one OS zpool at the > same time (I know you _can_ do this safely, but I have seen too many > horror stories on this list that I just avoid it). >Can you elaborate #3? In what situation will it happen? Thanks. Fred
On Fri, Oct 21, 2011 at 8:02 PM, Fred Liu <Fred_Liu at issi.com> wrote:> >> 3. Do NOT let a system see drives with more than one OS zpool at the >> same time (I know you _can_ do this safely, but I have seen too many >> horror stories on this list that I just avoid it). >> > > Can you elaborate #3? In what situation will it happen?Some people have trained their fingers to use the -f option on every command that supports it to force the operation. For instance, how often do you do rm -rf vs. rm -r and answer questions about every file? If various zpool commands (import, create, replace, etc.) are used against the wrong disk with a force option, you can clobber a zpool that is in active use by another system. In a previous job, my lab environment had a bunch of LUNs presented to multiple boxes. This was done for convenience in an environment where there would be little impact if an errant command were issued. I''d never do that in production without some form of I/O fencing in place. -- Mike Gerdts http://mgerdts.blogspot.com/
Recently someone posted to this list of that _exact_ situation, they loaded an OS to a pair of drives while a pair of different drives containing an OS were still attached. The zpool on the first pair ended up not being able to be imported, and were corrupted. I can post more info when I am back in the office on Monday. On Friday, October 21, 2011, Fred Liu <Fred_Liu at issi.com> wrote:> >> 3. Do NOT let a system see drives with more than one OS zpool at the >> same time (I know you _can_ do this safely, but I have seen too many >> horror stories on this list that I just avoid it). >> > > Can you elaborate #3? In what situation will it happen? > > > Thanks. > > Fred >-- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20111022/76283cb3/attachment.html>
Fajar A. Nugraha
2011-Oct-22 07:03 UTC
[zfs-discuss] FS Reliability WAS: about btrfs and zfs
On Sat, Oct 22, 2011 at 11:36 AM, Paul Kraus <paul at kraus-haus.org> wrote:> Recently someone posted to this list of that _exact_ situation, they loaded > an OS to a pair of drives while a pair of different drives containing an OS > were still attached. The zpool on the first pair ended up not being able to > be imported, and were corrupted. I can post more info when I am back in the > office on Monday.That is nasty :P I wonder if Darik''s approach for zfsonlinux is better. In Ubuntu''s (currently unofficial) zfs root support, the startup script force-imports rpool (or whatever pool the user specifies on kernel command line, if explicitly specified), and drops to a rescue shell if there are more than one pool with the same name. This means: - no problem with pools previously imported in another system - no corruption due to duplicate pool name, as when that happens the user needs to manually take action to import the correct pool by id - the other pool remains untouched, and (if necessary) the user can reimport it under different name -- Fajar> > On Friday, October 21, 2011, Fred Liu <Fred_Liu at issi.com> wrote: >> >>> 3. Do NOT let a system see drives with more than one OS zpool at the >>> same time (I know you _can_ do this safely, but I have seen too many >>> horror stories on this list that I just avoid it). >>> >> >> Can you elaborate #3? In what situation will it happen? >> >> >> Thanks. >> >> Fred >>
On Sat, Oct 22, 2011 at 12:36 AM, Paul Kraus <paul at kraus-haus.org> wrote:> Recently someone posted to this list of that _exact_ situation, they loaded > an OS to a pair of drives while a pair of different drives containing an OS > were still attached. The zpool on the first pair ended up not being able to > be imported, and were corrupted. I can post more info when I am back in the > office on Monday.See the thread started on Tue, Aug 2, 2011 at 12:23 PM with a Subject of "[zfs-discuss] Wrong rpool used after reinstall!", the followups, and at least one additional related thread. While I agree that you _should_ be able to have multiple unrelated boot environments on hard drives at once, it seems prudent to me to NOT do such. I assume you _can_ manage multiple ZFS based boot environments using Live Upgrade (or whatever has replaced it in 11). NOTE that I have not done such (managed multiple ZFS boot environments with Live Upgrade), but I ASSUME you can. I suspect that the "root" of this potential problem is in the ZFS boot code and the use of the same zpool name for multiple zpools at once. By having the boot loader use the zpool directly you get the benefit of having the redundancy of ZFS much earlier in the boot process (the only thing that appears to load off of a single drive is the boot loader, everything from there on loads from the mirrored zpool, at least on my NCP 3 system, my first foray into ZFS root). The danger is that if there are multiple zpools with the same (required) name, then the boot loader may become confused, especially if drives get physically moved around. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players
Paul, Thanks. I understand now. Fred> -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Paul Kraus > Sent: ???, ?? 24, 2011 22:38 > To: ZFS Discussions > Subject: Re: [zfs-discuss] FS Reliability WAS: about btrfs and zfs > > On Sat, Oct 22, 2011 at 12:36 AM, Paul Kraus <paul at kraus-haus.org> > wrote: > > > Recently someone posted to this list of that _exact_ situation, they > loaded > > an OS to a pair of drives while a pair of different drives containing > an OS > > were still attached. The zpool on the first pair ended up not being > able to > > be imported, and were corrupted. I can post more info when I am back > in the > > office on Monday. > > See the thread started on Tue, Aug 2, 2011 at 12:23 PM with a > Subject of "[zfs-discuss] Wrong rpool used after reinstall!", the > followups, and at least one additional related thread. > > While I agree that you _should_ be able to have multiple unrelated > boot environments on hard drives at once, it seems prudent to me to > NOT do such. I assume you _can_ manage multiple ZFS based boot > environments using Live Upgrade (or whatever has replaced it in 11). > NOTE that I have not done such (managed multiple ZFS boot environments > with Live Upgrade), but I ASSUME you can. > > I suspect that the "root" of this potential problem is in the ZFS > boot code and the use of the same zpool name for multiple zpools at > once. By having the boot loader use the zpool directly you get the > benefit of having the redundancy of ZFS much earlier in the boot > process (the only thing that appears to load off of a single drive is > the boot loader, everything from there on loads from the mirrored > zpool, at least on my NCP 3 system, my first foray into ZFS root). The > danger is that if there are multiple zpools with the same (required) > name, then the boot loader may become confused, especially if drives > get physically moved around. > > -- > {--------1---------2---------3---------4---------5---------6---------7- > --------} > Paul Kraus > -> Senior Systems Architect, Garnet River > ( http://www.garnetriver.com/ ) > -> Sound Coordinator, Schenectady Light Opera Company ( > http://www.sloctheater.org/ ) > -> Technical Advisor, RPI Players > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> Some people have trained their fingers to use the -f option on every > command that supports it to force the operation. For instance, how > often do you do rm -rf vs. rm -r and answer questions about every > file? > > If various zpool commands (import, create, replace, etc.) are used > against the wrong disk with a force option, you can clobber a zpool > that is in active use by another system. In a previous job, my lab > environment had a bunch of LUNs presented to multiple boxes. This was > done for convenience in an environment where there would be little > impact if an errant command were issued. I''d never do that in > production without some form of I/O fencing in place. >I also have that habit. And It is good practice to bear in mind. Thanks. Fred
On Fri, Oct 21, 2011 at 9:33 PM, Mike Gerdts <mgerdts at gmail.com> wrote:> Some people have trained their fingers to use the -f option on every > command that supports it to force the operation. ?For instance, how > often do you do rm -rf vs. rm -r and answer questions about every > file?The last time I tried it, the SUN / ORACLE java webconsole plugin for ZFS used the -f option on every command issued, including zpool create and zpool import ... very dangerous in my opinion, which is part of the reason I don''t use that interface. -- {--------1---------2---------3---------4---------5---------6---------7---------} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players