Can btrfs deal reasonably gracefully with sudden shutdowns? (I''m mainly thinking of power outages which lead to logical structure damage but not physical media damage.) What would be the risk points, file-system-wise? Can for example a rotating snapshot schedule mitigate some or all issues relating to sudden shutdowns, if any? (_For example_, take a snapshot every minute, keeping the last five; if the main file system fails to mount, then could the most recent usable snapshot be used as a fallback, or is it likely to be equally damaged or inconsistent?) Obviously a UPS or other form of fallback power is preferable to no UPS if power outages are a concern, so as to allow a controlled system shutdown (or fail-over to a more long-term backup power supply) in the event of a prolonged power outage, but I''m wondering about situations where such don''t exist or even fail. -- Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se “People who think they know everything really annoy those of us who know we don’t.” (Bjarne Stroustrup) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 06, 2012 at 12:33:08PM +0000, Michael Kjörling wrote:> Can btrfs deal reasonably gracefully with sudden shutdowns? (I''m > mainly thinking of power outages which lead to logical structure > damage but not physical media damage.)In theory (i.e. by the design of the FS), you should be able to pull the plug on btrfs at any point, and the FS will always be consistent. This makes some assumptions: That writing a single page to the FS is atomic. That the hardware reports barriers to the OS reliably. i.e. if the hardware says it''s fully stored data without losing it, then it actually has. There are also some caveats: while the FS should always be consistent, the latest transaction write may not have been completed, so you could potentially lose up to 30 seconds of writes to the FS from immediately before the crash. If the FS does corrupt over a power failure, and the hardware can be demonstrated to be good, then we have a bug that needs to be tracked down. (There have been a number of these over the development of the FS so far, but they do get fixed).> What would be the risk points, file-system-wise? > > Can for example a rotating snapshot schedule mitigate some or all > issues relating to sudden shutdowns, if any? (_For example_, take a > snapshot every minute, keeping the last five; if the main file system > fails to mount, then could the most recent usable snapshot be used as > a fallback, or is it likely to be equally damaged or inconsistent?)No, snapshots give you no additional guarantees -- if the FS corrupts and is unmountable, a snapshot is part of the same FS and will also be unmountable.> Obviously a UPS or other form of fallback power is preferable to no > UPS if power outages are a concern, so as to allow a controlled system > shutdown (or fail-over to a more long-term backup power supply) in the > event of a prolonged power outage, but I''m wondering about situations > where such don''t exist or even fail.As I said above, the FS structures _should_ be completely reliable in the face of power loss; that they haven''t been in the past is definitely a bug, and those bugs have been / are being fixed as they''re found. We''ve had very few transid match failures recently, which used to be the main failure mode for these bugs. I don''t know whether that''s because people aren''t reporting them, or because they''re not happening nearly so often these days. I suspect the latter. I guess the question for you is: are you after the _expected_ behaviour of the FS (should always be consistent on good hardware, but you may lose up to 30 seconds of writes), or are you after mitigation strategies in the face of FS bugs (keep off-site backups and be prepared to use them)? Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- emacs: Eighty Megabytes And Constantly Swapping. ---
On Tue, Nov 06, 2012 at 12:33:08PM +0000, Michael Kjörling wrote:> Can btrfs deal reasonably gracefully with sudden shutdowns? (I''m > mainly thinking of power outages which lead to logical structure > damage but not physical media damage.) >AFAIK, yes, because btrfs is naturally COW supported, which means you can roll back to the latest stable situation at least.> What would be the risk points, file-system-wise? >Data loss is possible if you''re not writing with O_SYNC or doing fsync after a write.> Can for example a rotating snapshot schedule mitigate some or all > issues relating to sudden shutdowns, if any? (_For example_, take a > snapshot every minute, keeping the last five; if the main file system > fails to mount, then could the most recent usable snapshot be used as > a fallback, or is it likely to be equally damaged or inconsistent?) >In your case, when we finish creating a snapshot, the whole FS is at a stable status(both metadata and data is safely written into the disk). So yes, you can use the latest snapshot as a fallback or backup or something. I''d note here, btrfs somewhat suffers from ENOSPC cases, where it may recover itself or get you into readonly state, but you data is safe at least. thanks, liubo> Obviously a UPS or other form of fallback power is preferable to no > UPS if power outages are a concern, so as to allow a controlled system > shutdown (or fail-over to a more long-term backup power supply) in the > event of a prolonged power outage, but I''m wondering about situations > where such don''t exist or even fail. > > -- > Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se > “People who think they know everything really annoy > those of us who know we don’t.” (Bjarne Stroustrup) > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 6 Nov 2012 12:48 +0000, from hugo@carfax.org.uk (Hugo Mills):> There are also some caveats: while the FS should always be > consistent, the latest transaction write may not have been completed, > so you could potentially lose up to 30 seconds of writes to the FS > from immediately before the crash.I''d rather lose the most recent 30 seconds of writes but have a consistent file system with as-consistent-as-can-be-expected data, than end up with a corrupted file system. On that note; can this value be tuned currently, is it hardcoded, or is it stored in metadata somewhere but the tooling to tune it is not yet available?> If the FS does corrupt over a power failure, and the hardware can > be demonstrated to be good, then we have a bug that needs to be > tracked down. (There have been a number of these over the development > of the FS so far, but they do get fixed).Is there a simple way to tell ahead of time whether the hardware meets the assumptions made by the file system with regards to write barriers etc.?> I guess the question for you is: are you after the _expected_ > behaviour of the FS (should always be consistent on good hardware, but > you may lose up to 30 seconds of writes), or are you after mitigation > strategies in the face of FS bugs (keep off-site backups and be > prepared to use them)?I already have full, daily on-site backups on an external drive that is logically unmounted except for when backups are running, as well as partial off-site backups to cloud storage - and of course, taking advantage of btrfs''s snapshotting support there is no real reason why I couldn''t increase the backup frequency while retaining data consistency. Losing half a minute of writes is fairly inconsequential for personal use as long as the file system remains consistent, and in the face of disastrous corruption it is at least possible to do a full restore to bare metal from rescue media and backup without losing too much. Not trivial time-wise (that''s currently 1.4 TB over USB 2.0), but possible. -- Michael Kjörling • http://michael.kjorling.se • michael@kjorling.se “People who think they know everything really annoy those of us who know we don’t.” (Bjarne Stroustrup) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 06, 2012 at 01:47:02PM +0000, Michael Kjörling wrote:> On 6 Nov 2012 12:48 +0000, from hugo@carfax.org.uk (Hugo Mills): > > There are also some caveats: while the FS should always be > > consistent, the latest transaction write may not have been completed, > > so you could potentially lose up to 30 seconds of writes to the FS > > from immediately before the crash. > > I''d rather lose the most recent 30 seconds of writes but have a > consistent file system with as-consistent-as-can-be-expected data, > than end up with a corrupted file system. > > On that note; can this value be tuned currently, is it hardcoded, or > is it stored in metadata somewhere but the tooling to tune it is not > yet available?As far as I understand, no, it''s hard-coded.> > If the FS does corrupt over a power failure, and the hardware can > > be demonstrated to be good, then we have a bug that needs to be > > tracked down. (There have been a number of these over the development > > of the FS so far, but they do get fixed). > > Is there a simple way to tell ahead of time whether the hardware meets > the assumptions made by the file system with regards to write barriers > etc.?"Most" hardware does. I think there''s a "barriers disabled" warning in the kernel logs on mounting the FS, and some time ago there were rumours of a tool to check for it (from Red Hat, but I don''t know if it ever saw the light of day). That''s all for the case where the hardware explicitly states that it doesn''t support barriers. More concerning is the out-of-spec hardware which claims to support barriers and utterly fails to do so. I don''t think there''s much you can do to detect that case, other than force failures and try to catch it out -- then return it to the manufacturer under whatever consumer protection laws you have, on the grounds that it''s no fit for purpose. I think the number of actual such hard disks that do this is fairly small, but they are out there. I''m not aware of a blacklist/quirks list for them.> > I guess the question for you is: are you after the _expected_ > > behaviour of the FS (should always be consistent on good hardware, but > > you may lose up to 30 seconds of writes), or are you after mitigation > > strategies in the face of FS bugs (keep off-site backups and be > > prepared to use them)? > > I already have full, daily on-site backups on an external drive that > is logically unmounted except for when backups are running, as well as > partial off-site backups to cloud storage - and of course, taking > advantage of btrfs''s snapshotting support there is no real reason why > I couldn''t increase the backup frequency while retaining data > consistency. Losing half a minute of writes is fairly inconsequential > for personal use as long as the file system remains consistent, and in > the face of disastrous corruption it is at least possible to do a full > restore to bare metal from rescue media and backup without losing too > much. Not trivial time-wise (that''s currently 1.4 TB over USB 2.0), > but possible.OK, so I hope I''ve managed to answer your question satisfactorily. Let us know if there''s any outstanding queries you want cleared up. :) Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk == PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- "I will not be pushed, filed, stamped, indexed, briefed, --- debriefed or numbered. My life is my own."
Michael Kjörling <michael <at> kjorling.se> writes:> > Can btrfs deal reasonably gracefully with sudden shutdowns? (I''m > mainly thinking of power outages which lead to logical structure > damage but not physical media damage.) >Really rather well! We''ve had a sequence of power-cuts around here and I''ve scrubbed each time, finding only one corruption over all which was fixed by the scrub and no data lost. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html