David Dyer-Bennet
2009-Jan-23 15:26 UTC
[zfs-discuss] Is scrubbing "safe" in 101b? (OpenSolaris 2008.11)
I thought I''d noticed that my crashes tended to occur when I was running a scrub, and saw at least one open bug that was scrub-related that could cause such a crash. However, I eventually tracked my problem down (as it got worse) to a bad piece of memory (been nearly a week since I replaced the memory, and no more problems). Which leaves me wondering, how safe is running a scrub? Scrub is one of the things that made ZFS so attractive to me, and my automatic reaction when I first hook up the data disks during a recovery is "run a scrub!". -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Casper.Dik at Sun.COM
2009-Jan-23 15:52 UTC
[zfs-discuss] Is scrubbing "safe" in 101b? (OpenSolaris 2008.11)
>I thought I''d noticed that my crashes tended to occur when I was running a >scrub, and saw at least one open bug that was scrub-related that could >cause such a crash. However, I eventually tracked my problem down (as it >got worse) to a bad piece of memory (been nearly a week since I replaced >the memory, and no more problems).I had a problem and I think that it was a bad motherboard; it too panic''ed during scrub and it even said that "scrub finished" (it went from 50% to finished, immediately). I replaced the system (motherboard, harddisk) and I re-ran scrub; no problem with scrub that time but it took the amount it should have taken.>Which leaves me wondering, how safe is running a scrub? Scrub is one of >the things that made ZFS so attractive to me, and my automatic reaction >when I first hook up the data disks during a recovery is "run a scrub!".If your memory is bad, anything can happen. A scrub can rewrite bad data; but it can be the case that the disk is fine but the memory is bad. Then, if the data is replicated it can be copied and rewritten; it is then possible to write incorrect data (and if they need to recompute the checksum, then oops) Casper
David Dyer-Bennet
2009-Jan-23 16:32 UTC
[zfs-discuss] Is scrubbing "safe" in 101b? (OpenSolaris 2008.11)
On Fri, January 23, 2009 09:52, Casper.Dik at Sun.COM wrote:>>Which leaves me wondering, how safe is running a scrub? Scrub is one of >>the things that made ZFS so attractive to me, and my automatic reaction >>when I first hook up the data disks during a recovery is "run a scrub!". > > > If your memory is bad, anything can happen. A scrub can rewrite bad > data; but it can be the case that the disk is fine but the memory is > bad. Then, if the data is replicated it can be copied and rewritten; > it is then possible to write incorrect data (and if they need to recompute > the checksum, then oops)The memory was ECC, so it *should* have mostly detected problems early enough to avoid writing bad data. And so far nothing has been detected as bad in the pool during light use. But I haven''t yet run a scrub since fixing the memory, so I have no idea what horrors may be lurking in wait. The pool is two mirror vdevs, and then I have two backups on external hard drives, and then I have two sets of optical disks of the photos, one of them off-site (I''d lose several months of photos if I had to fall back to the optical disks, I''m a bit behind there). So I''m not yet in great fear of actually losing anything, and have very little risk of actually losing a LOT. But what I''m wondering is, are there known bugs in 101b that make scrubbing inadvisable with that code? I''d love to *find out* what horrors may be lurking. -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info
Glenn Lagasse
2009-Jan-23 18:01 UTC
[zfs-discuss] Is scrubbing "safe" in 101b? (OpenSolaris 2008.11)
* David Dyer-Bennet (dd-b at dd-b.net) wrote:> > On Fri, January 23, 2009 09:52, Casper.Dik at Sun.COM wrote: > > >>Which leaves me wondering, how safe is running a scrub? Scrub is one of > >>the things that made ZFS so attractive to me, and my automatic reaction > >>when I first hook up the data disks during a recovery is "run a scrub!". > > > > > > If your memory is bad, anything can happen. A scrub can rewrite bad > > data; but it can be the case that the disk is fine but the memory is > > bad. Then, if the data is replicated it can be copied and rewritten; > > it is then possible to write incorrect data (and if they need to recompute > > the checksum, then oops) > > The memory was ECC, so it *should* have mostly detected problems early > enough to avoid writing bad data. And so far nothing has been detected as > bad in the pool during light use. But I haven''t yet run a scrub since > fixing the memory, so I have no idea what horrors may be lurking in wait. > > The pool is two mirror vdevs, and then I have two backups on external hard > drives, and then I have two sets of optical disks of the photos, one of > them off-site (I''d lose several months of photos if I had to fall back to > the optical disks, I''m a bit behind there). So I''m not yet in great fear > of actually losing anything, and have very little risk of actually losing > a LOT. > > But what I''m wondering is, are there known bugs in 101b that make > scrubbing inadvisable with that code? I''d love to *find out* what horrors > may be lurking.There''s nothing in the release notes for 2008.11 (based on 101b) about issues running scrub. I''ve been using 101b for some time now and haven''t seen or heard of any issues running scrub. There''s always bugs. But I''m pretty certain there isn''t a known ''zfs scrub is inadvisable under any and all conditions'' bug laying about. I''ve certainly not heard of such a thing (and it would be pretty big news for 2008.11 if true). Cheers, -- Glenn
David Dyer-Bennet
2009-Jan-23 18:30 UTC
[zfs-discuss] Is scrubbing "safe" in 101b? (OpenSolaris 2008.11)
On Fri, January 23, 2009 12:01, Glenn Lagasse wrote:> * David Dyer-Bennet (dd-b at dd-b.net) wrote:>> But what I''m wondering is, are there known bugs in 101b that make >> scrubbing inadvisable with that code? I''d love to *find out* what >> horrors >> may be lurking. > > There''s nothing in the release notes for 2008.11 (based on 101b) about > issues running scrub. I''ve been using 101b for some time now and > haven''t seen or heard of any issues running scrub. > > There''s always bugs. But I''m pretty certain there isn''t a known ''zfs > scrub is inadvisable under any and all conditions'' bug laying about. > I''ve certainly not heard of such a thing (and it would be pretty big > news for 2008.11 if true).Thanks. I appreciate the qualifications, and agree that there are, indeed, always bugs. Okay; I think I''ll make one last check of my own through the buglist (so it won''t be your fault if I muck up my data :-)), and if nothing turns up that scares me I will run the scrub and see what happens. Hmmm; maybe update ONE of the two backups first. It''s at times like this when I really miss relatively cheap backup tapes (current tapes that I''ve seen aren''t relatively cheap). -- David Dyer-Bennet, dd-b at dd-b.net; http://dd-b.net/ Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/ Photos: http://dd-b.net/photography/gallery/ Dragaera: http://dragaera.info