A brief search didn''t show anything relevant, so here goes: Would it be feasible to support a scrub per-filesystem rather than per-pool? The reason is that on a large system, a scrub of a pool can take excessively long (and, indeed, may never complete). Running a scrub on each filesystem allows it to be broken up into smaller chunks, which would be much easier to arrange. (For example, I could scrub one filesystem a night and not have it run into working hours.) Another reason might be that I have both busy and quiet filesystems. For the busy ones, they''re regularly backed up, and the data regularly read anyway; for the quiet ones they''re neither read nor backed up, so it would be nice to be able to validate those. -- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/
Peter, That''s a great suggestion. And as fortune would have it, we have the code to do it already. Scrubbing in ZFS is driven from the logical layer, not the physical layer. When you scrub a pool, you''re really just scrubbing the pool-wide metadata, then scrubbing each filesystem. At 50,000 feet, it''s as simple as adding a zfs(1M) scrub subcommand and having it invoke the already-existing DMU traverse interface. Closer to ground, there are a few details to work out -- we need an option to specify whether to include snapshots, whether to descend recursively (in the case of nested filesystems), and how to handle branch points (which are created by clones). Plus we need some way to name the MOS (meta-object set, which is where we keep all pool metadata) so you can ask to scrub only that. Sounds like a nice tidy project for a summer intern! Jeff On Sat, Mar 29, 2008 at 05:14:20PM +0000, Peter Tribble wrote:> A brief search didn''t show anything relevant, so here > goes: > > Would it be feasible to support a scrub per-filesystem > rather than per-pool? > > The reason is that on a large system, a scrub of a pool can > take excessively long (and, indeed, may never complete). > Running a scrub on each filesystem allows it to be broken > up into smaller chunks, which would be much easier to > arrange. (For example, I could scrub one filesystem a > night and not have it run into working hours.) > > Another reason might be that I have both busy and > quiet filesystems. For the busy ones, they''re regularly > backed up, and the data regularly read anyway; for the > quiet ones they''re neither read nor backed up, so it > would be nice to be able to validate those. > > -- > -Peter Tribble > http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
I would be very happy having a filesystem based zfs scrub We have a 18TB big zpool, it takes more then 2 days to do the scrub. Since we cannot take snapshots during the scrub, this is unacceptable Kristof This message posted from opensolaris.org
kristof wrote:> I would be very happy having a filesystem based zfs scrub > > We have a 18TB big zpool, it takes more then 2 days to do the scrub. > > Since we cannot take snapshots during the scrub, this is unacceptableWe have recently discovered the same issue on one of our internal build machines. We have a daily bringover of the Teamware onnv-gate that is snapshoted when it completes and as such we can never run a full scrub. Given some of our storage is reaching (or past) EOSL I really want to be able to scrub the important datasets (ie all those other than the clones of onnv). -- Darren J Moffat
On Mar 31, 2008, at 10:41 AM, kristof wrote:> I would be very happy having a filesystem based zfs scrub > > We have a 18TB big zpool, it takes more then 2 days to do the scrub. > > Since we cannot take snapshots during the scrub, this is unacceptableWhile per-dataset scrubbing would certainly be a coarse-grained solution to your problem, work is underway to address the problematic interaction between scrubs and snapshots. Adam -- Adam Leventhal, Fishworks http://blogs.sun.com/ahl
zfs-discuss-bounces at opensolaris.org wrote on 04/01/2008 04:25:39 AM:> kristof wrote: > > I would be very happy having a filesystem based zfs scrub > > > > We have a 18TB big zpool, it takes more then 2 days to do the scrub. > > > > Since we cannot take snapshots during the scrub, this is unacceptable > > We have recently discovered the same issue on one of our internal build > machines. We have a daily bringover of the Teamware onnv-gate that is > snapshoted when it completes and as such we can never run a full scrub. > Given some of our storage is reaching (or past) EOSL I really want to > be able to scrub the important datasets (ie all those other than the > clones of onnv). > > -- > Darren J MoffatAye, or better yet -- give the scrub/resilver/snap reset issue fix very high priority. As it stands snapshots are impossible when you need to resilver and scrub (even on supposedly sun supported thumper configs). -Wade Stuart
>> We have recently discovered the same issue on one of our internal build >> machines. We have a daily bringover of the Teamware onnv-gate that is >> snapshoted when it completes and as such we can never run a full scrub. >> Given some of our storage is reaching (or past) EOSL I really want to >> be able to scrub the important datasets (ie all those other than the >> clones of onnv). > > Aye, or better yet -- give the scrub/resilver/snap reset issue fix very > high priority. As it stands snapshots are impossible when you need to > resilver and scrub (even on supposedly sun supported thumper configs).Since the scrub walks down the metadata tree and that filesystem definitions are somewhere on top of that tree, it shouldn''t be too hard to make the scrub start from that point instead of the uberblock, or not? -mg
> Aye, or better yet -- give the scrub/resilver/snap reset issue fix very > high priority. As it stands snapshots are impossible when you need to > resilver and scrub (even on supposedly sun supported thumper configs).No argument. One of our top engineers is working on this as we speak. I say we all buy him a drink when he integrates the fix. Jeff
Jeff Bonwick <Jeff.Bonwick at sun.com> wrote on 04/05/2008 01:33:05 AM:> > Aye, or better yet -- give the scrub/resilver/snap reset issue fixvery> > high priority. As it stands snapshots are impossible when you need to > > resilver and scrub (even on supposedly sun supported thumper configs). > > No argument. One of our top engineers is working on this as we speak. > I say we all buy him a drink when he integrates the fix. > > JeffLet me know where to send the 6 pack. -Wade
Jeff, On Mon, Mar 31, 2008 at 9:01 AM, Jeff Bonwick <Jeff.Bonwick at sun.com> wrote:> Peter, > > That''s a great suggestion. And as fortune would have it, we have the > code to do it already. Scrubbing in ZFS is driven from the logical > layer, not the physical layer. When you scrub a pool, you''re really > just scrubbing the pool-wide metadata, then scrubbing each filesystem.Thanks for the encouraging response. I was hoping that it would be little more than starting the traversal in the correct place! I''ve logged CR 6685106 to cover this request.> At 50,000 feet, it''s as simple as adding a zfs(1M) scrub subcommand > and having it invoke the already-existing DMU traverse interface. > > Closer to ground, there are a few details to work out -- we need an > option to specify whether to include snapshots, whether to descend > recursively (in the case of nested filesystems), and how to handle > branch points (which are created by clones). Plus we need some way > to name the MOS (meta-object set, which is where we keep all pool > metadata) so you can ask to scrub only that.Devil''s in the details and all that...> Sounds like a nice tidy project for a summer intern! > > Jeff > > > > On Sat, Mar 29, 2008 at 05:14:20PM +0000, Peter Tribble wrote: > > A brief search didn''t show anything relevant, so here > > goes: > > > > Would it be feasible to support a scrub per-filesystem > > rather than per-pool? > > > > The reason is that on a large system, a scrub of a pool can > > take excessively long (and, indeed, may never complete). > > Running a scrub on each filesystem allows it to be broken > > up into smaller chunks, which would be much easier to > > arrange. (For example, I could scrub one filesystem a > > night and not have it run into working hours.) > > > > Another reason might be that I have both busy and > > quiet filesystems. For the busy ones, they''re regularly > > backed up, and the data regularly read anyway; for the > > quiet ones they''re neither read nor backed up, so it > > would be nice to be able to validate those. > > > > -- > > -Peter Tribble > > http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/ > > _______________________________________________ > > zfs-discuss mailing list > > zfs-discuss at opensolaris.org > > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- -Peter Tribble http://www.petertribble.co.uk/ - http://ptribble.blogspot.com/