We all know that data corruption may happen, even on the most reliable of hardware. That''s why zfs har pool scrubbing. Could we introduce a zpool option (as in zpool set <optionname> <pool>) for "scrub period", in "number of hours" (with 0 being no automatic scrubbing). I see several modern raidcontrollers (such as the LSI Megaraid MFI line) has such features (called "patrol reads") already built into them. Why should zfs have the same? Having the zpool automagically handling this (probably a good thing to default it on 168 hours or one week) would also mean that the scrubbing feature is independent from cron, and since scrub already has lower priority than ... actual work, it really shouldn''t annoy anybody (except those having their server under their bed). Of course I''m more than willing to stand corrected if someone can tell me where this is already implemented, or why it''s not needed. Proper flames over this should start with a "warning, flame" header, so I can don my asbestos longjohns. ;) //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On Sat, Mar 20, 2010 at 4:07 PM, Svein Skogen <svein at stillbilde.net> wrote:> We all know that data corruption may happen, even on the most reliable of > hardware. That''s why zfs har pool scrubbing. > > Could we introduce a zpool option (as in zpool set <optionname> <pool>) for > "scrub period", in "number of hours" (with 0 being no automatic scrubbing). > > I see several modern raidcontrollers (such as the LSI Megaraid MFI line) > has such features (called "patrol reads") already built into them. Why > should zfs have the same? Having the zpool automagically handling this > (probably a good thing to default it on 168 hours or one week) would also > mean that the scrubbing feature is independent from cron, and since scrub > already has lower priority than ... actual work, it really shouldn''t annoy > anybody (except those having their server under their bed). > > Of course I''m more than willing to stand corrected if someone can tell me > where this is already implemented, or why it''s not needed. Proper flames > over this should start with a "warning, flame" header, so I can don my > asbestos longjohns. ;) >That would add unnecessary code to the ZFS layer for something that cron can handle in one line. Someone could hack zfs.c to automatically handle editing the crontab but I don''t know if it''s worth the effort. Are you worried that cron will fail or is it just an aesthetic requirement ? -- Giovanni -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100320/fa0e53f5/attachment.html>
On 20.03.2010 20:53, Giovanni Tirloni wrote:> On Sat, Mar 20, 2010 at 4:07 PM, Svein Skogen <svein at stillbilde.net > <mailto:svein at stillbilde.net>> wrote: > > We all know that data corruption may happen, even on the most > reliable of hardware. That''s why zfs har pool scrubbing. > > Could we introduce a zpool option (as in zpool set <optionname> > <pool>) for "scrub period", in "number of hours" (with 0 being no > automatic scrubbing). > > I see several modern raidcontrollers (such as the LSI Megaraid MFI > line) has such features (called "patrol reads") already built into > them. Why should zfs have the same? Having the zpool automagically > handling this (probably a good thing to default it on 168 hours or > one week) would also mean that the scrubbing feature is independent > from cron, and since scrub already has lower priority than ... > actual work, it really shouldn''t annoy anybody (except those having > their server under their bed). > > Of course I''m more than willing to stand corrected if someone can > tell me where this is already implemented, or why it''s not needed. > Proper flames over this should start with a "warning, flame" header, > so I can don my asbestos longjohns. ;) > > > That would add unnecessary code to the ZFS layer for something that cron > can handle in one line.It would add some code, but it could quite possibly reside in the same area that already handles automatic rebuilds (for hotspares). The reason I''m thinking it belongs there, is that ZFS has a rather good counter at "how many hours of runtime", while cron only has knowledge about wall time.> Someone could hack zfs.c to automatically handle editing the crontab but > I don''t know if it''s worth the effort.This would be a possible workaround, but there are ... several implementations of cron out there with more than one syntax...> Are you worried that cron will fail or is it just an aesthetic requirement ?No, I''m thinking more in the lines of "zfs could be ported to pure storage boxes that don''t really need a lot of other daemons running" (ZFS and cronstar with a decent management frontend would beat a _LOT_ of the cheap NAS/SAN boxes out there). Besides, I don''t like "relying on external software" for filesystem-services. ;) (And you can call that aesthetic if you like) //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On Mar 20, 2010, at 12:07 PM, Svein Skogen wrote:> We all know that data corruption may happen, even on the most reliable of hardware. That''s why zfs har pool scrubbing. > > Could we introduce a zpool option (as in zpool set <optionname> <pool>) for "scrub period", in "number of hours" (with 0 being no automatic scrubbing).Currently you can do this with cron, of course (or at). The ZFS-based appliances in the market offer simple ways to manage such jobs -- NexentaStor, Oracle''s Sun OpenStorage, etc.> I see several modern raidcontrollers (such as the LSI Megaraid MFI line) has such features (called "patrol reads") already built into them. Why should zfs have the same? Having the zpool automagically handling this (probably a good thing to default it on 168 hours or one week) would also mean that the scrubbing feature is independent from cron, and since scrub already has lower priority than ... actual work, it really shouldn''t annoy anybody (except those having their server under their bed). > > Of course I''m more than willing to stand corrected if someone can tell me where this is already implemented, or why it''s not needed. Proper flames over this should start with a "warning, flame" header, so I can don my asbestos longjohns. ;)Prepare your longjohns! Ha! Just kidding... the solution exists, just turn it on. And remember the UNIX philosophy. http://en.wikipedia.org/wiki/Unix_philosophy -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
I''m not sure I like this at all. Some of my pools take hours to scrub. I have a cron job run scrubs in sequence... Start one pool''s scrub and then poll until it''s finished, start the next and wait, and so on so I don''t create too much load and bring all I/O to a crawl. The job is launched once a week, so the scrubs have plenty of time to finish. :) Scrubs every hour? Some of my pools would be in continuous scrub. -- This message posted from opensolaris.org
On Sat, Mar 20, 2010 at 4:00 PM, Richard Elling <richard.elling at gmail.com>wrote:> On Mar 20, 2010, at 12:07 PM, Svein Skogen wrote: > > We all know that data corruption may happen, even on the most reliable of > hardware. That''s why zfs har pool scrubbing. > > > > Could we introduce a zpool option (as in zpool set <optionname> <pool>) > for "scrub period", in "number of hours" (with 0 being no automatic > scrubbing). > > Currently you can do this with cron, of course (or at). The ZFS-based > appliances > in the market offer simple ways to manage such jobs -- NexentaStor, > Oracle''s Sun > OpenStorage, etc. >Right, but I rather agree with Svein. It would be nice to have it integrated. I would argue at the very least, it should become an integrated service much like auto-snapshot (which could/was also done from cron). Doing a basic cron means if you have lots of pools, you might start triggering several scrubs at the same time, which may or may not crush the system with I/O load. So the answer is "well then query to see if the last scrub is done", and suddenly we''ve gone from a simple cron job to custom scripting based on what could be a myriad of variables.> > > I see several modern raidcontrollers (such as the LSI Megaraid MFI line) > has such features (called "patrol reads") already built into them. Why > should zfs have the same? Having the zpool automagically handling this > (probably a good thing to default it on 168 hours or one week) would also > mean that the scrubbing feature is independent from cron, and since scrub > already has lower priority than ... actual work, it really shouldn''t annoy > anybody (except those having their server under their bed). > > > > Of course I''m more than willing to stand corrected if someone can tell me > where this is already implemented, or why it''s not needed. Proper flames > over this should start with a "warning, flame" header, so I can don my > asbestos longjohns. ;) > > Prepare your longjohns! Ha! > Just kidding... the solution exists, just turn it on. And remember the > UNIX philosophy. > http://en.wikipedia.org/wiki/Unix_philosophy > -- richard > >Funny (ironic?) you''d quote the UNIX philosophy when the Linux folks have been running around since day one claiming the basic concept of ZFS fly''s in the face of that very concept. Rather than do one thing well, it''s unifying two things (file system and raid/disk management) into one. :) --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100320/ffe6a58c/attachment.html>
On Sat, Mar 20, 2010 at 5:00 PM, Gary Gendel <gary at genashor.com> wrote:> I''m not sure I like this at all. Some of my pools take hours to scrub. I > have a cron job run scrubs in sequence... Start one pool''s scrub and then > poll until it''s finished, start the next and wait, and so on so I don''t > create too much load and bring all I/O to a crawl. > > The job is launched once a week, so the scrubs have plenty of time to > finish. :) > > Scrubs every hour? Some of my pools would be in continuous scrub. > >Who said anything about scrubs every hour? I see he mentioned hour being the granularity of the frequency, but that hardly means you''d HAVE to run scrubs every hour. Nobody is stopping you from setting it to 3600 hours if you so choose. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100320/f264def7/attachment.html>
On 20.03.2010 23:00, Gary Gendel wrote:> I''m not sure I like this at all. Some of my pools take hours to scrub. I have a cron job run scrubs in sequence... Start one pool''s scrub and then poll until it''s finished, start the next and wait, and so on so I don''t create too much load and bring all I/O to a crawl. > > The job is launched once a week, so the scrubs have plenty of time to finish. :) > > Scrubs every hour? Some of my pools would be in continuous scrub.If I''m not mistaken, I suggested a default value of 168 hours, which is ... a week. ;) //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On Sat, 20 Mar 2010, Tim Cook wrote:> > Funny (ironic?) you''d quote the UNIX philosophy when the Linux folks have been running around since day > one claiming the basic concept of ZFS fly''s in the face of that very concept. ?Rather than do one thing > well, it''s unifying two things (file system and raid/disk management) into one. ?:)Most software introduced in Linux clearly violates the "UNIX philosophy". Instead of small and simple parts we have huge and complex parts, with many programs requiring 70 or 80 libraries in order to run. Zfs''s intermingling of layers is benign in comparison. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
On Sat, Mar 20, 2010 at 5:36 PM, Bob Friesenhahn < bfriesen at simple.dallas.tx.us> wrote:> On Sat, 20 Mar 2010, Tim Cook wrote: > >> >> Funny (ironic?) you''d quote the UNIX philosophy when the Linux folks have >> been running around since day >> one claiming the basic concept of ZFS fly''s in the face of that very >> concept. Rather than do one thing >> well, it''s unifying two things (file system and raid/disk management) into >> one. :) >> > > Most software introduced in Linux clearly violates the "UNIX philosophy". > Instead of small and simple parts we have huge and complex parts, with many > programs requiring 70 or 80 libraries in order to run. Zfs''s intermingling > of layers is benign in comparison. > > > Bob > >You can take that up with them :) I''m just pointing out the obvious irony of claiming separation as an excuse for not adding features when the product is based on the very idea of unification of layers/features/functionality. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100320/a7eaab6c/attachment.html>
To add my 0.2 cents... I think starting/stopping scrub belongs to cron, smf, etc. and not to zfs itself. However what would be nice to have is an ability to freeze/resume a scrub and also limit its rate of scrubbing. One of the reason is that when working in SAN environments one have to take into account more that just a server where a scrub will be running as while it might not impact the server it might cause an issue for others, etc. -- Robert Milkowski http://milek.blogspot.com
Edward Ned Harvey
2010-Mar-21  13:20 UTC
[zfs-discuss] Proposition of a new zpool property.
> That would add unnecessary code to the ZFS layer for something that > cron can handle in one line.Actually ... Why should there be a ZFS property to share NFS, when you can already do that with "share" and "dfstab?" And still the zfs property exists. I think the proposed existence of a ZFS scrub period property makes just as much sense. Which is to say, I don''t see any point to either one. ;-) Personally, I use dfstab, and cron.
Edward Ned Harvey
2010-Mar-21  13:26 UTC
[zfs-discuss] Proposition of a new zpool property.
> Most software introduced in Linux clearly violates the "UNIX > philosophy".Hehehe, don''t get me started on OSX. ;-) And for the love of all things sacred, never say OSX is not UNIX. I made that mistake once. Which is not to say I was proven wrong or anything - but it''s apparently a subject that people are oversensitive and emotional about. Seriously out of control. Avoid the subject. Please.
Casper.Dik at Sun.COM
2010-Mar-21  13:58 UTC
[zfs-discuss] Proposition of a new zpool property.
>> That would add unnecessary code to the ZFS layer for something that >> cron can handle in one line. > >Actually ... Why should there be a ZFS property to share NFS, when you can >already do that with "share" and "dfstab?" And still the zfs property >exists.Probably because it is easy to create new filesystems and clone them; as NFS only works per filesystem you need to edit dfstab every time when you add a filesystem. With the nfs property, zfs create the NFS export, etc. Casper
On 21.03.2010 14:26, Edward Ned Harvey wrote:>> Most software introduced in Linux clearly violates the "UNIX >> philosophy". > > Hehehe, don''t get me started on OSX. ;-) And for the love of all things > sacred, never say OSX is not UNIX. I made that mistake once. Which is not > to say I was proven wrong or anything - but it''s apparently a subject that > people are oversensitive and emotional about. Seriously out of control. > Avoid the subject. Please.<Sarcasm Alert> You talking about the Chruch of St. Jobs Advanced Tactical Response Team headquartered in Cupertino? </Sarcasm Alert> //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
Edward Ned Harvey
2010-Mar-22  01:13 UTC
[zfs-discuss] Proposition of a new zpool property.
> >Actually ... Why should there be a ZFS property to share NFS, when you > can > >already do that with "share" and "dfstab?" And still the zfs property > >exists. > > Probably because it is easy to create new filesystems and clone them; > as > NFS only works per filesystem you need to edit dfstab every time when > you > add a filesystem. With the nfs property, zfs create the NFS export, > etc.Either I''m missing something, or you are. If I export /somedir and then I create a new zfs filesystem /somedir/foo/bar then I don''t have to mess around with dfstab, because it''s a subdirectory of an exported directory, it''s already accessible via NFS. So unless I misunderstand what you''re saying, you''re wrong. This is the only situation I can imagine, where you would want to create a ZFS filesystem and have it default to NFS exported.
Robert Milkowski wrote:> > To add my 0.2 cents... > > I think starting/stopping scrub belongs to cron, smf, etc. and not to > zfs itself. > > However what would be nice to have is an ability to freeze/resume a > scrub and also limit its rate of scrubbing. > One of the reason is that when working in SAN environments one have to > take into account more that just a server where a scrub will be running > as while it might not impact the server it might cause an issue for > others, etc.There''s an RFE for this (pause/resume a scrub), or rather there was - unfortunately, it''s got subsumed into another RFE/BUG and the pause/resume requirement got lost. I''ll see about reinstating it. -- Andrew
On 22.03.2010 02:13, Edward Ned Harvey wrote:>>> Actually ... Why should there be a ZFS property to share NFS, when you >> can >>> already do that with "share" and "dfstab?" And still the zfs property >>> exists. >> >> Probably because it is easy to create new filesystems and clone them; >> as >> NFS only works per filesystem you need to edit dfstab every time when >> you >> add a filesystem. With the nfs property, zfs create the NFS export, >> etc. > > Either I''m missing something, or you are. > > If I export /somedir and then I create a new zfs filesystem /somedir/foo/bar > then I don''t have to mess around with dfstab, because it''s a subdirectory of > an exported directory, it''s already accessible via NFS. So unless I > misunderstand what you''re saying, you''re wrong. > > This is the only situation I can imagine, where you would want to create a > ZFS filesystem and have it default to NFS exported.Actually, I can see some reasons for this. Some of us wants directories mounted "the same place" at all servers. Consider the following: zfs inherit sharenfs pool/nfs zfs create -o mountpoint=/home pool/nfs/home zfs create -o mountpoint=/webpages pool/nfs/www zfs create -o mountpoint=/someotherdir pool/nfs/otherdir etc. So, I do see the point of the sharenfs attribute. ;) //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On 21.03.2010 01:25, Robert Milkowski wrote:> > To add my 0.2 cents... > > I think starting/stopping scrub belongs to cron, smf, etc. and not to > zfs itself. > > However what would be nice to have is an ability to freeze/resume a > scrub and also limit its rate of scrubbing. > One of the reason is that when working in SAN environments one have to > take into account more that just a server where a scrub will be running > as while it might not impact the server it might cause an issue for > others, etc.Does cron happen to know how many other scrubs are running, bogging down your IO system? If the scrub scheduling was integrated into zfs itself, it would be a small step to include smf/sysctl settings for "maximum number of parallel scrubs", meaning the next scrub could "sit waiting" until the running ones are finished. //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On 22/03/2010 01:13, Edward Ned Harvey wrote:>>> Actually ... Why should there be a ZFS property to share NFS, when you >>> >> can >> >>> already do that with "share" and "dfstab?" And still the zfs property >>> exists. >>> >> Probably because it is easy to create new filesystems and clone them; >> as >> NFS only works per filesystem you need to edit dfstab every time when >> you >> add a filesystem. With the nfs property, zfs create the NFS export, >> etc. >> > Either I''m missing something, or you are. > > If I export /somedir and then I create a new zfs filesystem /somedir/foo/bar > then I don''t have to mess around with dfstab, because it''s a subdirectory of > an exported directory, it''s already accessible via NFS. So unless I > misunderstand what you''re saying, you''re wrong. > > >no, it is not a subdirectory it is a filesystem mounted on top of the subdirectory. So unless you use NFSv4 with mirror mounts or an automounter other NFS version will show you contents of a directory and not a filesystem. It doesn''t matter if it is a zfs or not. -- Robert Milkowski http://milek.blogspot.com
On 22/03/2010 08:49, Andrew Gabriel wrote:> Robert Milkowski wrote: >> >> To add my 0.2 cents... >> >> I think starting/stopping scrub belongs to cron, smf, etc. and not to >> zfs itself. >> >> However what would be nice to have is an ability to freeze/resume a >> scrub and also limit its rate of scrubbing. >> One of the reason is that when working in SAN environments one have >> to take into account more that just a server where a scrub will be >> running as while it might not impact the server it might cause an >> issue for others, etc. > > There''s an RFE for this (pause/resume a scrub), or rather there was - > unfortunately, it''s got subsumed into another RFE/BUG and the > pause/resume requirement got lost. I''ll see about reinstating it. >have you got the rfe/bug numbers? I will try to find some time and get it implemented... -- Robert Milkowski http://milek.blogspot.com
Edward Ned Harvey
2010-Mar-22  12:35 UTC
[zfs-discuss] Proposition of a new zpool property.
> Does cron happen to know how many other scrubs are running, bogging > down > your IO system? If the scrub scheduling was integrated into zfs itself,It doesn''t need to. Crontab entry: /root/bin/scruball.sh /root/bin/scruball.sh: #!/usr/bin/bash for filesystem in filesystem1 filesystem2 filesystem3 ; do zfs scrub $filesystem done If you were talking about something else, for example, multiple machines all scrubbing a SAN at the same time, then ZFS can''t solve that any better than cron, because it would require inter-machine communication to coordinate. I contend a shell script could actually handle that better than a built-in zfs property anyway.
On 22.03.2010 13:35, Edward Ned Harvey wrote:>> Does cron happen to know how many other scrubs are running, bogging >> down >> your IO system? If the scrub scheduling was integrated into zfs itself, > > It doesn''t need to. > > Crontab entry: /root/bin/scruball.sh > > /root/bin/scruball.sh: > #!/usr/bin/bash > for filesystem in filesystem1 filesystem2 filesystem3 ; do > zfs scrub $filesystem > done > > > If you were talking about something else, for example, multiple machines all > scrubbing a SAN at the same time, then ZFS can''t solve that any better than > cron, because it would require inter-machine communication to coordinate. I > contend a shell script could actually handle that better than a built-in zfs > property anyway. >IIRC it''s "zpool scrub", and last time I checked, the zpool command exited (with status 0) as soon as it had started the scrub. Your command would start _ALL_ scrubs in paralell as a result. //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
Edward Ned Harvey
2010-Mar-22  12:50 UTC
[zfs-discuss] Proposition of a new zpool property.
> no, it is not a subdirectory it is a filesystem mounted on top of the > subdirectory. > So unless you use NFSv4 with mirror mounts or an automounter other NFS > version will show you contents of a directory and not a filesystem. It > doesn''t matter if it is a zfs or not.Ok, I learned something here, that I want to share: If you create a new zfs filesystem as a subdir of a zfs filesystem which is exported via nfs and shared via cifs ... The cifs clients see the contents of the child zfs filesystems. But, as Robert said above, nfs clients do not see the contents of the child zfs filesystem. So, if you nest zfs filesystems inside each other (I don''t) then the sharenfs property of a parent can be inherited by a child, and if that''s your desired behavior, it''s a cool feature. For that matter, even if you do set the property, and you create a new child filesystem with inheritance, that only means the server will auto-export the filesystem. It doesn''t mean the client will auto-mount it, right? So what''s the 2nd half of the solution? Assuming you want the clients to see the subdirectories as the server does.
Edward Ned Harvey
2010-Mar-22  12:54 UTC
[zfs-discuss] Proposition of a new zpool property.
> IIRC it''s "zpool scrub", and last time I checked, the zpool command > exited (with status 0) as soon as it had started the scrub. Your > command > would start _ALL_ scrubs in paralell as a result.You''re right. I did that wrong. Sorry ''bout that. So either way, if there''s a zfs property for scrub, that still doesn''t prevent multiple scrubs from running simultaneously. So ... Presently there''s no way to avoid the simultaneous scrubs either way, right? You have to home-cook scripts to detect which scrubs are running on which filesystems, and serialize the scrubs. With, or without the property. Don''t get me wrong - I''m not discouraging the creation of the property. But if you want to avoid simul-scrub, you''d first have to create a mechanism for that, and then you could create the autoscrub.
On 22/03/2010 12:50, Edward Ned Harvey wrote:>> no, it is not a subdirectory it is a filesystem mounted on top of the >> subdirectory. >> So unless you use NFSv4 with mirror mounts or an automounter other NFS >> version will show you contents of a directory and not a filesystem. It >> doesn''t matter if it is a zfs or not. >> > Ok, I learned something here, that I want to share: > > If you create a new zfs filesystem as a subdir of a zfs filesystem which is > exported via nfs and shared via cifs ... > > The cifs clients see the contents of the child zfs filesystems. > But, as Robert said above, nfs clients do not see the contents of the child > zfs filesystem. > > So, if you nest zfs filesystems inside each other (I don''t) then the > sharenfs property of a parent can be inherited by a child, and if that''s > your desired behavior, it''s a cool feature. > > For that matter, even if you do set the property, and you create a new child > filesystem with inheritance, that only means the server will auto-export the > > filesystem. It doesn''t mean the client will auto-mount it, right? So > what''s the 2nd half of the solution? Assuming you want the clients to see > the subdirectories as the server does. > > >look for mirror mounts feature in NFSv4. -- Robert Milkowski http://milek.blogspot.com
On 22.03.2010 13:54, Edward Ned Harvey wrote:>> IIRC it''s "zpool scrub", and last time I checked, the zpool command >> exited (with status 0) as soon as it had started the scrub. Your >> command >> would start _ALL_ scrubs in paralell as a result. > > You''re right. I did that wrong. Sorry ''bout that. > > So either way, if there''s a zfs property for scrub, that still doesn''t > prevent multiple scrubs from running simultaneously. So ... Presently > there''s no way to avoid the simultaneous scrubs either way, right? You have > to home-cook scripts to detect which scrubs are running on which > filesystems, and serialize the scrubs. With, or without the property. > > Don''t get me wrong - I''m not discouraging the creation of the property. But > if you want to avoid simul-scrub, you''d first have to create a mechanism for > that, and then you could create the autoscrub. >Which is exactly why I wanted it "cooked in" in the zfs code itself. zfs "knows" how many fs''es it''s scrubbing. //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On Mar 22, 2010, at 7:30 AM, Svein Skogen wrote:> On 22.03.2010 13:54, Edward Ned Harvey wrote: >>> IIRC it''s "zpool scrub", and last time I checked, the zpool command >>> exited (with status 0) as soon as it had started the scrub. Your >>> command >>> would start _ALL_ scrubs in paralell as a result. >> >> You''re right. I did that wrong. Sorry ''bout that. >> >> So either way, if there''s a zfs property for scrub, that still doesn''t >> prevent multiple scrubs from running simultaneously. So ... Presently >> there''s no way to avoid the simultaneous scrubs either way, right? You have >> to home-cook scripts to detect which scrubs are running on which >> filesystems, and serialize the scrubs. With, or without the property. >> >> Don''t get me wrong - I''m not discouraging the creation of the property. But >> if you want to avoid simul-scrub, you''d first have to create a mechanism for >> that, and then you could create the autoscrub. >> > > Which is exactly why I wanted it "cooked in" in the zfs code itself. zfs "knows" how many fs''es it''s scrubbing.Nit: ZFS does not scrub file systems. ZFS scrubs pools. In most deployments I''ve done or seen there are very few pools, with many file systems. For appliances like NexentaStor or Oracle''s Sun OpenStorage platforms, the default smallest unit of deployment is one disk. In other words, there is no case where multiple scrubs compete for the resources of a single disk because a single disk only participates in one pool. In general, resource management works when you are resource constrained. Hence, it is quite acceptable to implement concurrent scrubs. Bottom line: systems engineering is still required for optimal system operation. -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
On 22.03.2010 18:10, Richard Elling wrote:> On Mar 22, 2010, at 7:30 AM, Svein Skogen wrote: > >> On 22.03.2010 13:54, Edward Ned Harvey wrote: >>>> IIRC it''s "zpool scrub", and last time I checked, the zpool command >>>> exited (with status 0) as soon as it had started the scrub. Your >>>> command >>>> would start _ALL_ scrubs in paralell as a result. >>> >>> You''re right. I did that wrong. Sorry ''bout that. >>> >>> So either way, if there''s a zfs property for scrub, that still doesn''t >>> prevent multiple scrubs from running simultaneously. So ... Presently >>> there''s no way to avoid the simultaneous scrubs either way, right? You have >>> to home-cook scripts to detect which scrubs are running on which >>> filesystems, and serialize the scrubs. With, or without the property. >>> >>> Don''t get me wrong - I''m not discouraging the creation of the property. But >>> if you want to avoid simul-scrub, you''d first have to create a mechanism for >>> that, and then you could create the autoscrub. >>> >> >> Which is exactly why I wanted it "cooked in" in the zfs code itself. zfs "knows" how many fs''es it''s scrubbing. > > Nit: ZFS does not scrub file systems. ZFS scrubs pools. In most deployments > I''ve done or seen there are very few pools, with many file systems. > > For appliances like NexentaStor or Oracle''s Sun OpenStorage platforms, the > default smallest unit of deployment is one disk. In other words, there is no > case where multiple scrubs compete for the resources of a single disk because > a single disk only participates in one pool. In general, resource management > works when you are resource constrained. Hence, it is quite acceptable to > implement concurrent scrubs. > > Bottom line: systems engineering is still required for optimal system operation. > -- richardWhen you hook up a monstrosity like 96 disks (the limit of those supermicro 2.5"-drive sas enclosures discussed on this list recently) to two 4-lane sas-controllers, the bottleneck is likely to be your controller, your pci-express-bus, or your memory bandwidth. You still want to be able to put some constraints into how much your pushing the hardware. ;) //Svein -- Sending mail from a temporary set up workstation, as my primary W500 is off for service. PGP not installed.
On Mar 22, 2010, at 10:36 AM, Svein Skogen wrote:> On 22.03.2010 18:10, Richard Elling wrote: >> On Mar 22, 2010, at 7:30 AM, Svein Skogen wrote: >> >>> On 22.03.2010 13:54, Edward Ned Harvey wrote: >>>>> IIRC it''s "zpool scrub", and last time I checked, the zpool command >>>>> exited (with status 0) as soon as it had started the scrub. Your >>>>> command >>>>> would start _ALL_ scrubs in paralell as a result. >>>> >>>> You''re right. I did that wrong. Sorry ''bout that. >>>> >>>> So either way, if there''s a zfs property for scrub, that still doesn''t >>>> prevent multiple scrubs from running simultaneously. So ... Presently >>>> there''s no way to avoid the simultaneous scrubs either way, right? You have >>>> to home-cook scripts to detect which scrubs are running on which >>>> filesystems, and serialize the scrubs. With, or without the property. >>>> >>>> Don''t get me wrong - I''m not discouraging the creation of the property. But >>>> if you want to avoid simul-scrub, you''d first have to create a mechanism for >>>> that, and then you could create the autoscrub. >>>> >>> >>> Which is exactly why I wanted it "cooked in" in the zfs code itself. zfs "knows" how many fs''es it''s scrubbing. >> >> Nit: ZFS does not scrub file systems. ZFS scrubs pools. In most deployments >> I''ve done or seen there are very few pools, with many file systems. >> >> For appliances like NexentaStor or Oracle''s Sun OpenStorage platforms, the >> default smallest unit of deployment is one disk. In other words, there is no >> case where multiple scrubs compete for the resources of a single disk because >> a single disk only participates in one pool. In general, resource management >> works when you are resource constrained. Hence, it is quite acceptable to >> implement concurrent scrubs. >> >> Bottom line: systems engineering is still required for optimal system operation. >> -- richard > > When you hook up a monstrosity like 96 disks (the limit of those supermicro 2.5"-drive sas enclosures discussed on this list recently) to two 4-lane sas-controllers, the bottleneck is likely to be your controller, your pci-express-bus, or your memory bandwidth. You still want to be able to put some constraints into how much your pushing the hardware. ;)Scrub tends to be a random workload dominated by IOPS, not bandwidth. But if you are so inclined to create an unbalanced system... Bottom line: systems engineering is still required for optimal system operation :-) -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
On 03/22/10 11:02, Richard Elling wrote:> Scrub tends to be a random workload dominated by IOPS, not bandwidth.you may want to look at this again post build 128; the addition of metadata prefetch to scrub/resilver in that build appears to have dramatically changed how it performs (largely for the better). - Bill
On Mar 22, 2010, at 11:33 AM, Bill Sommerfeld wrote:> On 03/22/10 11:02, Richard Elling wrote: >> Scrub tends to be a random workload dominated by IOPS, not bandwidth. > > you may want to look at this again post build 128; the addition of > metadata prefetch to scrub/resilver in that build appears to have > dramatically changed how it performs (largely for the better).Yes, it is better. But still nowhere near platter speed. All it takes is one little seek... -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
Edward Ned Harvey
2010-Mar-22  21:19 UTC
[zfs-discuss] Proposition of a new zpool property.
> In other words, there > is no > case where multiple scrubs compete for the resources of a single disk > because > a single disk only participates in one pool.Excellent point. However, the problem scenario was described as SAN. I can easily imagine a scenario where some SAN administrator created a pool of raid 5+1 or raid 0+1, and the pool is divided up into 3 LUNs which are presented to 3 different machines. Hence, when Machine A is hammering on the disks, it could also affect Machine B or C. The "catch" that I keep repeating, is that even a zfs property couldn''t possibly solve that problem.
On Mon, Mar 22, 2010 at 12:21 PM, Richard Elling <richard.elling at gmail.com>wrote:> Yes, it is better. But still nowhere near platter speed. All it takes is > one little seek... >True, dat. I find that scrubs start very slow (< 20MB/s) with the disks at near-100% utilization. Towards the end of the scrub, speeds are up in the 250+ MB/s range. It''s on very slow disk (8x WD Green), so the seek penalty is high. I suspect this is because data and metadata has been scattered across the disk due to churn from snapshots, etc. I''ve never noticed a slowdown in regular use though, in fact local disk on my clients tends to be the bottleneck when copying files. -B -- Brandon High : bhigh at freaks.com -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100322/2b28fa17/attachment.html>