While playing with zfs at home on my DIY NAS, to try zfs and raidz, I setup 2 disks in a raidz pool. I started to think, in this scenario, does a 2 disk raidz actually provide any protection? I don''t know the under parts of parity data in zfs. It didn''t complain when I created it. To that, if I only have 2 disks to start, should I start with a raidz or mirror? Is there issues with future expansion and disk replacement? Thanks, Greg -- This message posted from opensolaris.org
Hi Greg, With two disks, I would start with a mirror. Then, you could add two more disks for expansion. You can also detach disks in a mirrored configuration. Or, you could attach another disk to create a 3-way mirror. With a RAIDZ configuration, you would not be able to expand the two disks to three disks. You would have to add two more disks. See the examples in this section of the ZFS Admin Guide. http://docs.sun.com/app/docs/doc/819-5461/gayrd?a=view Cindy On 10/15/09 10:45, Gregory Gee wrote:> While playing with zfs at home on my DIY NAS, to try zfs and raidz, I setup 2 disks in a raidz pool. I started to think, in this scenario, does a 2 disk raidz actually provide any protection? I don''t know the under parts of parity data in zfs. It didn''t complain when I created it. > > To that, if I only have 2 disks to start, should I start with a raidz or mirror? Is there issues with future expansion and disk replacement? > > Thanks, > Greg
On Thu, Oct 15, 2009 at 11:09:32AM -0600, Cindy Swearingen wrote:> Hi Greg, > > With two disks, I would start with a mirror. Then, you could addAdditionally, with a two disk RAIDZ1 you are doing parity calculations for no good reason. I would recommend a mirror. -brian -- "Coding in C is like sending a 3 year old to do groceries. You gotta tell them exactly what you want or you''ll end up with a cupboard full of pop tarts and pancake mix." -- IRC User (http://www.bash.org/?841435)
> Additionally, with a two disk RAIDZ1 you are doing parity calculations > for > no good reason. I would recommend a mirror.Will there ever be the possibility to extend a RAIDZ mirror with another disk? Linux can do this since kernel 2.6.17 (at least). I know this is not an really an enterprise feature but enterprise storage arrays are able to expand a RAID5/6 (even online). IMHO to expand bulk storage (RAID5/6/Z/Z2) with another spindle would convince some small enterprises to switch to ZFS. What do other ZFS guys think of this possibility?
Matthias Appel wrote:> >> Additionally, with a two disk RAIDZ1 you are doing parity calculations >> for >> no good reason. I would recommend a mirror. >> > > > Will there ever be the possibility to extend a RAIDZ mirror with another > disk? > > Linux can do this since kernel 2.6.17 (at least). > > I know this is not an really an enterprise feature but enterprise storage > arrays are able to expand a RAID5/6 (even online). > > IMHO to expand bulk storage (RAID5/6/Z/Z2) with another spindle would > convince some small enterprises to switch to ZFS. > > What do other ZFS guys think of this possibility? > >Expanding a RAIDZ (i.e. adding another disk for data, not parity) is a constantly-asked-for feature. It''s decidedly non-trivial (frankly, I''ve been staring at the code for a year now, trying to figure out how, and I''m just not up to the task). The biggest issue is interrupted expansion - i.e. I''ve got code to do it (expansion), but it breaks all over the place when I interrupt the expansion - horrible pool corruption all the time. And I think that''s the big problem - how to do the expansion in stages, while keeping the pool active. At this point, the only way I can get it to work is to offline (ie export) the whole pool, and then pray that nothing interrupts the expansion process. That all said, I''m not a /real/ developer, so maybe someone else has some free time to try. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Thu, Oct 15, 2009 at 03:32:44PM -0700, Erik Trimble wrote:> Expanding a RAIDZ (i.e. adding another disk for data, not parity) is a > constantly-asked-for feature. > > It''s decidedly non-trivial (frankly, I''ve been staring at the code for a > year now, trying to figure out how, and I''m just not up to the task). > The biggest issue is interrupted expansion - i.e. I''ve got code to do it > (expansion), but it breaks all over the place when I interrupt the > expansion - horrible pool corruption all the time. And I think that''s > the big problem - how to do the expansion in stages, while keeping the > pool active. At this point, the only way I can get it to work is to > offline (ie export) the whole pool, and then pray that nothing > interrupts the expansion process.Anyone knows how Drobo does it? -- Eugen* Leitl <a href="http://leitl.org">leitl</a> http://leitl.org ______________________________________________________________ ICBM: 48.07100, 11.36820 http://www.ativel.com http://postbiota.org 8B29F6BE: 099D 78BA 2FD3 B014 B08A 7779 75B0 2443 8B29 F6BE
Add the new disk - start writing new blocks to that disk, instead of waiting to re-layout all the stipes. And when the disk is not active, do slow/safe copy on write to balance all the blocks? -- This message posted from opensolaris.org
Prasad Unnikrishnan wrote:> Add the new disk - start writing new blocks to that disk, instead of waiting to re-layout all the stipes. And when the disk is not active, do slow/safe copy on write to balance all the blocks? >Conceptually, yes, doing a zpool expansion while the pool is live isn''t hard to map out, conceptually. As always, the devil is in the details. In this case, the primary problem I''m having is maintaining two different block mapping schemes (one for the old disk layout, and one for the new disk layout) and still being able to interrupt the expansion process. My primary problem is that I have to keep both schemes in memory during the migration, and if something should happen (i.e. reboot, panic, etc) then I lose the current state of the zpool, and everything goes to hell in a handbasket. -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800)
On Fri, Oct 16, 2009 at 1:40 PM, Erik Trimble <Erik.Trimble at sun.com> wrote:> Prasad Unnikrishnan wrote: > >> Add the new disk - start writing new blocks to that disk, instead of >> waiting to re-layout all the stipes. And when the disk is not active, do >> slow/safe copy on write to balance all the blocks? >> >> > Conceptually, yes, doing a zpool expansion while the pool is live isn''t > hard to map out, conceptually. > > As always, the devil is in the details. In this case, the primary problem > I''m having is maintaining two different block mapping schemes (one for the > old disk layout, and one for the new disk layout) and still being able to > interrupt the expansion process. My primary problem is that I have to keep > both schemes in memory during the migration, and if something should happen > (i.e. reboot, panic, etc) then I lose the current state of the zpool, and > everything goes to hell in a handbasket. >In a way I think the key of this working is the code for device removal, because when you are removing a device, you take from one and put on another, it should be much easier to use this code and move 1/N of existing data to a new device using functions from device removal modifications, i could be wrong but it may not be as far as people fear. Device removal was mentioned in the Next word for ZFS video. James Dickens http://uadmin.blogspot.com jamesd.wi at gmail.com> > > -- > Erik Trimble > Java System Support > Mailstop: usca22-123 > Phone: x17195 > Santa Clara, CA > Timezone: US/Pacific (GMT-0800) > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091019/4efc21c9/attachment.html>
Unnikrishnan, Prasad (GE, Corporate)
2009-Oct-20 07:37 UTC
[zfs-discuss] Stupid to have 2 disk raidz?
It will wonderful to see that working, we badly need that capability of being able to remove a disk from a pool. ________________________________ From: James Dickens [mailto:jamesd.wi at gmail.com] Sent: 20 October 2009 03:39 To: Erik Trimble Cc: Unnikrishnan, Prasad (GE, Corporate); zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] Stupid to have 2 disk raidz? On Fri, Oct 16, 2009 at 1:40 PM, Erik Trimble <Erik.Trimble at sun.com> wrote: Prasad Unnikrishnan wrote: Add the new disk - start writing new blocks to that disk, instead of waiting to re-layout all the stipes. And when the disk is not active, do slow/safe copy on write to balance all the blocks? Conceptually, yes, doing a zpool expansion while the pool is live isn''t hard to map out, conceptually. As always, the devil is in the details. In this case, the primary problem I''m having is maintaining two different block mapping schemes (one for the old disk layout, and one for the new disk layout) and still being able to interrupt the expansion process. My primary problem is that I have to keep both schemes in memory during the migration, and if something should happen (i.e. reboot, panic, etc) then I lose the current state of the zpool, and everything goes to hell in a handbasket. In a way I think the key of this working is the code for device removal, because when you are removing a device, you take from one and put on another, it should be much easier to use this code and move 1/N of existing data to a new device using functions from device removal modifications, i could be wrong but it may not be as far as people fear. Device removal was mentioned in the Next word for ZFS video. James Dickens http://uadmin.blogspot.com jamesd.wi at gmail.com -- Erik Trimble Java System Support Mailstop: usca22-123 Phone: x17195 Santa Clara, CA Timezone: US/Pacific (GMT-0800) _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20091020/f18ea38d/attachment.html>
Erik Trimble wrote:> As always, the devil is in the details. In this case, > the primary > problem I''m having is maintaining two different block > mapping schemes > (one for the old disk layout, and one for the new > disk layout) and still > being able to interrupt the expansion process. My > primary problem is > that I have to keep both schemes in memory during the > migration, and if > something should happen (i.e. reboot, panic, etc) > then I lose the > current state of the zpool, and everything goes to > hell in a handbasket.It might not be that bad, if only zfs would allow mirroring a raidz pool. Back when I did storage admin for a smaller company where availability was hyper-critical (but we couldn''t afford EMC/Veritas), we had a hardware RAID5 array. After a few years of service, we ran into some problems: * Need to restripe the array? Screwed. * Need to replace the array because current one is EOL? Screwed. * Array controller barfed for whatever reason? Screwed. * Need to flash the controller with latest firmware? Screwed. * Need to replace a component on the array, e.g. NIC, controller or power supply? Screwed. * Need to relocate the array? Screwed. If we could stomach downtime or short-lived storage solutions, none of this would have mattered. To get around this, we took two hardware RAID arrays and mirrored them in software. We could offline/restripe/replace/upgrade/relocate/whatever-we-wanted to an individual array since it was only a mirror which we could offline/online or detach/attach. I suspect this could be simulated today with setting up a mirrored pool on top of a zvol of a raidz pool. That involves a lot of overhead, doing parity/checksum calculations multiple times for the same data. On the plus side, setting this up might make it possible to defrag a pool. Should zfs simply allow mirroring one pool with another, then with a few spare disks laying around, altering the geometry of an existing pool could be done with zero downtime using steps similar to the following. 1. Create spare_pool as large as current_pool using spare disks 2. Attach spare_pool to current_pool 3. Wait for resilver to complete 4. Detach and destroy current_pool 5. Create new_pool the way you want it now 6. Attach new_pool to spare_pool 7. Wait for resilver to complete 8. Detach/destroy spare_pool 9. Chuckle at the fact that you completely remade your production pool while fully available I did this dance several times over the course of many years back in the Disksuite days. Thoughts? Marty -- This message posted from opensolaris.org