Is it possible to force ZFS to "nicely" re-organize data inside a zpool after a new root level vdev has been introduced? e.g. Take a pool with 1 vdev consisting of a 2 disk mirror. Populate some arbitrary files using about 50% of the capacity. Then add another 2 mirrored disks to the pool. It seems like (judging from zpool iostat) that going forward new data will be striped as expected, but existing data is not striped. This presents a question of what happens with the original mirror set runs out of space? Does the striping stop automagically? How does this impact resilvering and recovering from failures? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070905/b3f19f40/attachment.html>
Richard Elling
2007-Sep-05 21:26 UTC
[zfs-discuss] Consequences of adding a root vdev later?
Solaris wrote:> Is it possible to force ZFS to "nicely" re-organize data inside a zpool > after a new root level vdev has been introduced?Currently, ZFS will not reorganize the existing data for such cases. You can force this to occur by copying the data and removing the old, but that seems like a lot of extra work for most cases.> e.g. Take a pool with 1 vdev consisting of a 2 disk mirror. Populate > some arbitrary files using about 50% of the capacity. Then add another > 2 mirrored disks to the pool. > > It seems like (judging from zpool iostat) that going forward new data > will be striped as expected, but existing data is not striped. This > presents a question of what happens with the original mirror set runs > out of space? Does the striping stop automagically? How does this > impact resilvering and recovering from failures?Yes. AFAIK, nobody has characterized resilvering, though this is about the 4th time this week someone has brought the topic up. Has anyone done work here that we don''t know about? If so, please speak up :-) -- richard
Bill Sommerfeld
2007-Sep-06 05:21 UTC
[zfs-discuss] Consequences of adding a root vdev later?
On Wed, 2007-09-05 at 14:26 -0700, Richard Elling wrote:> AFAIK, nobody has characterized resilvering, though this is about the 4th > time this week someone has brought the topic up. Has anyone done work here > that we don''t know about? If so, please speak up :-)I haven''t been conducting controlled experiments, but I have been moving a large pool around recently via a series of zpool replace operations, and so have been keeping an eye on a bunch of resilvering. The one conclusion I have so far is that, for the pool I''m moving, the time to complete a disk-replacement resilver seems to be largely independent of the number of disks being resilvered (so far, I''ve done batches of up to seven replacements) and in the same ballpark as a scrub. To be conservative, I''m moving only one disk per raidz group per "pass". - Bill
Richard Elling
2007-Sep-06 20:59 UTC
[zfs-discuss] Consequences of adding a root vdev later?
Bill Sommerfeld wrote:> On Wed, 2007-09-05 at 14:26 -0700, Richard Elling wrote: > >> AFAIK, nobody has characterized resilvering, though this is about the 4th >> time this week someone has brought the topic up. Has anyone done work here >> that we don''t know about? If so, please speak up :-) >> > > I haven''t been conducting controlled experiments, but I have been moving > a large pool around recently via a series of zpool replace operations, > and so have been keeping an eye on a bunch of resilvering. > > The one conclusion I have so far is that, for the pool I''m moving, the > time to complete a disk-replacement resilver seems to be largely > independent of the number of disks being resilvered (so far, I''ve done > batches of up to seven replacements) and in the same ballpark as a > scrub. > > To be conservative, I''m moving only one disk per raidz group per > "pass". > > - Bill >Thanks Bill, I''ve put together some tests and thus far the bottleneck is on the read side. It may take another week or so to finish my characterizations and analyze the data, though. -- richard
Curtis Schiewek
2007-Sep-10 13:15 UTC
[zfs-discuss] Consequences of adding a root vdev later?
So, If I have a pool that made up of 2 raidz vdevs, all data is striped across? So if I somehow lose a vdev I lose all my data?! This message posted from opensolaris.org
Mario Goebbels
2007-Sep-10 14:33 UTC
[zfs-discuss] Consequences of adding a root vdev later?
> If I have a pool that made up of 2 raidz vdevs, all data is striped across? So if I somehow lose a vdev I lose all my data?!If your vdevs are RAID-Z''s, there has to be a rare coincidence to happen to break the pool (two disks failing in the same RAID-Z)... But yeah, ZFS spreads blocks to different vdevs, trying to balance the bandwidths of the vdevs. -mg -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 648 bytes Desc: OpenPGP digital signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070910/bceba6f2/attachment.bin>
Mario Goebbels
2007-Sep-10 15:40 UTC
[zfs-discuss] Consequences of adding a root vdev later?
> I''m more worried about the availability of my data in the even of a > controller failure. I plan on using 4-chan SATA controllers and > creating multiple 4 disk RAIDZ vdevs. I want to use a single pool, but > it looks like I can''t as controller failure = ZERO access, although the > same can be said for any other non-redundant components.ZFS vdevs are marked accordingly to identify them as drive, slice or partition of a pool. ZFS can reconstruct the pool easily using these after controller failure. ZFS preferred mode of operation is using dumb disks, means leaving jobs like redundancy and such to the file system. So unless you''re using hardware RAID, which makes the disk configuration pretty much dependent on the controller, changing one won''t make your data blow up. (Not sure what actually happens if you change the controller at random and let the system boot, whether it automagically scans all controllers and disks if zpool.cache doesn''t match the system configuration, or if it blows up and requires manual intervention. There wasn''t an occasion yet to try out for me.) -mg -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 648 bytes Desc: OpenPGP digital signature URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20070910/5db9fde5/attachment.bin>