1. I have a 4x18GB drive setup as RAIDZ. Now when thinking about it in terms of RAID5 I would expect to get (4-1)x18 worth of drive space, but DF -h shows 4x18. Is this a bug or do I not understand? 2. Once again thinking in RAID5 terms if I have 4X18GB and 12X9GB drives and I want to make a RAIDZ of all of them I would expect the 18GB to be treated at 9GB so the RAIDZ would be 16X9GB. Is that correct? Oh, running NV33. Thanks!
On Wed, Apr 05, 2006 at 07:36:14PM -0700, Sean Hafeez wrote:> 1. I have a 4x18GB drive setup as RAIDZ. Now when thinking about it > in terms of RAID5 I would expect to get (4-1)x18 worth of drive > space, but DF -h shows 4x18. Is this a bug or do I not understand?yes, this is a known bug, 6288488 (though the description is not exactly what you are seeing, there are related bugs 6298446 and 6350956 which describe the issue).> 2. Once again thinking in RAID5 terms if I have 4X18GB and 12X9GB > drives and I want to make a RAIDZ of all of them I would expect the > 18GB to be treated at 9GB so the RAIDZ would be 16X9GB. Is that correct?that is correct. grant.
On Apr 5, 2006, at 7:40 PM, grant beattie wrote:> >> 2. Once again thinking in RAID5 terms if I have 4X18GB and 12X9GB >> drives and I want to make a RAIDZ of all of them I would expect the >> 18GB to be treated at 9GB so the RAIDZ would be 16X9GB. Is that >> correct? > > that is correct. >Thanks. Another thought.. If i stripe 2 of the 9GB drives together to make a 18GB drive.. so my 12x9GB becomes 6x18GB(6 sets of striped 2x9GB) then make RAIDZ on top of that including my 4x18GB drives. That would give me 10X18GB of RAIDZ. Now the way I am thinking the I would not have to worry about the one of the 2x9GB failing because if it did the RAIDZ on top would recover.... Does that make sense? Thanks!
AFAIK, it is not possible to create stripes in a vdev. That is, you can''t have a RAIDZ pool contain any subdisks which are really stripes. Per the zpool(1M) man page: vdevs can only be: block device (typically a disk or a HW raid array presented as a single device) file mirror raidz Erik Trimble Java System Support Mailstop: usca14-102 Phone: x17195 Santa Clara, CA Sean Hafeez wrote:> > On Apr 5, 2006, at 7:40 PM, grant beattie wrote: > > >> >>> 2. Once again thinking in RAID5 terms if I have 4X18GB and 12X9GB >>> drives and I want to make a RAIDZ of all of them I would expect the >>> 18GB to be treated at 9GB so the RAIDZ would be 16X9GB. Is that >>> correct? >> >> >> that is correct. >> > > Thanks. > > Another thought.. > > If i stripe 2 of the 9GB drives together to make a 18GB drive.. so my > 12x9GB becomes 6x18GB(6 sets of striped 2x9GB) then make RAIDZ on top > of that including my 4x18GB drives. That would give me 10X18GB of > RAIDZ. Now the way I am thinking the I would not have to worry about > the one of the 2x9GB failing because if it did the RAIDZ on top would > recover.... > > Does that make sense? > > Thanks! > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Wed, 2006-04-05 at 22:36, Sean Hafeez wrote:> 2. Once again thinking in RAID5 terms if I have 4X18GB and 12X9GB > drives and I want to make a RAIDZ of all of them I would expect the > 18GB to be treated at 9GB so the RAIDZ would be 16X9GB. Is that correct?yep, but with that disk collection you''d probably be happier setting up multiple raidz groups in the pool. for instance: 4x18 + 4x9 + 4x9 + 4x9 or 4x18 + 6x9 + 6x9 zfs will balance allocations across multiple top-level raidz groups. - Bill
Hum. OK. Lack of understanding here on my side. I really just want one big space called /data that is RAID5 like... So 3 raidz sets with raidz on top of it? On Apr 5, 2006, at 10:12 PM, Bill Sommerfeld wrote:> On Wed, 2006-04-05 at 22:36, Sean Hafeez wrote: >> 2. Once again thinking in RAID5 terms if I have 4X18GB and 12X9GB >> drives and I want to make a RAIDZ of all of them I would expect the >> 18GB to be treated at 9GB so the RAIDZ would be 16X9GB. Is that >> correct? > > yep, but with that disk collection you''d probably be happier > setting up > multiple raidz groups in the pool. > > for instance: > > 4x18 + 4x9 + 4x9 + 4x9 > > or > 4x18 + 6x9 + 6x9 > > zfs will balance allocations across multiple top-level raidz groups. > > - Bill > >
On Thu, 2006-04-06 at 17:01, Sean Hafeez wrote:> Hum. > > OK. Lack of understanding here on my side. I really just want one > big space called /data that is RAID5 like... > > So 3 raidz sets with raidz on top of it?Nope, one pool with 3 raidz grouups. there''s an implicit raid-0-ish striping among top-level vdevs. zpool create data raidz c0t0d0 c1t0d0 c2t0d0 c3t0d0 ... zpool add data raidz c0t1d0 c1t1d0 c2t1d0 c3t1d0 ... zpool add data raidz c0t2d0 c1t2d0 c2t2d0 c3t2d0 ... - Bill
On Apr 6, 2006, at 2:14 PM, Bill Sommerfeld wrote:> On Thu, 2006-04-06 at 17:01, Sean Hafeez wrote: >> Hum. >> >> OK. Lack of understanding here on my side. I really just want one >> big space called /data that is RAID5 like... >> >> So 3 raidz sets with raidz on top of it? > > Nope, one pool with 3 raidz grouups. there''s an implicit raid-0-ish > striping among top-level vdevs. > > zpool create data raidz c0t0d0 c1t0d0 c2t0d0 c3t0d0 ... > zpool add data raidz c0t1d0 c1t1d0 c2t1d0 c3t1d0 ... > zpool add data raidz c0t2d0 c1t2d0 c2t2d0 c3t2d0 ... > > - Bill > >OK, So the 1st line makes a raidz called data with the 1st set of drives. The second line creates a raidz of the send set of drives and add them to the pool (data) in a stripe like way. Same for the 3rd line... Think I got that right. Trying to over come years of old school RAID thinking...
OK...I am missing something or found a bug.. # cfgadm -al Ap_Id Type Receptacle Occupant Condition c0 scsi-bus connected configured unknown c0::dsk/c0t0d0 disk connected configured unknown c0::dsk/c0t1d0 disk connected configured unknown c0::dsk/c0t6d0 CD-ROM connected configured unknown c1 scsi-bus connected unconfigured unknown c2 scsi-bus connected configured unknown c2::dsk/c2t2d0 disk connected configured unknown c2::dsk/c2t3d0 disk connected configured unknown c2::dsk/c2t4d0 disk connected configured unknown c2::dsk/c2t5d0 disk connected configured unknown c2::dsk/c2t8d0 disk connected configured unknown c2::dsk/c2t9d0 disk connected configured unknown c2::dsk/c2t10d0 disk connected configured unknown c2::dsk/c2t11d0 disk connected configured unknown c2::dsk/c2t12d0 disk connected configured unknown c2::dsk/c2t13d0 disk connected configured unknown c2::dsk/c2t14d0 disk connected configured unknown # so c2t2, c2t3, c2t4, c2t5, c2t8 are 18GB drives... the rest are 9GB drives... # zpool create -f test raidz c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t8d0 # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT test 84.5G 210K 84.5G 0% ONLINE - # zpool add test raidz c2t9d0 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 invalid vdev specification use ''-f'' to override the following errors: /dev/dsk/c2t9d0s0 is part of exported or potentially active ZFS pool pc. Please see zpool(1M). # ls c2* c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t8d0 c2t9d0 c2t10d0s0 c2t11d0s0 c2t12d0s0 c2t13d0s0 c2t14d0s0 c2t2d0s0 c2t3d0s0 c2t4d0s0 c2t5d0s0 c2t8d0s0 c2t9d0s0 c2t10d0s1 c2t11d0s1 c2t12d0s1 c2t13d0s1 c2t14d0s1 c2t2d0s1 c2t3d0s1 c2t4d0s1 c2t5d0s1 c2t8d0s1 c2t9d0s1 c2t10d0s2 c2t11d0s2 c2t12d0s2 c2t13d0s2 c2t14d0s2 c2t2d0s2 c2t3d0s2 c2t4d0s2 c2t5d0s2 c2t8d0s2 c2t9d0s2 c2t10d0s3 c2t11d0s3 c2t12d0s3 c2t13d0s3 c2t14d0s3 c2t2d0s3 c2t3d0s3 c2t4d0s3 c2t5d0s3 c2t8d0s3 c2t9d0s3 c2t10d0s4 c2t11d0s4 c2t12d0s4 c2t13d0s4 c2t14d0s4 c2t2d0s4 c2t3d0s4 c2t4d0s4 c2t5d0s4 c2t8d0s4 c2t9d0s4 c2t10d0s5 c2t11d0s5 c2t12d0s5 c2t13d0s5 c2t14d0s5 c2t2d0s5 c2t3d0s5 c2t4d0s5 c2t5d0s5 c2t8d0s5 c2t9d0s5 c2t10d0s6 What did I miss? On Apr 6, 2006, at 2:14 PM, Bill Sommerfeld wrote:> On Thu, 2006-04-06 at 17:01, Sean Hafeez wrote: >> Hum. >> >> OK. Lack of understanding here on my side. I really just want one >> big space called /data that is RAID5 like... >> >> So 3 raidz sets with raidz on top of it? > > Nope, one pool with 3 raidz grouups. there''s an implicit raid-0-ish > striping among top-level vdevs. > > zpool create data raidz c0t0d0 c1t0d0 c2t0d0 c3t0d0 ... > zpool add data raidz c0t1d0 c1t1d0 c2t1d0 c3t1d0 ... > zpool add data raidz c0t2d0 c1t2d0 c2t2d0 c3t2d0 ... > > - Bill > >
> so c2t2, c2t3, c2t4, c2t5, c2t8 are 18GB drives... > > the rest are 9GB drives... > > # zpool create -f test raidz c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t8d0 > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > test 84.5G 210K 84.5G 0% ONLINE - > > # zpool add test raidz c2t9d0 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 > invalid vdev specification > use ''-f'' to override the following errors: > /dev/dsk/c2t9d0s0 is part of exported or potentially active ZFS pool > pc. Please see zpool(1M). > > ... > > What did I miss?I don''t think you missed anything. The "zpool add" command tells you that it thinks c2t9d0 is in use and that you have to use -f to override. Did you try that and it didn''t work, or am I missing something? Incidentally, you could do the zpool create in one step with: zpool create -f test \ raidz c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t8d0 \ raidz c2t9d0 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 I''m guessing that zpool will also complain that you''re adding two RAID-Z devices with different strip sizes. Again, -f will squelch the hand-holding. --Bill
OK. I think I follow. The drive used to be part of a pool that was delete so it had some kind of signature on the disk? Seems to me that when you delete the pool the signature should be removed or the error msg should be made clearer. It in no way indicated to me anything "wow, what now" since I knew the drive was not part of any pool. Maybe the msg could be changed to reflect that the drive has a signature on it as opposed to being part of a pool? Just my 2 cents. Thanks everyone for the help. I got everything up and running at this point on the U60....now I just need to get Solaris x86 to see the SIL680 PCI card I just dropped into another box.. -Sean On Apr 9, 2006, at 3:49 AM, Bill Moore wrote:>> so c2t2, c2t3, c2t4, c2t5, c2t8 are 18GB drives... >> >> the rest are 9GB drives... >> >> # zpool create -f test raidz c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t8d0 >> # zpool list >> NAME SIZE USED AVAIL CAP HEALTH >> ALTROOT >> test 84.5G 210K 84.5G 0% ONLINE - >> >> # zpool add test raidz c2t9d0 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 >> invalid vdev specification >> use ''-f'' to override the following errors: >> /dev/dsk/c2t9d0s0 is part of exported or potentially active ZFS pool >> pc. Please see zpool(1M). >> >> ... >> >> What did I miss? > > I don''t think you missed anything. The "zpool add" command tells you > that it thinks c2t9d0 is in use and that you have to use -f to > override. > Did you try that and it didn''t work, or am I missing something? > > Incidentally, you could do the zpool create in one step with: > > zpool create -f test \ > raidz c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t8d0 \ > raidz c2t9d0 c2t10d0 c2t11d0 c2t12d0 c2t13d0 c2t14d0 > > I''m guessing that zpool will also complain that you''re adding two > RAID-Z devices with different strip sizes. Again, -f will squelch > the hand-holding. > > > --Bill
Sean Hafeez wrote:> OK. I think I follow. > > The drive used to be part of a pool that was delete so it had some kind > of signature on the disk? Seems to me that when you delete the pool the > signature should be removed or the error msg should be made clearer. It > in no way indicated to me anything "wow, what now" since I knew the > drive was not part of any pool. Maybe the msg could be changed to > reflect that the drive has a signature on it as opposed to being part of > a pool?Perhaps this is fixed by CR 6400742? Dana
On Sun, Apr 09, 2006 at 10:55:34AM -0700, Sean Hafeez wrote:> OK. I think I follow. > > The drive used to be part of a pool that was delete so it had some > kind of signature on the disk? Seems to me that when you delete the > pool the signature should be removed or the error msg should be made > clearer. It in no way indicated to me anything "wow, what now" since > I knew the drive was not part of any pool. Maybe the msg could be > changed to reflect that the drive has a signature on it as opposed to > being part of a pool? > > Just my 2 cents.If you destroy a pool, then a subsequent ''zpool create'' will not complain. Note that there was a bug (6400742) introduced in build 37 and fixed in build 38 that prevented this from working properly, but it doesn''t look like you''re hitting this. My only guess is that you destroyed a pool while one of the drives was FAULTED/OFFLINE, in which case we couldn''t write the label to indicate that it was destroyed. This is covered in the ZFS Administration Guide, but it''s a difficult concept to convey in general. If you find that you can destroy a pool and then get a warning when recreating it, please file a bug. The reason this message is phrased this way is if you have shared (i.e. SAN) storage, then this case is indistinguishable from trying to create a pool on top of something actively in use from another machine. We have some plans to put the hostid or something else in the label, so that if we find a pool marked active, last written from the current host, but otherwise not present on the system, we don''t need to print this warning. An easier RFE would be to distinguish the exported case from the potentially active case: "cannot use X: is part of exported pool ''foo''" "cannot use X: may be part of active pool ''foo''" The other time this can come up is if the /etc/zfs/zpool.cache file is damaged or missing, or if you are booted from an alternate root. In this case it really is active from the same system, but we just don''t know about it. The pool may be importable, so we likely don''t want admins to blindly ''zpool create'' on top of it. I''m not sure what you mean by "no way indicated ..." because the message says "use -f to override". Was there a reason you didn''t try the ''-f'' flag? How exactly would you phrase this message, keeping in mind that we do want to indicate the potential that the pool really is active on another system, or active when booted from an alternate root? Thanks, - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
On Apr 9, 2006, at 11:40 AM, Eric Schrock wrote:> On Sun, Apr 09, 2006 at 10:55:34AM -0700, Sean Hafeez wrote: >> OK. I think I follow. >> >> The drive used to be part of a pool that was delete so it had some >> kind of signature on the disk? Seems to me that when you delete the >> pool the signature should be removed or the error msg should be made >> clearer. It in no way indicated to me anything "wow, what now" since >> I knew the drive was not part of any pool. Maybe the msg could be >> changed to reflect that the drive has a signature on it as opposed to >> being part of a pool? >> >> Just my 2 cents. > > If you destroy a pool, then a subsequent ''zpool create'' will not > complain. Note that there was a bug (6400742) introduced in build 37 > and fixed in build 38 that prevented this from working properly, > but it > doesn''t look like you''re hitting this. My only guess is that you > destroyed a pool while one of the drives was FAULTED/OFFLINE, in which > case we couldn''t write the label to indicate that it was destroyed. > This is covered in the ZFS Administration Guide, but it''s a difficult > concept to convey in general. If you find that you can destroy a pool > and then get a warning when recreating it, please file a bug. >I was messing a bit with my 711 boxes and such with ZFS but I am 99% sure that I did zpool destroy each time. Running NV B33. I have been waiting on ZFS boot to upgrade... Thanks!> > I''m not sure what you mean by "no way indicated ..." because the > message > says "use -f to override". Was there a reason you didn''t try the ''-f'' > flag? How exactly would you phrase this message, keeping in mind that > we do want to indicate the potential that the pool really is active on > another system, or active when booted from an alternate root? >Sorry, my phrasing was bad. What I met to say was it left me without a clear idea what was going on and where to go from there. I think I would change it to say something like this... "Warning /dev/dsk/c2t9d0s0 is signed as being part of exported or potentially active ZFS pool. Please verify before continuing. -f will override warning and create the pool." ...or some such thing. If you really want to be sure you can do what they did in Linux Software RAID which was told you a disk was in a RAID set and told you to use "--force" to override it. Then when you did it with "--force" it said if you are really sure you want to do this do it again with "--really-force". ;) Thanks!
Roch Bourbonnais - Performance Engineering
2006-Apr-10 08:57 UTC
[zfs-discuss] A few Newbie questions about RAIDZ
Sean Hafeez writes: > > > On Apr 9, 2006, at 11:40 AM, Eric Schrock wrote: > > > On Sun, Apr 09, 2006 at 10:55:34AM -0700, Sean Hafeez wrote: > >> OK. I think I follow. > >> > >> The drive used to be part of a pool that was delete so it had some > >> kind of signature on the disk? Seems to me that when you delete the > >> pool the signature should be removed or the error msg should be made > >> clearer. It in no way indicated to me anything "wow, what now" since > >> I knew the drive was not part of any pool. Maybe the msg could be > >> changed to reflect that the drive has a signature on it as opposed to > >> being part of a pool? > >> > >> Just my 2 cents. > > > > If you destroy a pool, then a subsequent ''zpool create'' will not > > complain. Note that there was a bug (6400742) introduced in build 37 > > and fixed in build 38 that prevented this from working properly, > > but it > > doesn''t look like you''re hitting this. My only guess is that you > > destroyed a pool while one of the drives was FAULTED/OFFLINE, in which > > case we couldn''t write the label to indicate that it was destroyed. > > This is covered in the ZFS Administration Guide, but it''s a difficult > > concept to convey in general. If you find that you can destroy a pool > > and then get a warning when recreating it, please file a bug. > > > > > I was messing a bit with my 711 boxes and such with ZFS but I am 99% > sure that I did zpool destroy each time. Running NV B33. I have been > waiting on ZFS boot to upgrade... > > Thanks! > > > > > I''m not sure what you mean by "no way indicated ..." because the > > message > > says "use -f to override". Was there a reason you didn''t try the ''-f'' > > flag? How exactly would you phrase this message, keeping in mind that > > we do want to indicate the potential that the pool really is active on > > another system, or active when booted from an alternate root? > > > > Sorry, my phrasing was bad. What I met to say was it left me without > a clear idea what was going on and where to go from there. > > I think I would change it to say something like this... > > "Warning /dev/dsk/c2t9d0s0 is signed as being part of exported or > potentially active ZFS pool. Please verify before continuing. -f will > override warning and create the pool." > As a general principle, I actually will try to avoid as much as possible override flags. If a complex zpool create command fails with a warning, I''d rather fix the problem than give a blank check to zpool. The issue is that, if I fix a problem reported by zpool, I don''t know, from the outside, that there would not be another one, yet unnoticed, waiting to byte me. I am not saying there is one, but a big override leaves me with an uncomfortable feeling. So what I think would be good here is some way of fixing the identified problem such that a subsequent zpool can be issued without the -f flags. zpool deregister c2t9d0 This makes more sense to me, since I may know something about that piece of hardware, that gives me confidence that I am not about to cause a catastrophe. ____________________________________________________________________________________ Roch Bourbonnais Sun Microsystems, Icnc-Grenoble Senior Performance Analyst 180, Avenue De L''Europe, 38330, Montbonnot Saint Martin, France Performance & Availability Engineering http://icncweb.france/~rbourbon http://blogs.sun.com/roller/page/roch Roch.Bourbonnais at Sun.Com (+33).4.76.18.83.20
On Mon, Apr 10, 2006 at 10:57:16AM +0200, Roch Bourbonnais - Performance Engineering wrote:> > > So what I think would be good here is some way of fixing the > identified problem such that a subsequent zpool can be > issued without the -f flags. > > zpool deregister c2t9d0 > > This makes more sense to me, since I may know something > about that piece of hardware, that gives me confidence that > I am not about to cause a catastrophe.Yes, this would be useful. But it needs to be generic enough to scrub any disk of identifying characteristics. We have the same problem if a disk is formatted for use with UFS or VxFS. There should be some generic way to wipe a disk of any identifying characteristics (without having to wipe the entire disk). - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock