I was asked a few interesting questions by a fellow co-worker regarding ZFS and after much google-bombing, still can''t find answers. I''ve seen several people try to ask these questions, but only to have been answered indirectly. If I have a pool that consists of a raidz-1 w/ three 500gb disks and I go to the store and buy a fourth 500gb disk and add it to the pool as the second vdev, what happens when that fourth disk has a hardware failure? The second question is lets say I have two disks and I create a non-parity pool [2 vdevs creating /tank] with a single child filesystem [/tank/fscopies2/] in the pool with the copies=2 attribute. If I lose one of these disks, will I still have access to my files? If you were to add a third disk to this filesystem as a third vdev at a future point in time, would there be any scenario where a hardware failure would cause the rest of the pool to be unreadable? -- This message posted from opensolaris.org
On Thu, Nov 20, 2008 at 9:54 AM, Kam <cam at mrlane.com> wrote:> I was asked a few interesting questions by a fellow co-worker regarding ZFS > and after much google-bombing, still can''t find answers. I''ve seen several > people try to ask these questions, but only to have been answered > indirectly. > > If I have a pool that consists of a raidz-1 w/ three 500gb disks and I go > to the store and buy a fourth 500gb disk and add it to the pool as the > second vdev, what happens when that fourth disk has a hardware failure?Your pool is toast. If you set copies=2, and get lucky enough that copies of every block on the standalone are copied to the raidz vdev, you might be able to survive, the odds are pretty slim. Long story short, don''t ever do it. Not only will you have messed up performance, you''ve basically wasted the drive used for parity in the raidz vdev. The pool isn''t any better than a raid-0 at this point.> > > The second question is lets say I have two disks and I create a non-parity > pool [2 vdevs creating /tank] with a single child filesystem > [/tank/fscopies2/] in the pool with the copies=2 attribute. If I lose one of > these disks, will I still have access to my files? If you were to add a > third disk to this filesystem as a third vdev at a future point in time, > would there be any scenario where a hardware failure would cause the rest of > the pool to be unreadable? >Same as above, not necessarily. There''s nothing guaranteeing where the two copies will exist, just that there will be two. They may both be on one disk, or they may not. This is more to protect against corrupt blocks if you only have a single drive, than against losing an entire drive. Moral of the story continues to be: If you want protection against a failed disk, use a raid algorithm that provides it. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/6028688b/attachment.html>
>>>>> "t" == Tim <tim at tcsac.net> writes:>> a fourth 500gb disk and add >> it to the pool as the second vdev, what happens when that >> fourth disk has a hardware failure? t> If you set copies=2, and get lucky enough t> that copies of every block on the standalone are copied to the t> raidz vdev, you might be able to survive, no, of course you won''t survive. Just try it with file vdev''s before pontificating about it. -----8<---- bash-3.00# mkfile 64m t0 bash-3.00# mkfile 64m t1 bash-3.00# mkfile 64m t2 bash-3.00# mkfile 64m t00 bash-3.00# pwd -P /usr/export bash-3.00# zpool create foolpool raidz1 /usr/export/t{0..2} bash-3.00# zpool add foolpool /usr/export/t00 invalid vdev specification use ''-f'' to override the following errors: mismatched replication level: pool uses raidz and new vdev is file bash-3.00# zpool add -f foolpool /usr/export/t00 bash-3.00# zpool status -v foolpool pool: foolpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /usr/export/t0 ONLINE 0 0 0 /usr/export/t1 ONLINE 0 0 0 /usr/export/t2 ONLINE 0 0 0 /usr/export/t00 ONLINE 0 0 0 errors: No known data errors bash-3.00# zfs set copies=2 foolpool bash-3.00# cd / bash-3.00# pax -rwpe sbin foolpool/ bash-3.00# > /usr/export/t00 bash-3.00# pax -w foolpool/ > /dev/null bash-3.00# zpool status -v foolpool pool: foolpool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool DEGRADED 4 0 21 raidz1 ONLINE 0 0 0 /usr/export/t0 ONLINE 0 0 0 /usr/export/t1 ONLINE 0 0 0 /usr/export/t2 ONLINE 0 0 0 /usr/export/t00 DEGRADED 4 0 21 too many errors errors: No known data errors bash-3.00# zpool offline foolpool /usr/export/t00 cannot offline /usr/export/t00: no valid replicas bash-3.00# zpool export foolpool panic[cpu0]/thread=2a1016b7ca0: assertion failed: vdev_config_sync(rvd, txg) == 0, file: ../../common/fs/zfs/spa.c, line: 3125 000002a1016b7850 genunix:assfail+78 (7b72c668, 7b72b680, c35, 183d800, 1285c00, 0) %l0-3: 0000000000000422 0000000000000081 000003001df5e580 0000000070170880 %l4-7: 0000060016076c88 0000000000000000 0000000001887800 0000000000000000 000002a1016b7900 zfs:spa_sync+244 (3001df5e580, 42, 30043434e30, 7b72c400, 7b72b400, 4) %l0-3: 0000000000000000 000003001df5e6b0 000003001df5e670 00000600155cce80 %l4-7: 0000030056703040 0000060013659200 000003001df5e708 00000000018c2e98 000002a1016b79c0 zfs:txg_sync_thread+120 (60013659200, 42, 2a1016b7a70, 60013659320, 60013659312, 60013659310) %l0-3: 0000000000000000 00000600136592d0 00000600136592d8 0000060013659316 %l4-7: 0000060013659314 00000600136592c8 0000000000000043 0000000000000042 syncing file systems... done [...first reboot...] WARNING: ZFS replay transaction error 30, dataset boot/usr, seq 0x134c, txtype 9 NOTICE: iscsi: SendTarget discovery failed (11) [``patiently waits'''' forver] ~#Type ''go'' to resume {0} ok boot -m milestone=none Resetting ... [...second reboot...] # /sbin/mount -o remount,rw / # /sbin/mount /usr # iscsiadm remove discovery-address 10.100.100.135 # iscsiadm remove discovery-address 10.100.100.138 # cd /usr/export # mkdir hide # mv t0 t1 t2 t00 hide mv: cannot access t00 [haha ZFS.] # sync # reboot syncing file systems... done [...third reboot...] SunOS Release 5.11 Version snv_71 64-bit Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved. Use is subject to license terms. NOTICE: mddb: unable to get devid for ''sd'', 0xffffffff NOTICE: mddb: unable to get devid for ''sd'', 0xffffffff NOTICE: mddb: unable to get devid for ''sd'', 0xffffffff WARNING: /pci at 8,700000/usb at 5,3 (ohci0): Connecting device on port 4 failed Hostname: terabithia.th3h.inner.chaos /usr/sbin/pmconfig: "/etc/power.conf" line 37, cannot find ufs mount point for "/usr/.CPR" Reading ZFS config: done. Mounting ZFS filesystems: (9/9) terabithia.th3h.inner.chaos console login: root Password: Nov 20 13:09:30 terabithia.th3h.inner.chaos login: ROOT LOGIN /dev/console Last login: Mon Aug 18 03:04:12 on console Sun Microsystems Inc. SunOS 5.11 snv_71 October 2007 You have new mail. # exec bash bash-3.00# zpool status -v foolpool pool: foolpool state: UNAVAIL status: One or more devices could not be opened. There are insufficient replicas for the pool to continue functioning. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-3C scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool UNAVAIL 0 0 0 insufficient replicas raidz1 UNAVAIL 0 0 0 insufficient replicas /usr/export/t0 UNAVAIL 0 0 0 cannot open /usr/export/t1 UNAVAIL 0 0 0 cannot open /usr/export/t2 UNAVAIL 0 0 0 cannot open /usr/export/t00 UNAVAIL 0 0 0 cannot open bash-3.00# cd /usr/export bash-3.00# mv hide/* . bash-3.00# zpool clear foolpool cannot open ''foolpool'': pool is unavailable bash-3.00# zpool status -v foolpool pool: foolpool state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool UNAVAIL 0 0 0 insufficient replicas raidz1 ONLINE 0 0 0 /usr/export/t0 ONLINE 0 0 0 /usr/export/t1 ONLINE 0 0 0 /usr/export/t2 ONLINE 0 0 0 /usr/export/t00 UNAVAIL 0 0 0 corrupted data bash-3.00# zpool export foolpool bash-3.00# zpool import -d /usr/export foolpool internal error: Value too large for defined data type Abort (core dumped) bash-3.00# rm core bash-3.00# zpool import -d /usr/export pool: foolpool id: 8355048046034000632 state: FAULTED status: One or more devices contains corrupted data. action: The pool cannot be imported due to damaged devices or data. see: http://www.sun.com/msg/ZFS-8000-5E config: foolpool UNAVAIL insufficient replicas raidz1 ONLINE /usr/export/t0 ONLINE /usr/export/t1 ONLINE /usr/export/t2 ONLINE /usr/export/t00 UNAVAIL corrupted data bash-3.00# ls -l total 398218 drwxr-xr-x 2 root root 2 Nov 20 13:10 hide drwxr-xr-x 6 root root 6 Oct 5 02:04 nboot -rw------T 1 root root 67108864 Nov 20 12:50 t0 -rw------T 1 root root 67028992 Nov 20 12:50 t00 [<- lol*2. where did THAT come from? -rw------T 1 root root 67108864 Nov 20 12:50 t1 and slightly too small, see?] -rw------T 1 root root 67108864 Nov 20 12:50 t2 -----8<---- t> They may both be on one disk, or they may not. This is more t> to protect against corrupt blocks if you only have a single t> drive, than against losing an entire drive. It''s not because there is some data which is not spread across both drives. It''s because ZFS won''t let you get at ANY of the data, even what''s spread, because of a variety of sanity checks, core dumps, and kernel panics. copies=2 is a really silly feature, IMHO. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/bccb5dd4/attachment.bin>
On Thu, Nov 20, 2008 at 12:35 PM, Miles Nordin <carton at ivy.net> wrote:> >>>>> "t" == Tim <tim at tcsac.net> writes: > > >> a fourth 500gb disk and add > >> it to the pool as the second vdev, what happens when that > >> fourth disk has a hardware failure? > > t> If you set copies=2, and get lucky enough > t> that copies of every block on the standalone are copied to the > t> raidz vdev, you might be able to survive, > > no, of course you won''t survive. Just try it with file vdev''s before > pontificating about it. > > -----8<---- > bash-3.00# mkfile 64m t0 > bash-3.00# mkfile 64m t1 > bash-3.00# mkfile 64m t2 > bash-3.00# mkfile 64m t00 > bash-3.00# pwd -P > /usr/export > bash-3.00# zpool create foolpool raidz1 /usr/export/t{0..2} > bash-3.00# zpool add foolpool /usr/export/t00 > invalid vdev specification > use ''-f'' to override the following errors: > mismatched replication level: pool uses raidz and new vdev is file > bash-3.00# zpool add -f foolpool /usr/export/t00 > bash-3.00# zpool status -v foolpool > pool: foolpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > foolpool ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > /usr/export/t0 ONLINE 0 0 0 > /usr/export/t1 ONLINE 0 0 0 > /usr/export/t2 ONLINE 0 0 0 > /usr/export/t00 ONLINE 0 0 0 > > errors: No known data errors > bash-3.00# zfs set copies=2 foolpool > bash-3.00# cd / > bash-3.00# pax -rwpe sbin foolpool/ > bash-3.00# > /usr/export/t00 > bash-3.00# pax -w foolpool/ > /dev/null > bash-3.00# zpool status -v foolpool > pool: foolpool > state: DEGRADED > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using ''zpool clear'' or replace the device with ''zpool replace''. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > foolpool DEGRADED 4 0 21 > raidz1 ONLINE 0 0 0 > /usr/export/t0 ONLINE 0 0 0 > /usr/export/t1 ONLINE 0 0 0 > /usr/export/t2 ONLINE 0 0 0 > /usr/export/t00 DEGRADED 4 0 21 too many errors > > errors: No known data errors > bash-3.00# zpool offline foolpool /usr/export/t00 > cannot offline /usr/export/t00: no valid replicas > bash-3.00# zpool export foolpool > panic[cpu0]/thread=2a1016b7ca0: assertion failed: vdev_config_sync(rvd, > txg) == 0, file: ../../common/fs/zfs/spa.c, line: 3125 > > 000002a1016b7850 genunix:assfail+78 (7b72c668, 7b72b680, c35, 183d800, > 1285c00, 0) > %l0-3: 0000000000000422 0000000000000081 000003001df5e580 0000000070170880 > %l4-7: 0000060016076c88 0000000000000000 0000000001887800 0000000000000000 > 000002a1016b7900 zfs:spa_sync+244 (3001df5e580, 42, 30043434e30, 7b72c400, > 7b72b400, 4) > %l0-3: 0000000000000000 000003001df5e6b0 000003001df5e670 00000600155cce80 > %l4-7: 0000030056703040 0000060013659200 000003001df5e708 00000000018c2e98 > 000002a1016b79c0 zfs:txg_sync_thread+120 (60013659200, 42, 2a1016b7a70, > 60013659320, 60013659312, 60013659310) > %l0-3: 0000000000000000 00000600136592d0 00000600136592d8 0000060013659316 > %l4-7: 0000060013659314 00000600136592c8 0000000000000043 0000000000000042 > > syncing file systems... done > [...first reboot...] > WARNING: ZFS replay transaction error 30, dataset boot/usr, seq 0x134c, > txtype 9 > > NOTICE: iscsi: SendTarget discovery failed (11) [``patiently > waits'''' forver] > > ~#Type ''go'' to resume > {0} ok boot -m milestone=none > Resetting ... > [...second reboot...] > > # /sbin/mount -o remount,rw / > # /sbin/mount /usr > # iscsiadm remove discovery-address 10.100.100.135 > # iscsiadm remove discovery-address 10.100.100.138 > # cd /usr/export > # mkdir hide > # mv t0 t1 t2 t00 hide > mv: cannot access t00 [haha ZFS.] > # sync > # reboot > syncing file systems... done > [...third reboot...] > SunOS Release 5.11 Version snv_71 64-bit > Copyright 1983-2007 Sun Microsystems, Inc. All rights reserved. > Use is subject to license terms. > NOTICE: mddb: unable to get devid for ''sd'', 0xffffffff > NOTICE: mddb: unable to get devid for ''sd'', 0xffffffff > NOTICE: mddb: unable to get devid for ''sd'', 0xffffffff > WARNING: /pci at 8,700000/usb at 5,3 (ohci0): Connecting device on port 4 failed > Hostname: terabithia.th3h.inner.chaos > /usr/sbin/pmconfig: "/etc/power.conf" line 37, cannot find ufs mount point > for "/usr/.CPR" > Reading ZFS config: done. > Mounting ZFS filesystems: (9/9) > > terabithia.th3h.inner.chaos console login: root > Password: > Nov 20 13:09:30 terabithia.th3h.inner.chaos login: ROOT LOGIN /dev/console > Last login: Mon Aug 18 03:04:12 on console > Sun Microsystems Inc. SunOS 5.11 snv_71 October 2007 > You have new mail. > # exec bash > bash-3.00# zpool status -v foolpool > pool: foolpool > state: UNAVAIL > status: One or more devices could not be opened. There are insufficient > replicas for the pool to continue functioning. > action: Attach the missing device and online it using ''zpool online''. > see: http://www.sun.com/msg/ZFS-8000-3C > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > foolpool UNAVAIL 0 0 0 insufficient > replicas > raidz1 UNAVAIL 0 0 0 insufficient > replicas > /usr/export/t0 UNAVAIL 0 0 0 cannot open > /usr/export/t1 UNAVAIL 0 0 0 cannot open > /usr/export/t2 UNAVAIL 0 0 0 cannot open > /usr/export/t00 UNAVAIL 0 0 0 cannot open > bash-3.00# cd /usr/export > bash-3.00# mv hide/* . > bash-3.00# zpool clear foolpool > cannot open ''foolpool'': pool is unavailable > bash-3.00# zpool status -v foolpool > pool: foolpool > state: UNAVAIL > status: One or more devices could not be used because the label is missing > or invalid. There are insufficient replicas for the pool to > continue > functioning. > action: Destroy and re-create the pool from a backup source. > see: http://www.sun.com/msg/ZFS-8000-5E > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > foolpool UNAVAIL 0 0 0 insufficient > replicas > raidz1 ONLINE 0 0 0 > /usr/export/t0 ONLINE 0 0 0 > /usr/export/t1 ONLINE 0 0 0 > /usr/export/t2 ONLINE 0 0 0 > /usr/export/t00 UNAVAIL 0 0 0 corrupted data > bash-3.00# zpool export foolpool > bash-3.00# zpool import -d /usr/export foolpool > internal error: Value too large for defined data type > Abort (core dumped) > bash-3.00# rm core > bash-3.00# zpool import -d /usr/export > pool: foolpool > id: 8355048046034000632 > state: FAULTED > status: One or more devices contains corrupted data. > action: The pool cannot be imported due to damaged devices or data. > see: http://www.sun.com/msg/ZFS-8000-5E > config: > > foolpool UNAVAIL insufficient replicas > raidz1 ONLINE > /usr/export/t0 ONLINE > /usr/export/t1 ONLINE > /usr/export/t2 ONLINE > /usr/export/t00 UNAVAIL corrupted data > bash-3.00# ls -l > total 398218 > drwxr-xr-x 2 root root 2 Nov 20 13:10 hide > drwxr-xr-x 6 root root 6 Oct 5 02:04 nboot > -rw------T 1 root root 67108864 Nov 20 12:50 t0 > -rw------T 1 root root 67028992 Nov 20 12:50 t00 [<- lol*2. > where did THAT come from? > -rw------T 1 root root 67108864 Nov 20 12:50 t1 and > slightly too small, see?] > -rw------T 1 root root 67108864 Nov 20 12:50 t2 > -----8<---- > > t> They may both be on one disk, or they may not. This is more > t> to protect against corrupt blocks if you only have a single > t> drive, than against losing an entire drive. > > It''s not because there is some data which is not spread across both > drives. It''s because ZFS won''t let you get at ANY of the data, even > what''s spread, because of a variety of sanity checks, core dumps, and > kernel panics. copies=2 is a really silly feature, IMHO. >Pretty sure ALL of the above are settings that can be changed. Not to mention that *backline*, from what I''ve seen, can still get the data if you have valid copies of the blocks. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/b530aa09/attachment.html>
>>>>> "t" == Tim <tim at tcsac.net> writes:t> Pretty sure ALL of the above are settings that can be changed. nope. But feel free to be more specific. Or repeat the test yourself---it only takes like fifteen minutes. It''d take five if not for the rebooting. t> Not to mention that *backline*, from what I''ve seen, can still t> get the data if you have valid copies of the blocks. Can you elaborate? There have been a lot of posts here about pools in FAULTED state that won''t import---I''m sure those posters would be even more interested than I to know how they can recover their data using *backline*. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/454177f6/attachment.bin>
On Thu, Nov 20, 2008 at 5:02 PM, Miles Nordin <carton at ivy.net> wrote:> >>>>> "t" == Tim <tim at tcsac.net> writes: > > t> Pretty sure ALL of the above are settings that can be changed. > > nope. But feel free to be more specific. Or repeat the test > yourself---it only takes like fifteen minutes. It''d take five if not > for the rebooting.Uhh, yes. There''s more than one post here describing how to set what the system does when the pool is in a faulted state/you lose a drive. Wait, panic, proceed. If we had any sort of decent search function I''d already have the response. As such, it''ll take a bit of time for me to dig it up.> > > t> Not to mention that *backline*, from what I''ve seen, can still > t> get the data if you have valid copies of the blocks. > > Can you elaborate? There have been a lot of posts here about pools in > FAULTED state that won''t import---I''m sure those posters would be even > more interested than I to know how they can recover their data using > *backline*. > >I''d have to search the list, but I''ve read of more than one person here that has worked with a Sun Engineer to manually re-create the metadata. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/50940ec2/attachment.html>
On Thu, Nov 20, 2008 at 5:02 PM, Miles Nordin <carton at ivy.net> wrote:> >>>>> "t" == Tim <tim at tcsac.net> writes: > > t> Pretty sure ALL of the above are settings that can be changed. > > nope. But feel free to be more specific. Or repeat the test > yourself---it only takes like fifteen minutes. It''d take five if not > for the rebooting. >Here''s this one: PSARC 2007/567 zpool failmode property http://prefetch.net/blog/index.php/2008/03/01/configuring-zfs-to-gracefully-deal-with-failures/ http://mail.opensolaris.org/pipermail/onnv-notify/2007-October/012782.html -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/33431740/attachment.html>
>>>>> "t" == Tim <tim at tcsac.net> writes:t> Uhh, yes. There''s more than one post here describing how to t> set what the system does when the pool is in a faulted t> state/you lose a drive. Wait, panic, proceed. (a) search for it, try it, then report your results. Don''t counter a reasonably-designed albeit quick test with stardust and fantasy based on vague innuendos. (b) ``ALL of the above are settings that can be changed.'''' Please tell me how to fix ``ALL'''' my problems, which you claim are all fixable without enumerating them. I will: 1. panic on exporting pool 2. ''zpool import foolpool'' coredumps 3. pools backed by unavailable iSCSI targets and stored in zpool.cache prevent bootup 4. WARNING: ZFS replay transaction error 30, dataset boot/usr, seq 0x134c, txtype 9 5. disappearing, reappearing /usr/export/t00 (the backing for the file vdev was another ZFS pool, so this is also a ZFS problem) Yeah, I''m pressing the issue. But I''m annoyed because I gave the question an honest go and got a clear result, and instead of repeating my test as you easily could, you dismiss the result without even reading it carefully. I think it''s possible to give much better advice by trying the the thing out than by extrapolating marketing claims. t> I''ve read of more than one person here that has worked with a t> Sun Engineer to manually re-create the metadata. (a) not in this situation, they didn''t. The loss was much less catastrophic than a whole vdev---in fact it wasn''t ever proven that anything was lost at all by the underlying storage. I think the pool may also have been simpler structure, like a single-device vdev exported as a LUN from a SAN. (b) that''s not the same thing as *backline*, whatever that is. (c) that''s pretty far short of ``if you get lucky enough ... you might be able to survive.'''' My claim: there is zero chance of importing the pool in the normal, supported way after this failure scenario. (d) I don''t think the OP had in mind counting on support from Sun engineering when you responded by suggesting an unredundant pool with copies=2 might suit his needs. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/10a49fba/attachment.bin>
On Thu, Nov 20, 2008 at 5:37 PM, Miles Nordin <carton at ivy.net> wrote:> >>>>> "t" == Tim <tim at tcsac.net> writes: > > t> Uhh, yes. There''s more than one post here describing how to > t> set what the system does when the pool is in a faulted > t> state/you lose a drive. Wait, panic, proceed. > > (a) search for it, try it, then report your results. Don''t counter a > reasonably-designed albeit quick test with stardust and fantasy > based on vague innuendos. > > (b) ``ALL of the above are settings that can be changed.'''' Please > tell me how to fix ``ALL'''' my problems, which you claim are all > fixable without enumerating them. I will: > > 1. panic on exporting pool > > 2. ''zpool import foolpool'' coredumps > > 3. pools backed by unavailable iSCSI targets and stored in > zpool.cache prevent bootup > > 4. WARNING: ZFS replay transaction error 30, dataset boot/usr, seq > 0x134c, txtype 9 > > 5. disappearing, reappearing /usr/export/t00 (the backing for the > file vdev was another ZFS pool, so this is also a ZFS problem) > > Yeah, I''m pressing the issue. But I''m annoyed because I gave the > question an honest go and got a clear result, and instead of repeating > my test as you easily could, you dismiss the result without even > reading it carefully. I think it''s possible to give much better > advice by trying the the thing out than by extrapolating marketing > claims.I gave you a the PSARC.> > t> I''ve read of more than one person here that has worked with a > t> Sun Engineer to manually re-create the metadata. > > (a) not in this situation, they didn''t. The loss was much less > catastrophic than a whole vdev---in fact it wasn''t ever proven > that anything was lost at all by the underlying storage. I think > the pool may also have been simpler structure, like a > single-device vdev exported as a LUN from a SAN. > > (b) that''s not the same thing as *backline*, whatever that is. > > (c) that''s pretty far short of ``if you get lucky enough ... you might > be able to survive.'''' My claim: there is zero chance of importing > the pool in the normal, supported way after this failure scenario. > > (d) I don''t think the OP had in mind counting on support from Sun > engineering when you responded by suggesting an unredundant pool > with copies=2 might suit his needs. >Which is why I told him not to do it, and that his chances were slim... -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/500dcebf/attachment.html>
>>>>> "t" == Tim <tim at tcsac.net> writes:t> I gave you a the PSARC. test it. t> Which is why I told him not to do it, and that his chances t> were slim... his chances are zero. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/18c3d80b/attachment.bin>
>>>>> "t" == Tim <tim at tcsac.net> writes:t> PSARC 2007/567 zpool failmode property I think this is in s10u6, so it''s been in stable solaris for twenty-one days. It''s not in the release I tested, so it''s worth trying again on a newer release. I tried on snv_83a, and instead of getting a panic when I export the pool, I get a panic when I try to import it: -----8<----- bash-3.2# mkfile 64m t0 bash-3.2# mkfile 64m t1 bash-3.2# mkfile 64m t2 bash-3.2# mkfile 64m t00 bash-3.2# pwd /export/v bash-3.2# zpool create foolpool raidz /export/v/t{0..2} bash-3.2# zpool add foolpool /export/v/t00 invalid vdev specification use ''-f'' to override the following errors: mismatched replication level: pool uses raidz and new vdev is file bash-3.2# zpool add -f foolpool /export/v/t00 bash-3.2# zpool set failmode=continue foolpool bash-3.2# zfs set copies=2 foolpool bash-3.2# cd / bash-3.2# pax -rwpe sbin foolpool/ bash-3.2# zpool status -v foolpool pool: foolpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /export/v/t0 ONLINE 0 0 0 /export/v/t1 ONLINE 0 0 0 /export/v/t2 ONLINE 0 0 0 /export/v/t00 ONLINE 0 0 0 errors: No known data errors bash-3.2# > /export/v/t00 bash-3.2# zpool status -v foolpool pool: foolpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 /export/v/t0 ONLINE 0 0 0 /export/v/t1 ONLINE 0 0 0 /export/v/t2 ONLINE 0 0 0 /export/v/t00 ONLINE 0 0 0 errors: No known data errors bash-3.2# pax -w foolpool/ > /dev/null bash-3.2# zpool status -v foolpool pool: foolpool state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using ''zpool clear'' or replace the device with ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM foolpool DEGRADED 5 0 22 raidz1 ONLINE 0 0 0 /export/v/t0 ONLINE 0 0 0 /export/v/t1 ONLINE 0 0 0 /export/v/t2 ONLINE 0 0 0 /export/v/t00 DEGRADED 5 0 22 too many errors errors: No known data errors bash-3.2# zpool export foolpool bash-3.2# zpool import -d /export/v pool: foolpool id: 2069059560600538796 state: DEGRADED status: One or more devices contains corrupted data. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: http://www.sun.com/msg/ZFS-8000-4J config: foolpool DEGRADED raidz1 ONLINE /export/v/t0 ONLINE /export/v/t1 ONLINE /export/v/t2 ONLINE /export/v/t00 UNAVAIL corrupted data bash-3.2# zpool import -d /export/v foolpool panic: assertion failed: vdev_config_sync(rvd->vdev_child, rvd->vdev_children, txg) == 0 (0x5 == 0x0), file: ../../common/fs/zfs/spa.c, line: 4095 -----8<----- -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081120/44010707/attachment.bin>