I''m in the process of standing up a couple of t5440''s, of
which the
config will eventually end up in another data center 6k miles from
the original config, and I''m supposed to send disks to the data
center and we''ll start from there (yes, I know how to flar and
jumpstart.
When the boss says do something, sometimes you *just* have to do it)
As I''ve already run into the boot failsafe when moving a root disk from
one sparc host to another, I recently found out that a sys-unconfig''d
disk
does not suffer from the same problem.
While I am probably going to be told, I shouldn''t be doing this,
I ran into an interesting "semantics" issue that I think zfs should
at least be able to avoid (and which I have seen in other non-abusive
configurations..... ;-)
2 zfs disk, root mirrored. c2t0 and c2t1.
hot unplug c2t0, (and I should probably have removed the
busted mirror from c2t1, but I didn''t)
sys-unconfig disk in c2t1
move disk to new t5440
boot disk, and it enumerates everything correctly and then I notice
zpool thinks it''s degraded. I had added the mirror after I realized
I wanted to run this by the list....
pool: rpool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using ''zpool replace''.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: resilver completed after 0h7m with 0 errors on Thu Jan 7 12:10:03 2010
config:
NAME STATE READ WRITE CKSUM
rpool DEGRADED 0 0 0
mirror DEGRADED 0 0 0
c2t0d0s0 ONLINE 0 0 0
c2t0d0s0 FAULTED 0 0 0 corrupted data
c2t3d0s0 ONLINE 0 0 0 13.8G resilvered
Anyway, should zfs report a faulted drive of the same ctd# which is
already active? I understand why this happened, but from a logistics
perspective, shouldn''t zfs be smart enough to ignore a faulted disk
like this? And this is not the first time I''ve had this scenario
happen
(I had an x4500 that had suffered through months of marvell driver
bugs and corruption, and we probably had 2 or 3 of these types of
things happen while trying to "soft" fix the problems). This also
happened with hot-spares, which caused support to spend some time
with back-line to figure out a procedure to clear those fauled disks
which had the same ctd# as a working hot-spare.......
Ben
--
This message posted from opensolaris.org