kristof
2008-Oct-27 17:54 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
I have create a zpool on server1 with 3 mirrors. Every mirror consists out: 1
local disk slice + 1 iscsi disk from server2.
I export the pool on server1, and try to import the pool on server2
Without connecting over iscsi to the local targets, no zpool is seen.
after connecting over iscsi to my local exposed disk, I thought I should be able
to import my pool since for every mirror 1 submirror (the local iscsi target) is
available.
When I try to import the pool I get:
bash-3.2# zpool import
pool: box3
id: 13669273719601162928
state: UNAVAIL
action: The pool cannot be imported due to damaged devices or data.
config:
box3 UNAVAIL insufficient
replicas
mirror UNAVAIL corrupted data
c5t1d0s0 ONLINE
c8t600144F048FFCC000000E081B33B9800d0 ONLINE
mirror ONLINE
c6t0d0s0 ONLINE
c8t600144F048FFCCC50000E081B33B9800d0 ONLINE
mirror ONLINE
c6t1d0s0 ONLINE
c8t600144F048FFCCD80000E081B33B9800d0 ONLINE
But the pool can still be imported on server1, so the pool is still OK.
How come I cannot import the pool on another node?
Thanks in advance.
Kristof
--
This message posted from opensolaris.org
Miles Nordin
2008-Oct-27 19:23 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
>>>>> "k" == kristof <kristof at aserver.com> writes:k> I have create a zpool on server1 with 3 mirrors. Every mirror k> consists out: 1 local disk slice + 1 iscsi disk from server2. k> I export the pool on server1, and try to import the pool on k> server2 I can''t follow your situation. what might help: * describe what you did in chronological order * give the servers names, and refer to them by name in your chronological description. Say ``server1'''', ``server3'''', do not say ``the other server'''' * when you reconfigure something,k take time to describe the change like it''s a big deal. Don''t just say ``then I connected the local iscsi target''''. WTF is a local iSCSI target? isn''t the point of iSCSI that things are remote? is it a target and initiator running on the same node? Has something that was not a target before become a target now? * use words with rigorously consistent meanings, like do not use ''mirror'' to refer to a server. Use ''mirror'' to talk about a mirrored ZFS vdev, and ''node'' to talk about a server. Also be sure to use the terms ``iSCSI target'''' and ``iSCSI initiator'''' clearly and not quietly swap them for each other. * when words are only two or three letters long, spell them correctly. Don''t say ''out'' when you mean ''of''. HTH, HAND. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081027/0d395d02/attachment.bin>
Hi Miles thanks for your reply. Sorry for my bad english, but it''s not my first language. Ok I will try explain again. My goal is to have a zpool that can move between 2 nodes (via zpool export/import) So I start by creating 3 equal slices on Server A: I create a 450 GB slice 0 on disk: c5t1d0 & c6t0d0 & c6t1d0 Then on Server B I do the same and I create 3 iscsi targets with those slices: iscsitadm create target /dev/rdsk/c5t1d0s0 disk1 iscsitadm create target /dev/rdsk/c6t0d0s0 disk2 iscsitadm create target /dev/rdsk/c6t1d0s0 disk3 Then again On Server A I add 3 static configs (for the above iscsi targets) (*) I run: iscsiadm modify discovery -s enable devfsadm -i iscsi Now I have 3 extra disks. Then again on server A I create a zpool: zpool create -f box3 mirror c5t1d0s0 c8t600144F048FFCC000000E081B33B9800d0 mirror c6t0d0s0 c8t600144F048FFCCC50000E081B33B9800d0 mirror c6t1d0s0 c8t600144F048FFCCD80000E081B33B9800d0 So far everything work fine. Ok now I would like to test if this zpool can be imported on Server B. (for in case of disaster) On Server A I export the zpool and logout the iscsi targets (iscsiadm modify discovery -s disable). On server B I try to import a a zpool, but none is found. On Server B I add the same static configs (*) but this time with 127.0.0.1 as ip I run iscsiadm modify discovery -s enable & devfsadm -i iscsi, and try the import again. This time zpool import finds a zpool, but it cannot be imported see error in previous post. If something is still unclear, please let me know. Kristof -- This message posted from opensolaris.org
>>>>> "k" == kristof <kristof at aserver.com> writes:k> iscsitadm create target /dev/rdsk/c5t1d0s0 disk1 k> zpool create -f box3 mirror c5t1d0s0 k> c8t600144F048FFCC000000E081B33B9800d0 I wonder if ZFS is silently making an EFI label inside slice 0. I''ve never used the Solaris iSCSI target, but you might try this on server B: format -e /dev/rdsk/c5t1d0s2 part label [pick EFI label] verify that /dev/{,r}dsk/c5t1d0 exist, confirming(?) it is really an EFI label. iscsitadm destroy ... iscsitadm create target /dev/dsk/c5t1d0 disk1 then start over creating the pool on server A, exporting, shutting off the iSCSI initiator, trying to import on server B. I''m kind of hurling sticky stuff at the wall without a clue, though. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081027/290994d8/attachment.bin>
Nigel Smith
2008-Oct-27 23:34 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
Hi Kristof Please could you post back to this forum the output from # zdb -l /dev/rdsk/... ... for each of the storage devices in your pool, while it is in a working condition on Server1. (Maybe best as an attachment) Then do the same again with the pool on Server2. What is the reported ''status'' of your zpool on Server2? (You have not provided a ''zpool status'') Thanks Nigel Smith -- This message posted from opensolaris.org
kristof
2008-Oct-28 19:22 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
HI,
Today I tried one more time from scratch.
I re-installed server B with latest available opensolaris 2008.11 (b99), b.t.w
server A runs opensolaris 2008 b98
I also re-labeled all my disks.
This time I can successfully import the pool on server B:
root at box4:~# zpool import
pool: box3
id: 12004712858660209674
state: ONLINE
status: One or more devices contains corrupted data.
action: The pool can be imported using its name or numeric identifier.
see: http://www.sun.com/msg/ZFS-8000-4J
config:
box3 ONLINE
mirror ONLINE
c5t1d0s0 UNAVAIL corrupted data
c8t600144F047BAB22E0000E081B33B9800d0 ONLINE
mirror ONLINE
c6t0d0s0 UNAVAIL corrupted data
c8t600144F047BAB22F0000E081B33B9800d0 ONLINE
mirror ONLINE
c6t1d0s0 UNAVAIL corrupted data
c8t600144F047BAB2310000E081B33B9800d0 ONLINE
zpool import box3
root at box4:~# zpool status
pool: box3
state: ONLINE
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using ''zpool replace''.
see: http://www.sun.com/msg/ZFS-8000-4J
scrub: none requested
config:
NAME STATE READ WRITE CKSUM
box3 ONLINE 0 0 0
mirror ONLINE 0 0 0
7459600399687860000 UNAVAIL 0 0 0
was /dev/dsk/c5t1d0s0
c8t600144F047BAB22E0000E081B33B9800d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
10898717263963950690 UNAVAIL 0 0 0
was /dev/dsk/c6t0d0s0
c8t600144F047BAB22F0000E081B33B9800d0 ONLINE 0 0 0
mirror ONLINE 0 0 0
7304718211036211049 UNAVAIL 0 0 0
was /dev/dsk/c6t1d0s0
c8t600144F047BAB2310000E081B33B9800d0 ONLINE 0 0 0
errors: No known data errors
Then I export the pool again on server B, and logout the iscsi targets
On server A I re-enable the iscsi targets and try to import the pool back, this
is was I get:
-bash-3.2# zpool import
pool: box3
id: 12004712858660209674
state: DEGRADED
status: One or more devices are missing from the system.
action: The pool can be imported despite missing or damaged devices. The
fault tolerance of the pool may be compromised if imported.
see: http://www.sun.com/msg/ZFS-8000-2Q
config:
box3 DEGRADED
mirror DEGRADED
c5t1d0s0 ONLINE
c0t600144F047BAB22E0000E081B33B9800d0 UNAVAIL cannot open
mirror DEGRADED
c6t0d0s0 ONLINE
c0t600144F047BAB22F0000E081B33B9800d0 UNAVAIL cannot open
mirror DEGRADED
c6t1d0s0 ONLINE
c0t600144F047BAB2310000E081B33B9800d0 UNAVAIL cannot open
-bash-3.2# format
Searching for disks...
Error: can''t open selected disk
''/dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0''.
Error: can''t open selected disk
''/dev/rdsk/c0t600144F047BAB22F0000E081B33B9800d0p0''.
Error: can''t open selected disk
''/dev/rdsk/c0t600144F047BAB2310000E081B33B9800d0p0''.
done
c0t600144F047BAB22E0000E081B33B9800d0: configured with capacity of 446.24GB
c0t600144F047BAB22F0000E081B33B9800d0: configured with capacity of 446.24GB
c0t600144F047BAB2310000E081B33B9800d0: configured with capacity of 446.24GB
AVAILABLE DISK SELECTIONS:
0. c0t600144F047BAB22E0000E081B33B9800d0 <SUN-SOLARIS-1 cyl 56 alt 2
hd 255 sec 65535>
/scsi_vhci/disk at g600144f047bab22e0000e081b33b9800
1. c0t600144F047BAB22F0000E081B33B9800d0 <SUN-SOLARIS-1 cyl 56 alt 2
hd 255 sec 65535>
/scsi_vhci/disk at g600144f047bab22f0000e081b33b9800
2. c0t600144F047BAB2310000E081B33B9800d0 <SUN-SOLARIS-1 cyl 56 alt 2
hd 255 sec 65535>
/scsi_vhci/disk at g600144f047bab2310000e081b33b9800
3. c5t0d0 <DEFAULT cyl 60798 alt 2 hd 255 sec 63>
/pci at 0,0/pci10de,cb84 at 5/disk at 0,0
4. c5t1d0 <ATA-WDCWD5001ABYS-0-1D01 cyl 60798 alt 2 hd 255 sec 63>
/pci at 0,0/pci10de,cb84 at 5/disk at 1,0
5. c6t0d0 <ATA-WDCWD1000FYPS-0-1B01 cyl 60797 alt 2 hd 255 sec 126>
/pci at 0,0/pci10de,cb84 at 5,1/disk at 0,0
6. c6t1d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126>
/pci at 0,0/pci10de,cb84 at 5,1/disk at 1,0
7. c7t0d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126>
/pci at 0,0/pci10de,cb84 at 5,2/disk at 0,0
8. c7t1d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126>
/pci at 0,0/pci10de,cb84 at 5,2/disk at 1,0
Specify disk (enter its number): 0
selecting c0t600144F047BAB22E0000E081B33B9800d0
[disk unformatted]
Error: can''t open disk
''/dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0''.
Error: can''t open disk
''/dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0''.
Disk not labeled. Label it now?
So for reason labels are corrupt or something ...
trying to relable the disk I get:
Warning: error writing VTOC.
Warning: error reading backup label.
Warning: error reading backup label.
Warning: error reading backup label.
Warning: error reading backup label.
Warning: error reading backup label.
Warning: no backup labels
Write label failed
FORMAT MENU:
disk - select a disk
type - select (define) a disk type
partition - select (define) a partition table
current - describe the current disk
format - format and analyze the disk
fdisk - run the fdisk program
repair - repair a defective sector
label - write label to the disk
analyze - surface analysis
defect - defect list management
backup - search for backup labels
verify - read and display labels
save - save new disk/partition definitions
inquiry - show vendor, product and revision
volname - set 8-character volume name
!<cmd> - execute <cmd>, then return
quit
format> fdisk
fdisk: Cannot open device /dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0.
Bad read of fdisk partition. Status = ffffffff
Cannot read fdisk partition information.
No fdisk solaris partition found
dmesg shows me:
Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab22e0000e081b33b9800 (sd16) offline
Oct 28 20:11:00 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at
g600144f047bab22e0000e081b33b9800 (sd16) multipath status: failed, path /iscsi
(iscsi0) to target address:
0000iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3FFFF,0 is
offline Load balancing: round-robin
Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab22e0000e081b33b9800 (sd16) offline
Oct 28 20:11:00 box3 iscsi: [ID 732673 kern.info] NOTICE: iscsi session(172)
iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3 offline
Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab22f0000e081b33b9800 (sd17) offline
Oct 28 20:11:00 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at
g600144f047bab22f0000e081b33b9800 (sd17) multipath status: failed, path /iscsi
(iscsi0) to target address:
0000iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2FFFF,0 is
offline Load balancing: round-robin
Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab22f0000e081b33b9800 (sd17) offline
Oct 28 20:11:00 box3 iscsi: [ID 732673 kern.info] NOTICE: iscsi session(169)
iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2 offline
Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab2310000e081b33b9800 (sd18) offline
Oct 28 20:11:00 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at
g600144f047bab2310000e081b33b9800 (sd18) multipath status: failed, path /iscsi
(iscsi0) to target address:
0000iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1FFFF,0 is
offline Load balancing: round-robin
Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab2310000e081b33b9800 (sd18) offline
Oct 28 20:11:00 box3 iscsi: [ID 732673 kern.info] NOTICE: iscsi session(166)
iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1 offline
Oct 28 20:11:00 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata)
instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1
Oct 28 20:11:04 box3 last message repeated 2 times
Oct 28 20:11:13 box3 iscsi: [ID 559844 kern.info] NOTICE: iscsi session(184)
iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3 online
Oct 28 20:11:13 box3 scsi: [ID 799468 kern.info] sd16 at scsi_vhci0: name
g600144f047bab22e0000e081b33b9800, bus address g600144f047bab22e0000e081b33b9800
f_tpgs
Oct 28 20:11:13 box3 genunix: [ID 936769 kern.info] sd16 is /scsi_vhci/disk at
g600144f047bab22e0000e081b33b9800
Oct 28 20:11:13 box3 scsi_vhci: [ID 734749 kern.warning] WARNING:
vhci_scsi_reset 0x1
Oct 28 20:11:13 box3 last message repeated 1 time
Oct 28 20:11:13 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab22e0000e081b33b9800 (sd16) online
Oct 28 20:11:13 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at
g600144f047bab22e0000e081b33b9800 (sd16) multipath status: degraded, path /iscsi
(iscsi0) to target address:
0000iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3FFFF,0 is
online Load balancing: round-robin
Oct 28 20:11:13 box3 iscsi: [ID 559844 kern.info] NOTICE: iscsi session(181)
iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2 online
Oct 28 20:11:13 box3 scsi: [ID 799468 kern.info] sd17 at scsi_vhci0: name
g600144f047bab22f0000e081b33b9800, bus address g600144f047bab22f0000e081b33b9800
f_tpgs
Oct 28 20:11:13 box3 genunix: [ID 936769 kern.info] sd17 is /scsi_vhci/disk at
g600144f047bab22f0000e081b33b9800
Oct 28 20:11:13 box3 scsi_vhci: [ID 734749 kern.warning] WARNING:
vhci_scsi_reset 0x1
Oct 28 20:11:13 box3 last message repeated 1 time
Oct 28 20:11:13 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab22f0000e081b33b9800 (sd17) online
Oct 28 20:11:13 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at
g600144f047bab22f0000e081b33b9800 (sd17) multipath status: degraded, path /iscsi
(iscsi0) to target address:
0000iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2FFFF,0 is
online Load balancing: round-robin
Oct 28 20:11:13 box3 iscsi: [ID 559844 kern.info] NOTICE: iscsi session(178)
iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1 online
Oct 28 20:11:13 box3 scsi: [ID 799468 kern.info] sd18 at scsi_vhci0: name
g600144f047bab2310000e081b33b9800, bus address g600144f047bab2310000e081b33b9800
f_tpgs
Oct 28 20:11:13 box3 genunix: [ID 936769 kern.info] sd18 is /scsi_vhci/disk at
g600144f047bab2310000e081b33b9800
Oct 28 20:11:13 box3 scsi_vhci: [ID 734749 kern.warning] WARNING:
vhci_scsi_reset 0x1
Oct 28 20:11:13 box3 last message repeated 1 time
Oct 28 20:11:13 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at
g600144f047bab2310000e081b33b9800 (sd18) online
Oct 28 20:11:13 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at
g600144f047bab2310000e081b33b9800 (sd18) multipath status: degraded, path /iscsi
(iscsi0) to target address:
0000iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1FFFF,0 is
online Load balancing: round-robin
Oct 28 20:11:13 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata)
instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1
Oct 28 20:11:15 box3 scsi_vhci: [ID 734749 kern.warning] WARNING:
vhci_scsi_reset 0x1
Oct 28 20:11:15 box3 last message repeated 5 times
Oct 28 20:11:20 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata)
instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1
Oct 28 20:11:24 box3 scsi_vhci: [ID 734749 kern.warning] WARNING:
vhci_scsi_reset 0x1
Oct 28 20:11:27 box3 last message repeated 30 times
Oct 28 20:11:27 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata)
instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1
Oct 28 20:11:28 box3 scsi_vhci: [ID 734749 kern.warning] WARNING:
vhci_scsi_reset 0x1
-bash-3.2#
does anyone knows what is going wrong here?
--
This message posted from opensolaris.org