kristof
2008-Oct-27 17:54 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
I have create a zpool on server1 with 3 mirrors. Every mirror consists out: 1 local disk slice + 1 iscsi disk from server2. I export the pool on server1, and try to import the pool on server2 Without connecting over iscsi to the local targets, no zpool is seen. after connecting over iscsi to my local exposed disk, I thought I should be able to import my pool since for every mirror 1 submirror (the local iscsi target) is available. When I try to import the pool I get: bash-3.2# zpool import pool: box3 id: 13669273719601162928 state: UNAVAIL action: The pool cannot be imported due to damaged devices or data. config: box3 UNAVAIL insufficient replicas mirror UNAVAIL corrupted data c5t1d0s0 ONLINE c8t600144F048FFCC000000E081B33B9800d0 ONLINE mirror ONLINE c6t0d0s0 ONLINE c8t600144F048FFCCC50000E081B33B9800d0 ONLINE mirror ONLINE c6t1d0s0 ONLINE c8t600144F048FFCCD80000E081B33B9800d0 ONLINE But the pool can still be imported on server1, so the pool is still OK. How come I cannot import the pool on another node? Thanks in advance. Kristof -- This message posted from opensolaris.org
Miles Nordin
2008-Oct-27 19:23 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
>>>>> "k" == kristof <kristof at aserver.com> writes:k> I have create a zpool on server1 with 3 mirrors. Every mirror k> consists out: 1 local disk slice + 1 iscsi disk from server2. k> I export the pool on server1, and try to import the pool on k> server2 I can''t follow your situation. what might help: * describe what you did in chronological order * give the servers names, and refer to them by name in your chronological description. Say ``server1'''', ``server3'''', do not say ``the other server'''' * when you reconfigure something,k take time to describe the change like it''s a big deal. Don''t just say ``then I connected the local iscsi target''''. WTF is a local iSCSI target? isn''t the point of iSCSI that things are remote? is it a target and initiator running on the same node? Has something that was not a target before become a target now? * use words with rigorously consistent meanings, like do not use ''mirror'' to refer to a server. Use ''mirror'' to talk about a mirrored ZFS vdev, and ''node'' to talk about a server. Also be sure to use the terms ``iSCSI target'''' and ``iSCSI initiator'''' clearly and not quietly swap them for each other. * when words are only two or three letters long, spell them correctly. Don''t say ''out'' when you mean ''of''. HTH, HAND. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081027/0d395d02/attachment.bin>
Hi Miles thanks for your reply. Sorry for my bad english, but it''s not my first language. Ok I will try explain again. My goal is to have a zpool that can move between 2 nodes (via zpool export/import) So I start by creating 3 equal slices on Server A: I create a 450 GB slice 0 on disk: c5t1d0 & c6t0d0 & c6t1d0 Then on Server B I do the same and I create 3 iscsi targets with those slices: iscsitadm create target /dev/rdsk/c5t1d0s0 disk1 iscsitadm create target /dev/rdsk/c6t0d0s0 disk2 iscsitadm create target /dev/rdsk/c6t1d0s0 disk3 Then again On Server A I add 3 static configs (for the above iscsi targets) (*) I run: iscsiadm modify discovery -s enable devfsadm -i iscsi Now I have 3 extra disks. Then again on server A I create a zpool: zpool create -f box3 mirror c5t1d0s0 c8t600144F048FFCC000000E081B33B9800d0 mirror c6t0d0s0 c8t600144F048FFCCC50000E081B33B9800d0 mirror c6t1d0s0 c8t600144F048FFCCD80000E081B33B9800d0 So far everything work fine. Ok now I would like to test if this zpool can be imported on Server B. (for in case of disaster) On Server A I export the zpool and logout the iscsi targets (iscsiadm modify discovery -s disable). On server B I try to import a a zpool, but none is found. On Server B I add the same static configs (*) but this time with 127.0.0.1 as ip I run iscsiadm modify discovery -s enable & devfsadm -i iscsi, and try the import again. This time zpool import finds a zpool, but it cannot be imported see error in previous post. If something is still unclear, please let me know. Kristof -- This message posted from opensolaris.org
>>>>> "k" == kristof <kristof at aserver.com> writes:k> iscsitadm create target /dev/rdsk/c5t1d0s0 disk1 k> zpool create -f box3 mirror c5t1d0s0 k> c8t600144F048FFCC000000E081B33B9800d0 I wonder if ZFS is silently making an EFI label inside slice 0. I''ve never used the Solaris iSCSI target, but you might try this on server B: format -e /dev/rdsk/c5t1d0s2 part label [pick EFI label] verify that /dev/{,r}dsk/c5t1d0 exist, confirming(?) it is really an EFI label. iscsitadm destroy ... iscsitadm create target /dev/dsk/c5t1d0 disk1 then start over creating the pool on server A, exporting, shutting off the iSCSI initiator, trying to import on server B. I''m kind of hurling sticky stuff at the wall without a clue, though. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20081027/290994d8/attachment.bin>
Nigel Smith
2008-Oct-27 23:34 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
Hi Kristof Please could you post back to this forum the output from # zdb -l /dev/rdsk/... ... for each of the storage devices in your pool, while it is in a working condition on Server1. (Maybe best as an attachment) Then do the same again with the pool on Server2. What is the reported ''status'' of your zpool on Server2? (You have not provided a ''zpool status'') Thanks Nigel Smith -- This message posted from opensolaris.org
kristof
2008-Oct-28 19:22 UTC
[zfs-discuss] zpool import: all devices online but: insufficient replicas
HI, Today I tried one more time from scratch. I re-installed server B with latest available opensolaris 2008.11 (b99), b.t.w server A runs opensolaris 2008 b98 I also re-labeled all my disks. This time I can successfully import the pool on server B: root at box4:~# zpool import pool: box3 id: 12004712858660209674 state: ONLINE status: One or more devices contains corrupted data. action: The pool can be imported using its name or numeric identifier. see: http://www.sun.com/msg/ZFS-8000-4J config: box3 ONLINE mirror ONLINE c5t1d0s0 UNAVAIL corrupted data c8t600144F047BAB22E0000E081B33B9800d0 ONLINE mirror ONLINE c6t0d0s0 UNAVAIL corrupted data c8t600144F047BAB22F0000E081B33B9800d0 ONLINE mirror ONLINE c6t1d0s0 UNAVAIL corrupted data c8t600144F047BAB2310000E081B33B9800d0 ONLINE zpool import box3 root at box4:~# zpool status pool: box3 state: ONLINE status: One or more devices could not be used because the label is missing or invalid. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Replace the device using ''zpool replace''. see: http://www.sun.com/msg/ZFS-8000-4J scrub: none requested config: NAME STATE READ WRITE CKSUM box3 ONLINE 0 0 0 mirror ONLINE 0 0 0 7459600399687860000 UNAVAIL 0 0 0 was /dev/dsk/c5t1d0s0 c8t600144F047BAB22E0000E081B33B9800d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 10898717263963950690 UNAVAIL 0 0 0 was /dev/dsk/c6t0d0s0 c8t600144F047BAB22F0000E081B33B9800d0 ONLINE 0 0 0 mirror ONLINE 0 0 0 7304718211036211049 UNAVAIL 0 0 0 was /dev/dsk/c6t1d0s0 c8t600144F047BAB2310000E081B33B9800d0 ONLINE 0 0 0 errors: No known data errors Then I export the pool again on server B, and logout the iscsi targets On server A I re-enable the iscsi targets and try to import the pool back, this is was I get: -bash-3.2# zpool import pool: box3 id: 12004712858660209674 state: DEGRADED status: One or more devices are missing from the system. action: The pool can be imported despite missing or damaged devices. The fault tolerance of the pool may be compromised if imported. see: http://www.sun.com/msg/ZFS-8000-2Q config: box3 DEGRADED mirror DEGRADED c5t1d0s0 ONLINE c0t600144F047BAB22E0000E081B33B9800d0 UNAVAIL cannot open mirror DEGRADED c6t0d0s0 ONLINE c0t600144F047BAB22F0000E081B33B9800d0 UNAVAIL cannot open mirror DEGRADED c6t1d0s0 ONLINE c0t600144F047BAB2310000E081B33B9800d0 UNAVAIL cannot open -bash-3.2# format Searching for disks... Error: can''t open selected disk ''/dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0''. Error: can''t open selected disk ''/dev/rdsk/c0t600144F047BAB22F0000E081B33B9800d0p0''. Error: can''t open selected disk ''/dev/rdsk/c0t600144F047BAB2310000E081B33B9800d0p0''. done c0t600144F047BAB22E0000E081B33B9800d0: configured with capacity of 446.24GB c0t600144F047BAB22F0000E081B33B9800d0: configured with capacity of 446.24GB c0t600144F047BAB2310000E081B33B9800d0: configured with capacity of 446.24GB AVAILABLE DISK SELECTIONS: 0. c0t600144F047BAB22E0000E081B33B9800d0 <SUN-SOLARIS-1 cyl 56 alt 2 hd 255 sec 65535> /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 1. c0t600144F047BAB22F0000E081B33B9800d0 <SUN-SOLARIS-1 cyl 56 alt 2 hd 255 sec 65535> /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 2. c0t600144F047BAB2310000E081B33B9800d0 <SUN-SOLARIS-1 cyl 56 alt 2 hd 255 sec 65535> /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 3. c5t0d0 <DEFAULT cyl 60798 alt 2 hd 255 sec 63> /pci at 0,0/pci10de,cb84 at 5/disk at 0,0 4. c5t1d0 <ATA-WDCWD5001ABYS-0-1D01 cyl 60798 alt 2 hd 255 sec 63> /pci at 0,0/pci10de,cb84 at 5/disk at 1,0 5. c6t0d0 <ATA-WDCWD1000FYPS-0-1B01 cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci10de,cb84 at 5,1/disk at 0,0 6. c6t1d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci10de,cb84 at 5,1/disk at 1,0 7. c7t0d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci10de,cb84 at 5,2/disk at 0,0 8. c7t1d0 <DEFAULT cyl 60797 alt 2 hd 255 sec 126> /pci at 0,0/pci10de,cb84 at 5,2/disk at 1,0 Specify disk (enter its number): 0 selecting c0t600144F047BAB22E0000E081B33B9800d0 [disk unformatted] Error: can''t open disk ''/dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0''. Error: can''t open disk ''/dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0''. Disk not labeled. Label it now? So for reason labels are corrupt or something ... trying to relable the disk I get: Warning: error writing VTOC. Warning: error reading backup label. Warning: error reading backup label. Warning: error reading backup label. Warning: error reading backup label. Warning: error reading backup label. Warning: no backup labels Write label failed FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk fdisk - run the fdisk program repair - repair a defective sector label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions inquiry - show vendor, product and revision volname - set 8-character volume name !<cmd> - execute <cmd>, then return quit format> fdisk fdisk: Cannot open device /dev/rdsk/c0t600144F047BAB22E0000E081B33B9800d0p0. Bad read of fdisk partition. Status = ffffffff Cannot read fdisk partition information. No fdisk solaris partition found dmesg shows me: Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 (sd16) offline Oct 28 20:11:00 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 (sd16) multipath status: failed, path /iscsi (iscsi0) to target address: 0000iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3FFFF,0 is offline Load balancing: round-robin Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 (sd16) offline Oct 28 20:11:00 box3 iscsi: [ID 732673 kern.info] NOTICE: iscsi session(172) iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3 offline Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 (sd17) offline Oct 28 20:11:00 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 (sd17) multipath status: failed, path /iscsi (iscsi0) to target address: 0000iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2FFFF,0 is offline Load balancing: round-robin Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 (sd17) offline Oct 28 20:11:00 box3 iscsi: [ID 732673 kern.info] NOTICE: iscsi session(169) iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2 offline Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 (sd18) offline Oct 28 20:11:00 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 (sd18) multipath status: failed, path /iscsi (iscsi0) to target address: 0000iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1FFFF,0 is offline Load balancing: round-robin Oct 28 20:11:00 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 (sd18) offline Oct 28 20:11:00 box3 iscsi: [ID 732673 kern.info] NOTICE: iscsi session(166) iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1 offline Oct 28 20:11:00 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata) instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1 Oct 28 20:11:04 box3 last message repeated 2 times Oct 28 20:11:13 box3 iscsi: [ID 559844 kern.info] NOTICE: iscsi session(184) iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3 online Oct 28 20:11:13 box3 scsi: [ID 799468 kern.info] sd16 at scsi_vhci0: name g600144f047bab22e0000e081b33b9800, bus address g600144f047bab22e0000e081b33b9800 f_tpgs Oct 28 20:11:13 box3 genunix: [ID 936769 kern.info] sd16 is /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 Oct 28 20:11:13 box3 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Oct 28 20:11:13 box3 last message repeated 1 time Oct 28 20:11:13 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 (sd16) online Oct 28 20:11:13 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at g600144f047bab22e0000e081b33b9800 (sd16) multipath status: degraded, path /iscsi (iscsi0) to target address: 0000iqn.1986-03.com.sun:02:b2570926-f5f4-ee41-a3b4-89d7f965e31b.disk3FFFF,0 is online Load balancing: round-robin Oct 28 20:11:13 box3 iscsi: [ID 559844 kern.info] NOTICE: iscsi session(181) iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2 online Oct 28 20:11:13 box3 scsi: [ID 799468 kern.info] sd17 at scsi_vhci0: name g600144f047bab22f0000e081b33b9800, bus address g600144f047bab22f0000e081b33b9800 f_tpgs Oct 28 20:11:13 box3 genunix: [ID 936769 kern.info] sd17 is /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 Oct 28 20:11:13 box3 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Oct 28 20:11:13 box3 last message repeated 1 time Oct 28 20:11:13 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 (sd17) online Oct 28 20:11:13 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at g600144f047bab22f0000e081b33b9800 (sd17) multipath status: degraded, path /iscsi (iscsi0) to target address: 0000iqn.1986-03.com.sun:02:5c9dd9a7-e1a9-c68e-8b2a-ee0e3a97596d.disk2FFFF,0 is online Load balancing: round-robin Oct 28 20:11:13 box3 iscsi: [ID 559844 kern.info] NOTICE: iscsi session(178) iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1 online Oct 28 20:11:13 box3 scsi: [ID 799468 kern.info] sd18 at scsi_vhci0: name g600144f047bab2310000e081b33b9800, bus address g600144f047bab2310000e081b33b9800 f_tpgs Oct 28 20:11:13 box3 genunix: [ID 936769 kern.info] sd18 is /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 Oct 28 20:11:13 box3 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Oct 28 20:11:13 box3 last message repeated 1 time Oct 28 20:11:13 box3 genunix: [ID 408114 kern.info] /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 (sd18) online Oct 28 20:11:13 box3 genunix: [ID 834635 kern.info] /scsi_vhci/disk at g600144f047bab2310000e081b33b9800 (sd18) multipath status: degraded, path /iscsi (iscsi0) to target address: 0000iqn.1986-03.com.sun:02:dcdc3bdd-2f56-e18c-d0c8-caeacd624f08.disk1FFFF,0 is online Load balancing: round-robin Oct 28 20:11:13 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata) instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1 Oct 28 20:11:15 box3 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Oct 28 20:11:15 box3 last message repeated 5 times Oct 28 20:11:20 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata) instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1 Oct 28 20:11:24 box3 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 Oct 28 20:11:27 box3 last message repeated 30 times Oct 28 20:11:27 box3 pcplusmp: [ID 444295 kern.info] pcplusmp: ide (ata) instance #1 vector 0xf ioapic 0x2 intin 0xf is bound to cpu 1 Oct 28 20:11:28 box3 scsi_vhci: [ID 734749 kern.warning] WARNING: vhci_scsi_reset 0x1 -bash-3.2# does anyone knows what is going wrong here? -- This message posted from opensolaris.org