thr3ads.net - Xen users - [Xen-users] Debian Squeeze, xen, multipath and iscsi [Feb 2011]

If this information is useful, please help other people find it:
Share via:

Agustin Lopez

2011-Feb-11 13:35 UTC

[Xen-users] Debian Squeeze, xen, multipath and iscsi

Hi all!

I want to update my Debian Lenny xen servers to Squeeze.
I am testing with a new install. All installs Ok but when I install the 
multipath
package I get a kernel crash.

I am searched a bit with google but I have not found any solution.

Are there anybody in the list working with this configuration?

PS: If I boot my server without Xen, with a standard kernel, multipath
and the other is working right.

Best regards for the help,
Agustin

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Henrik Langos

2011-Feb-11 14:18 UTC

head link

Re: [Xen-users] Debian Squeeze, xen, multipath and iscsi

On Fri, Feb 11, 2011 at 02:35:53PM +0100, Agustin Lopez
wrote:>
> Hi all!
>
> I want to update my Debian Lenny xen servers to Squeeze.
> I am testing with a new install. All installs Ok but when I install the  
> multipath
> package I get a kernel crash.
>
> I am searched a bit with google but I have not found any solution.
>
> Are there anybody in the list working with this configuration?
>
> PS: If I boot my server without Xen, with a standard kernel, multipath
> and the other is working right.
>
What exactly crashes? dom0 ? Do you get a kernel dump on the console?

I have pretty much the same setup here (iSCSI + multipath + Xen + 
Squeeze dom0 + Lenny/Etch PVM domUs) and I had some trouble with 
multipath and iSCSI beeing a little touchy.

Basically my dom0 kernel hates to have fast iSCSI logout/login 
sequences.

You''ll have to give multipathd some time to cleanly remove
multipath devices before you do another login.

Otherwise I get stuff like this where kpartx (the thing that 
manages of device nodes for partitions) triggers some 
race condition:

Feb 10 06:46:43 xenhost03 kernel: [225060.039126] BUG: unable to handle kernel
paging request at ffff88001558b010
Feb 10 06:46:43 xenhost03 kernel: [225060.039172] IP: [<ffffffff8100e428>]
xen_set_pmd+0x15/0x2c
Feb 10 06:46:43 xenhost03 kernel: [225060.039210] PGD 1002067 PUD 1006067 PMD
18a067 PTE 801000001558b065
Feb 10 06:46:43 xenhost03 kernel: [225060.039253] Oops: 0003 [#1] SMP
Feb 10 06:46:43 xenhost03 kernel: [225060.039284] last sysfs file:
/sys/devices/virtual/block/dm-6/dm/suspended
Feb 10 06:46:43 xenhost03 kernel: [225060.039319] CPU 0
Feb 10 06:46:43 xenhost03 kernel: [225060.039344] Modules linked in: tun
dm_round_robin crc32c xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack xt_physdev iptable_filter ip_tables x_tables bridge stp xen_evtchn
xenfs ib_iser rdma_cm ib_c
m iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi dm_multipath scsi_dh loop snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm i915 drm_kms_helper drm snd_timer i2c_i801 evdev parport_pc
psmouse serio_raw pcspkr i2c_
algo_bit parport i2c_core snd soundcore video output snd_page_alloc button
processor acpi_processor ext3 jbd mbcache dm_mod sd_mod crc_t10dif usbhid hid
uhci_hcd ata_generic ata_piix libata ehci_hcd scsi_mod e1000e usbcore nls_base
thermal thermal_sys
 [last unloaded: scsi_wait_scan]
Feb 10 06:46:43 xenhost03 kernel: [225060.039851] Pid: 9259, comm: kpartx_id Not
tainted 2.6.32-5-xen-amd64 #1 To Be Filled By O.E.M.
Feb 10 06:46:43 xenhost03 kernel: [225060.039904] RIP:
e030:[<ffffffff8100e428>]  [<ffffffff8100e428>]
xen_set_pmd+0x15/0x2c
Feb 10 06:46:43 xenhost03 kernel: [225060.039959] RSP: e02b:ffff880013ad3b18 
EFLAGS: 00010246
Feb 10 06:46:43 xenhost03 kernel: [225060.039990] RAX: 0000000000000000 RBX:
ffff88001558b010 RCX: ffff880000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] RDX: ffffea0000000000 RSI:
0000000001cc0000 RDI: ffff88001558b010
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] RBP: 0000000000000000 R08:
0000000001cc0000 R09: ffff880073c03100
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] R10: 0000000000000000 R11:
ffff88002ce3bd78 R12: 000000000061c000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] R13: 0000000000400000 R14:
ffff88001558b010 R15: ffff88002e156000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] FS:  00007f5094f64700(0000)
GS:ffff880003630000(0000) knlGS:0000000000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] CS:  e033 DS: 0000 ES: 0000
CR0: 000000008005003b
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] CR2: ffff88001558b010 CR3:
0000000011a2b000 CR4: 0000000000002660
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] Process kpartx_id (pid: 9259,
threadinfo ffff880013ad2000, task ffff880002747810)
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] Stack:
Feb 10 06:46:43 xenhost03 kernel: [225060.040006]  ffff880000000000
0000000000600000 0000000000400000 ffffffff810cf886
Feb 10 06:46:43 xenhost03 kernel: [225060.040006] <0> ffff880013ad3fd8
0000000017ab2067 ffff880002159180 0000000000000000
Feb 10 06:46:43 xenhost03 kernel: [225060.040636] <0> 0000000000000000
000000000061bfff 000000000061bfff 0000000001c00000
Feb 10 06:46:43 xenhost03 kernel: [225060.040636] Call Trace:
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810cf886>] ?
free_pgd_range+0x226/0x3bf
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810cfabb>] ?
free_pgtables+0x9c/0xbd
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810d129d>] ?
exit_mmap+0xef/0x148
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff8104cb09>] ?
mmput+0x3c/0xdf
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f44d6>] ?
flush_old_exec+0x45c/0x548
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff811270d0>] ?
load_elf_binary+0x0/0x1954
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff8112746d>] ?
load_elf_binary+0x39d/0x1954
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810cc572>] ?
follow_page+0x2ad/0x303
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810ce136>] ?
__get_user_pages+0x3ea/0x47b
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f4fcb>] ?
get_arg_page+0x61/0x110
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff811270d0>] ?
load_elf_binary+0x0/0x1954
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f3caa>] ?
search_binary_handler+0xb4/0x245
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff810f54a7>] ?
do_execve+0x1e4/0x2c3
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff81010500>] ?
sys_execve+0x35/0x4c
Feb 10 06:46:43 xenhost03 kernel: [225060.040636]  [<ffffffff81011f9a>] ?
stub_execve+0x6a/0xc0
Feb 10 06:46:43 xenhost03 kernel: [225060.040636] Code: fb ff ff e8 c6 f4 01 00
bf 01 00 00 00 e8 c9 ea ff ff 59 5e 5b c3 55 48 89 f5 53 48 89 fb 48 83 ec 08 e8
6e e3 ff ff 84 c0 75 08 <48> 89 2b 41 59 5b 5d c3 41 58 48 89 df 48 89 ee
5b 5d e9 7e ff
Feb 10 06:46:43 xenhost03 kernel: [225060.042981] RIP 
[<ffffffff8100e428>] xen_set_pmd+0x15/0x2c
Feb 10 06:46:43 xenhost03 kernel: [225060.042981]  RSP <ffff880013ad3b18>
Feb 10 06:46:43 xenhost03 kernel: [225060.042981] CR2: ffff88001558b010
Feb 10 06:46:43 xenhost03 kernel: [225060.042981] ---[ end trace
9939eec096f5a2de ]---


Also I noticed dom0 lockups of more than a minute when starting
HVM domUs while another domU was creating heavy IO load. 
Those only disapeared when I gave my dom0 a fixed ammount of 
RAM instead of balooning it down.


Other than that I had no bad trouble. (Well, life migration of
Lenny 32 bit domUs on 64bit dom0 doesn''t work because the 
Lenny domU kernel is not good at that).


I didn''t do a new install of squeeze though. I started
with lenny and upgraded to squeeze.

cheers
-henrik


_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Agustin Lopez

2011-Feb-16 11:43 UTC

head link

Re: [Xen-users] Debian Squeeze, xen, multipath and iscsi

Hi!

Thanks for the answer.
Below is the kernel message I am repeatly getting in the log.
The system crash only with the interaction between xen, iscsi and multipath.
Again, system is Debian Squeeze running on a Fujitsu  PRIMERGY RX200 S4 with 8
cores Intel(R) Xeon(R) CPU           E5405  @ 2.00GHz
iSCSI is from a EMC cabinet.

Any help, please?

Agustin


Modules linked in: dm_round_robin scsi_dh_emc crc32c ib_iser rdma_cm ib_cm iw_cm
ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi bridge stp xen_evtchn xenfs
dm_multipath dm_mod scsi_dh loop i2c_i801 usbhid ioatdma dca hid shpchp i2c_cor
Feb 16 12:08:01 ariete kernel: [  358.633003] Pid: 1672, comm: dmsetup Tainted:
G      D    2.6.32-5-xen-amd64 #1 PRIMERGY RX200 S4
Feb 16 12:08:01 ariete kernel: [  358.633003] RIP:
e030:[<ffffffff8130cb16>]  [<ffffffff8130cb16>] _spin_lock+0x13/0x1b
Feb 16 12:08:01 ariete kernel: [  358.633003] RSP: e02b:ffff8807dbccdb10 
EFLAGS: 00000297
Feb 16 12:08:01 ariete kernel: [  358.633003] RAX: 0000000000000022 RBX:
ffff8807dbccdb28 RCX: ffff8807dbccdb68
Feb 16 12:08:01 ariete kernel: [  358.633003] RDX: 0000000000000021 RSI:
0000000000000200 RDI: ffff8807dbe1c300
Feb 16 12:08:01 ariete kernel: [  358.633003] RBP: 0000000000000200 R08:
0000000000000008 R09: ffffffff814eb870
Feb 16 12:08:01 ariete kernel: [  358.633003] R10: 000000000000000b R11:
ffff8807dbe1c280 R12: ffff8807dbe1c280
Feb 16 12:08:01 ariete kernel: [  358.633003] R13: 000000000000c580 R14:
ffff8807dbccdb28 R15: ffffffff814eb830
Feb 16 12:08:01 ariete kernel: [  358.633003] FS:  00007fe9c607a7a0(0000)
GS:ffff8800280c7000(0000) knlGS:0000000000000000
Feb 16 12:08:01 ariete kernel: [  358.633003] CS:  e033 DS: 0000 ES: 0000 CR0:
000000008005003b
Feb 16 12:08:01 ariete kernel: [  358.633003] CR2: 00007fe9c5803420 CR3:
0000000001001000 CR4: 0000000000002660
Feb 16 12:08:01 ariete kernel: [  358.633003] DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Feb 16 12:08:01 ariete kernel: [  358.633003] DR3: 0000000000000000 DR6:
00000000ffff0ff0 DR7: 0000000000000400
Feb 16 12:08:01 ariete kernel: [  358.633003] Call Trace:
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff8100dd87>] ?
xen_exit_mmap+0xf8/0x136
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff810d1208>] ?
exit_mmap+0x5a/0x148
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff8104cb09>] ?
mmput+0x3c/0xdf
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff81050702>] ?
exit_mm+0x102/0x10d
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff8130ca72>] ?
_spin_lock_irq+0x7/0x22
Feb 16 12:08:01 ariete kernel: [  358.633003]  [<ffffffff81052127>] ?
do_exit+0x1f8/0x6c6
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100ecdf>] ?
xen_restore_fl_direct_end+0x0/0x1
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130cb3a>] ?
_spin_unlock_irqrestore+0xd/0xe
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8104f3af>] ?
release_console_sem+0x17e/0x1af
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130d9dd>] ?
oops_end+0xaf/0xb4
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810135f0>] ?
do_invalid_op+0x8b/0x95
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100c694>] ?
pin_pagetable_pfn+0x2d/0x36
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffffa01bb9ea>] ?
copy_params+0x71/0xb1 [dm_mod]
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810baf07>] ?
__alloc_pages_nodemask+0x11c/0x5f5
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8101293b>] ?
invalid_op+0x1b/0x20
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100c694>] ?
pin_pagetable_pfn+0x2d/0x36
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8100c690>] ?
pin_pagetable_pfn+0x29/0x36
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810cd4e2>] ?
__pte_alloc+0x6b/0xc6
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810cb394>] ?
pmd_alloc+0x28/0x5b
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810cd60b>] ?
handle_mm_fault+0xce/0x80f
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff810fbc5c>] ?
do_vfs_ioctl+0x48d/0x4cb
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130f016>] ?
do_page_fault+0x2e0/0x2fc
Feb 16 12:08:01 ariete kernel: [  358.642915]  [<ffffffff8130ceb5>] ?
page_fault+0x25/0x30


El 11/02/2011 15:18, Henrik Langos escribió:> On Fri, Feb 11, 2011 at 02:35:53PM +0100, Agustin Lopez wrote:
>> Hi all!
>>
>> I want to update my Debian Lenny xen servers to Squeeze.
>> I am testing with a new install. All installs Ok but when I install the
>> multipath
>> package I get a kernel crash.
>>
>> I am searched a bit with google but I have not found any solution.
>>
>> Are there anybody in the list working with this configuration?
>>
>> PS: If I boot my server without Xen, with a standard kernel, multipath
>> and the other is working right.
>>
> What exactly crashes? dom0 ? Do you get a kernel dump on the console?
>
> I have pretty much the same setup here (iSCSI + multipath + Xen +
> Squeeze dom0 + Lenny/Etch PVM domUs) and I had some trouble with
> multipath and iSCSI beeing a little touchy.
>
> Basically my dom0 kernel hates to have fast iSCSI logout/login
> sequences.
>
> You''ll have to give multipathd some time to cleanly remove
> multipath devices before you do another login.
>
> Otherwise I get stuff like this where kpartx (the thing that
> manages of device nodes for partitions) triggers some
> race condition:
>
> Feb 10 06:46:43 xenhost03 kernel: [225060.039126] BUG: unable to handle
kernel paging request at ffff88001558b010
> Feb 10 06:46:43 xenhost03 kernel: [225060.039172] IP:
[<ffffffff8100e428>] xen_set_pmd+0x15/0x2c
> Feb 10 06:46:43 xenhost03 kernel: [225060.039210] PGD 1002067 PUD 1006067
PMD 18a067 PTE 801000001558b065
> Feb 10 06:46:43 xenhost03 kernel: [225060.039253] Oops: 0003 [#1] SMP
> Feb 10 06:46:43 xenhost03 kernel: [225060.039284] last sysfs file:
/sys/devices/virtual/block/dm-6/dm/suspended
> Feb 10 06:46:43 xenhost03 kernel: [225060.039319] CPU 0
> Feb 10 06:46:43 xenhost03 kernel: [225060.039344] Modules linked in: tun
dm_round_robin crc32c xt_tcpudp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
nf_conntrack xt_physdev iptable_filter ip_tables x_tables bridge stp xen_evtchn
xenfs ib_iser rdma_cm ib_c
> m iw_cm ib_sa ib_mad ib_core ib_addr iscsi_tcp libiscsi_tcp libiscsi
scsi_transport_iscsi dm_multipath scsi_dh loop snd_hda_intel snd_hda_codec
snd_hwdep snd_pcm i915 drm_kms_helper drm snd_timer i2c_i801 evdev parport_pc
psmouse serio_raw pcspkr i2c_
> algo_bit parport i2c_core snd soundcore video output snd_page_alloc button
processor acpi_processor ext3 jbd mbcache dm_mod sd_mod crc_t10dif usbhid hid
uhci_hcd ata_generic ata_piix libata ehci_hcd scsi_mod e1000e usbcore nls_base
thermal thermal_sys
>   [last unloaded: scsi_wait_scan]
> Feb 10 06:46:43 xenhost03 kernel: [225060.039851] Pid: 9259, comm:
kpartx_id Not tainted 2.6.32-5-xen-amd64 #1 To Be Filled By O.E.M.
> Feb 10 06:46:43 xenhost03 kernel: [225060.039904] RIP:
e030:[<ffffffff8100e428>]  [<ffffffff8100e428>]
xen_set_pmd+0x15/0x2c
> Feb 10 06:46:43 xenhost03 kernel: [225060.039959] RSP:
e02b:ffff880013ad3b18  EFLAGS: 00010246
> Feb 10 06:46:43 xenhost03 kernel: [225060.039990] RAX: 0000000000000000
RBX: ffff88001558b010 RCX: ffff880000000000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] RDX: ffffea0000000000
RSI: 0000000001cc0000 RDI: ffff88001558b010
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] RBP: 0000000000000000
R08: 0000000001cc0000 R09: ffff880073c03100
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] R10: 0000000000000000
R11: ffff88002ce3bd78 R12: 000000000061c000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] R13: 0000000000400000
R14: ffff88001558b010 R15: ffff88002e156000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] FS: 
00007f5094f64700(0000) GS:ffff880003630000(0000) knlGS:0000000000000000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] CS:  e033 DS: 0000 ES:
0000 CR0: 000000008005003b
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] CR2: ffff88001558b010
CR3: 0000000011a2b000 CR4: 0000000000002660
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] DR0: 0000000000000000
DR1: 0000000000000000 DR2: 0000000000000000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] DR3: 0000000000000000
DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] Process kpartx_id (pid:
9259, threadinfo ffff880013ad2000, task ffff880002747810)
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006] Stack:
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006]  ffff880000000000
0000000000600000 0000000000400000 ffffffff810cf886
> Feb 10 06:46:43 xenhost03 kernel: [225060.040006]<0> 
ffff880013ad3fd8 0000000017ab2067 ffff880002159180 0000000000000000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636]<0> 
0000000000000000 000000000061bfff 000000000061bfff 0000000001c00000
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] Call Trace:
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810cf886>] ? free_pgd_range+0x226/0x3bf
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810cfabb>] ? free_pgtables+0x9c/0xbd
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810d129d>] ? exit_mmap+0xef/0x148
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff8104cb09>] ? mmput+0x3c/0xdf
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810f44d6>] ? flush_old_exec+0x45c/0x548
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff811270d0>] ? load_elf_binary+0x0/0x1954
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff8112746d>] ? load_elf_binary+0x39d/0x1954
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810cc572>] ? follow_page+0x2ad/0x303
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810ce136>] ? __get_user_pages+0x3ea/0x47b
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810f4fcb>] ? get_arg_page+0x61/0x110
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff811270d0>] ? load_elf_binary+0x0/0x1954
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810f3caa>] ? search_binary_handler+0xb4/0x245
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff810f54a7>] ? do_execve+0x1e4/0x2c3
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff81010500>] ? sys_execve+0x35/0x4c
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] 
[<ffffffff81011f9a>] ? stub_execve+0x6a/0xc0
> Feb 10 06:46:43 xenhost03 kernel: [225060.040636] Code: fb ff ff e8 c6 f4
01 00 bf 01 00 00 00 e8 c9 ea ff ff 59 5e 5b c3 55 48 89 f5 53 48 89 fb 48 83 ec
08 e8 6e e3 ff ff 84 c0 75 08<48>  89 2b 41 59 5b 5d c3 41 58 48 89 df 48
89 ee 5b 5d e9 7e ff
> Feb 10 06:46:43 xenhost03 kernel: [225060.042981] RIP 
[<ffffffff8100e428>] xen_set_pmd+0x15/0x2c
> Feb 10 06:46:43 xenhost03 kernel: [225060.042981] 
RSP<ffff880013ad3b18>
> Feb 10 06:46:43 xenhost03 kernel: [225060.042981] CR2: ffff88001558b010
> Feb 10 06:46:43 xenhost03 kernel: [225060.042981] ---[ end trace
9939eec096f5a2de ]---
>
>
> Also I noticed dom0 lockups of more than a minute when starting
> HVM domUs while another domU was creating heavy IO load.
> Those only disapeared when I gave my dom0 a fixed ammount of
> RAM instead of balooning it down.
>
>
> Other than that I had no bad trouble. (Well, life migration of
> Lenny 32 bit domUs on 64bit dom0 doesn''t work because the
> Lenny domU kernel is not good at that).
>
>
> I didn''t do a new install of squeeze though. I started
> with lenny and upgraded to squeeze.
>
> cheers
> -henrik
>
>
> _______________________________________________
> Xen-users mailing list
> Xen-users@lists.xensource.com
> http://lists.xensource.com/xen-users
>

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

davide.vaghetti@ing.unipi.it

2011-Apr-06 07:08 UTC

head link

[Xen-users] Re: Debian Squeeze, xen, multipath and iscsi

Hi, 

I had the very same problem. After getting almost mad trying to fix the
issue mixing different combination of boot option I followed the advice of
Henrik: let the multipath be completely loaded __before__ the iscsi daemon.
That is, don''t let the open-iscsi be loaded at boot time, or at least
remove
it from the relevant runlevel and load via rc.local. In my environment that
fixed the issue (the latest kernel update from Debian do not make any
difference).

good luck
davide
--
Davide Vaghetti
Faculty of Engineer
University of Pisa - Italy 

--
View this message in context:
http://xen.1045712.n5.nabble.com/Debian-Squeeze-xen-multipath-and-iscsi-tp3381162p4285815.html
Sent from the Xen - User mailing list archive at Nabble.com.

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Henrik Langos

2011-Apr-06 08:55 UTC

head link

Re: [Xen-users] Re: Debian Squeeze, xen, multipath and iscsi

On Wed, Apr 06, 2011 at 12:08:37AM -0700, davide.vaghetti@ing.unipi.it
wrote:> Hi, 
> 
> I had the very same problem. After getting almost mad trying to fix the
> issue mixing different combination of boot option I followed the advice of
> Henrik: let the multipath be completely loaded __before__ the iscsi daemon.
> That is, don''t let the open-iscsi be loaded at boot time, or at
least remove
> it from the relevant runlevel and load via rc.local. In my environment that
> fixed the issue (the latest kernel update from Debian do not make any
> difference).
Hi Davide,

I had to reboot my Xen host recently and upon iSCSI login I repeatedly ran
into that same problem. I don''t do any automatic iSCSI logins during
boot
and I don''t start any Xen guest on boot. So the system _was_ completely
up
and idle. Still everytime I did the iscsi login (which currently logs in
to 12 volumes via 2 paths each) I ended up with a CPUs stuck for a minute
or something plainly wrong like this:

[225060.039126] BUG: unable to handle kernel paging request at ffff88001558b010
[225060.039172] IP: [<ffffffff8100e428>] xen_set_pmd+0x15/0x2c
[225060.039210] PGD 1002067 PUD 1006067 PMD 18a067 PTE 801000001558b065
[225060.039253] Oops: 0003 [#1] SMP
[225060.039284] last sysfs file: /sys/devices/virtual/block/dm-6/dm/suspended
[225060.039319] CPU 0
...
where kpartx_id or udevd or any other part of the device mapper eco system
would trigger a memory management problem.

The workaround "hold your breath and cross your fingers"-style was to 
- stop the multipath daemon, then
- do the iSCSI login, and then
- restart the mutipath daemon.

like this:

/etc/init.d/multipath-tools stop
sleep 5
iscsiadm -m node -p ''192.168.0.1:3260'' --login
sleep 5
/etc/init.d/multipath-tools start

( while alternately praying and cursing. ;-) )

That way I was able to login to all iSCSI targets without immediately
triggering bugs / race conditions.

Actually I am not sure that I need the multipathd at all. I have after all a
pretty
simple setup.

# multipath -ll
...
36090a068302e3e73a9d4041bd000e0b3 dm-11 EQLOGIC,100E-00
size=10G features=''1 queue_if_no_path''
hwhandler=''0'' wp=rw
`-+- policy=''round-robin 0'' prio=2 status=active
  |- 24:0:0:0 sdt 65:48  active ready running
  `- 23:0:0:0 sdo 8:224  active ready running
36090a068302e4ee710d5341bd000a04b dm-16 EQLOGIC,100E-00
size=38G features=''1 queue_if_no_path''
hwhandler=''0'' wp=rw
`-+- policy=''round-robin 0'' prio=2 status=active
  |- 20:0:0:0 sdl 8:176  active ready running
  `- 19:0:0:0 sdk 8:160  active ready running
...

My host has two gigabit interfaces that are dedicated to storage traffic.
Each connects to a switch that connects to one of the controllers of my 
storage. There is no cross-connect between the switches. So if any one 
part fails, then one path is unavailable but there should be no need to
reconfigure anything. Could anybody hit me with a clue stick? ;-)

Did anybody try a different Dom0 system?
I am tempted to try XCP as Dom0 instead of Debian but I''d love to
know if anybody else already did that switch and what their experience
was.

Does anybody have experience with iSCSI and multipath on XCP ?

cheers
-henrik

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Davide Vaghetti

2011-Apr-11 11:24 UTC

head link

Re: [Xen-users] Re: Debian Squeeze, xen, multipath and iscsi

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/06/2011 10:55 AM, Henrik Langos wrote:> On Wed, Apr 06, 2011 at 12:08:37AM -0700, davide.vaghetti@ing.unipi.it
wrote:
>> Hi,
>>
>> I had the very same problem. After getting almost mad trying to fix the
>> issue mixing different combination of boot option I followed the advice
of
>> Henrik: let the multipath be completely loaded __before__ the iscsi
daemon.
>> That is, don''t let the open-iscsi be loaded at boot time, or
at least remove
>> it from the relevant runlevel and load via rc.local. In my environment
that
>> fixed the issue (the latest kernel update from Debian do not make any
>> difference).
>
> Hi Davide,
>
> I had to reboot my Xen host recently and upon iSCSI login I repeatedly ran
> into that same problem. I don''t do any automatic iSCSI logins
during boot
> and I don''t start any Xen guest on boot. So the system _was_
completely up
> and idle. Still everytime I did the iscsi login (which currently logs in
> to 12 volumes via 2 paths each) I ended up with a CPUs stuck for a minute
> or something plainly wrong like this:
>
Too bad!

And well... I had the same problems. Lately, I restarted a node of my
xen cluster (Pacemaker/Corosync), and despite having disabled iscsi at
boot, one of the CPU stucked on multipath. In the end, I removed even
multipath (boot -tools and tools-boot) from the startup, and now it
works. To make the iscsi/multipath combination works I have to start the
multipath __after__ iscsi (and, obviously, after Xen is fully started
up). The good news is that it seems stable.

Let''s make a deal, the first to come up to a better solution will let
the other know!

bye and thanks for sharing
davide
- -- 
Dott. Davide Vaghetti
Centro Servizi Informatici Facolta'' di Ingegneria
Universita'' di Pisa
PGP:
http://keys.keysigning.org:11371/pks/lookup?op=get&search=0x7A1B3BA18C4E0A4D
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk2i5NAACgkQehs7oYxOCk2WqwCg9ao9CuFA7KH0QKPN1bJ0VUsY
TVgAoMMDWFi9lyIu2s3s4rPPaN0a78+k
=BOba
-----END PGP SIGNATURE-----

_______________________________________________
Xen-users mailing list
Xen-users@lists.xensource.com
http://lists.xensource.com/xen-users

Xen users - Feb 2011 - Debian Squeeze, xen, multipath and iscsi

[Xen-users] Debian Squeeze, xen, multipath and iscsi

Re: [Xen-users] Debian Squeeze, xen, multipath and iscsi

Re: [Xen-users] Debian Squeeze, xen, multipath and iscsi

[Xen-users] Re: Debian Squeeze, xen, multipath and iscsi

Re: [Xen-users] Re: Debian Squeeze, xen, multipath and iscsi

Re: [Xen-users] Re: Debian Squeeze, xen, multipath and iscsi