thr3ads.net - Xen devel - Create a iSCSI DomU with disks in another DomU running on the same Dom0 [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Roger Pau Monné

2012-Dec-21 08:29 UTC

Create a iSCSI DomU with disks in another DomU running on the same Dom0

Hello,

I''m trying to use a strange setup, that consists in having a DomU
serving iSCSI targets to the Dom0, that will use this targets as disks
for other DomUs. I''ve tried to set up this iSCSI target DomU using both
Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI
Enterprise Target (IET), and when launching the DomU I get this messages
from Xen:

(XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000,
caf=8000000000000003, taf=7400000000000001
(XEN) Xen WARN at mm.c:1926
(XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx: 0000000000000000
(XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi: ffff82c4802766e8
(XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8:  0000000000000004
(XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
(XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14: 7400000000000001
(XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff82c4802bfba8:
(XEN)    ffff830141405000 8000000000000003 7400000000000001 0000000000145028
(XEN)    ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28
(XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3
(XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc3f0
(XEN)    0000000000000001 ffffffffffff8000 0000000000000002 ffff83011d555000
(XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607
(XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90
(XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000
(XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920
(XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 ffff82c4802b8000
(XEN)    ffff82c400000000 0000000000000001 ffffc90000028b10 ffffc90000028b10
(XEN)    ffff8300dfb03000 0000000000000000 0000000000000000 0000000000145028
(XEN)    000000000011cf7c 0000000000001000 0000000000157e68 0000000000007ff0
(XEN)    000000000000027e 000000000042000d 0000000000020b50 ffff8300dfdf0000
(XEN)    ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 ffff880185f6fd58
(XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65
(XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831
(XEN)    000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28 ffff8300dfb03000
(XEN)    ffff8300dfdf0000 0000150e11a417f8 0000000000000002 ffff82c480300948
(XEN) Xen call trace:
(XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
(XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
(XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
(XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
(XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
(XEN)    
(XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
(XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000,
caf=8000000000000003, taf=7400000000000001
(XEN) Xen WARN at mm.c:1926
(XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx: 0000000000000000
(XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi: ffff82c4802766e8
(XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8:  0000000000000004
(XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
(XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14: 7400000000000001
(XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4: 00000000000026f0
(XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
(XEN) Xen stack trace from rsp=ffff82c4802bfba8:
(XEN)    ffff830141405000 8000000000000003 7400000000000001 000000000014581d
(XEN)    ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28
(XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3
(XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc308
(XEN)    0000000000000000 ffffffffffff8000 0000000000000001 ffff83011d555000
(XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607
(XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90
(XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000
(XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920
(XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 ffff82c4802b8000
(XEN)    ffffffff00000000 0000000000000001 ffffc90000028b60 ffffc90000028b60
(XEN)    ffff8300dfb03000 0000000000000000 0000000000000000 000000000014581d
(XEN)    00000000000deb3e 0000000000001000 0000000000157e68 000000000b507ff0
(XEN)    0000000000000261 000000000042000d 00000000000204b0 ffffc90000028b38
(XEN)    0000000000000002 ffffc90000028b38 ffffc90000028b38 ffff880185f6fd58
(XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65
(XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831
(XEN)    000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000
(XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28 0000000000000086
(XEN)    ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 0000000000000286
(XEN) Xen call trace:
(XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
(XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
(XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
(XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
(XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
(XEN)    
(XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.

(Note that I''ve added a WARN() to mm.c:1925 to see where the
get_page call was coming from).

Connecting the iSCSI disks to another Dom0 works fine, so this
problem only happens when trying to connect the disks to the
Dom0 where the DomU is running.

I''ve replaced the Linux DomU serving iSCSI targets with a
NetBSD DomU, and the problems disappears, and I''m able to
attach the targets shared by the DomU to the Dom0 without
issues.

The problem seems to come from netfront/netback, does anyone
have a clue about what might cause this bad interaction
between IET and netfront/netback?

Thanks, Roger.

Konrad Rzeszutek Wilk

2012-Dec-21 14:03 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné
wrote:> Hello,
> 
> I''m trying to use a strange setup, that consists in having a DomU
> serving iSCSI targets to the Dom0, that will use this targets as disks
> for other DomUs. I''ve tried to set up this iSCSI target DomU using
both
> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI
> Enterprise Target (IET), and when launching the DomU I get this messages
> from Xen:
> 
> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
> (XEN) Xen WARN at mm.c:1926
> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx: 0000000000000000
> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi: ffff82c4802766e8
> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8:  0000000000000004
> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14: 7400000000000001
> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4: 00000000000026f0
> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
> (XEN)    ffff830141405000 8000000000000003 7400000000000001
0000000000145028
> (XEN)    ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc3f0
> (XEN)    0000000000000001 ffffffffffff8000 0000000000000002
ffff83011d555000
> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38
ffff82c4802b8000
> (XEN)    ffff82c400000000 0000000000000001 ffffc90000028b10
ffffc90000028b10
> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
0000000000145028
> (XEN)    000000000011cf7c 0000000000001000 0000000000157e68
0000000000007ff0
> (XEN)    000000000000027e 000000000042000d 0000000000020b50
ffff8300dfdf0000
> (XEN)    ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0
ffff880185f6fd58
> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
ffff8300dfb03000
> (XEN)    ffff8300dfdf0000 0000150e11a417f8 0000000000000002
ffff82c480300948
> (XEN) Xen call trace:
> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
> (XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
> (XEN)    
> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
> (XEN) Xen WARN at mm.c:1926
> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
> (XEN) CPU:    0
> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx: 0000000000000000
> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi: ffff82c4802766e8
> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8:  0000000000000004
> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11: 0000000000000001
> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14: 7400000000000001
> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4: 00000000000026f0
> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
> (XEN)    ffff830141405000 8000000000000003 7400000000000001
000000000014581d
> (XEN)    ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc308
> (XEN)    0000000000000000 ffffffffffff8000 0000000000000001
ffff83011d555000
> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38
ffff82c4802b8000
> (XEN)    ffffffff00000000 0000000000000001 ffffc90000028b60
ffffc90000028b60
> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
000000000014581d
> (XEN)    00000000000deb3e 0000000000001000 0000000000157e68
000000000b507ff0
> (XEN)    0000000000000261 000000000042000d 00000000000204b0
ffffc90000028b38
> (XEN)    0000000000000002 ffffc90000028b38 ffffc90000028b38
ffff880185f6fd58
> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
0000000000000086
> (XEN)    ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000
0000000000000286
> (XEN) Xen call trace:
> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
> (XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
> (XEN)    
> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
> 
> (Note that I''ve added a WARN() to mm.c:1925 to see where the
> get_page call was coming from).
> 
> Connecting the iSCSI disks to another Dom0 works fine, so this
> problem only happens when trying to connect the disks to the
> Dom0 where the DomU is running.
Is this happening when the ''disks'' are exported to the domUs?
Are they exported via QEMU or xen-blkback?
> 
> I''ve replaced the Linux DomU serving iSCSI targets with a
> NetBSD DomU, and the problems disappears, and I''m able to
> attach the targets shared by the DomU to the Dom0 without
> issues.
> 
> The problem seems to come from netfront/netback, does anyone
> have a clue about what might cause this bad interaction
> between IET and netfront/netback?
Or it might be that we are re-using the PFN for blkback/blkfront
and using the m2p overrides and overwritting the netfront/netback
m2p overrides?

Is this with an HVM domU or PV domU?
> 
> Thanks, Roger.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
>

Roger Pau Monné

2012-Dec-21 14:47 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote:> On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote:
>> Hello,
>>
>> I''m trying to use a strange setup, that consists in having a
DomU
>> serving iSCSI targets to the Dom0, that will use this targets as disks
>> for other DomUs. I''ve tried to set up this iSCSI target DomU
using both
>> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI
>> Enterprise Target (IET), and when launching the DomU I get this
messages
>> from Xen:
>>
>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
>> (XEN) Xen WARN at mm.c:1926
>> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx:
0000000000000000
>> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi:
ffff82c4802766e8
>> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8: 
0000000000000004
>> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11:
0000000000000001
>> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14:
7400000000000001
>> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4:
00000000000026f0
>> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
>> (XEN)    ffff830141405000 8000000000000003 7400000000000001
0000000000145028
>> (XEN)    ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
>> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
>> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc3f0
>> (XEN)    0000000000000001 ffffffffffff8000 0000000000000002
ffff83011d555000
>> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
>> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
>> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
>> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
>> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38
ffff82c4802b8000
>> (XEN)    ffff82c400000000 0000000000000001 ffffc90000028b10
ffffc90000028b10
>> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
0000000000145028
>> (XEN)    000000000011cf7c 0000000000001000 0000000000157e68
0000000000007ff0
>> (XEN)    000000000000027e 000000000042000d 0000000000020b50
ffff8300dfdf0000
>> (XEN)    ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0
ffff880185f6fd58
>> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
>> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
>> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
>> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
ffff8300dfb03000
>> (XEN)    ffff8300dfdf0000 0000150e11a417f8 0000000000000002
ffff82c480300948
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
>> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
>> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
>> (XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
>> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
>> (XEN)    
>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
>> (XEN) Xen WARN at mm.c:1926
>> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
>> (XEN) CPU:    0
>> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx:
0000000000000000
>> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi:
ffff82c4802766e8
>> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8: 
0000000000000004
>> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11:
0000000000000001
>> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14:
7400000000000001
>> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4:
00000000000026f0
>> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs: e008
>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
>> (XEN)    ffff830141405000 8000000000000003 7400000000000001
000000000014581d
>> (XEN)    ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
>> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
>> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc308
>> (XEN)    0000000000000000 ffffffffffff8000 0000000000000001
ffff83011d555000
>> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
>> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
>> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
>> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
>> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38
ffff82c4802b8000
>> (XEN)    ffffffff00000000 0000000000000001 ffffc90000028b60
ffffc90000028b60
>> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
000000000014581d
>> (XEN)    00000000000deb3e 0000000000001000 0000000000157e68
000000000b507ff0
>> (XEN)    0000000000000261 000000000042000d 00000000000204b0
ffffc90000028b38
>> (XEN)    0000000000000002 ffffc90000028b38 ffffc90000028b38
ffff880185f6fd58
>> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
>> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
>> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
>> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
0000000000000086
>> (XEN)    ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000
0000000000000286
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
>> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
>> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
>> (XEN)    [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23
>> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
>> (XEN)    
>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
>>
>> (Note that I''ve added a WARN() to mm.c:1925 to see where the
>> get_page call was coming from).
>>
>> Connecting the iSCSI disks to another Dom0 works fine, so this
>> problem only happens when trying to connect the disks to the
>> Dom0 where the DomU is running.
> 
> Is this happening when the ''disks'' are exported to the
domUs?
> Are they exported via QEMU or xen-blkback?
The iSCSI disks are connected to the DomUs using blkback, and this is
happening when the DomU tries to access it''s disks.
>>
>> I''ve replaced the Linux DomU serving iSCSI targets with a
>> NetBSD DomU, and the problems disappears, and I''m able to
>> attach the targets shared by the DomU to the Dom0 without
>> issues.
>>
>> The problem seems to come from netfront/netback, does anyone
>> have a clue about what might cause this bad interaction
>> between IET and netfront/netback?
> 
> Or it might be that we are re-using the PFN for blkback/blkfront
> and using the m2p overrides and overwritting the netfront/netback
> m2p overrides?
What''s strange is that this doesn''t happen when the domain
that has the
targets is a NetBSD PV. There are also problems when blkback is not used
(see below), so I guess the problem is between netfront/netback and IET.
> Is this with an HVM domU or PV domU?
Both domains (the domain holding the iSCSI targets, and the created
guests) are PV.

Also, I''ve forgot to say that in the previous email, but if I just
connect the iSCSI disks to the Dom0, I don''t see any errors from Xen,
but the Dom0 kernel starts complaining:

[70272.569607] sd 14:0:0:0: [sdc]
[70272.569611] Sense Key : Medium Error [current]
[70272.569619] Info fld=0x0
[70272.569623] sd 14:0:0:0: [sdc]
[70272.569627] Add. Sense: Unrecovered read error
[70272.569633] sd 14:0:0:0: [sdc] CDB:
[70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00
[70272.569662] end_request: critical target error, dev sdc, sector 0
[70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code
[70277.571220] sd 14:0:0:0: [sdc]
[70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[70277.571229] sd 14:0:0:0: [sdc]
[70277.571233] Sense Key : Medium Error [current]
[70277.571241] Info fld=0x0
[70277.571245] sd 14:0:0:0: [sdc]
[70277.571249] Add. Sense: Unrecovered read error
[70277.571255] sd 14:0:0:0: [sdc] CDB:
[70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00
[70277.571284] end_request: critical target error, dev sdc, sector 0
[70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code
[70282.572781] sd 14:0:0:0: [sdc]
[70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[70282.572790] sd 14:0:0:0: [sdc]
[70282.572794] Sense Key : Medium Error [current]
[70282.572802] Info fld=0x0
[70282.572806] sd 14:0:0:0: [sdc]
[70282.572810] Add. Sense: Unrecovered read error
[70282.572816] sd 14:0:0:0: [sdc] CDB:
[70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00
[70282.572846] end_request: critical target error, dev sdc, sector 0
[70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code
[70287.574409] sd 14:0:0:0: [sdc]
[70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[70287.574418] sd 14:0:0:0: [sdc]
[70287.574422] Sense Key : Medium Error [current]
[70287.574430] Info fld=0x0
[70287.574434] sd 14:0:0:0: [sdc]
[70287.574438] Add. Sense: Unrecovered read error
[70287.574445] sd 14:0:0:0: [sdc] CDB:
[70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00
[70287.574474] end_request: critical target error, dev sdc, sector 0

When I try to attach the targets to another Dom0, everything works fine,
the problem only happens when the iSCSI target is a DomU and you attach
the disks from the Dom0 on the same machine.

Thanks, Roger.

Konrad Rzeszutek Wilk

2012-Dec-21 17:35 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On Fri, Dec 21, 2012 at 03:47:20PM +0100, Roger Pau Monné
wrote:> On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote:
> > On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote:
> >> Hello,
> >>
> >> I''m trying to use a strange setup, that consists in
having a DomU
> >> serving iSCSI targets to the Dom0, that will use this targets as
disks
> >> for other DomUs. I''ve tried to set up this iSCSI target
DomU using both
> >> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI
> >> Enterprise Target (IET), and when launching the DomU I get this
messages
> >> from Xen:
> >>
> >> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
> >> (XEN) Xen WARN at mm.c:1926
> >> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
> >> (XEN) CPU:    0
> >> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
> >> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> >> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx:
0000000000000000
> >> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi:
ffff82c4802766e8
> >> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8: 
0000000000000004
> >> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11:
0000000000000001
> >> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14:
7400000000000001
> >> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4:
00000000000026f0
> >> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
> >> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs:
e008
> >> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
> >> (XEN)    ffff830141405000 8000000000000003 7400000000000001
0000000000145028
> >> (XEN)    ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
> >> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
> >> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc3f0
> >> (XEN)    0000000000000001 ffffffffffff8000 0000000000000002
ffff83011d555000
> >> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
> >> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
> >> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
> >> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
> >> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38
ffff82c4802b8000
> >> (XEN)    ffff82c400000000 0000000000000001 ffffc90000028b10
ffffc90000028b10
> >> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
0000000000145028
> >> (XEN)    000000000011cf7c 0000000000001000 0000000000157e68
0000000000007ff0
> >> (XEN)    000000000000027e 000000000042000d 0000000000020b50
ffff8300dfdf0000
> >> (XEN)    ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0
ffff880185f6fd58
> >> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
> >> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
> >> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
> >> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
ffff8300dfb03000
> >> (XEN)    ffff8300dfdf0000 0000150e11a417f8 0000000000000002
ffff82c480300948
> >> (XEN) Xen call trace:
> >> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
> >> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
> >> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
> >> (XEN)    [<ffff82c48010eb65>]
do_grant_table_op+0x12ad/0x1b23
> >> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
> >> (XEN)    
> >> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
> >> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
> >> (XEN) Xen WARN at mm.c:1926
> >> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted ]----
> >> (XEN) CPU:    0
> >> (XEN) RIP:    e008:[<ffff82c48016ea17>] get_page+0xd5/0x101
> >> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> >> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx:
0000000000000000
> >> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi:
ffff82c4802766e8
> >> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8: 
0000000000000004
> >> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11:
0000000000000001
> >> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14:
7400000000000001
> >> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4:
00000000000026f0
> >> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
> >> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010   cs:
e008
> >> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
> >> (XEN)    ffff830141405000 8000000000000003 7400000000000001
000000000014581d
> >> (XEN)    ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
> >> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
> >> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc308
> >> (XEN)    0000000000000000 ffffffffffff8000 0000000000000001
ffff83011d555000
> >> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
> >> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
> >> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
> >> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
> >> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38
ffff82c4802b8000
> >> (XEN)    ffffffff00000000 0000000000000001 ffffc90000028b60
ffffc90000028b60
> >> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
000000000014581d
> >> (XEN)    00000000000deb3e 0000000000001000 0000000000157e68
000000000b507ff0
> >> (XEN)    0000000000000261 000000000042000d 00000000000204b0
ffffc90000028b38
> >> (XEN)    0000000000000002 ffffc90000028b38 ffffc90000028b38
ffff880185f6fd58
> >> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
> >> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
> >> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
> >> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
0000000000000086
> >> (XEN)    ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000
0000000000000286
> >> (XEN) Xen call trace:
> >> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
> >> (XEN)    [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162
> >> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
> >> (XEN)    [<ffff82c48010eb65>]
do_grant_table_op+0x12ad/0x1b23
> >> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
> >> (XEN)    
> >> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid.
> >>
> >> (Note that I''ve added a WARN() to mm.c:1925 to see where
the
> >> get_page call was coming from).
> >>
> >> Connecting the iSCSI disks to another Dom0 works fine, so this
> >> problem only happens when trying to connect the disks to the
> >> Dom0 where the DomU is running.
> > 
> > Is this happening when the ''disks'' are exported to
the domUs?
> > Are they exported via QEMU or xen-blkback?
> 
> The iSCSI disks are connected to the DomUs using blkback, and this is
> happening when the DomU tries to access it''s disks.
> 
> >>
> >> I''ve replaced the Linux DomU serving iSCSI targets with a
> >> NetBSD DomU, and the problems disappears, and I''m able to
> >> attach the targets shared by the DomU to the Dom0 without
> >> issues.
> >>
> >> The problem seems to come from netfront/netback, does anyone
> >> have a clue about what might cause this bad interaction
> >> between IET and netfront/netback?
> > 
> > Or it might be that we are re-using the PFN for blkback/blkfront
> > and using the m2p overrides and overwritting the netfront/netback
> > m2p overrides?
> 
> What''s strange is that this doesn''t happen when the
domain that has the
> targets is a NetBSD PV. There are also problems when blkback is not used
> (see below), so I guess the problem is between netfront/netback and IET.
> 
> > Is this with an HVM domU or PV domU?
> 
> Both domains (the domain holding the iSCSI targets, and the created
> guests) are PV.
> 
> Also, I''ve forgot to say that in the previous email, but if I just
> connect the iSCSI disks to the Dom0, I don''t see any errors from
Xen,
> but the Dom0 kernel starts complaining:
> 
> [70272.569607] sd 14:0:0:0: [sdc]
> [70272.569611] Sense Key : Medium Error [current]
> [70272.569619] Info fld=0x0
> [70272.569623] sd 14:0:0:0: [sdc]
> [70272.569627] Add. Sense: Unrecovered read error
> [70272.569633] sd 14:0:0:0: [sdc] CDB:
> [70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00
> [70272.569662] end_request: critical target error, dev sdc, sector 0
> [70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code
> [70277.571220] sd 14:0:0:0: [sdc]
> [70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [70277.571229] sd 14:0:0:0: [sdc]
> [70277.571233] Sense Key : Medium Error [current]
> [70277.571241] Info fld=0x0
> [70277.571245] sd 14:0:0:0: [sdc]
> [70277.571249] Add. Sense: Unrecovered read error
> [70277.571255] sd 14:0:0:0: [sdc] CDB:
> [70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00
> [70277.571284] end_request: critical target error, dev sdc, sector 0
> [70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code
> [70282.572781] sd 14:0:0:0: [sdc]
> [70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [70282.572790] sd 14:0:0:0: [sdc]
> [70282.572794] Sense Key : Medium Error [current]
> [70282.572802] Info fld=0x0
> [70282.572806] sd 14:0:0:0: [sdc]
> [70282.572810] Add. Sense: Unrecovered read error
> [70282.572816] sd 14:0:0:0: [sdc] CDB:
> [70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00
> [70282.572846] end_request: critical target error, dev sdc, sector 0
> [70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code
> [70287.574409] sd 14:0:0:0: [sdc]
> [70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> [70287.574418] sd 14:0:0:0: [sdc]
> [70287.574422] Sense Key : Medium Error [current]
> [70287.574430] Info fld=0x0
> [70287.574434] sd 14:0:0:0: [sdc]
> [70287.574438] Add. Sense: Unrecovered read error
> [70287.574445] sd 14:0:0:0: [sdc] CDB:
> [70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00
> [70287.574474] end_request: critical target error, dev sdc, sector 0
> 
> When I try to attach the targets to another Dom0, everything works fine,
> the problem only happens when the iSCSI target is a DomU and you attach
> the disks from the Dom0 on the same machine.
I think we are just swizzling the PFNs with a different MFN when you
do the domU -> domX, using two ring protocols. Weird thought as the
m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of
thing.

What happens if the dom0/domU are all 3.8 with the persistent grant
patches?
> 
> Thanks, Roger.
>

Roger Pau Monné

2013-Jan-02 13:05 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On 21/12/12 18:35, Konrad Rzeszutek Wilk wrote:> On Fri, Dec 21, 2012 at 03:47:20PM +0100, Roger Pau Monné wrote:
>> On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote:
>>>> Hello,
>>>>
>>>> I''m trying to use a strange setup, that consists in
having a DomU
>>>> serving iSCSI targets to the Dom0, that will use this targets
as disks
>>>> for other DomUs. I''ve tried to set up this iSCSI
target DomU using both
>>>> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI
>>>> Enterprise Target (IET), and when launching the DomU I get this
messages
>>>> from Xen:
>>>>
>>>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
>>>> (XEN) Xen WARN at mm.c:1926
>>>> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted
]----
>>>> (XEN) CPU:    0
>>>> (XEN) RIP:    e008:[<ffff82c48016ea17>]
get_page+0xd5/0x101
>>>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>>>> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx:
0000000000000000
>>>> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi:
ffff82c4802766e8
>>>> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8: 
0000000000000004
>>>> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11:
0000000000000001
>>>> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14:
7400000000000001
>>>> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4:
00000000000026f0
>>>> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010  
cs: e008
>>>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
>>>> (XEN)    ffff830141405000 8000000000000003 7400000000000001
0000000000145028
>>>> (XEN)    ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
>>>> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
>>>> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc3f0
>>>> (XEN)    0000000000000001 ffffffffffff8000 0000000000000002
ffff83011d555000
>>>> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
>>>> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
>>>> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38
ffff82c4802b8000
>>>> (XEN)    ffff82c400000000 0000000000000001 ffffc90000028b10
ffffc90000028b10
>>>> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
0000000000145028
>>>> (XEN)    000000000011cf7c 0000000000001000 0000000000157e68
0000000000007ff0
>>>> (XEN)    000000000000027e 000000000042000d 0000000000020b50
ffff8300dfdf0000
>>>> (XEN)    ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0
ffff880185f6fd58
>>>> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
>>>> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
>>>> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
>>>> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
ffff8300dfb03000
>>>> (XEN)    ffff8300dfdf0000 0000150e11a417f8 0000000000000002
ffff82c480300948
>>>> (XEN) Xen call trace:
>>>> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
>>>> (XEN)    [<ffff82c480109ba3>]
__get_paged_frame+0xbf/0x162
>>>> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
>>>> (XEN)    [<ffff82c48010eb65>]
do_grant_table_op+0x12ad/0x1b23
>>>> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
>>>> (XEN)    
>>>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff
invalid.
>>>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000,
od=ffff830141405000, caf=8000000000000003, taf=7400000000000001
>>>> (XEN) Xen WARN at mm.c:1926
>>>> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Not tainted
]----
>>>> (XEN) CPU:    0
>>>> (XEN) RIP:    e008:[<ffff82c48016ea17>]
get_page+0xd5/0x101
>>>> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
>>>> (XEN) rax: 0000000000000000   rbx: ffff830141405000   rcx:
0000000000000000
>>>> (XEN) rdx: ffff82c480300920   rsi: 000000000000000a   rdi:
ffff82c4802766e8
>>>> (XEN) rbp: ffff82c4802bfbf8   rsp: ffff82c4802bfba8   r8: 
0000000000000004
>>>> (XEN) r9:  0000000000000004   r10: 0000000000000004   r11:
0000000000000001
>>>> (XEN) r12: 0000000000157e68   r13: ffff83019e60c000   r14:
7400000000000001
>>>> (XEN) r15: 8000000000000003   cr0: 000000008005003b   cr4:
00000000000026f0
>>>> (XEN) cr3: 000000011c180000   cr2: 00007f668d1eb000
>>>> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: e010  
cs: e008
>>>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8:
>>>> (XEN)    ffff830141405000 8000000000000003 7400000000000001
000000000014581d
>>>> (XEN)    ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00
ffff82c4802bfd28
>>>> (XEN)    ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58
ffff82c480109ba3
>>>> (XEN)    ffffffffffffffff 0000000000000000 ffff83011c977fb8
0000000061dfc308
>>>> (XEN)    0000000000000000 ffffffffffff8000 0000000000000001
ffff83011d555000
>>>> (XEN)    ffff83019e60c000 0000000000000000 ffff82c4802bfd98
ffff82c48010c607
>>>> (XEN)    ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001
000000000011cf90
>>>> (XEN)    0000000000000000 ffff82c4802b8000 ffff82c4802b8000
ffff82c4802b8000
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000
ffff82c480300920
>>>> (XEN)    ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38
ffff82c4802b8000
>>>> (XEN)    ffffffff00000000 0000000000000001 ffffc90000028b60
ffffc90000028b60
>>>> (XEN)    ffff8300dfb03000 0000000000000000 0000000000000000
000000000014581d
>>>> (XEN)    00000000000deb3e 0000000000001000 0000000000157e68
000000000b507ff0
>>>> (XEN)    0000000000000261 000000000042000d 00000000000204b0
ffffc90000028b38
>>>> (XEN)    0000000000000002 ffffc90000028b38 ffffc90000028b38
ffff880185f6fd58
>>>> (XEN)    ffff880185f6fd78 0000000000000005 ffff82c4802bfef8
ffff82c48010eb65
>>>> (XEN)    ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18
ffff82c480181831
>>>> (XEN)    000000000006df66 000032cfdc175ce6 0000000000000000
0000000000000000
>>>> (XEN)    0000000000000000 0000000000000005 ffff82c4802bfe28
0000000000000086
>>>> (XEN)    ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000
0000000000000286
>>>> (XEN) Xen call trace:
>>>> (XEN)    [<ffff82c48016ea17>] get_page+0xd5/0x101
>>>> (XEN)    [<ffff82c480109ba3>]
__get_paged_frame+0xbf/0x162
>>>> (XEN)    [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a
>>>> (XEN)    [<ffff82c48010eb65>]
do_grant_table_op+0x12ad/0x1b23
>>>> (XEN)    [<ffff82c48022280b>] syscall_enter+0xeb/0x145
>>>> (XEN)    
>>>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff
invalid.
>>>>
>>>> (Note that I''ve added a WARN() to mm.c:1925 to see
where the
>>>> get_page call was coming from).
>>>>
>>>> Connecting the iSCSI disks to another Dom0 works fine, so this
>>>> problem only happens when trying to connect the disks to the
>>>> Dom0 where the DomU is running.
>>>
>>> Is this happening when the ''disks'' are exported
to the domUs?
>>> Are they exported via QEMU or xen-blkback?
>>
>> The iSCSI disks are connected to the DomUs using blkback, and this is
>> happening when the DomU tries to access it''s disks.
>>
>>>>
>>>> I''ve replaced the Linux DomU serving iSCSI targets
with a
>>>> NetBSD DomU, and the problems disappears, and I''m able
to
>>>> attach the targets shared by the DomU to the Dom0 without
>>>> issues.
>>>>
>>>> The problem seems to come from netfront/netback, does anyone
>>>> have a clue about what might cause this bad interaction
>>>> between IET and netfront/netback?
>>>
>>> Or it might be that we are re-using the PFN for blkback/blkfront
>>> and using the m2p overrides and overwritting the netfront/netback
>>> m2p overrides?
>>
>> What''s strange is that this doesn''t happen when the
domain that has the
>> targets is a NetBSD PV. There are also problems when blkback is not
used
>> (see below), so I guess the problem is between netfront/netback and
IET.
>>
>>> Is this with an HVM domU or PV domU?
>>
>> Both domains (the domain holding the iSCSI targets, and the created
>> guests) are PV.
>>
>> Also, I''ve forgot to say that in the previous email, but if I
just
>> connect the iSCSI disks to the Dom0, I don''t see any errors
from Xen,
>> but the Dom0 kernel starts complaining:
>>
>> [70272.569607] sd 14:0:0:0: [sdc]
>> [70272.569611] Sense Key : Medium Error [current]
>> [70272.569619] Info fld=0x0
>> [70272.569623] sd 14:0:0:0: [sdc]
>> [70272.569627] Add. Sense: Unrecovered read error
>> [70272.569633] sd 14:0:0:0: [sdc] CDB:
>> [70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70272.569662] end_request: critical target error, dev sdc, sector 0
>> [70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code
>> [70277.571220] sd 14:0:0:0: [sdc]
>> [70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [70277.571229] sd 14:0:0:0: [sdc]
>> [70277.571233] Sense Key : Medium Error [current]
>> [70277.571241] Info fld=0x0
>> [70277.571245] sd 14:0:0:0: [sdc]
>> [70277.571249] Add. Sense: Unrecovered read error
>> [70277.571255] sd 14:0:0:0: [sdc] CDB:
>> [70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70277.571284] end_request: critical target error, dev sdc, sector 0
>> [70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code
>> [70282.572781] sd 14:0:0:0: [sdc]
>> [70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [70282.572790] sd 14:0:0:0: [sdc]
>> [70282.572794] Sense Key : Medium Error [current]
>> [70282.572802] Info fld=0x0
>> [70282.572806] sd 14:0:0:0: [sdc]
>> [70282.572810] Add. Sense: Unrecovered read error
>> [70282.572816] sd 14:0:0:0: [sdc] CDB:
>> [70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70282.572846] end_request: critical target error, dev sdc, sector 0
>> [70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code
>> [70287.574409] sd 14:0:0:0: [sdc]
>> [70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
>> [70287.574418] sd 14:0:0:0: [sdc]
>> [70287.574422] Sense Key : Medium Error [current]
>> [70287.574430] Info fld=0x0
>> [70287.574434] sd 14:0:0:0: [sdc]
>> [70287.574438] Add. Sense: Unrecovered read error
>> [70287.574445] sd 14:0:0:0: [sdc] CDB:
>> [70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00
>> [70287.574474] end_request: critical target error, dev sdc, sector 0
>>
>> When I try to attach the targets to another Dom0, everything works
fine,
>> the problem only happens when the iSCSI target is a DomU and you attach
>> the disks from the Dom0 on the same machine.
> 
> I think we are just swizzling the PFNs with a different MFN when you
> do the domU -> domX, using two ring protocols. Weird thought as the
> m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of
> thing.
> 
> What happens if the dom0/domU are all 3.8 with the persistent grant
> patches?
Sorry for the delay, the same error happens when Dom0/DomU is using a
persistent grants enabled kernel, although I had to backport the
persistent grants patch to 3.2, because I was unable to get iSCSI
Enterprise Target dkms working with 3.8. I''m also seeing this messages
in the DomU that''s running the iSCSI target:

[  511.338845] net_ratelimit: 36 callbacks suppressed
[  511.338851] net eth0: rx->offset: 0, size: 4294967295
[  512.288282] net eth0: rx->offset: 0, size: 4294967295
[  512.525639] net eth0: rx->offset: 0, size: 4294967295
[  512.800729] net eth0: rx->offset: 0, size: 4294967295
[  512.800732] net eth0: rx->offset: 0, size: 4294967295
[  513.049447] net eth0: rx->offset: 0, size: 4294967295
[  513.050125] net eth0: rx->offset: 0, size: 4294967295
[  513.313493] net eth0: rx->offset: 0, size: 4294967295
[  513.313497] net eth0: rx->offset: 0, size: 4294967295
[  513.557233] net eth0: rx->offset: 0, size: 4294967295
[  517.422772] net_ratelimit: 61 callbacks suppressed
[  517.422777] net eth0: rx->offset: 0, size: 4294967295
[  517.422780] net eth0: rx->offset: 0, size: 4294967295
[  517.667053] net eth0: rx->offset: 0, size: 4294967295
[  517.667640] net eth0: rx->offset: 0, size: 4294967295
[  517.879690] net eth0: rx->offset: 0, size: 4294967295
[  517.879693] net eth0: rx->offset: 0, size: 4294967295
[  518.125314] net eth0: rx->offset: 0, size: 4294967295
[  518.125907] net eth0: rx->offset: 0, size: 4294967295
[  518.477026] net eth0: rx->offset: 0, size: 4294967295
[  518.477029] net eth0: rx->offset: 0, size: 4294967295
[  553.400129] net_ratelimit: 84 callbacks suppressed
[  553.400134] net eth0: rx->offset: 0, size: 4294967295
[  553.400615] net eth0: rx->offset: 0, size: 4294967295
[  553.400618] net eth0: rx->offset: 0, size: 4294967295
[  553.400620] net eth0: rx->offset: 0, size: 4294967295
[  553.603476] net eth0: rx->offset: 0, size: 4294967295
[  553.604103] net eth0: rx->offset: 0, size: 4294967295
[  553.807444] net eth0: rx->offset: 0, size: 4294967295
[  553.807447] net eth0: rx->offset: 0, size: 4294967295
[  554.049223] net eth0: rx->offset: 0, size: 4294967295
[  554.049902] net eth0: rx->offset: 0, size: 4294967295
[  581.912017] net_ratelimit: 8 callbacks suppressed
[  581.912022] net eth0: rx->offset: 0, size: 4294967295
[  581.912496] net eth0: rx->offset: 0, size: 4294967295
[  582.118996] net eth0: rx->offset: 0, size: 4294967295
[  582.357495] net eth0: rx->offset: 0, size: 4294967295
[  615.983921] net eth0: rx->offset: 0, size: 4294967295

Konrad Rzeszutek Wilk

2013-Jan-02 21:36 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

> > I think we are just swizzling the PFNs with a different MFN when you
> > do the domU -> domX, using two ring protocols. Weird thought as the
> > m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of
> > thing.
> > 
> > What happens if the dom0/domU are all 3.8 with the persistent grant
> > patches?
> 
> Sorry for the delay, the same error happens when Dom0/DomU is using a
> persistent grants enabled kernel, although I had to backport the
> persistent grants patch to 3.2, because I was unable to get iSCSI
> Enterprise Target dkms working with 3.8. I''m also seeing this
messages
> in the DomU that''s running the iSCSI target:
> 
> [  511.338845] net_ratelimit: 36 callbacks suppressed
> [  511.338851] net eth0: rx->offset: 0, size: 4294967295
-1 ?! I saw this somewhere with 9000 MTUs.
> [  512.288282] net eth0: rx->offset: 0, size: 4294967295
> [  512.525639] net eth0: rx->offset: 0, size: 4294967295
> [  512.800729] net eth0: rx->offset: 0, size: 4294967295
But wow. It is just all over.

Could you instrument the M2P code to print out the PFN/MFN
values are they are being altered (along with __builtin_func(1) to
get an idea who is doing it). Perhaps that will shed light whether
my theory (that we are overwritting the MFNs) is truly happening.
It does not help that it ends up using multicalls - so it might be
that they are being done both in bathess - so there are multiple
MFN updates. Perhaps the multicalls have two or more changes to the
same MFN?

Roger Pau Monné

2013-Jan-09 19:23 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On 02/01/13 22:36, Konrad Rzeszutek Wilk wrote:>>> I think we are just swizzling the PFNs with a different MFN when
you
>>> do the domU -> domX, using two ring protocols. Weird thought as
the
>>> m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of
>>> thing.
>>>
>>> What happens if the dom0/domU are all 3.8 with the persistent grant
>>> patches?
>>
>> Sorry for the delay, the same error happens when Dom0/DomU is using a
>> persistent grants enabled kernel, although I had to backport the
>> persistent grants patch to 3.2, because I was unable to get iSCSI
>> Enterprise Target dkms working with 3.8. I''m also seeing this
messages
>> in the DomU that''s running the iSCSI target:
>>
>> [  511.338845] net_ratelimit: 36 callbacks suppressed
>> [  511.338851] net eth0: rx->offset: 0, size: 4294967295
> 
> -1 ?! I saw this somewhere with 9000 MTUs.
> 
>> [  512.288282] net eth0: rx->offset: 0, size: 4294967295
>> [  512.525639] net eth0: rx->offset: 0, size: 4294967295
>> [  512.800729] net eth0: rx->offset: 0, size: 4294967295
> 
> But wow. It is just all over.
> 
> Could you instrument the M2P code to print out the PFN/MFN
> values are they are being altered (along with __builtin_func(1) to
> get an idea who is doing it). Perhaps that will shed light whether
> my theory (that we are overwritting the MFNs) is truly happening.
> It does not help that it ends up using multicalls - so it might be
> that they are being done both in bathess - so there are multiple
> MFN updates. Perhaps the multicalls have two or more changes to the
> same MFN?
A little more info, I''ve found that we are passing FOREIGN_FRAMES in
the sbk fragments on netback. When we try to perform the grant copy
operation using a foreign mfn as source, we hit the error. Here is
the stack trace of the addition of the bogus sbk:

[  107.094109] Pid: 64, comm: kworker/u:5 Not tainted 3.7.0-rc3+ #8
[  107.094114] Call Trace:
[  107.094126]  [<ffffffff813ff099>] xen_netbk_queue_tx_skb+0x16b/0x1aa
[  107.094135]  [<ffffffff81400088>] xenvif_start_xmit+0x7b/0x9e
[  107.094143]  [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db
[  107.094151]  [<ffffffff81478917>] sch_direct_xmit+0x6e/0x150
[  107.094159]  [<ffffffff814612ce>] dev_queue_xmit+0x16a/0x360
[  107.094168]  [<ffffffff814c2426>] br_dev_queue_push_xmit+0x5c/0x62
[  107.094175]  [<ffffffff814c2536>] br_deliver+0x35/0x3f
[  107.094182]  [<ffffffff814c12b8>] br_dev_xmit+0xd7/0xef
[  107.094189]  [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db
[  107.094197]  [<ffffffff814549b6>] ? __alloc_skb+0x8d/0x187
[  107.094204]  [<ffffffff81461409>] dev_queue_xmit+0x2a5/0x360
[  107.094212]  [<ffffffff81485991>] ip_finish_output2+0x25c/0x2b7
[  107.094219]  [<ffffffff81485a62>] ip_finish_output+0x76/0x7b
[  107.094226]  [<ffffffff81485aa1>] ip_output+0x3a/0x3c
[  107.094235]  [<ffffffff81483367>] dst_output+0xf/0x11
[  107.094242]  [<ffffffff81483558>] ip_local_out+0x5c/0x5e
[  107.094249]  [<ffffffff81485560>] ip_queue_xmit+0x2ce/0x2fc
[  107.094256]  [<ffffffff814969b4>] tcp_transmit_skb+0x746/0x787
[  107.094264]  [<ffffffff81498e73>] tcp_write_xmit+0x837/0x949
[  107.094273]  [<ffffffff810f1fea>] ? virt_to_head_page+0x9/0x2c
[  107.094281]  [<ffffffff810f21aa>] ? ksize+0x1a/0x24
[  107.094288]  [<ffffffff814549ca>] ? __alloc_skb+0xa1/0x187
[  107.094295]  [<ffffffff81499212>] __tcp_push_pending_frames+0x2c/0x59
[  107.094302]  [<ffffffff8148af1a>] tcp_push+0x87/0x89
[  107.094309]  [<ffffffff8148cd26>] tcp_sendpage+0x448/0x480
[  107.094317]  [<ffffffff814aacc8>] inet_sendpage+0xa0/0xb5
[  107.094327]  [<ffffffff81329aac>] iscsi_sw_tcp_pdu_xmit+0xa2/0x236
[  107.094335]  [<ffffffff81328188>] iscsi_tcp_task_xmit+0x34/0x236
[  107.094345]  [<ffffffff8100d8b5>] ? __spin_time_accum+0x17/0x2e
[  107.094352]  [<ffffffff8100daf3>] ? __xen_spin_lock+0xb7/0xcd
[  107.094360]  [<ffffffff8132434a>] iscsi_xmit_task+0x52/0x94
[  107.094367]  [<ffffffff81324ea2>] iscsi_xmitworker+0x1c2/0x2b9
[  107.094375]  [<ffffffff81324ce0>] ? iscsi_prep_scsi_cmd_pdu+0x604/0x604
[  107.094384]  [<ffffffff81052bbe>] process_one_work+0x20b/0x2f9
[  107.094391]  [<ffffffff81052e17>] worker_thread+0x16b/0x272
[  107.094398]  [<ffffffff81052cac>] ? process_one_work+0x2f9/0x2f9
[  107.094406]  [<ffffffff81056c60>] kthread+0xb0/0xb8
[  107.094414]  [<ffffffff81056bb0>] ?
kthread_freezable_should_stop+0x5b/0x5b
[  107.094422]  [<ffffffff8151f4bc>] ret_from_fork+0x7c/0xb0
[  107.094430]  [<ffffffff81056bb0>] ?
kthread_freezable_should_stop+0x5b/0x5b

I will try to find out who is setting that sbk frags. Do you have any
idea Konrad?

Konrad Rzeszutek Wilk

2013-Jan-11 15:06 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On Wed, Jan 09, 2013 at 08:23:45PM +0100, Roger Pau Monné
wrote:> On 02/01/13 22:36, Konrad Rzeszutek Wilk wrote:
> >>> I think we are just swizzling the PFNs with a different MFN
when you
> >>> do the domU -> domX, using two ring protocols. Weird
thought as the
> >>> m2p code has checks WARN_ON(PagePrivate(..)) to catch this
sort of
> >>> thing.
> >>>
> >>> What happens if the dom0/domU are all 3.8 with the persistent
grant
> >>> patches?
> >>
> >> Sorry for the delay, the same error happens when Dom0/DomU is
using a
> >> persistent grants enabled kernel, although I had to backport the
> >> persistent grants patch to 3.2, because I was unable to get iSCSI
> >> Enterprise Target dkms working with 3.8. I''m also seeing
this messages
> >> in the DomU that''s running the iSCSI target:
> >>
> >> [  511.338845] net_ratelimit: 36 callbacks suppressed
> >> [  511.338851] net eth0: rx->offset: 0, size: 4294967295
> > 
> > -1 ?! I saw this somewhere with 9000 MTUs.
> > 
> >> [  512.288282] net eth0: rx->offset: 0, size: 4294967295
> >> [  512.525639] net eth0: rx->offset: 0, size: 4294967295
> >> [  512.800729] net eth0: rx->offset: 0, size: 4294967295
> > 
> > But wow. It is just all over.
> > 
> > Could you instrument the M2P code to print out the PFN/MFN
> > values are they are being altered (along with __builtin_func(1) to
> > get an idea who is doing it). Perhaps that will shed light whether
> > my theory (that we are overwritting the MFNs) is truly happening.
> > It does not help that it ends up using multicalls - so it might be
> > that they are being done both in bathess - so there are multiple
> > MFN updates. Perhaps the multicalls have two or more changes to the
> > same MFN?
> 
> A little more info, I''ve found that we are passing FOREIGN_FRAMES
in
> the sbk fragments on netback. When we try to perform the grant copy
> operation using a foreign mfn as source, we hit the error. Here is
> the stack trace of the addition of the bogus sbk:
> 
> [  107.094109] Pid: 64, comm: kworker/u:5 Not tainted 3.7.0-rc3+ #8
> [  107.094114] Call Trace:
> [  107.094126]  [<ffffffff813ff099>]
xen_netbk_queue_tx_skb+0x16b/0x1aa
> [  107.094135]  [<ffffffff81400088>] xenvif_start_xmit+0x7b/0x9e
> [  107.094143]  [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db
> [  107.094151]  [<ffffffff81478917>] sch_direct_xmit+0x6e/0x150
> [  107.094159]  [<ffffffff814612ce>] dev_queue_xmit+0x16a/0x360
> [  107.094168]  [<ffffffff814c2426>] br_dev_queue_push_xmit+0x5c/0x62
> [  107.094175]  [<ffffffff814c2536>] br_deliver+0x35/0x3f
> [  107.094182]  [<ffffffff814c12b8>] br_dev_xmit+0xd7/0xef
> [  107.094189]  [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db
> [  107.094197]  [<ffffffff814549b6>] ? __alloc_skb+0x8d/0x187
> [  107.094204]  [<ffffffff81461409>] dev_queue_xmit+0x2a5/0x360
> [  107.094212]  [<ffffffff81485991>] ip_finish_output2+0x25c/0x2b7
> [  107.094219]  [<ffffffff81485a62>] ip_finish_output+0x76/0x7b
> [  107.094226]  [<ffffffff81485aa1>] ip_output+0x3a/0x3c
> [  107.094235]  [<ffffffff81483367>] dst_output+0xf/0x11
> [  107.094242]  [<ffffffff81483558>] ip_local_out+0x5c/0x5e
> [  107.094249]  [<ffffffff81485560>] ip_queue_xmit+0x2ce/0x2fc
> [  107.094256]  [<ffffffff814969b4>] tcp_transmit_skb+0x746/0x787
> [  107.094264]  [<ffffffff81498e73>] tcp_write_xmit+0x837/0x949
> [  107.094273]  [<ffffffff810f1fea>] ? virt_to_head_page+0x9/0x2c
> [  107.094281]  [<ffffffff810f21aa>] ? ksize+0x1a/0x24
> [  107.094288]  [<ffffffff814549ca>] ? __alloc_skb+0xa1/0x187
> [  107.094295]  [<ffffffff81499212>]
__tcp_push_pending_frames+0x2c/0x59
> [  107.094302]  [<ffffffff8148af1a>] tcp_push+0x87/0x89
> [  107.094309]  [<ffffffff8148cd26>] tcp_sendpage+0x448/0x480
> [  107.094317]  [<ffffffff814aacc8>] inet_sendpage+0xa0/0xb5
> [  107.094327]  [<ffffffff81329aac>] iscsi_sw_tcp_pdu_xmit+0xa2/0x236
> [  107.094335]  [<ffffffff81328188>] iscsi_tcp_task_xmit+0x34/0x236
> [  107.094345]  [<ffffffff8100d8b5>] ? __spin_time_accum+0x17/0x2e
> [  107.094352]  [<ffffffff8100daf3>] ? __xen_spin_lock+0xb7/0xcd
> [  107.094360]  [<ffffffff8132434a>] iscsi_xmit_task+0x52/0x94
> [  107.094367]  [<ffffffff81324ea2>] iscsi_xmitworker+0x1c2/0x2b9
> [  107.094375]  [<ffffffff81324ce0>] ?
iscsi_prep_scsi_cmd_pdu+0x604/0x604
> [  107.094384]  [<ffffffff81052bbe>] process_one_work+0x20b/0x2f9
> [  107.094391]  [<ffffffff81052e17>] worker_thread+0x16b/0x272
> [  107.094398]  [<ffffffff81052cac>] ? process_one_work+0x2f9/0x2f9
> [  107.094406]  [<ffffffff81056c60>] kthread+0xb0/0xb8
> [  107.094414]  [<ffffffff81056bb0>] ?
kthread_freezable_should_stop+0x5b/0x5b
> [  107.094422]  [<ffffffff8151f4bc>] ret_from_fork+0x7c/0xb0
> [  107.094430]  [<ffffffff81056bb0>] ?
kthread_freezable_should_stop+0x5b/0x5b
> 
> I will try to find out who is setting that sbk frags. Do you have any
> idea Konrad?
m2p_add_override.

>  
>

Roger Pau Monné

2013-Jan-11 15:57 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Hello Konrad,

I''ve found the problem, blkback is adding granted pages to the bio that
is then passed to the underlying block device. When using a iscsi 
target running on another DomU in the same h/w this bios end up in 
netback, and then when performing the gnttab copy operation, it 
complains because the passed mfn belongs to a different domain.

I''ve checked this by applying the appended patch to blkback, which 
allocates a buffer to pass to the bio instead of using the granted 
page. Of course this should not applied, since it implies additional 
memcpys.

I think the right way to solve this would be to change netback to 
use gnttab_map and memcpy instead of gnttab_copy, but I guess this 
will imply a performance degradation (haven''t benchmarked it, but I 
assume gnttab_copy is used in netback because it is faster than 
gnttab_map + memcpy + gnttab_unmap).

---

diff --git a/drivers/block/xen-blkback/blkback.c
b/drivers/block/xen-blkback/blkback.c
index 8808028..9740cbb 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -80,6 +80,8 @@ struct pending_req {
 	unsigned short		operation;
 	int			status;
 	struct list_head	free_list;
+	struct page *grant_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
+	void *bio_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 	DECLARE_BITMAP(unmap_seg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
 };
 
@@ -701,6 +703,7 @@ static void xen_blk_drain_io(struct xen_blkif *blkif)
 
 static void __end_block_io_op(struct pending_req *pending_req, int error)
 {
+	int i;
 	/* An error fails the entire request. */
 	if ((pending_req->operation == BLKIF_OP_FLUSH_DISKCACHE) &&
 	    (error == -EOPNOTSUPP)) {
@@ -724,6 +727,16 @@ static void __end_block_io_op(struct pending_req
*pending_req, int error)
 	 * the proper response on the ring.
 	 */
 	if (atomic_dec_and_test(&pending_req->pendcnt)) {
+		for (i = 0; i < pending_req->nr_pages; i++) {
+			BUG_ON(pending_req->bio_pages[i] == NULL);
+			if (pending_req->operation == BLKIF_OP_READ) {
+				void *grant = kmap_atomic(pending_req->grant_pages[i]);
+				memcpy(grant, pending_req->bio_pages[i],
+				       PAGE_SIZE);
+				kunmap_atomic(grant);
+			}
+			kfree(pending_req->bio_pages[i]);
+		}
 		xen_blkbk_unmap(pending_req);
 		make_response(pending_req->blkif, pending_req->id,
 			      pending_req->operation, pending_req->status);
@@ -846,7 +859,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	int operation;
 	struct blk_plug plug;
 	bool drain = false;
-	struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
 
 	switch (req->operation) {
 	case BLKIF_OP_READ:
@@ -889,6 +901,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	pending_req->operation = req->operation;
 	pending_req->status    = BLKIF_RSP_OKAY;
 	pending_req->nr_pages  = nseg;
+	memset(pending_req->bio_pages, 0, sizeof(pending_req->bio_pages));
 
 	for (i = 0; i < nseg; i++) {
 		seg[i].nsec = req->u.rw.seg[i].last_sect -
@@ -933,7 +946,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	 * the hypercall to unmap the grants - that is all done in
 	 * xen_blkbk_unmap.
 	 */
-	if (xen_blkbk_map(req, pending_req, seg, pages))
+	if (xen_blkbk_map(req, pending_req, seg, pending_req->grant_pages))
 		goto fail_flush;
 
 	/*
@@ -943,9 +956,17 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif,
 	xen_blkif_get(blkif);
 
 	for (i = 0; i < nseg; i++) {
+		void *grant;
+		pending_req->bio_pages[i] = kmalloc(PAGE_SIZE, GFP_KERNEL);
+		if (req->operation == BLKIF_OP_WRITE) {
+			grant = kmap_atomic(pending_req->grant_pages[i]);
+			memcpy(pending_req->bio_pages[i], grant,
+			       PAGE_SIZE);
+			kunmap_atomic(grant);
+		}
 		while ((bio == NULL) ||
 		       (bio_add_page(bio,
-				     pages[i],
+				     virt_to_page(pending_req->bio_pages[i]),
 				     seg[i].nsec << 9,
 				     seg[i].buf & ~PAGE_MASK) == 0)) {

Konrad Rzeszutek Wilk

2013-Jan-11 18:51 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné
wrote:> Hello Konrad,
> 
> I''ve found the problem, blkback is adding granted pages to the bio
that
> is then passed to the underlying block device. When using a iscsi 
> target running on another DomU in the same h/w this bios end up in 
> netback, and then when performing the gnttab copy operation, it 
> complains because the passed mfn belongs to a different domain.
OK, so my original theory was sound. The m2p override
"sticks".> 
> I''ve checked this by applying the appended patch to blkback, which
> allocates a buffer to pass to the bio instead of using the granted 
> page. Of course this should not applied, since it implies additional 
> memcpys.
> 
> I think the right way to solve this would be to change netback to 
> use gnttab_map and memcpy instead of gnttab_copy, but I guess this 
> will imply a performance degradation (haven''t benchmarked it, but
I
> assume gnttab_copy is used in netback because it is faster than 
> gnttab_map + memcpy + gnttab_unmap).
Or blkback is altered to use grant_copy. Or perhaps m2p_override
can do multiple PAGE_FOREIGN? (So if it detects a collision it will
do something smart.. like allocate a new page or update the 
kmap_op with extra information).


And yes, grant_map in netback is much much slower that grant_copy
(I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes
that Jan came up with).

See attached.
> 
> ---
> 
> diff --git a/drivers/block/xen-blkback/blkback.c
b/drivers/block/xen-blkback/blkback.c
> index 8808028..9740cbb 100644
> --- a/drivers/block/xen-blkback/blkback.c
> +++ b/drivers/block/xen-blkback/blkback.c
> @@ -80,6 +80,8 @@ struct pending_req {
>  	unsigned short		operation;
>  	int			status;
>  	struct list_head	free_list;
> +	struct page *grant_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> +	void *bio_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>  	DECLARE_BITMAP(unmap_seg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
>  };
>  
> @@ -701,6 +703,7 @@ static void xen_blk_drain_io(struct xen_blkif *blkif)
>  
>  static void __end_block_io_op(struct pending_req *pending_req, int error)
>  {
> +	int i;
>  	/* An error fails the entire request. */
>  	if ((pending_req->operation == BLKIF_OP_FLUSH_DISKCACHE) &&
>  	    (error == -EOPNOTSUPP)) {
> @@ -724,6 +727,16 @@ static void __end_block_io_op(struct pending_req
*pending_req, int error)
>  	 * the proper response on the ring.
>  	 */
>  	if (atomic_dec_and_test(&pending_req->pendcnt)) {
> +		for (i = 0; i < pending_req->nr_pages; i++) {
> +			BUG_ON(pending_req->bio_pages[i] == NULL);
> +			if (pending_req->operation == BLKIF_OP_READ) {
> +				void *grant = kmap_atomic(pending_req->grant_pages[i]);
> +				memcpy(grant, pending_req->bio_pages[i],
> +				       PAGE_SIZE);
> +				kunmap_atomic(grant);
> +			}
> +			kfree(pending_req->bio_pages[i]);
> +		}
>  		xen_blkbk_unmap(pending_req);
>  		make_response(pending_req->blkif, pending_req->id,
>  			      pending_req->operation, pending_req->status);
> @@ -846,7 +859,6 @@ static int dispatch_rw_block_io(struct xen_blkif
*blkif,
>  	int operation;
>  	struct blk_plug plug;
>  	bool drain = false;
> -	struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
>  
>  	switch (req->operation) {
>  	case BLKIF_OP_READ:
> @@ -889,6 +901,7 @@ static int dispatch_rw_block_io(struct xen_blkif
*blkif,
>  	pending_req->operation = req->operation;
>  	pending_req->status    = BLKIF_RSP_OKAY;
>  	pending_req->nr_pages  = nseg;
> +	memset(pending_req->bio_pages, 0, sizeof(pending_req->bio_pages));
>  
>  	for (i = 0; i < nseg; i++) {
>  		seg[i].nsec = req->u.rw.seg[i].last_sect -
> @@ -933,7 +946,7 @@ static int dispatch_rw_block_io(struct xen_blkif
*blkif,
>  	 * the hypercall to unmap the grants - that is all done in
>  	 * xen_blkbk_unmap.
>  	 */
> -	if (xen_blkbk_map(req, pending_req, seg, pages))
> +	if (xen_blkbk_map(req, pending_req, seg, pending_req->grant_pages))
>  		goto fail_flush;
>  
>  	/*
> @@ -943,9 +956,17 @@ static int dispatch_rw_block_io(struct xen_blkif
*blkif,
>  	xen_blkif_get(blkif);
>  
>  	for (i = 0; i < nseg; i++) {
> +		void *grant;
> +		pending_req->bio_pages[i] = kmalloc(PAGE_SIZE, GFP_KERNEL);
> +		if (req->operation == BLKIF_OP_WRITE) {
> +			grant = kmap_atomic(pending_req->grant_pages[i]);
> +			memcpy(pending_req->bio_pages[i], grant,
> +			       PAGE_SIZE);
> +			kunmap_atomic(grant);
> +		}
>  		while ((bio == NULL) ||
>  		       (bio_add_page(bio,
> -				     pages[i],
> +				     virt_to_page(pending_req->bio_pages[i]),
>  				     seg[i].nsec << 9,
>  				     seg[i].buf & ~PAGE_MASK) == 0)) {
>  
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Roger Pau Monné

2013-Jan-11 19:29 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote:> On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote:
>> Hello Konrad,
>>
>> I''ve found the problem, blkback is adding granted pages to the
bio that
>> is then passed to the underlying block device. When using a iscsi 
>> target running on another DomU in the same h/w this bios end up in 
>> netback, and then when performing the gnttab copy operation, it 
>> complains because the passed mfn belongs to a different domain.
> 
> OK, so my original theory was sound. The m2p override "sticks".
>>
>> I''ve checked this by applying the appended patch to blkback,
which
>> allocates a buffer to pass to the bio instead of using the granted 
>> page. Of course this should not applied, since it implies additional 
>> memcpys.
>>
>> I think the right way to solve this would be to change netback to 
>> use gnttab_map and memcpy instead of gnttab_copy, but I guess this 
>> will imply a performance degradation (haven''t benchmarked it,
but I
>> assume gnttab_copy is used in netback because it is faster than 
>> gnttab_map + memcpy + gnttab_unmap).
> 
> Or blkback is altered to use grant_copy.
This would not work with the persistent-grants extension, and also when
scaling to a large number of guests will probably have a degraded
performance due to the grant table lock (compared to using persistent
grants).
> Or perhaps m2p_override
> can do multiple PAGE_FOREIGN? (So if it detects a collision it will
> do something smart.. like allocate a new page or update the 
> kmap_op with extra information).
What we could do is add extra information to m2p_override, containing
the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in
grant_copy (or netback) the grant_ref_t and domid of the passed mfn is
used instead of the mfn (provided that grant_copy can perform a copy
between two grant references of different domains).
> 
> 
> And yes, grant_map in netback is much much slower that grant_copy
> (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes
> that Jan came up with).
Yes, I see there''s no way we are going to use grant_map instead of
grant_copy. I guess this will no longer be true once netback/front
starts using the persistent grants extension.

Konrad Rzeszutek Wilk

2013-Jan-11 21:09 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On Fri, Jan 11, 2013 at 08:29:12PM +0100, Roger Pau Monné
wrote:> On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote:
> > On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote:
> >> Hello Konrad,
> >>
> >> I''ve found the problem, blkback is adding granted pages
to the bio that
> >> is then passed to the underlying block device. When using a iscsi 
> >> target running on another DomU in the same h/w this bios end up in
> >> netback, and then when performing the gnttab copy operation, it 
> >> complains because the passed mfn belongs to a different domain.
> > 
> > OK, so my original theory was sound. The m2p override
"sticks".
> >>
> >> I''ve checked this by applying the appended patch to
blkback, which
> >> allocates a buffer to pass to the bio instead of using the granted
> >> page. Of course this should not applied, since it implies
additional
> >> memcpys.
> >>
> >> I think the right way to solve this would be to change netback to 
> >> use gnttab_map and memcpy instead of gnttab_copy, but I guess this
> >> will imply a performance degradation (haven''t benchmarked
it, but I
> >> assume gnttab_copy is used in netback because it is faster than 
> >> gnttab_map + memcpy + gnttab_unmap).
> > 
> > Or blkback is altered to use grant_copy.
> 
> This would not work with the persistent-grants extension, and also when
> scaling to a large number of guests will probably have a degraded
> performance due to the grant table lock (compared to using persistent
> grants).
> 
> > Or perhaps m2p_override
> > can do multiple PAGE_FOREIGN? (So if it detects a collision it will
> > do something smart.. like allocate a new page or update the 
> > kmap_op with extra information).
> 
> What we could do is add extra information to m2p_override, containing
> the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in
> grant_copy (or netback) the grant_ref_t and domid of the passed mfn is
> used instead of the mfn (provided that grant_copy can perform a copy
> between two grant references of different domains).
> 
> > 
> > 
> > And yes, grant_map in netback is much much slower that grant_copy
> > (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes
> > that Jan came up with).
> 
> Yes, I see there''s no way we are going to use grant_map instead of
> grant_copy. I guess this will no longer be true once netback/front
> starts using the persistent grants extension.
Hm? Annie posted patches for the persistent grants on netback/netfront
and they did not show much improvement (as you are already doing grant_copy).

Or are you saying change netback to use grant_map and utilize the
skb->deconstructor to keep track of it? And then do persistent grant
extensions on it?

Roger Pau Monné

2013-Jan-12 12:11 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On 11/01/13 22:09, Konrad Rzeszutek Wilk wrote:> On Fri, Jan 11, 2013 at 08:29:12PM +0100, Roger Pau Monné wrote:
>> On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote:
>>> On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote:
>>>> Hello Konrad,
>>>>
>>>> I''ve found the problem, blkback is adding granted
pages to the bio that
>>>> is then passed to the underlying block device. When using a
iscsi
>>>> target running on another DomU in the same h/w this bios end up
in
>>>> netback, and then when performing the gnttab copy operation, it
>>>> complains because the passed mfn belongs to a different domain.
>>>
>>> OK, so my original theory was sound. The m2p override
"sticks".
>>>>
>>>> I''ve checked this by applying the appended patch to
blkback, which
>>>> allocates a buffer to pass to the bio instead of using the
granted
>>>> page. Of course this should not applied, since it implies
additional
>>>> memcpys.
>>>>
>>>> I think the right way to solve this would be to change netback
to
>>>> use gnttab_map and memcpy instead of gnttab_copy, but I guess
this
>>>> will imply a performance degradation (haven''t
benchmarked it, but I
>>>> assume gnttab_copy is used in netback because it is faster than
>>>> gnttab_map + memcpy + gnttab_unmap).
>>>
>>> Or blkback is altered to use grant_copy.
>>
>> This would not work with the persistent-grants extension, and also when
>> scaling to a large number of guests will probably have a degraded
>> performance due to the grant table lock (compared to using persistent
>> grants).
>>
>>> Or perhaps m2p_override
>>> can do multiple PAGE_FOREIGN? (So if it detects a collision it will
>>> do something smart.. like allocate a new page or update the 
>>> kmap_op with extra information).
>>
>> What we could do is add extra information to m2p_override, containing
>> the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in
>> grant_copy (or netback) the grant_ref_t and domid of the passed mfn is
>> used instead of the mfn (provided that grant_copy can perform a copy
>> between two grant references of different domains).
Since the issue I''m having is not common I''m not sure if this
solution
is worth it, it will imply storing a pointer to a struct in the page
private data, that stores the mfn, grant reference and domid (now we are
only storing the mfn in the page private data).
>>>
>>>
>>> And yes, grant_map in netback is much much slower that grant_copy
>>> (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes
>>> that Jan came up with).
>>
>> Yes, I see there''s no way we are going to use grant_map
instead of
>> grant_copy. I guess this will no longer be true once netback/front
>> starts using the persistent grants extension.
> 
> Hm? Annie posted patches for the persistent grants on netback/netfront
> and they did not show much improvement (as you are already doing
grant_copy).
> 
> Or are you saying change netback to use grant_map and utilize the
> skb->deconstructor to keep track of it? And then do persistent grant
> extensions on it?
I''m not really familiar with the net code, but I''ve had a
quick look at
the grant_copy operation in Xen and it is indeed using the grant lock to
protect some parts of the code.

Using persistent grants should provide better performance in the long
run because once the grant is mapped we don''t have to issue any more
grant operations, thus avoiding grant lock contention (maybe I''m
missing
something here). skb deconstructor should probably be used in netfront,
to return the persistently mapped grant to the list of free grants, but
I''m not sure if we will need to use it in netback.

Konrad Rzeszutek Wilk

2013-Jan-14 15:24 UTC

head link

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

On Sat, Jan 12, 2013 at 01:11:32PM +0100, Roger Pau Monné
wrote:> On 11/01/13 22:09, Konrad Rzeszutek Wilk wrote:
> > On Fri, Jan 11, 2013 at 08:29:12PM +0100, Roger Pau Monné wrote:
> >> On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote:
> >>> On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné
wrote:
> >>>> Hello Konrad,
> >>>>
> >>>> I''ve found the problem, blkback is adding granted
pages to the bio that
> >>>> is then passed to the underlying block device. When using
a iscsi
> >>>> target running on another DomU in the same h/w this bios
end up in
> >>>> netback, and then when performing the gnttab copy
operation, it
> >>>> complains because the passed mfn belongs to a different
domain.
> >>>
> >>> OK, so my original theory was sound. The m2p override
"sticks".
> >>>>
> >>>> I''ve checked this by applying the appended patch
to blkback, which
> >>>> allocates a buffer to pass to the bio instead of using the
granted
> >>>> page. Of course this should not applied, since it implies
additional
> >>>> memcpys.
> >>>>
> >>>> I think the right way to solve this would be to change
netback to
> >>>> use gnttab_map and memcpy instead of gnttab_copy, but I
guess this
> >>>> will imply a performance degradation (haven''t
benchmarked it, but I
> >>>> assume gnttab_copy is used in netback because it is faster
than
> >>>> gnttab_map + memcpy + gnttab_unmap).
> >>>
> >>> Or blkback is altered to use grant_copy.
> >>
> >> This would not work with the persistent-grants extension, and also
when
> >> scaling to a large number of guests will probably have a degraded
> >> performance due to the grant table lock (compared to using
persistent
> >> grants).
> >>
> >>> Or perhaps m2p_override
> >>> can do multiple PAGE_FOREIGN? (So if it detects a collision it
will
> >>> do something smart.. like allocate a new page or update the 
> >>> kmap_op with extra information).
> >>
> >> What we could do is add extra information to m2p_override,
containing
> >> the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in
> >> grant_copy (or netback) the grant_ref_t and domid of the passed
mfn is
> >> used instead of the mfn (provided that grant_copy can perform a
copy
> >> between two grant references of different domains).
> 
> Since the issue I''m having is not common I''m not sure if
this solution
> is worth it, it will imply storing a pointer to a struct in the page
> private data, that stores the mfn, grant reference and domid (now we are
> only storing the mfn in the page private data).
> 
> >>>
> >>>
> >>> And yes, grant_map in netback is much much slower that
grant_copy
> >>> (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy
fixes
> >>> that Jan came up with).
> >>
> >> Yes, I see there''s no way we are going to use grant_map
instead of
> >> grant_copy. I guess this will no longer be true once netback/front
> >> starts using the persistent grants extension.
> > 
> > Hm? Annie posted patches for the persistent grants on netback/netfront
> > and they did not show much improvement (as you are already doing
grant_copy).
> > 
> > Or are you saying change netback to use grant_map and utilize the
> > skb->deconstructor to keep track of it? And then do persistent
grant
> > extensions on it?
> 
> I''m not really familiar with the net code, but I''ve had a
quick look at
> the grant_copy operation in Xen and it is indeed using the grant lock to
> protect some parts of the code.
Sure, but at the cost of doing memory copy. And if the guests are on 
seperate sockets there are no cache benefits.
> 
> Using persistent grants should provide better performance in the long
> run because once the grant is mapped we don''t have to issue any
more
> grant operations, thus avoiding grant lock contention (maybe I''m
missing
> something here). skb deconstructor should probably be used in netfront,
They did not improve it. As a matter of fact they made it worst.
(So this is taking the idea that Andrew had that he shared with Oliver and you
about doing persistent grants, and using the same type of hypercalls and
copy - but do it in the network subsystem).
> to return the persistently mapped grant to the list of free grants, but
> I''m not sure if we will need to use it in netback.
Why not? We don''t want to blow any the cache data if we can do it. The
problem is how to deal with the RX path from the NICs. Each NICs on the
RX path does something like this:

	1). process its descriptors
	2). unmap the page
	3). allocate a new set of skbs (and pages), update the descriptors
	    with the new bus address
	4). pass off the unmapped pages (with skbs) to the network stack.
	5). forget about the skbs.

There is no persistency - so the NIC ends up getting a "fresh" set of
pages
all the time that perculate up to netback.

The TX (so netback -> NIC) could be solved by having a pool of
pages/skb''s
that are mapped and are owned by netback (via the skb->dcst) so that they
are retained.>

Xen devel - Dec 2012 - Create a iSCSI DomU with disks in another DomU running on the same Dom0

Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0

Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0