Roger Pau Monné
2012-Dec-21 08:29 UTC
Create a iSCSI DomU with disks in another DomU running on the same Dom0
Hello, I''m trying to use a strange setup, that consists in having a DomU serving iSCSI targets to the Dom0, that will use this targets as disks for other DomUs. I''ve tried to set up this iSCSI target DomU using both Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI Enterprise Target (IET), and when launching the DomU I get this messages from Xen: (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 (XEN) Xen WARN at mm.c:1926 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c4802bfba8: (XEN) ffff830141405000 8000000000000003 7400000000000001 0000000000145028 (XEN) ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc3f0 (XEN) 0000000000000001 ffffffffffff8000 0000000000000002 ffff83011d555000 (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 ffff82c4802b8000 (XEN) ffff82c400000000 0000000000000001 ffffc90000028b10 ffffc90000028b10 (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 0000000000145028 (XEN) 000000000011cf7c 0000000000001000 0000000000157e68 0000000000007ff0 (XEN) 000000000000027e 000000000042000d 0000000000020b50 ffff8300dfdf0000 (XEN) ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 ffff880185f6fd58 (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 ffff8300dfb03000 (XEN) ffff8300dfdf0000 0000150e11a417f8 0000000000000002 ffff82c480300948 (XEN) Xen call trace: (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 (XEN) (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 (XEN) Xen WARN at mm.c:1926 (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c4802bfba8: (XEN) ffff830141405000 8000000000000003 7400000000000001 000000000014581d (XEN) ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc308 (XEN) 0000000000000000 ffffffffffff8000 0000000000000001 ffff83011d555000 (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 ffff82c4802b8000 (XEN) ffffffff00000000 0000000000000001 ffffc90000028b60 ffffc90000028b60 (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 000000000014581d (XEN) 00000000000deb3e 0000000000001000 0000000000157e68 000000000b507ff0 (XEN) 0000000000000261 000000000042000d 00000000000204b0 ffffc90000028b38 (XEN) 0000000000000002 ffffc90000028b38 ffffc90000028b38 ffff880185f6fd58 (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 0000000000000086 (XEN) ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 0000000000000286 (XEN) Xen call trace: (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 (XEN) (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. (Note that I''ve added a WARN() to mm.c:1925 to see where the get_page call was coming from). Connecting the iSCSI disks to another Dom0 works fine, so this problem only happens when trying to connect the disks to the Dom0 where the DomU is running. I''ve replaced the Linux DomU serving iSCSI targets with a NetBSD DomU, and the problems disappears, and I''m able to attach the targets shared by the DomU to the Dom0 without issues. The problem seems to come from netfront/netback, does anyone have a clue about what might cause this bad interaction between IET and netfront/netback? Thanks, Roger.
Konrad Rzeszutek Wilk
2012-Dec-21 14:03 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote:> Hello, > > I''m trying to use a strange setup, that consists in having a DomU > serving iSCSI targets to the Dom0, that will use this targets as disks > for other DomUs. I''ve tried to set up this iSCSI target DomU using both > Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI > Enterprise Target (IET), and when launching the DomU I get this messages > from Xen: > > (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 > (XEN) Xen WARN at mm.c:1926 > (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 > (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 > (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 > (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c4802bfba8: > (XEN) ffff830141405000 8000000000000003 7400000000000001 0000000000145028 > (XEN) ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 > (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 > (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc3f0 > (XEN) 0000000000000001 ffffffffffff8000 0000000000000002 ffff83011d555000 > (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 > (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 > (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 > (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 > (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 ffff82c4802b8000 > (XEN) ffff82c400000000 0000000000000001 ffffc90000028b10 ffffc90000028b10 > (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 0000000000145028 > (XEN) 000000000011cf7c 0000000000001000 0000000000157e68 0000000000007ff0 > (XEN) 000000000000027e 000000000042000d 0000000000020b50 ffff8300dfdf0000 > (XEN) ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 ffff880185f6fd58 > (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 > (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 > (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 ffff8300dfb03000 > (XEN) ffff8300dfdf0000 0000150e11a417f8 0000000000000002 ffff82c480300948 > (XEN) Xen call trace: > (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 > (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 > (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a > (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 > (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 > (XEN) > (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. > (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 > (XEN) Xen WARN at mm.c:1926 > (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 0 > (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 > (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 > (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 > (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 > (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 > (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff82c4802bfba8: > (XEN) ffff830141405000 8000000000000003 7400000000000001 000000000014581d > (XEN) ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 > (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 > (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc308 > (XEN) 0000000000000000 ffffffffffff8000 0000000000000001 ffff83011d555000 > (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 > (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 > (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 > (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 > (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 ffff82c4802b8000 > (XEN) ffffffff00000000 0000000000000001 ffffc90000028b60 ffffc90000028b60 > (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 000000000014581d > (XEN) 00000000000deb3e 0000000000001000 0000000000157e68 000000000b507ff0 > (XEN) 0000000000000261 000000000042000d 00000000000204b0 ffffc90000028b38 > (XEN) 0000000000000002 ffffc90000028b38 ffffc90000028b38 ffff880185f6fd58 > (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 > (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 > (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 > (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 0000000000000086 > (XEN) ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 0000000000000286 > (XEN) Xen call trace: > (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 > (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 > (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a > (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 > (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 > (XEN) > (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. > > (Note that I''ve added a WARN() to mm.c:1925 to see where the > get_page call was coming from). > > Connecting the iSCSI disks to another Dom0 works fine, so this > problem only happens when trying to connect the disks to the > Dom0 where the DomU is running.Is this happening when the ''disks'' are exported to the domUs? Are they exported via QEMU or xen-blkback?> > I''ve replaced the Linux DomU serving iSCSI targets with a > NetBSD DomU, and the problems disappears, and I''m able to > attach the targets shared by the DomU to the Dom0 without > issues. > > The problem seems to come from netfront/netback, does anyone > have a clue about what might cause this bad interaction > between IET and netfront/netback?Or it might be that we are re-using the PFN for blkback/blkfront and using the m2p overrides and overwritting the netfront/netback m2p overrides? Is this with an HVM domU or PV domU?> > Thanks, Roger. > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Roger Pau Monné
2012-Dec-21 14:47 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote:> On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote: >> Hello, >> >> I''m trying to use a strange setup, that consists in having a DomU >> serving iSCSI targets to the Dom0, that will use this targets as disks >> for other DomUs. I''ve tried to set up this iSCSI target DomU using both >> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI >> Enterprise Target (IET), and when launching the DomU I get this messages >> from Xen: >> >> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 >> (XEN) Xen WARN at mm.c:1926 >> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 >> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor >> (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 >> (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 >> (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 >> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 >> (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 >> (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 >> (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >> (XEN) Xen stack trace from rsp=ffff82c4802bfba8: >> (XEN) ffff830141405000 8000000000000003 7400000000000001 0000000000145028 >> (XEN) ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 >> (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 >> (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc3f0 >> (XEN) 0000000000000001 ffffffffffff8000 0000000000000002 ffff83011d555000 >> (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 >> (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 >> (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 >> (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 >> (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 ffff82c4802b8000 >> (XEN) ffff82c400000000 0000000000000001 ffffc90000028b10 ffffc90000028b10 >> (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 0000000000145028 >> (XEN) 000000000011cf7c 0000000000001000 0000000000157e68 0000000000007ff0 >> (XEN) 000000000000027e 000000000042000d 0000000000020b50 ffff8300dfdf0000 >> (XEN) ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 ffff880185f6fd58 >> (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 >> (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 >> (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 ffff8300dfb03000 >> (XEN) ffff8300dfdf0000 0000150e11a417f8 0000000000000002 ffff82c480300948 >> (XEN) Xen call trace: >> (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 >> (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 >> (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a >> (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 >> (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 >> (XEN) >> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. >> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 >> (XEN) Xen WARN at mm.c:1926 >> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- >> (XEN) CPU: 0 >> (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 >> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor >> (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 >> (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 >> (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 >> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 >> (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 >> (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 >> (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >> (XEN) Xen stack trace from rsp=ffff82c4802bfba8: >> (XEN) ffff830141405000 8000000000000003 7400000000000001 000000000014581d >> (XEN) ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 >> (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 >> (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc308 >> (XEN) 0000000000000000 ffffffffffff8000 0000000000000001 ffff83011d555000 >> (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 >> (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 >> (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 >> (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 >> (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 ffff82c4802b8000 >> (XEN) ffffffff00000000 0000000000000001 ffffc90000028b60 ffffc90000028b60 >> (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 000000000014581d >> (XEN) 00000000000deb3e 0000000000001000 0000000000157e68 000000000b507ff0 >> (XEN) 0000000000000261 000000000042000d 00000000000204b0 ffffc90000028b38 >> (XEN) 0000000000000002 ffffc90000028b38 ffffc90000028b38 ffff880185f6fd58 >> (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 >> (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 >> (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 >> (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 0000000000000086 >> (XEN) ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 0000000000000286 >> (XEN) Xen call trace: >> (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 >> (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 >> (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a >> (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 >> (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 >> (XEN) >> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. >> >> (Note that I''ve added a WARN() to mm.c:1925 to see where the >> get_page call was coming from). >> >> Connecting the iSCSI disks to another Dom0 works fine, so this >> problem only happens when trying to connect the disks to the >> Dom0 where the DomU is running. > > Is this happening when the ''disks'' are exported to the domUs? > Are they exported via QEMU or xen-blkback?The iSCSI disks are connected to the DomUs using blkback, and this is happening when the DomU tries to access it''s disks.>> >> I''ve replaced the Linux DomU serving iSCSI targets with a >> NetBSD DomU, and the problems disappears, and I''m able to >> attach the targets shared by the DomU to the Dom0 without >> issues. >> >> The problem seems to come from netfront/netback, does anyone >> have a clue about what might cause this bad interaction >> between IET and netfront/netback? > > Or it might be that we are re-using the PFN for blkback/blkfront > and using the m2p overrides and overwritting the netfront/netback > m2p overrides?What''s strange is that this doesn''t happen when the domain that has the targets is a NetBSD PV. There are also problems when blkback is not used (see below), so I guess the problem is between netfront/netback and IET.> Is this with an HVM domU or PV domU?Both domains (the domain holding the iSCSI targets, and the created guests) are PV. Also, I''ve forgot to say that in the previous email, but if I just connect the iSCSI disks to the Dom0, I don''t see any errors from Xen, but the Dom0 kernel starts complaining: [70272.569607] sd 14:0:0:0: [sdc] [70272.569611] Sense Key : Medium Error [current] [70272.569619] Info fld=0x0 [70272.569623] sd 14:0:0:0: [sdc] [70272.569627] Add. Sense: Unrecovered read error [70272.569633] sd 14:0:0:0: [sdc] CDB: [70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00 [70272.569662] end_request: critical target error, dev sdc, sector 0 [70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code [70277.571220] sd 14:0:0:0: [sdc] [70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [70277.571229] sd 14:0:0:0: [sdc] [70277.571233] Sense Key : Medium Error [current] [70277.571241] Info fld=0x0 [70277.571245] sd 14:0:0:0: [sdc] [70277.571249] Add. Sense: Unrecovered read error [70277.571255] sd 14:0:0:0: [sdc] CDB: [70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00 [70277.571284] end_request: critical target error, dev sdc, sector 0 [70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code [70282.572781] sd 14:0:0:0: [sdc] [70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [70282.572790] sd 14:0:0:0: [sdc] [70282.572794] Sense Key : Medium Error [current] [70282.572802] Info fld=0x0 [70282.572806] sd 14:0:0:0: [sdc] [70282.572810] Add. Sense: Unrecovered read error [70282.572816] sd 14:0:0:0: [sdc] CDB: [70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00 [70282.572846] end_request: critical target error, dev sdc, sector 0 [70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code [70287.574409] sd 14:0:0:0: [sdc] [70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [70287.574418] sd 14:0:0:0: [sdc] [70287.574422] Sense Key : Medium Error [current] [70287.574430] Info fld=0x0 [70287.574434] sd 14:0:0:0: [sdc] [70287.574438] Add. Sense: Unrecovered read error [70287.574445] sd 14:0:0:0: [sdc] CDB: [70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00 [70287.574474] end_request: critical target error, dev sdc, sector 0 When I try to attach the targets to another Dom0, everything works fine, the problem only happens when the iSCSI target is a DomU and you attach the disks from the Dom0 on the same machine. Thanks, Roger.
Konrad Rzeszutek Wilk
2012-Dec-21 17:35 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On Fri, Dec 21, 2012 at 03:47:20PM +0100, Roger Pau Monné wrote:> On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote: > > On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote: > >> Hello, > >> > >> I''m trying to use a strange setup, that consists in having a DomU > >> serving iSCSI targets to the Dom0, that will use this targets as disks > >> for other DomUs. I''ve tried to set up this iSCSI target DomU using both > >> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI > >> Enterprise Target (IET), and when launching the DomU I get this messages > >> from Xen: > >> > >> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 > >> (XEN) Xen WARN at mm.c:1926 > >> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 0 > >> (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 > >> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor > >> (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 > >> (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 > >> (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 > >> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > >> (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 > >> (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 > >> (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 > >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > >> (XEN) Xen stack trace from rsp=ffff82c4802bfba8: > >> (XEN) ffff830141405000 8000000000000003 7400000000000001 0000000000145028 > >> (XEN) ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 > >> (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 > >> (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc3f0 > >> (XEN) 0000000000000001 ffffffffffff8000 0000000000000002 ffff83011d555000 > >> (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 > >> (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 > >> (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 > >> (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 > >> (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 ffff82c4802b8000 > >> (XEN) ffff82c400000000 0000000000000001 ffffc90000028b10 ffffc90000028b10 > >> (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 0000000000145028 > >> (XEN) 000000000011cf7c 0000000000001000 0000000000157e68 0000000000007ff0 > >> (XEN) 000000000000027e 000000000042000d 0000000000020b50 ffff8300dfdf0000 > >> (XEN) ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 ffff880185f6fd58 > >> (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 > >> (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 > >> (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 > >> (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 ffff8300dfb03000 > >> (XEN) ffff8300dfdf0000 0000150e11a417f8 0000000000000002 ffff82c480300948 > >> (XEN) Xen call trace: > >> (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 > >> (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 > >> (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a > >> (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 > >> (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 > >> (XEN) > >> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. > >> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 > >> (XEN) Xen WARN at mm.c:1926 > >> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > >> (XEN) CPU: 0 > >> (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 > >> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor > >> (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 > >> (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 > >> (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 > >> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > >> (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 > >> (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 > >> (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 > >> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > >> (XEN) Xen stack trace from rsp=ffff82c4802bfba8: > >> (XEN) ffff830141405000 8000000000000003 7400000000000001 000000000014581d > >> (XEN) ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 > >> (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 > >> (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc308 > >> (XEN) 0000000000000000 ffffffffffff8000 0000000000000001 ffff83011d555000 > >> (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 > >> (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 > >> (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 > >> (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 > >> (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 ffff82c4802b8000 > >> (XEN) ffffffff00000000 0000000000000001 ffffc90000028b60 ffffc90000028b60 > >> (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 000000000014581d > >> (XEN) 00000000000deb3e 0000000000001000 0000000000157e68 000000000b507ff0 > >> (XEN) 0000000000000261 000000000042000d 00000000000204b0 ffffc90000028b38 > >> (XEN) 0000000000000002 ffffc90000028b38 ffffc90000028b38 ffff880185f6fd58 > >> (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 > >> (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 > >> (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 > >> (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 0000000000000086 > >> (XEN) ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 0000000000000286 > >> (XEN) Xen call trace: > >> (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 > >> (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 > >> (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a > >> (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 > >> (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 > >> (XEN) > >> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. > >> > >> (Note that I''ve added a WARN() to mm.c:1925 to see where the > >> get_page call was coming from). > >> > >> Connecting the iSCSI disks to another Dom0 works fine, so this > >> problem only happens when trying to connect the disks to the > >> Dom0 where the DomU is running. > > > > Is this happening when the ''disks'' are exported to the domUs? > > Are they exported via QEMU or xen-blkback? > > The iSCSI disks are connected to the DomUs using blkback, and this is > happening when the DomU tries to access it''s disks. > > >> > >> I''ve replaced the Linux DomU serving iSCSI targets with a > >> NetBSD DomU, and the problems disappears, and I''m able to > >> attach the targets shared by the DomU to the Dom0 without > >> issues. > >> > >> The problem seems to come from netfront/netback, does anyone > >> have a clue about what might cause this bad interaction > >> between IET and netfront/netback? > > > > Or it might be that we are re-using the PFN for blkback/blkfront > > and using the m2p overrides and overwritting the netfront/netback > > m2p overrides? > > What''s strange is that this doesn''t happen when the domain that has the > targets is a NetBSD PV. There are also problems when blkback is not used > (see below), so I guess the problem is between netfront/netback and IET. > > > Is this with an HVM domU or PV domU? > > Both domains (the domain holding the iSCSI targets, and the created > guests) are PV. > > Also, I''ve forgot to say that in the previous email, but if I just > connect the iSCSI disks to the Dom0, I don''t see any errors from Xen, > but the Dom0 kernel starts complaining: > > [70272.569607] sd 14:0:0:0: [sdc] > [70272.569611] Sense Key : Medium Error [current] > [70272.569619] Info fld=0x0 > [70272.569623] sd 14:0:0:0: [sdc] > [70272.569627] Add. Sense: Unrecovered read error > [70272.569633] sd 14:0:0:0: [sdc] CDB: > [70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00 > [70272.569662] end_request: critical target error, dev sdc, sector 0 > [70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code > [70277.571220] sd 14:0:0:0: [sdc] > [70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [70277.571229] sd 14:0:0:0: [sdc] > [70277.571233] Sense Key : Medium Error [current] > [70277.571241] Info fld=0x0 > [70277.571245] sd 14:0:0:0: [sdc] > [70277.571249] Add. Sense: Unrecovered read error > [70277.571255] sd 14:0:0:0: [sdc] CDB: > [70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00 > [70277.571284] end_request: critical target error, dev sdc, sector 0 > [70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code > [70282.572781] sd 14:0:0:0: [sdc] > [70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [70282.572790] sd 14:0:0:0: [sdc] > [70282.572794] Sense Key : Medium Error [current] > [70282.572802] Info fld=0x0 > [70282.572806] sd 14:0:0:0: [sdc] > [70282.572810] Add. Sense: Unrecovered read error > [70282.572816] sd 14:0:0:0: [sdc] CDB: > [70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00 > [70282.572846] end_request: critical target error, dev sdc, sector 0 > [70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code > [70287.574409] sd 14:0:0:0: [sdc] > [70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE > [70287.574418] sd 14:0:0:0: [sdc] > [70287.574422] Sense Key : Medium Error [current] > [70287.574430] Info fld=0x0 > [70287.574434] sd 14:0:0:0: [sdc] > [70287.574438] Add. Sense: Unrecovered read error > [70287.574445] sd 14:0:0:0: [sdc] CDB: > [70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00 > [70287.574474] end_request: critical target error, dev sdc, sector 0 > > When I try to attach the targets to another Dom0, everything works fine, > the problem only happens when the iSCSI target is a DomU and you attach > the disks from the Dom0 on the same machine.I think we are just swizzling the PFNs with a different MFN when you do the domU -> domX, using two ring protocols. Weird thought as the m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of thing. What happens if the dom0/domU are all 3.8 with the persistent grant patches?> > Thanks, Roger. >
Roger Pau Monné
2013-Jan-02 13:05 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On 21/12/12 18:35, Konrad Rzeszutek Wilk wrote:> On Fri, Dec 21, 2012 at 03:47:20PM +0100, Roger Pau Monné wrote: >> On 21/12/12 15:03, Konrad Rzeszutek Wilk wrote: >>> On Fri, Dec 21, 2012 at 09:29:39AM +0100, Roger Pau Monné wrote: >>>> Hello, >>>> >>>> I''m trying to use a strange setup, that consists in having a DomU >>>> serving iSCSI targets to the Dom0, that will use this targets as disks >>>> for other DomUs. I''ve tried to set up this iSCSI target DomU using both >>>> Debian Squeeze/Wheezy (with kernels 2.6.32 and 3.2) and ISCSI >>>> Enterprise Target (IET), and when launching the DomU I get this messages >>>> from Xen: >>>> >>>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 >>>> (XEN) Xen WARN at mm.c:1926 >>>> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- >>>> (XEN) CPU: 0 >>>> (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 >>>> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor >>>> (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 >>>> (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 >>>> (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 >>>> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 >>>> (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 >>>> (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 >>>> (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 >>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >>>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8: >>>> (XEN) ffff830141405000 8000000000000003 7400000000000001 0000000000145028 >>>> (XEN) ffff82f6028a0510 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 >>>> (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 >>>> (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc3f0 >>>> (XEN) 0000000000000001 ffffffffffff8000 0000000000000002 ffff83011d555000 >>>> (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 >>>> (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 >>>> (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 >>>> (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 >>>> (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000005802bfd38 ffff82c4802b8000 >>>> (XEN) ffff82c400000000 0000000000000001 ffffc90000028b10 ffffc90000028b10 >>>> (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 0000000000145028 >>>> (XEN) 000000000011cf7c 0000000000001000 0000000000157e68 0000000000007ff0 >>>> (XEN) 000000000000027e 000000000042000d 0000000000020b50 ffff8300dfdf0000 >>>> (XEN) ffff82c4802bfd78 ffffc90000028ac0 ffffc90000028ac0 ffff880185f6fd58 >>>> (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 >>>> (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 >>>> (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 >>>> (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 ffff8300dfb03000 >>>> (XEN) ffff8300dfdf0000 0000150e11a417f8 0000000000000002 ffff82c480300948 >>>> (XEN) Xen call trace: >>>> (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 >>>> (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 >>>> (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a >>>> (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 >>>> (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 >>>> (XEN) >>>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. >>>> (XEN) mm.c:1925:d0 Error pfn 157e68: rd=ffff83019e60c000, od=ffff830141405000, caf=8000000000000003, taf=7400000000000001 >>>> (XEN) Xen WARN at mm.c:1926 >>>> (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- >>>> (XEN) CPU: 0 >>>> (XEN) RIP: e008:[<ffff82c48016ea17>] get_page+0xd5/0x101 >>>> (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor >>>> (XEN) rax: 0000000000000000 rbx: ffff830141405000 rcx: 0000000000000000 >>>> (XEN) rdx: ffff82c480300920 rsi: 000000000000000a rdi: ffff82c4802766e8 >>>> (XEN) rbp: ffff82c4802bfbf8 rsp: ffff82c4802bfba8 r8: 0000000000000004 >>>> (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 >>>> (XEN) r12: 0000000000157e68 r13: ffff83019e60c000 r14: 7400000000000001 >>>> (XEN) r15: 8000000000000003 cr0: 000000008005003b cr4: 00000000000026f0 >>>> (XEN) cr3: 000000011c180000 cr2: 00007f668d1eb000 >>>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >>>> (XEN) Xen stack trace from rsp=ffff82c4802bfba8: >>>> (XEN) ffff830141405000 8000000000000003 7400000000000001 000000000014581d >>>> (XEN) ffff82f6028b03b0 ffff83019e60c000 ffff82f602afcd00 ffff82c4802bfd28 >>>> (XEN) ffff82c4802bfd18 0000000000157e68 ffff82c4802bfc58 ffff82c480109ba3 >>>> (XEN) ffffffffffffffff 0000000000000000 ffff83011c977fb8 0000000061dfc308 >>>> (XEN) 0000000000000000 ffffffffffff8000 0000000000000001 ffff83011d555000 >>>> (XEN) ffff83019e60c000 0000000000000000 ffff82c4802bfd98 ffff82c48010c607 >>>> (XEN) ffff82c4802bfd34 ffff82c4802bfd30 ffff82c400000001 000000000011cf90 >>>> (XEN) 0000000000000000 ffff82c4802b8000 ffff82c4802b8000 ffff82c4802b8000 >>>> (XEN) ffff82c4802b8000 ffff82c4802bfd5c 000000029e60c000 ffff82c480300920 >>>> (XEN) ffff82c4802b8000 ffff82c4802bfd38 00000002802bfd38 ffff82c4802b8000 >>>> (XEN) ffffffff00000000 0000000000000001 ffffc90000028b60 ffffc90000028b60 >>>> (XEN) ffff8300dfb03000 0000000000000000 0000000000000000 000000000014581d >>>> (XEN) 00000000000deb3e 0000000000001000 0000000000157e68 000000000b507ff0 >>>> (XEN) 0000000000000261 000000000042000d 00000000000204b0 ffffc90000028b38 >>>> (XEN) 0000000000000002 ffffc90000028b38 ffffc90000028b38 ffff880185f6fd58 >>>> (XEN) ffff880185f6fd78 0000000000000005 ffff82c4802bfef8 ffff82c48010eb65 >>>> (XEN) ffff82c4802bfdc8 ffff82c480300960 ffff82c4802bfe18 ffff82c480181831 >>>> (XEN) 000000000006df66 000032cfdc175ce6 0000000000000000 0000000000000000 >>>> (XEN) 0000000000000000 0000000000000005 ffff82c4802bfe28 0000000000000086 >>>> (XEN) ffff82c4802bfe28 ffff82c480125eae ffff83019e60c000 0000000000000286 >>>> (XEN) Xen call trace: >>>> (XEN) [<ffff82c48016ea17>] get_page+0xd5/0x101 >>>> (XEN) [<ffff82c480109ba3>] __get_paged_frame+0xbf/0x162 >>>> (XEN) [<ffff82c48010c607>] gnttab_copy+0x4c6/0x91a >>>> (XEN) [<ffff82c48010eb65>] do_grant_table_op+0x12ad/0x1b23 >>>> (XEN) [<ffff82c48022280b>] syscall_enter+0xeb/0x145 >>>> (XEN) >>>> (XEN) grant_table.c:2076:d0 source frame ffffffffffffffff invalid. >>>> >>>> (Note that I''ve added a WARN() to mm.c:1925 to see where the >>>> get_page call was coming from). >>>> >>>> Connecting the iSCSI disks to another Dom0 works fine, so this >>>> problem only happens when trying to connect the disks to the >>>> Dom0 where the DomU is running. >>> >>> Is this happening when the ''disks'' are exported to the domUs? >>> Are they exported via QEMU or xen-blkback? >> >> The iSCSI disks are connected to the DomUs using blkback, and this is >> happening when the DomU tries to access it''s disks. >> >>>> >>>> I''ve replaced the Linux DomU serving iSCSI targets with a >>>> NetBSD DomU, and the problems disappears, and I''m able to >>>> attach the targets shared by the DomU to the Dom0 without >>>> issues. >>>> >>>> The problem seems to come from netfront/netback, does anyone >>>> have a clue about what might cause this bad interaction >>>> between IET and netfront/netback? >>> >>> Or it might be that we are re-using the PFN for blkback/blkfront >>> and using the m2p overrides and overwritting the netfront/netback >>> m2p overrides? >> >> What''s strange is that this doesn''t happen when the domain that has the >> targets is a NetBSD PV. There are also problems when blkback is not used >> (see below), so I guess the problem is between netfront/netback and IET. >> >>> Is this with an HVM domU or PV domU? >> >> Both domains (the domain holding the iSCSI targets, and the created >> guests) are PV. >> >> Also, I''ve forgot to say that in the previous email, but if I just >> connect the iSCSI disks to the Dom0, I don''t see any errors from Xen, >> but the Dom0 kernel starts complaining: >> >> [70272.569607] sd 14:0:0:0: [sdc] >> [70272.569611] Sense Key : Medium Error [current] >> [70272.569619] Info fld=0x0 >> [70272.569623] sd 14:0:0:0: [sdc] >> [70272.569627] Add. Sense: Unrecovered read error >> [70272.569633] sd 14:0:0:0: [sdc] CDB: >> [70272.569637] Read(10): 28 00 00 00 00 00 00 00 08 00 >> [70272.569662] end_request: critical target error, dev sdc, sector 0 >> [70277.571208] sd 14:0:0:0: [sdc] Unhandled sense code >> [70277.571220] sd 14:0:0:0: [sdc] >> [70277.571224] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [70277.571229] sd 14:0:0:0: [sdc] >> [70277.571233] Sense Key : Medium Error [current] >> [70277.571241] Info fld=0x0 >> [70277.571245] sd 14:0:0:0: [sdc] >> [70277.571249] Add. Sense: Unrecovered read error >> [70277.571255] sd 14:0:0:0: [sdc] CDB: >> [70277.571259] Read(10): 28 00 00 00 00 00 00 00 08 00 >> [70277.571284] end_request: critical target error, dev sdc, sector 0 >> [70282.572768] sd 14:0:0:0: [sdc] Unhandled sense code >> [70282.572781] sd 14:0:0:0: [sdc] >> [70282.572785] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [70282.572790] sd 14:0:0:0: [sdc] >> [70282.572794] Sense Key : Medium Error [current] >> [70282.572802] Info fld=0x0 >> [70282.572806] sd 14:0:0:0: [sdc] >> [70282.572810] Add. Sense: Unrecovered read error >> [70282.572816] sd 14:0:0:0: [sdc] CDB: >> [70282.572820] Read(10): 28 00 00 00 00 00 00 00 08 00 >> [70282.572846] end_request: critical target error, dev sdc, sector 0 >> [70287.574397] sd 14:0:0:0: [sdc] Unhandled sense code >> [70287.574409] sd 14:0:0:0: [sdc] >> [70287.574413] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE >> [70287.574418] sd 14:0:0:0: [sdc] >> [70287.574422] Sense Key : Medium Error [current] >> [70287.574430] Info fld=0x0 >> [70287.574434] sd 14:0:0:0: [sdc] >> [70287.574438] Add. Sense: Unrecovered read error >> [70287.574445] sd 14:0:0:0: [sdc] CDB: >> [70287.574448] Read(10): 28 00 00 00 00 00 00 00 08 00 >> [70287.574474] end_request: critical target error, dev sdc, sector 0 >> >> When I try to attach the targets to another Dom0, everything works fine, >> the problem only happens when the iSCSI target is a DomU and you attach >> the disks from the Dom0 on the same machine. > > I think we are just swizzling the PFNs with a different MFN when you > do the domU -> domX, using two ring protocols. Weird thought as the > m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of > thing. > > What happens if the dom0/domU are all 3.8 with the persistent grant > patches?Sorry for the delay, the same error happens when Dom0/DomU is using a persistent grants enabled kernel, although I had to backport the persistent grants patch to 3.2, because I was unable to get iSCSI Enterprise Target dkms working with 3.8. I''m also seeing this messages in the DomU that''s running the iSCSI target: [ 511.338845] net_ratelimit: 36 callbacks suppressed [ 511.338851] net eth0: rx->offset: 0, size: 4294967295 [ 512.288282] net eth0: rx->offset: 0, size: 4294967295 [ 512.525639] net eth0: rx->offset: 0, size: 4294967295 [ 512.800729] net eth0: rx->offset: 0, size: 4294967295 [ 512.800732] net eth0: rx->offset: 0, size: 4294967295 [ 513.049447] net eth0: rx->offset: 0, size: 4294967295 [ 513.050125] net eth0: rx->offset: 0, size: 4294967295 [ 513.313493] net eth0: rx->offset: 0, size: 4294967295 [ 513.313497] net eth0: rx->offset: 0, size: 4294967295 [ 513.557233] net eth0: rx->offset: 0, size: 4294967295 [ 517.422772] net_ratelimit: 61 callbacks suppressed [ 517.422777] net eth0: rx->offset: 0, size: 4294967295 [ 517.422780] net eth0: rx->offset: 0, size: 4294967295 [ 517.667053] net eth0: rx->offset: 0, size: 4294967295 [ 517.667640] net eth0: rx->offset: 0, size: 4294967295 [ 517.879690] net eth0: rx->offset: 0, size: 4294967295 [ 517.879693] net eth0: rx->offset: 0, size: 4294967295 [ 518.125314] net eth0: rx->offset: 0, size: 4294967295 [ 518.125907] net eth0: rx->offset: 0, size: 4294967295 [ 518.477026] net eth0: rx->offset: 0, size: 4294967295 [ 518.477029] net eth0: rx->offset: 0, size: 4294967295 [ 553.400129] net_ratelimit: 84 callbacks suppressed [ 553.400134] net eth0: rx->offset: 0, size: 4294967295 [ 553.400615] net eth0: rx->offset: 0, size: 4294967295 [ 553.400618] net eth0: rx->offset: 0, size: 4294967295 [ 553.400620] net eth0: rx->offset: 0, size: 4294967295 [ 553.603476] net eth0: rx->offset: 0, size: 4294967295 [ 553.604103] net eth0: rx->offset: 0, size: 4294967295 [ 553.807444] net eth0: rx->offset: 0, size: 4294967295 [ 553.807447] net eth0: rx->offset: 0, size: 4294967295 [ 554.049223] net eth0: rx->offset: 0, size: 4294967295 [ 554.049902] net eth0: rx->offset: 0, size: 4294967295 [ 581.912017] net_ratelimit: 8 callbacks suppressed [ 581.912022] net eth0: rx->offset: 0, size: 4294967295 [ 581.912496] net eth0: rx->offset: 0, size: 4294967295 [ 582.118996] net eth0: rx->offset: 0, size: 4294967295 [ 582.357495] net eth0: rx->offset: 0, size: 4294967295 [ 615.983921] net eth0: rx->offset: 0, size: 4294967295
Konrad Rzeszutek Wilk
2013-Jan-02 21:36 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
> > I think we are just swizzling the PFNs with a different MFN when you > > do the domU -> domX, using two ring protocols. Weird thought as the > > m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of > > thing. > > > > What happens if the dom0/domU are all 3.8 with the persistent grant > > patches? > > Sorry for the delay, the same error happens when Dom0/DomU is using a > persistent grants enabled kernel, although I had to backport the > persistent grants patch to 3.2, because I was unable to get iSCSI > Enterprise Target dkms working with 3.8. I''m also seeing this messages > in the DomU that''s running the iSCSI target: > > [ 511.338845] net_ratelimit: 36 callbacks suppressed > [ 511.338851] net eth0: rx->offset: 0, size: 4294967295-1 ?! I saw this somewhere with 9000 MTUs.> [ 512.288282] net eth0: rx->offset: 0, size: 4294967295 > [ 512.525639] net eth0: rx->offset: 0, size: 4294967295 > [ 512.800729] net eth0: rx->offset: 0, size: 4294967295But wow. It is just all over. Could you instrument the M2P code to print out the PFN/MFN values are they are being altered (along with __builtin_func(1) to get an idea who is doing it). Perhaps that will shed light whether my theory (that we are overwritting the MFNs) is truly happening. It does not help that it ends up using multicalls - so it might be that they are being done both in bathess - so there are multiple MFN updates. Perhaps the multicalls have two or more changes to the same MFN?
Roger Pau Monné
2013-Jan-09 19:23 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On 02/01/13 22:36, Konrad Rzeszutek Wilk wrote:>>> I think we are just swizzling the PFNs with a different MFN when you >>> do the domU -> domX, using two ring protocols. Weird thought as the >>> m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of >>> thing. >>> >>> What happens if the dom0/domU are all 3.8 with the persistent grant >>> patches? >> >> Sorry for the delay, the same error happens when Dom0/DomU is using a >> persistent grants enabled kernel, although I had to backport the >> persistent grants patch to 3.2, because I was unable to get iSCSI >> Enterprise Target dkms working with 3.8. I''m also seeing this messages >> in the DomU that''s running the iSCSI target: >> >> [ 511.338845] net_ratelimit: 36 callbacks suppressed >> [ 511.338851] net eth0: rx->offset: 0, size: 4294967295 > > -1 ?! I saw this somewhere with 9000 MTUs. > >> [ 512.288282] net eth0: rx->offset: 0, size: 4294967295 >> [ 512.525639] net eth0: rx->offset: 0, size: 4294967295 >> [ 512.800729] net eth0: rx->offset: 0, size: 4294967295 > > But wow. It is just all over. > > Could you instrument the M2P code to print out the PFN/MFN > values are they are being altered (along with __builtin_func(1) to > get an idea who is doing it). Perhaps that will shed light whether > my theory (that we are overwritting the MFNs) is truly happening. > It does not help that it ends up using multicalls - so it might be > that they are being done both in bathess - so there are multiple > MFN updates. Perhaps the multicalls have two or more changes to the > same MFN?A little more info, I''ve found that we are passing FOREIGN_FRAMES in the sbk fragments on netback. When we try to perform the grant copy operation using a foreign mfn as source, we hit the error. Here is the stack trace of the addition of the bogus sbk: [ 107.094109] Pid: 64, comm: kworker/u:5 Not tainted 3.7.0-rc3+ #8 [ 107.094114] Call Trace: [ 107.094126] [<ffffffff813ff099>] xen_netbk_queue_tx_skb+0x16b/0x1aa [ 107.094135] [<ffffffff81400088>] xenvif_start_xmit+0x7b/0x9e [ 107.094143] [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db [ 107.094151] [<ffffffff81478917>] sch_direct_xmit+0x6e/0x150 [ 107.094159] [<ffffffff814612ce>] dev_queue_xmit+0x16a/0x360 [ 107.094168] [<ffffffff814c2426>] br_dev_queue_push_xmit+0x5c/0x62 [ 107.094175] [<ffffffff814c2536>] br_deliver+0x35/0x3f [ 107.094182] [<ffffffff814c12b8>] br_dev_xmit+0xd7/0xef [ 107.094189] [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db [ 107.094197] [<ffffffff814549b6>] ? __alloc_skb+0x8d/0x187 [ 107.094204] [<ffffffff81461409>] dev_queue_xmit+0x2a5/0x360 [ 107.094212] [<ffffffff81485991>] ip_finish_output2+0x25c/0x2b7 [ 107.094219] [<ffffffff81485a62>] ip_finish_output+0x76/0x7b [ 107.094226] [<ffffffff81485aa1>] ip_output+0x3a/0x3c [ 107.094235] [<ffffffff81483367>] dst_output+0xf/0x11 [ 107.094242] [<ffffffff81483558>] ip_local_out+0x5c/0x5e [ 107.094249] [<ffffffff81485560>] ip_queue_xmit+0x2ce/0x2fc [ 107.094256] [<ffffffff814969b4>] tcp_transmit_skb+0x746/0x787 [ 107.094264] [<ffffffff81498e73>] tcp_write_xmit+0x837/0x949 [ 107.094273] [<ffffffff810f1fea>] ? virt_to_head_page+0x9/0x2c [ 107.094281] [<ffffffff810f21aa>] ? ksize+0x1a/0x24 [ 107.094288] [<ffffffff814549ca>] ? __alloc_skb+0xa1/0x187 [ 107.094295] [<ffffffff81499212>] __tcp_push_pending_frames+0x2c/0x59 [ 107.094302] [<ffffffff8148af1a>] tcp_push+0x87/0x89 [ 107.094309] [<ffffffff8148cd26>] tcp_sendpage+0x448/0x480 [ 107.094317] [<ffffffff814aacc8>] inet_sendpage+0xa0/0xb5 [ 107.094327] [<ffffffff81329aac>] iscsi_sw_tcp_pdu_xmit+0xa2/0x236 [ 107.094335] [<ffffffff81328188>] iscsi_tcp_task_xmit+0x34/0x236 [ 107.094345] [<ffffffff8100d8b5>] ? __spin_time_accum+0x17/0x2e [ 107.094352] [<ffffffff8100daf3>] ? __xen_spin_lock+0xb7/0xcd [ 107.094360] [<ffffffff8132434a>] iscsi_xmit_task+0x52/0x94 [ 107.094367] [<ffffffff81324ea2>] iscsi_xmitworker+0x1c2/0x2b9 [ 107.094375] [<ffffffff81324ce0>] ? iscsi_prep_scsi_cmd_pdu+0x604/0x604 [ 107.094384] [<ffffffff81052bbe>] process_one_work+0x20b/0x2f9 [ 107.094391] [<ffffffff81052e17>] worker_thread+0x16b/0x272 [ 107.094398] [<ffffffff81052cac>] ? process_one_work+0x2f9/0x2f9 [ 107.094406] [<ffffffff81056c60>] kthread+0xb0/0xb8 [ 107.094414] [<ffffffff81056bb0>] ? kthread_freezable_should_stop+0x5b/0x5b [ 107.094422] [<ffffffff8151f4bc>] ret_from_fork+0x7c/0xb0 [ 107.094430] [<ffffffff81056bb0>] ? kthread_freezable_should_stop+0x5b/0x5b I will try to find out who is setting that sbk frags. Do you have any idea Konrad?
Konrad Rzeszutek Wilk
2013-Jan-11 15:06 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On Wed, Jan 09, 2013 at 08:23:45PM +0100, Roger Pau Monné wrote:> On 02/01/13 22:36, Konrad Rzeszutek Wilk wrote: > >>> I think we are just swizzling the PFNs with a different MFN when you > >>> do the domU -> domX, using two ring protocols. Weird thought as the > >>> m2p code has checks WARN_ON(PagePrivate(..)) to catch this sort of > >>> thing. > >>> > >>> What happens if the dom0/domU are all 3.8 with the persistent grant > >>> patches? > >> > >> Sorry for the delay, the same error happens when Dom0/DomU is using a > >> persistent grants enabled kernel, although I had to backport the > >> persistent grants patch to 3.2, because I was unable to get iSCSI > >> Enterprise Target dkms working with 3.8. I''m also seeing this messages > >> in the DomU that''s running the iSCSI target: > >> > >> [ 511.338845] net_ratelimit: 36 callbacks suppressed > >> [ 511.338851] net eth0: rx->offset: 0, size: 4294967295 > > > > -1 ?! I saw this somewhere with 9000 MTUs. > > > >> [ 512.288282] net eth0: rx->offset: 0, size: 4294967295 > >> [ 512.525639] net eth0: rx->offset: 0, size: 4294967295 > >> [ 512.800729] net eth0: rx->offset: 0, size: 4294967295 > > > > But wow. It is just all over. > > > > Could you instrument the M2P code to print out the PFN/MFN > > values are they are being altered (along with __builtin_func(1) to > > get an idea who is doing it). Perhaps that will shed light whether > > my theory (that we are overwritting the MFNs) is truly happening. > > It does not help that it ends up using multicalls - so it might be > > that they are being done both in bathess - so there are multiple > > MFN updates. Perhaps the multicalls have two or more changes to the > > same MFN? > > A little more info, I''ve found that we are passing FOREIGN_FRAMES in > the sbk fragments on netback. When we try to perform the grant copy > operation using a foreign mfn as source, we hit the error. Here is > the stack trace of the addition of the bogus sbk: > > [ 107.094109] Pid: 64, comm: kworker/u:5 Not tainted 3.7.0-rc3+ #8 > [ 107.094114] Call Trace: > [ 107.094126] [<ffffffff813ff099>] xen_netbk_queue_tx_skb+0x16b/0x1aa > [ 107.094135] [<ffffffff81400088>] xenvif_start_xmit+0x7b/0x9e > [ 107.094143] [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db > [ 107.094151] [<ffffffff81478917>] sch_direct_xmit+0x6e/0x150 > [ 107.094159] [<ffffffff814612ce>] dev_queue_xmit+0x16a/0x360 > [ 107.094168] [<ffffffff814c2426>] br_dev_queue_push_xmit+0x5c/0x62 > [ 107.094175] [<ffffffff814c2536>] br_deliver+0x35/0x3f > [ 107.094182] [<ffffffff814c12b8>] br_dev_xmit+0xd7/0xef > [ 107.094189] [<ffffffff81460fe7>] dev_hard_start_xmit+0x25e/0x3db > [ 107.094197] [<ffffffff814549b6>] ? __alloc_skb+0x8d/0x187 > [ 107.094204] [<ffffffff81461409>] dev_queue_xmit+0x2a5/0x360 > [ 107.094212] [<ffffffff81485991>] ip_finish_output2+0x25c/0x2b7 > [ 107.094219] [<ffffffff81485a62>] ip_finish_output+0x76/0x7b > [ 107.094226] [<ffffffff81485aa1>] ip_output+0x3a/0x3c > [ 107.094235] [<ffffffff81483367>] dst_output+0xf/0x11 > [ 107.094242] [<ffffffff81483558>] ip_local_out+0x5c/0x5e > [ 107.094249] [<ffffffff81485560>] ip_queue_xmit+0x2ce/0x2fc > [ 107.094256] [<ffffffff814969b4>] tcp_transmit_skb+0x746/0x787 > [ 107.094264] [<ffffffff81498e73>] tcp_write_xmit+0x837/0x949 > [ 107.094273] [<ffffffff810f1fea>] ? virt_to_head_page+0x9/0x2c > [ 107.094281] [<ffffffff810f21aa>] ? ksize+0x1a/0x24 > [ 107.094288] [<ffffffff814549ca>] ? __alloc_skb+0xa1/0x187 > [ 107.094295] [<ffffffff81499212>] __tcp_push_pending_frames+0x2c/0x59 > [ 107.094302] [<ffffffff8148af1a>] tcp_push+0x87/0x89 > [ 107.094309] [<ffffffff8148cd26>] tcp_sendpage+0x448/0x480 > [ 107.094317] [<ffffffff814aacc8>] inet_sendpage+0xa0/0xb5 > [ 107.094327] [<ffffffff81329aac>] iscsi_sw_tcp_pdu_xmit+0xa2/0x236 > [ 107.094335] [<ffffffff81328188>] iscsi_tcp_task_xmit+0x34/0x236 > [ 107.094345] [<ffffffff8100d8b5>] ? __spin_time_accum+0x17/0x2e > [ 107.094352] [<ffffffff8100daf3>] ? __xen_spin_lock+0xb7/0xcd > [ 107.094360] [<ffffffff8132434a>] iscsi_xmit_task+0x52/0x94 > [ 107.094367] [<ffffffff81324ea2>] iscsi_xmitworker+0x1c2/0x2b9 > [ 107.094375] [<ffffffff81324ce0>] ? iscsi_prep_scsi_cmd_pdu+0x604/0x604 > [ 107.094384] [<ffffffff81052bbe>] process_one_work+0x20b/0x2f9 > [ 107.094391] [<ffffffff81052e17>] worker_thread+0x16b/0x272 > [ 107.094398] [<ffffffff81052cac>] ? process_one_work+0x2f9/0x2f9 > [ 107.094406] [<ffffffff81056c60>] kthread+0xb0/0xb8 > [ 107.094414] [<ffffffff81056bb0>] ? kthread_freezable_should_stop+0x5b/0x5b > [ 107.094422] [<ffffffff8151f4bc>] ret_from_fork+0x7c/0xb0 > [ 107.094430] [<ffffffff81056bb0>] ? kthread_freezable_should_stop+0x5b/0x5b > > I will try to find out who is setting that sbk frags. Do you have any > idea Konrad?m2p_add_override.> >
Roger Pau Monné
2013-Jan-11 15:57 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
Hello Konrad, I''ve found the problem, blkback is adding granted pages to the bio that is then passed to the underlying block device. When using a iscsi target running on another DomU in the same h/w this bios end up in netback, and then when performing the gnttab copy operation, it complains because the passed mfn belongs to a different domain. I''ve checked this by applying the appended patch to blkback, which allocates a buffer to pass to the bio instead of using the granted page. Of course this should not applied, since it implies additional memcpys. I think the right way to solve this would be to change netback to use gnttab_map and memcpy instead of gnttab_copy, but I guess this will imply a performance degradation (haven''t benchmarked it, but I assume gnttab_copy is used in netback because it is faster than gnttab_map + memcpy + gnttab_unmap). --- diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 8808028..9740cbb 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -80,6 +80,8 @@ struct pending_req { unsigned short operation; int status; struct list_head free_list; + struct page *grant_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; + void *bio_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; DECLARE_BITMAP(unmap_seg, BLKIF_MAX_SEGMENTS_PER_REQUEST); }; @@ -701,6 +703,7 @@ static void xen_blk_drain_io(struct xen_blkif *blkif) static void __end_block_io_op(struct pending_req *pending_req, int error) { + int i; /* An error fails the entire request. */ if ((pending_req->operation == BLKIF_OP_FLUSH_DISKCACHE) && (error == -EOPNOTSUPP)) { @@ -724,6 +727,16 @@ static void __end_block_io_op(struct pending_req *pending_req, int error) * the proper response on the ring. */ if (atomic_dec_and_test(&pending_req->pendcnt)) { + for (i = 0; i < pending_req->nr_pages; i++) { + BUG_ON(pending_req->bio_pages[i] == NULL); + if (pending_req->operation == BLKIF_OP_READ) { + void *grant = kmap_atomic(pending_req->grant_pages[i]); + memcpy(grant, pending_req->bio_pages[i], + PAGE_SIZE); + kunmap_atomic(grant); + } + kfree(pending_req->bio_pages[i]); + } xen_blkbk_unmap(pending_req); make_response(pending_req->blkif, pending_req->id, pending_req->operation, pending_req->status); @@ -846,7 +859,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, int operation; struct blk_plug plug; bool drain = false; - struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; switch (req->operation) { case BLKIF_OP_READ: @@ -889,6 +901,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, pending_req->operation = req->operation; pending_req->status = BLKIF_RSP_OKAY; pending_req->nr_pages = nseg; + memset(pending_req->bio_pages, 0, sizeof(pending_req->bio_pages)); for (i = 0; i < nseg; i++) { seg[i].nsec = req->u.rw.seg[i].last_sect - @@ -933,7 +946,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, * the hypercall to unmap the grants - that is all done in * xen_blkbk_unmap. */ - if (xen_blkbk_map(req, pending_req, seg, pages)) + if (xen_blkbk_map(req, pending_req, seg, pending_req->grant_pages)) goto fail_flush; /* @@ -943,9 +956,17 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, xen_blkif_get(blkif); for (i = 0; i < nseg; i++) { + void *grant; + pending_req->bio_pages[i] = kmalloc(PAGE_SIZE, GFP_KERNEL); + if (req->operation == BLKIF_OP_WRITE) { + grant = kmap_atomic(pending_req->grant_pages[i]); + memcpy(pending_req->bio_pages[i], grant, + PAGE_SIZE); + kunmap_atomic(grant); + } while ((bio == NULL) || (bio_add_page(bio, - pages[i], + virt_to_page(pending_req->bio_pages[i]), seg[i].nsec << 9, seg[i].buf & ~PAGE_MASK) == 0)) {
Konrad Rzeszutek Wilk
2013-Jan-11 18:51 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote:> Hello Konrad, > > I''ve found the problem, blkback is adding granted pages to the bio that > is then passed to the underlying block device. When using a iscsi > target running on another DomU in the same h/w this bios end up in > netback, and then when performing the gnttab copy operation, it > complains because the passed mfn belongs to a different domain.OK, so my original theory was sound. The m2p override "sticks".> > I''ve checked this by applying the appended patch to blkback, which > allocates a buffer to pass to the bio instead of using the granted > page. Of course this should not applied, since it implies additional > memcpys. > > I think the right way to solve this would be to change netback to > use gnttab_map and memcpy instead of gnttab_copy, but I guess this > will imply a performance degradation (haven''t benchmarked it, but I > assume gnttab_copy is used in netback because it is faster than > gnttab_map + memcpy + gnttab_unmap).Or blkback is altered to use grant_copy. Or perhaps m2p_override can do multiple PAGE_FOREIGN? (So if it detects a collision it will do something smart.. like allocate a new page or update the kmap_op with extra information). And yes, grant_map in netback is much much slower that grant_copy (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes that Jan came up with). See attached.> > --- > > diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c > index 8808028..9740cbb 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -80,6 +80,8 @@ struct pending_req { > unsigned short operation; > int status; > struct list_head free_list; > + struct page *grant_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > + void *bio_pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > DECLARE_BITMAP(unmap_seg, BLKIF_MAX_SEGMENTS_PER_REQUEST); > }; > > @@ -701,6 +703,7 @@ static void xen_blk_drain_io(struct xen_blkif *blkif) > > static void __end_block_io_op(struct pending_req *pending_req, int error) > { > + int i; > /* An error fails the entire request. */ > if ((pending_req->operation == BLKIF_OP_FLUSH_DISKCACHE) && > (error == -EOPNOTSUPP)) { > @@ -724,6 +727,16 @@ static void __end_block_io_op(struct pending_req *pending_req, int error) > * the proper response on the ring. > */ > if (atomic_dec_and_test(&pending_req->pendcnt)) { > + for (i = 0; i < pending_req->nr_pages; i++) { > + BUG_ON(pending_req->bio_pages[i] == NULL); > + if (pending_req->operation == BLKIF_OP_READ) { > + void *grant = kmap_atomic(pending_req->grant_pages[i]); > + memcpy(grant, pending_req->bio_pages[i], > + PAGE_SIZE); > + kunmap_atomic(grant); > + } > + kfree(pending_req->bio_pages[i]); > + } > xen_blkbk_unmap(pending_req); > make_response(pending_req->blkif, pending_req->id, > pending_req->operation, pending_req->status); > @@ -846,7 +859,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > int operation; > struct blk_plug plug; > bool drain = false; > - struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > > switch (req->operation) { > case BLKIF_OP_READ: > @@ -889,6 +901,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > pending_req->operation = req->operation; > pending_req->status = BLKIF_RSP_OKAY; > pending_req->nr_pages = nseg; > + memset(pending_req->bio_pages, 0, sizeof(pending_req->bio_pages)); > > for (i = 0; i < nseg; i++) { > seg[i].nsec = req->u.rw.seg[i].last_sect - > @@ -933,7 +946,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > * the hypercall to unmap the grants - that is all done in > * xen_blkbk_unmap. > */ > - if (xen_blkbk_map(req, pending_req, seg, pages)) > + if (xen_blkbk_map(req, pending_req, seg, pending_req->grant_pages)) > goto fail_flush; > > /* > @@ -943,9 +956,17 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > xen_blkif_get(blkif); > > for (i = 0; i < nseg; i++) { > + void *grant; > + pending_req->bio_pages[i] = kmalloc(PAGE_SIZE, GFP_KERNEL); > + if (req->operation == BLKIF_OP_WRITE) { > + grant = kmap_atomic(pending_req->grant_pages[i]); > + memcpy(pending_req->bio_pages[i], grant, > + PAGE_SIZE); > + kunmap_atomic(grant); > + } > while ((bio == NULL) || > (bio_add_page(bio, > - pages[i], > + virt_to_page(pending_req->bio_pages[i]), > seg[i].nsec << 9, > seg[i].buf & ~PAGE_MASK) == 0)) { > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Roger Pau Monné
2013-Jan-11 19:29 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote:> On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote: >> Hello Konrad, >> >> I''ve found the problem, blkback is adding granted pages to the bio that >> is then passed to the underlying block device. When using a iscsi >> target running on another DomU in the same h/w this bios end up in >> netback, and then when performing the gnttab copy operation, it >> complains because the passed mfn belongs to a different domain. > > OK, so my original theory was sound. The m2p override "sticks". >> >> I''ve checked this by applying the appended patch to blkback, which >> allocates a buffer to pass to the bio instead of using the granted >> page. Of course this should not applied, since it implies additional >> memcpys. >> >> I think the right way to solve this would be to change netback to >> use gnttab_map and memcpy instead of gnttab_copy, but I guess this >> will imply a performance degradation (haven''t benchmarked it, but I >> assume gnttab_copy is used in netback because it is faster than >> gnttab_map + memcpy + gnttab_unmap). > > Or blkback is altered to use grant_copy.This would not work with the persistent-grants extension, and also when scaling to a large number of guests will probably have a degraded performance due to the grant table lock (compared to using persistent grants).> Or perhaps m2p_override > can do multiple PAGE_FOREIGN? (So if it detects a collision it will > do something smart.. like allocate a new page or update the > kmap_op with extra information).What we could do is add extra information to m2p_override, containing the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in grant_copy (or netback) the grant_ref_t and domid of the passed mfn is used instead of the mfn (provided that grant_copy can perform a copy between two grant references of different domains).> > > And yes, grant_map in netback is much much slower that grant_copy > (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes > that Jan came up with).Yes, I see there''s no way we are going to use grant_map instead of grant_copy. I guess this will no longer be true once netback/front starts using the persistent grants extension.
Konrad Rzeszutek Wilk
2013-Jan-11 21:09 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On Fri, Jan 11, 2013 at 08:29:12PM +0100, Roger Pau Monné wrote:> On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote: > > On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote: > >> Hello Konrad, > >> > >> I''ve found the problem, blkback is adding granted pages to the bio that > >> is then passed to the underlying block device. When using a iscsi > >> target running on another DomU in the same h/w this bios end up in > >> netback, and then when performing the gnttab copy operation, it > >> complains because the passed mfn belongs to a different domain. > > > > OK, so my original theory was sound. The m2p override "sticks". > >> > >> I''ve checked this by applying the appended patch to blkback, which > >> allocates a buffer to pass to the bio instead of using the granted > >> page. Of course this should not applied, since it implies additional > >> memcpys. > >> > >> I think the right way to solve this would be to change netback to > >> use gnttab_map and memcpy instead of gnttab_copy, but I guess this > >> will imply a performance degradation (haven''t benchmarked it, but I > >> assume gnttab_copy is used in netback because it is faster than > >> gnttab_map + memcpy + gnttab_unmap). > > > > Or blkback is altered to use grant_copy. > > This would not work with the persistent-grants extension, and also when > scaling to a large number of guests will probably have a degraded > performance due to the grant table lock (compared to using persistent > grants). > > > Or perhaps m2p_override > > can do multiple PAGE_FOREIGN? (So if it detects a collision it will > > do something smart.. like allocate a new page or update the > > kmap_op with extra information). > > What we could do is add extra information to m2p_override, containing > the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in > grant_copy (or netback) the grant_ref_t and domid of the passed mfn is > used instead of the mfn (provided that grant_copy can perform a copy > between two grant references of different domains). > > > > > > > And yes, grant_map in netback is much much slower that grant_copy > > (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes > > that Jan came up with). > > Yes, I see there''s no way we are going to use grant_map instead of > grant_copy. I guess this will no longer be true once netback/front > starts using the persistent grants extension.Hm? Annie posted patches for the persistent grants on netback/netfront and they did not show much improvement (as you are already doing grant_copy). Or are you saying change netback to use grant_map and utilize the skb->deconstructor to keep track of it? And then do persistent grant extensions on it?
Roger Pau Monné
2013-Jan-12 12:11 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On 11/01/13 22:09, Konrad Rzeszutek Wilk wrote:> On Fri, Jan 11, 2013 at 08:29:12PM +0100, Roger Pau Monné wrote: >> On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote: >>> On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote: >>>> Hello Konrad, >>>> >>>> I''ve found the problem, blkback is adding granted pages to the bio that >>>> is then passed to the underlying block device. When using a iscsi >>>> target running on another DomU in the same h/w this bios end up in >>>> netback, and then when performing the gnttab copy operation, it >>>> complains because the passed mfn belongs to a different domain. >>> >>> OK, so my original theory was sound. The m2p override "sticks". >>>> >>>> I''ve checked this by applying the appended patch to blkback, which >>>> allocates a buffer to pass to the bio instead of using the granted >>>> page. Of course this should not applied, since it implies additional >>>> memcpys. >>>> >>>> I think the right way to solve this would be to change netback to >>>> use gnttab_map and memcpy instead of gnttab_copy, but I guess this >>>> will imply a performance degradation (haven''t benchmarked it, but I >>>> assume gnttab_copy is used in netback because it is faster than >>>> gnttab_map + memcpy + gnttab_unmap). >>> >>> Or blkback is altered to use grant_copy. >> >> This would not work with the persistent-grants extension, and also when >> scaling to a large number of guests will probably have a degraded >> performance due to the grant table lock (compared to using persistent >> grants). >> >>> Or perhaps m2p_override >>> can do multiple PAGE_FOREIGN? (So if it detects a collision it will >>> do something smart.. like allocate a new page or update the >>> kmap_op with extra information). >> >> What we could do is add extra information to m2p_override, containing >> the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in >> grant_copy (or netback) the grant_ref_t and domid of the passed mfn is >> used instead of the mfn (provided that grant_copy can perform a copy >> between two grant references of different domains).Since the issue I''m having is not common I''m not sure if this solution is worth it, it will imply storing a pointer to a struct in the page private data, that stores the mfn, grant reference and domid (now we are only storing the mfn in the page private data).>>> >>> >>> And yes, grant_map in netback is much much slower that grant_copy >>> (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes >>> that Jan came up with). >> >> Yes, I see there''s no way we are going to use grant_map instead of >> grant_copy. I guess this will no longer be true once netback/front >> starts using the persistent grants extension. > > Hm? Annie posted patches for the persistent grants on netback/netfront > and they did not show much improvement (as you are already doing grant_copy). > > Or are you saying change netback to use grant_map and utilize the > skb->deconstructor to keep track of it? And then do persistent grant > extensions on it?I''m not really familiar with the net code, but I''ve had a quick look at the grant_copy operation in Xen and it is indeed using the grant lock to protect some parts of the code. Using persistent grants should provide better performance in the long run because once the grant is mapped we don''t have to issue any more grant operations, thus avoiding grant lock contention (maybe I''m missing something here). skb deconstructor should probably be used in netfront, to return the persistently mapped grant to the list of free grants, but I''m not sure if we will need to use it in netback.
Konrad Rzeszutek Wilk
2013-Jan-14 15:24 UTC
Re: Create a iSCSI DomU with disks in another DomU running on the same Dom0
On Sat, Jan 12, 2013 at 01:11:32PM +0100, Roger Pau Monné wrote:> On 11/01/13 22:09, Konrad Rzeszutek Wilk wrote: > > On Fri, Jan 11, 2013 at 08:29:12PM +0100, Roger Pau Monné wrote: > >> On 11/01/13 19:51, Konrad Rzeszutek Wilk wrote: > >>> On Fri, Jan 11, 2013 at 04:57:52PM +0100, Roger Pau Monné wrote: > >>>> Hello Konrad, > >>>> > >>>> I''ve found the problem, blkback is adding granted pages to the bio that > >>>> is then passed to the underlying block device. When using a iscsi > >>>> target running on another DomU in the same h/w this bios end up in > >>>> netback, and then when performing the gnttab copy operation, it > >>>> complains because the passed mfn belongs to a different domain. > >>> > >>> OK, so my original theory was sound. The m2p override "sticks". > >>>> > >>>> I''ve checked this by applying the appended patch to blkback, which > >>>> allocates a buffer to pass to the bio instead of using the granted > >>>> page. Of course this should not applied, since it implies additional > >>>> memcpys. > >>>> > >>>> I think the right way to solve this would be to change netback to > >>>> use gnttab_map and memcpy instead of gnttab_copy, but I guess this > >>>> will imply a performance degradation (haven''t benchmarked it, but I > >>>> assume gnttab_copy is used in netback because it is faster than > >>>> gnttab_map + memcpy + gnttab_unmap). > >>> > >>> Or blkback is altered to use grant_copy. > >> > >> This would not work with the persistent-grants extension, and also when > >> scaling to a large number of guests will probably have a degraded > >> performance due to the grant table lock (compared to using persistent > >> grants). > >> > >>> Or perhaps m2p_override > >>> can do multiple PAGE_FOREIGN? (So if it detects a collision it will > >>> do something smart.. like allocate a new page or update the > >>> kmap_op with extra information). > >> > >> What we could do is add extra information to m2p_override, containing > >> the grant_ref_t and domid, so when a FOREIGN_FRAME is detected in > >> grant_copy (or netback) the grant_ref_t and domid of the passed mfn is > >> used instead of the mfn (provided that grant_copy can perform a copy > >> between two grant references of different domains). > > Since the issue I''m having is not common I''m not sure if this solution > is worth it, it will imply storing a pointer to a struct in the page > private data, that stores the mfn, grant reference and domid (now we are > only storing the mfn in the page private data). > > >>> > >>> > >>> And yes, grant_map in netback is much much slower that grant_copy > >>> (I tested 2.6.32 vs 3.7 using a Xen 4.1.3 with the grant_copy fixes > >>> that Jan came up with). > >> > >> Yes, I see there''s no way we are going to use grant_map instead of > >> grant_copy. I guess this will no longer be true once netback/front > >> starts using the persistent grants extension. > > > > Hm? Annie posted patches for the persistent grants on netback/netfront > > and they did not show much improvement (as you are already doing grant_copy). > > > > Or are you saying change netback to use grant_map and utilize the > > skb->deconstructor to keep track of it? And then do persistent grant > > extensions on it? > > I''m not really familiar with the net code, but I''ve had a quick look at > the grant_copy operation in Xen and it is indeed using the grant lock to > protect some parts of the code.Sure, but at the cost of doing memory copy. And if the guests are on seperate sockets there are no cache benefits.> > Using persistent grants should provide better performance in the long > run because once the grant is mapped we don''t have to issue any more > grant operations, thus avoiding grant lock contention (maybe I''m missing > something here). skb deconstructor should probably be used in netfront,They did not improve it. As a matter of fact they made it worst. (So this is taking the idea that Andrew had that he shared with Oliver and you about doing persistent grants, and using the same type of hypercalls and copy - but do it in the network subsystem).> to return the persistently mapped grant to the list of free grants, but > I''m not sure if we will need to use it in netback.Why not? We don''t want to blow any the cache data if we can do it. The problem is how to deal with the RX path from the NICs. Each NICs on the RX path does something like this: 1). process its descriptors 2). unmap the page 3). allocate a new set of skbs (and pages), update the descriptors with the new bus address 4). pass off the unmapped pages (with skbs) to the network stack. 5). forget about the skbs. There is no persistency - so the NIC ends up getting a "fresh" set of pages all the time that perculate up to netback. The TX (so netback -> NIC) could be solved by having a pool of pages/skb''s that are mapped and are owned by netback (via the skb->dcst) so that they are retained.>