Is xenpaging supposed to work? For me its crashing on a 4x3.6GHz Xeon, 4G system running SLES11 SP1. There is one guest running. A simple minded test of xenpaging crashes the host with xen 4.0 and xen-unstable the same way: # xenpaging 1 1 xc: error: Error flushing ioemu cache: Internal error (XEN) mm.c:768:d0 mfn 431f page ffff82f6000863e0 ci 0180000000000000/0000000000000000 32 (XEN) mm.c:769:d0 ffff82f6000863e0: 0000000000000000 0180000000000000 0000000000000000 0000000000000000 (XEN) Assertion ''(page->count_info & ((1UL<<(64 - (9)))-1)) != 0'' failed at mm.c:772 (XEN) Debugging connection not set up. (XEN) ----[ Xen-4.1.21836-20100726.111138 x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c480161d92>] is_iomem_page+0x11e/0x16c (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor (XEN) rax: 007fffffffffffff rbx: ffffffffffffffff rcx: 00000000c8c7d8ae (XEN) rdx: 00000000000000d5 rsi: 000000000036cb58 rdi: 00000000000003e8 (XEN) rbp: ffff83012fecfbd8 rsp: ffff83012fecfb98 r8: 0000000000000003 (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 (XEN) r12: 0000000000000000 r13: 0180000000000000 r14: 0000000000000000 (XEN) r15: ffff82f6000863e0 cr0: 0000000080050033 cr4: 00000000000006f0 (XEN) cr3: 0000000127c2e000 cr2: 00007f2aeea5d270 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83012fecfb98: (XEN) 0180000000000000 0000000000000000 0000000000000000 ffff8301192c0000 (XEN) 000000000000431f ffff8301192c0000 000000000000431f 000000000060c000 (XEN) ffff83012fecfc18 ffff82c4801ca912 0000000000000000 0000000100000000 (XEN) ffff8301192c0000 000000000060c000 ffff8800cd749e58 0000000000000000 (XEN) ffff83012fecfc28 ffff82c4801cee3d ffff83012fecfc78 ffff82c4801ceddd (XEN) 000000000010e540 7400000000000002 ffff82f6021ca810 fffffffffffffff3 (XEN) 000000000060c000 ffff8800cd749e58 0000000000000000 000000000060c000 (XEN) ffff83012fecfda8 ffff82c48014fc53 ffff83012fecff18 4c00000000000002 (XEN) ffff82f6024f85d0 000000002ffe4c60 4000000000000000 ffff82f602510fa0 (XEN) 000000000012887d 0000000000000067 ffff83012fea0000 ffff83012fea0000 (XEN) 000000000010e540 ffff83012fecfe38 ffff83012fecfcf8 ffff82c480167c59 (XEN) ffff83012fecfd88 ffff82c480167e96 ffff83012fecfd48 000000000012887d (XEN) 0000000000000000 0000000000000000 ffff83012fecfd58 0000000000249b00 (XEN) 00000000001288a0 00000000124d8125 ffff8301288a02e8 0000000000000000 (XEN) 0000000000000000 0000000000000001 00000000124d8125 ffff83012fea0000 (XEN) 0000000000000202 fffffffffffffff3 000000000060c000 ffff8800cd749e58 (XEN) 0000000000000000 00007fffe5f712b0 ffff83012fecfef8 ffff82c480103568 (XEN) ffff83012fea0000 0000000000000000 ffff83012fecfe08 ffff82c480163b65 (XEN) 0000000100000082 ffff82f602511400 ffff83012fea0000 ffff83012fea0000 (XEN) 0000000000000000 ffff82f602511400 ffff83012fecfee8 ffff82c48016e8d5 (XEN) Xen call trace: (XEN) [<ffff82c480161d92>] is_iomem_page+0x11e/0x16c (XEN) [<ffff82c4801ca912>] p2m_mem_paging_nominate+0xc3/0x252 (XEN) [<ffff82c4801cee3d>] mem_paging_domctl+0x2d/0x70 (XEN) [<ffff82c4801ceddd>] mem_event_domctl+0x36f/0x3a2 (XEN) [<ffff82c48014fc53>] arch_do_domctl+0x2573/0x2740 (XEN) [<ffff82c480103568>] do_domctl+0x1222/0x12aa (XEN) [<ffff82c4801fd392>] syscall_enter+0xf2/0x14c (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Assertion ''(page->count_info & ((1UL<<(64 - (9)))-1)) != 0'' failed at mm.c:772 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... (XEN) Debugging connection not set up. --- xen/arch/x86/mm.c | 7 +++++++ 1 file changed, 7 insertions(+) --- xen-unstable.hg-4.1.21836.orig/xen/arch/x86/mm.c +++ xen-unstable.hg-4.1.21836/xen/arch/x86/mm.c @@ -99,6 +99,7 @@ #include <xen/event.h> #include <xen/iocap.h> #include <xen/guest_access.h> +#include <xen/delay.h> #include <asm/paging.h> #include <asm/shadow.h> #include <asm/page.h> @@ -762,6 +763,12 @@ int is_iomem_page(unsigned long mfn) /* Caller must know that it is an iomem page, or a reference is held. */ page = mfn_to_page(mfn); + if ((page->count_info & PGC_count_mask) == 0) { + unsigned long **p = (unsigned long **)page; + MEM_LOG("mfn %lx page %p ci %p/%p %d", mfn, page, (void *)page->count_info, (void *)(page->count_info & PGC_count_mask), (int)sizeof(*page)); + MEM_LOG("%p: %p %p %p %p",page,p[0],p[1],p[2],p[3]); + mdelay(1234); + } ASSERT((page->count_info & PGC_count_mask) != 0); return (page_get_owner(page) == dom_io); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Jul-26 13:15 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
I''ve not seen this error before. Does your computer have EPT support? What''s the patch at the end? Is that a fix or a recent change that you think might be causing the problem? Patrick On 26 July 2010 07:58, Olaf Hering <olaf@aepfle.de> wrote:> > Is xenpaging supposed to work? > For me its crashing on a 4x3.6GHz Xeon, 4G system running SLES11 SP1. > > There is one guest running. > A simple minded test of xenpaging crashes the host with xen 4.0 and > xen-unstable the same way: > > # xenpaging 1 1 > xc: error: Error flushing ioemu cache: Internal error > (XEN) mm.c:768:d0 mfn 431f page ffff82f6000863e0 ci 0180000000000000/0000000000000000 32 > (XEN) mm.c:769:d0 ffff82f6000863e0: 0000000000000000 0180000000000000 0000000000000000 0000000000000000 > (XEN) Assertion ''(page->count_info & ((1UL<<(64 - (9)))-1)) != 0'' failed at mm.c:772 > (XEN) Debugging connection not set up. > (XEN) ----[ Xen-4.1.21836-20100726.111138 x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c480161d92>] is_iomem_page+0x11e/0x16c > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > (XEN) rax: 007fffffffffffff rbx: ffffffffffffffff rcx: 00000000c8c7d8ae > (XEN) rdx: 00000000000000d5 rsi: 000000000036cb58 rdi: 00000000000003e8 > (XEN) rbp: ffff83012fecfbd8 rsp: ffff83012fecfb98 r8: 0000000000000003 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: 0000000000000000 r13: 0180000000000000 r14: 0000000000000000 > (XEN) r15: ffff82f6000863e0 cr0: 0000000080050033 cr4: 00000000000006f0 > (XEN) cr3: 0000000127c2e000 cr2: 00007f2aeea5d270 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff83012fecfb98: > (XEN) 0180000000000000 0000000000000000 0000000000000000 ffff8301192c0000 > (XEN) 000000000000431f ffff8301192c0000 000000000000431f 000000000060c000 > (XEN) ffff83012fecfc18 ffff82c4801ca912 0000000000000000 0000000100000000 > (XEN) ffff8301192c0000 000000000060c000 ffff8800cd749e58 0000000000000000 > (XEN) ffff83012fecfc28 ffff82c4801cee3d ffff83012fecfc78 ffff82c4801ceddd > (XEN) 000000000010e540 7400000000000002 ffff82f6021ca810 fffffffffffffff3 > (XEN) 000000000060c000 ffff8800cd749e58 0000000000000000 000000000060c000 > (XEN) ffff83012fecfda8 ffff82c48014fc53 ffff83012fecff18 4c00000000000002 > (XEN) ffff82f6024f85d0 000000002ffe4c60 4000000000000000 ffff82f602510fa0 > (XEN) 000000000012887d 0000000000000067 ffff83012fea0000 ffff83012fea0000 > (XEN) 000000000010e540 ffff83012fecfe38 ffff83012fecfcf8 ffff82c480167c59 > (XEN) ffff83012fecfd88 ffff82c480167e96 ffff83012fecfd48 000000000012887d > (XEN) 0000000000000000 0000000000000000 ffff83012fecfd58 0000000000249b00 > (XEN) 00000000001288a0 00000000124d8125 ffff8301288a02e8 0000000000000000 > (XEN) 0000000000000000 0000000000000001 00000000124d8125 ffff83012fea0000 > (XEN) 0000000000000202 fffffffffffffff3 000000000060c000 ffff8800cd749e58 > (XEN) 0000000000000000 00007fffe5f712b0 ffff83012fecfef8 ffff82c480103568 > (XEN) ffff83012fea0000 0000000000000000 ffff83012fecfe08 ffff82c480163b65 > (XEN) 0000000100000082 ffff82f602511400 ffff83012fea0000 ffff83012fea0000 > (XEN) 0000000000000000 ffff82f602511400 ffff83012fecfee8 ffff82c48016e8d5 > (XEN) Xen call trace: > (XEN) [<ffff82c480161d92>] is_iomem_page+0x11e/0x16c > (XEN) [<ffff82c4801ca912>] p2m_mem_paging_nominate+0xc3/0x252 > (XEN) [<ffff82c4801cee3d>] mem_paging_domctl+0x2d/0x70 > (XEN) [<ffff82c4801ceddd>] mem_event_domctl+0x36f/0x3a2 > (XEN) [<ffff82c48014fc53>] arch_do_domctl+0x2573/0x2740 > (XEN) [<ffff82c480103568>] do_domctl+0x1222/0x12aa > (XEN) [<ffff82c4801fd392>] syscall_enter+0xf2/0x14c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Assertion ''(page->count_info & ((1UL<<(64 - (9)))-1)) != 0'' failed at mm.c:772 > (XEN) **************************************** > (XEN) > (XEN) Reboot in five seconds... > (XEN) Debugging connection not set up. > > > --- > xen/arch/x86/mm.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > --- xen-unstable.hg-4.1.21836.orig/xen/arch/x86/mm.c > +++ xen-unstable.hg-4.1.21836/xen/arch/x86/mm.c > @@ -99,6 +99,7 @@ > #include <xen/event.h> > #include <xen/iocap.h> > #include <xen/guest_access.h> > +#include <xen/delay.h> > #include <asm/paging.h> > #include <asm/shadow.h> > #include <asm/page.h> > @@ -762,6 +763,12 @@ int is_iomem_page(unsigned long mfn) > > /* Caller must know that it is an iomem page, or a reference is held. */ > page = mfn_to_page(mfn); > + if ((page->count_info & PGC_count_mask) == 0) { > + unsigned long **p = (unsigned long **)page; > + MEM_LOG("mfn %lx page %p ci %p/%p %d", mfn, page, (void *)page->count_info, (void *)(page->count_info & PGC_count_mask), (int)sizeof(*page)); > + MEM_LOG("%p: %p %p %p %p",page,p[0],p[1],p[2],p[3]); > + mdelay(1234); > + } > ASSERT((page->count_info & PGC_count_mask) != 0); > > return (page_get_owner(page) == dom_io); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Jul-26 14:08 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Mon, Jul 26, Patrick Colp wrote:> I''ve not seen this error before. Does your computer have EPT support?You are right, its a slightly dated system. x86info shows: ... CPU #4 EFamily: 0 EModel: 0 Family: 15 Model: 4 Stepping: 1 CPU Model: Pentium 4 (Prescott) [E0] Processor name string: Intel(R) Xeon(TM) CPU 3.60GHz ... I need to move to another host, all of my systems lack EPT support. Is there a portable way to check for the EPT flag in xenpaging?> What''s the patch at the end? Is that a fix or a recent change that you > think might be causing the problem?Its just naive debug output. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Jul-26 14:39 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On 26 July 2010 10:08, Olaf Hering <olaf@aepfle.de> wrote:> On Mon, Jul 26, Patrick Colp wrote: > >> I''ve not seen this error before. Does your computer have EPT support? > > You are right, its a slightly dated system. x86info shows: > > ... > CPU #4 > EFamily: 0 EModel: 0 Family: 15 Model: 4 Stepping: 1 > CPU Model: Pentium 4 (Prescott) [E0] > Processor name string: Intel(R) Xeon(TM) CPU 3.60GHz > ... > > I need to move to another host, all of my systems lack EPT support. > > > Is there a portable way to check for the EPT flag in xenpaging?Yeah, this can be added. Probably not a bad idea, really. I actually have some code to test if a machine has EPT enabled or not (by borrowing some of the Xen start-up code which checks for things like EPT), so it should be easy enough to roll that into functions that xenpaging can use. I''ll get on that right away. Patrick>> What''s the patch at the end? Is that a fix or a recent change that you >> think might be causing the problem? > > Its just naive debug output. > > > Olaf > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, At 15:39 +0100 on 26 Jul (1280158794), Patrick Colp wrote:> On 26 July 2010 10:08, Olaf Hering <olaf@aepfle.de> wrote: > > Is there a portable way to check for the EPT flag in xenpaging? > > Yeah, this can be added. Probably not a bad idea, really. I actually > have some code to test if a machine has EPT enabled or not (by > borrowing some of the Xen start-up code which checks for things like > EPT)Good idea. From inside Xen you should be checking hap_enabled(d) (since even on an EPT machine the tools can request shadow pagetables for individual domains). Outside Xen I''m not sure there''s a reliable way of detecting HAP; maybe the xenpaging hypercalls should return an error? Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, XenServer Engineering Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Jul-26 15:23 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On 26 July 2010 10:58, Tim Deegan <Tim.Deegan@citrix.com> wrote:> Hi, > > At 15:39 +0100 on 26 Jul (1280158794), Patrick Colp wrote: >> On 26 July 2010 10:08, Olaf Hering <olaf@aepfle.de> wrote: >> > Is there a portable way to check for the EPT flag in xenpaging? >> >> Yeah, this can be added. Probably not a bad idea, really. I actually >> have some code to test if a machine has EPT enabled or not (by >> borrowing some of the Xen start-up code which checks for things like >> EPT) > > Good idea. From inside Xen you should be checking hap_enabled(d) (since > even on an EPT machine the tools can request shadow pagetables for > individual domains). Outside Xen I''m not sure there''s a reliable way of > detecting HAP; maybe the xenpaging hypercalls should return an error?Right, checks against hap_enabled make sense too. From outside Xen, I can detect if EPT is enabled for the machine. This doesn''t guarantee it''s enabled for a given guest nor if it returns "EPT disabled" does it mean the machine doesn''t have EPT (it just means that EPT isn''t enabled at the time). However, it might be better to just put the checks in Xen and return an error through the domctls if HAP isn''t enabled, as you suggest. Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Jul-27 15:20 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Mon, Jul 26, Patrick Colp wrote:> I''ve not seen this error before. Does your computer have EPT support?Now I got another system with X5550 cpu, i7 based. x86info reports: ... CPU #8 EFamily: 0 EModel: 1 Family: 6 Model: 26 Stepping: 5 CPU Model: Core i7 (Nehalem) Processor name string: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz And it still crashes this way. XEN reports this during boot: ... (XEN) VMX: Supported advanced features: (XEN) - APIC MMIO access virtualisation (XEN) - APIC TPR shadow (XEN) - Extended Page Tables (EPT) (XEN) - Virtual-Processor Identifiers (VPID) (XEN) - Virtual NMI (XEN) - MSR direct-access bitmap (XEN) EPT supports 2MB super page. (XEN) HVM: ASIDs enabled. (XEN) HVM: VMX enabled (XEN) HVM: Hardware Assisted Paging detected. (XEN) Brought up 8 CPUs ... So, is xenpaging really working for you? Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Jul-27 15:46 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
I pulled the latest version of Xen (revision 21859) and ran it on my EPT box and then ran xenpaging and it''s working fine for me. Have you tried paging more than just 1 page? Not sure why that should make a difference, but it''s worth trying. Also, do you have any modifications/patches applied to your tree that aren''t part of xen-unstable? Is the guest you''re attempting to run xenpaging on an HVM guest? Patrick On 27 July 2010 11:20, Olaf Hering <olaf@aepfle.de> wrote:> On Mon, Jul 26, Patrick Colp wrote: > >> I''ve not seen this error before. Does your computer have EPT support? > > Now I got another system with X5550 cpu, i7 based. > x86info reports: > ... > CPU #8 > EFamily: 0 EModel: 1 Family: 6 Model: 26 Stepping: 5 > CPU Model: Core i7 (Nehalem) > Processor name string: Intel(R) Xeon(R) CPU X5550 @ 2.67GHz > > And it still crashes this way. > > XEN reports this during boot: > > ... > (XEN) VMX: Supported advanced features: > (XEN) - APIC MMIO access virtualisation > (XEN) - APIC TPR shadow > (XEN) - Extended Page Tables (EPT) > (XEN) - Virtual-Processor Identifiers (VPID) > (XEN) - Virtual NMI > (XEN) - MSR direct-access bitmap > (XEN) EPT supports 2MB super page. > (XEN) HVM: ASIDs enabled. > (XEN) HVM: VMX enabled > (XEN) HVM: Hardware Assisted Paging detected. > (XEN) Brought up 8 CPUs > ... > > > So, is xenpaging really working for you? > > > Olaf > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Jul-27 18:25 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Tue, Jul 27, Patrick Colp wrote:> I pulled the latest version of Xen (revision 21859) and ran it on my > EPT box and then ran xenpaging and it''s working fine for me.Good to hear.> Have you tried paging more than just 1 page? Not sure why that should > make a difference, but it''s worth trying. Also, do you have any > modifications/patches applied to your tree that aren''t part of > xen-unstable? Is the guest you''re attempting to run xenpaging on an > HVM guest?Other values lead to different crashes. I have no more modifications, beside the submitted patches to runlevel scripts. Its not a HVM guest, I will try that right now. xenpaging has some pitfalls it seems. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Jul-27 18:39 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On 27 July 2010 14:25, Olaf Hering <olaf@aepfle.de> wrote:> On Tue, Jul 27, Patrick Colp wrote: > >> I pulled the latest version of Xen (revision 21859) and ran it on my >> EPT box and then ran xenpaging and it''s working fine for me. > > Good to hear. > >> Have you tried paging more than just 1 page? Not sure why that should >> make a difference, but it''s worth trying. Also, do you have any >> modifications/patches applied to your tree that aren''t part of >> xen-unstable? Is the guest you''re attempting to run xenpaging on an >> HVM guest? > > Other values lead to different crashes. I have no more modifications, > beside the submitted patches to runlevel scripts. > > Its not a HVM guest, I will try that right now.Ah, that would probably be why. Just to clarify, it only works for 64-bit Xen on HVM guests which are using EPT.> xenpaging has some pitfalls it seems.The lack of warning when trying to use xenpaging on non-EPT guests is an issue. I''m currently working on the fix. Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Jul-28 15:26 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Tue, Jul 27, Patrick Colp wrote:> On 27 July 2010 14:25, Olaf Hering <olaf@aepfle.de> wrote: > > Its not a HVM guest, I will try that right now. > Ah, that would probably be why. Just to clarify, it only works for > 64-bit Xen on HVM guests which are using EPT.Patrick, its now working. Thanks for your work. I will play with it. One question: Is xenpaging supposed to exit and unlink the paging file if the guest is shutdown? Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Jul-28 16:22 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
> One question: > Is xenpaging supposed to exit and unlink the paging file if the guest is > shutdown?Currently it doesn''t do this, no. It will just keep running (but doing nothing). It would probably be useful to have some sort of notification from Xen to the xenpaging tool (or have the xenpaging tool periodically check if the guest is still running?) to do as you suggest. Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Aug-06 11:16 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Tue, Jul 27, Patrick Colp wrote:> I pulled the latest version of Xen (revision 21859) and ran it on my > EPT box and then ran xenpaging and it''s working fine for me.Patrick, after playing a bit more with xenpaging, its not working for me. Whats your environment? I have a plain SLES11 SP1 x86_64 installation on a Xeon X5550 box. The client is also a plain SLES11 SP1 x86_64. Once I boot the client and start xenpaging with 256mb, the client gets lots of SIGBUS or SIGSEGV. Xen prints ''Iomem mapping not permitted ffffffffff (domain 1)'' in grant_table.c:__gnttab_map_grant_ref() Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Aug-09 13:30 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
> after playing a bit more with xenpaging, its not working for me. > > Whats your environment? > > I have a plain SLES11 SP1 x86_64 installation on a Xeon X5550 box. > The client is also a plain SLES11 SP1 x86_64. > Once I boot the client and start xenpaging with 256mb, the client gets > lots of SIGBUS or SIGSEGV. > Xen prints ''Iomem mapping not permitted ffffffffff (domain 1)'' in > grant_table.c:__gnttab_map_grant_ref()Hi Olaf, Thanks for the info. This sounds like an issue with PV drivers. Unfortunately I haven''t been able to properly vet the PV driver stuff yet, but I''ll try to get on it as soon as I can. Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Aug-09 17:39 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Mon, Aug 09, Patrick Colp wrote:> > after playing a bit more with xenpaging, its not working for me. > > > > Whats your environment? > > > > I have a plain SLES11 SP1 x86_64 installation on a Xeon X5550 box. > > The client is also a plain SLES11 SP1 x86_64. > > Once I boot the client and start xenpaging with 256mb, the client gets > > lots of SIGBUS or SIGSEGV. > > Xen prints ''Iomem mapping not permitted ffffffffff (domain 1)'' in > > grant_table.c:__gnttab_map_grant_ref() > > Hi Olaf, > > Thanks for the info. This sounds like an issue with PV drivers. > Unfortunately I haven''t been able to properly vet the PV driver stuff > yet, but I''ll try to get on it as soon as I can.Patrick, in xenpaging.c:main() there is that loop which evicts pages before it enters the while(1) loop. Since that loop will take some time before it proceeds and receives events from xen, what will happen with page-in requests during the initial evict loop? Are they queued up, or could they cause all the errors during client bootup? I tried to move the initial evict_victim() calls into the while(1) loop. If there is no event from xc_wait_for_event_or_timeout(), fill &victims one by one. My attempt looks basically like shown below. Unfortunately, it crashes xen itself in odd ways. I will look at this route further tomorrow. Also, what is the best time to run xenpaging. If its called very early, right after xm start <domainname>, it will cause a crash in the client when the kernel is still initializing itself. Should there be some kind of event from the guest when its ready? Early boot skripts could create such an event. Olaf --- xen-unstable.hg-4.1.21925.orig/tools/xenpaging/xenpaging.c +++ xen-unstable.hg-4.1.21925/tools/xenpaging/xenpaging.c @@ -474,6 +476,7 @@ int main(int argc, char *argv[]) { domid_t domain_id; int num_pages; + int evicted_pages; xenpaging_t *paging; xenpaging_victim_t *victims; mem_event_request_t req; @@ -496,6 +499,7 @@ int main(int argc, char *argv[]) domain_id = atoi(argv[1]); num_pages = atoi(argv[2]); + evicted_pages = 0; victims = calloc(num_pages, sizeof(xenpaging_victim_t)); @@ -521,20 +525,24 @@ int main(int argc, char *argv[]) /* Evict pages */ memset(victims, 0, sizeof(xenpaging_victim_t) * num_pages); +#if 0 for ( i = 0; i < num_pages; i++ ) { + fprintf(stderr, "%s(%u) page %d\n",__func__,__LINE__, i); evict_victim(xch, paging, domain_id, &victims[i], fd, i); if ( i % 100 == 0 ) DPRINTF("%d pages evicted\n", i); } DPRINTF("pages evicted\n"); +#endif /* Swap pages in and out */ while ( 1 ) { /* Wait for Xen to signal that a page needs paged in */ rc = xc_wait_for_event_or_timeout(xch, paging->mem_event.xce_handle, 100); + fprintf(stderr, "%s(%u) rc %d\n",__func__,__LINE__, rc); if ( rc < -1 ) { ERROR("Error getting event"); @@ -621,6 +634,11 @@ int main(int argc, char *argv[]) } } } + if (evicted_pages < num_pages) { + evict_victim(xch, paging, domain_id, &victims[evicted_pages], fd, evicted_pages); + evicted_pages++; + fprintf(stderr, "%s(%u) evicted_pages %d\n",__func__,__LINE__, evicted_pages); + } } out: _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Aug-09 18:23 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On 9 August 2010 13:39, Olaf Hering <olaf@aepfle.de> wrote:> On Mon, Aug 09, Patrick Colp wrote: > >> > after playing a bit more with xenpaging, its not working for me. >> > >> > Whats your environment? >> > >> > I have a plain SLES11 SP1 x86_64 installation on a Xeon X5550 box. >> > The client is also a plain SLES11 SP1 x86_64. >> > Once I boot the client and start xenpaging with 256mb, the client gets >> > lots of SIGBUS or SIGSEGV. >> > Xen prints ''Iomem mapping not permitted ffffffffff (domain 1)'' in >> > grant_table.c:__gnttab_map_grant_ref() >> >> Hi Olaf, >> >> Thanks for the info. This sounds like an issue with PV drivers. >> Unfortunately I haven''t been able to properly vet the PV driver stuff >> yet, but I''ll try to get on it as soon as I can. > > Patrick, > > in xenpaging.c:main() there is that loop which evicts pages before it > enters the while(1) loop. Since that loop will take some time before it > proceeds and receives events from xen, what will happen with page-in > requests during the initial evict loop? Are they queued up, or could > they cause all the errors during client bootup?There is a shared ring for xenpaging between the tool and Xen. Paging requests are placed in this ring and if it''s full, the guest is paused. However, the VCPU that accesses a paged out page is pause as well until the request is satisfied. Therefore, there should be no issue with the evict loop being first. If the guest accesses a paged out page while the eviction process is taking place, it is paused until all the pages are evicted and the tool handles the request and pages it back in.> I tried to move the initial evict_victim() calls into the while(1) loop. > If there is no event from xc_wait_for_event_or_timeout(), fill &victims > one by one. > > My attempt looks basically like shown below. > Unfortunately, it crashes xen itself in odd ways. I will look at this > route further tomorrow.It''s not immediately clear to me why your change wouldn''t work.> Also, what is the best time to run xenpaging. If its called very early, > right after xm start <domainname>, it will cause a crash in the client > when the kernel is still initializing itself. Should there be some kind > of event from the guest when its ready? Early boot skripts could create > such an event.It should be possible to run xenpaging pretty well right from starting up a guest. I believe there is no realmode support, so if the guest starts in realmode then xenpaging couldn''t be initiated until after that. Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Aug-10 14:19 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Mon, Aug 09, Patrick Colp wrote:> > I tried to move the initial evict_victim() calls into the while(1) loop. > > If there is no event from xc_wait_for_event_or_timeout(), fill &victims > > one by one. > > > > My attempt looks basically like shown below. > > Unfortunately, it crashes xen itself in odd ways. I will look at this > > route further tomorrow. > > It''s not immediately clear to me why your change wouldn''t work.Patrick, there is something weird going on. Today I was able to boot the client sucessfully with my change. Still I got a few ''grant_table.c:583:d0 Iomem mapping not permitted ffffffffff (domain 1)'' lines. After some tries I found that /usr/bin/free in the client gives IO Error when I tried to run it. The same happend with cat /usr/bin/free > /dev/null While doing that, I saw that Iomem error above. The gfn happend to be 3aba9. I searched that in my xenpaging debug output. There was a page-out of gfn 3aba9, but no page-in request. So it seems that gfn lost its "state" somehow. Another thing: Now that xenpaging does the page-out process in a slow way, it will take alot more time to finish 65K pages. I did a ''init 0'' while it was still in the middle of the process of filling &victims. This shutdown killed xen itself. (ept_get_entry lines come from my own dbg printk, just there to check where the 0xffffffffff is coming from.) --- xen-unstable.hg-4.1.21925.orig/xen/arch/x86/mm/hap/p2m-ept.c +++ xen-unstable.hg-4.1.21925/xen/arch/x86/mm/hap/p2m-ept.c @@ -488,8 +488,11 @@ static mfn_t ept_get_entry(struct domain if ( ept_entry->avail1 != p2m_invalid ) { + ept_entry_t **__p = (ept_entry_t **)ept_entry; *t = ept_entry->avail1; mfn = _mfn(ept_entry->mfn); + if ((mfn_x(mfn) & 0xffffffffffUL) == 0xffffffffffUL) + printk("%s:%s(%u) %lx %p mp %lx gfn %lx\n",__FILE__,__func__,__LINE__,mfn_x(mfn), *__p, max_page, gfn); if ( i ) { /* (XEN) p2m-ept.c:ept_get_entry(495) ffffffffff 000ffffffffffc00 mp 140000 gfn 135a (XEN) mem_event.c:195:d0 Ignoring memory paging op on dying domain 1 (XEN) p2m-ept.c:ept_get_entry(495) ffffffffff 000ffffffffffa00 mp 140000 gfn a7c2 (XEN) p2m-ept.c:ept_get_entry(495) ffffffffff 000ffffffffffa00 mp 140000 gfn a7c2 (XEN) Assertion ''(((lport) >= 0) && ((lport) < ((((ld)->arch.has_32bit_shinfo) ? 32 : 64) * (((ld)->arch.has_32bit_shinfo) ? 32 : 64))) && (((ld)->evtchn[(lport)/128]) != ((void*)0)))'' failed at event_channel.c:1033 (XEN) Debugging connection not set up. (XEN) ----[ Xen-4.1.21925-20100810.075543 x86_64 debug=y Not tainted ]---- (XEN) CPU: 3 (XEN) RIP: e008:[<ffff82c480105fed>] notify_via_xen_event_channel+0x43/0xfb (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor (XEN) rax: 0000000000000000 rbx: 0000000000000007 rcx: 0000000000000000 (XEN) rdx: 0000000000000040 rsi: 0000000000000007 rdi: ffff830138370194 (XEN) rbp: ffff83013febfc88 rsp: ffff83013febfc68 r8: 0000000000000000 (XEN) r9: ffff82c48020aee0 r10: 00000000fffffff9 r11: 0000000000000004 (XEN) r12: ffff830138370000 r13: ffff830138370190 r14: 000000000000a7c2 (XEN) r15: 000000000012f977 cr0: 0000000080050033 cr4: 00000000000026f0 (XEN) cr3: 000000012fb44000 cr2: ffff8800e948fe98 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff83013febfc68: (XEN) 0000000000000282 ffff830138370000 ffff83013febfcd8 ffff830138371548 (XEN) ffff83013febfcb8 ffff82c4801cef11 ffff830138370000 ffff83013febff18 (XEN) ffff830138370000 ffff8300bf752000 ffff83013febfd18 ffff82c4801cd070 (XEN) 000000000000a7c2 0000000a00000003 000000000000a7c2 0000000000000000 (XEN) 000000030000000a 0000000000000000 000000000000a7c2 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffff83013febfef8 ffff82c48016c18b (XEN) ffff82c480153f82 ffff83013febfd70 ffff82c480151176 ffff83013febff18 (XEN) ffff83013febff18 ffff83013febff18 ffff83013febff18 ffff83013febff18 (XEN) ffff83013febff18 ffff83013febff18 ffff83013febff18 ffff83013febff18 (XEN) ffff83013febff18 ffff83013febfde0 0000000000000286 ffff83013febfe00 (XEN) 00000195a8185d6b 0000000000000286 ffff8300bf752030 0000000000000000 (XEN) 0000000000000000 0000000100000001 0000000000000000 ffff83013febfe10 (XEN) 00000000bf752000 ffff83012f977e98 ffff82f6025f2ee0 ffff83013cf50000 (XEN) ffff830138370000 ffff8300bf752000 ffff8800f271d000 ffff83013febfe40 (XEN) 00000195a7fb7184 ffff82c480122617 0000000000000000 800000000a7c2627 (XEN) ffff83013febfe68 ffff82c48014bcc4 ffff83013febfe68 ffff82c4801615d2 (XEN) ffff83013febff18 ffff8300bf752000 0000000000000001 0000000000000000 (XEN) ffff83013febfef8 ffff82c4802033c0 00007f20d9bd3000 0000000000000206 (XEN) 0000000a800073f0 0000000000000001 000000012f977e98 800000000a7c2627 (XEN) ffff83013febfed8 ffff8300bf752000 8000000000000427 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c480105fed>] notify_via_xen_event_channel+0x43/0xfb (XEN) [<ffff82c4801cef11>] mem_event_put_request+0x99/0xa7 (XEN) [<ffff82c4801cd070>] p2m_mem_paging_populate+0x230/0x242 (XEN) [<ffff82c48016c18b>] do_mmu_update+0x696/0x1839 (XEN) [<ffff82c4801fe1e2>] syscall_enter+0xf2/0x14c (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 3: (XEN) Assertion ''(((lport) >= 0) && ((lport) < ((((ld)->arch.has_32bit_shinfo) ? 32 : 64) * (((ld)->arch.has_32bit_shinfo) ? 32 : 64)**************************************** (XEN) (XEN) Reboot in five seconds... (XEN) Debugging connection not set up. (XEN) Resetting with ACPI MEMORY or I/O RESET_REG. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Aug-10 15:02 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On 10 August 2010 10:19, Olaf Hering <olaf@aepfle.de> wrote:> On Mon, Aug 09, Patrick Colp wrote: > >> > I tried to move the initial evict_victim() calls into the while(1) loop. >> > If there is no event from xc_wait_for_event_or_timeout(), fill &victims >> > one by one. >> > >> > My attempt looks basically like shown below. >> > Unfortunately, it crashes xen itself in odd ways. I will look at this >> > route further tomorrow. >> >> It''s not immediately clear to me why your change wouldn''t work. > > Patrick, > > there is something weird going on. > Today I was able to boot the client sucessfully with my change. Still I > got a few ''grant_table.c:583:d0 Iomem mapping not permitted ffffffffff > (domain 1)'' lines.This sounds like it''s trying to grant pages which have been paged out (since paged out pages change their p2m mapping to MFN_INVALID which is 0xffffffff).> After some tries I found that /usr/bin/free in the client gives IO Error > when I tried to run it. The same happend with cat /usr/bin/free > /dev/null > While doing that, I saw that Iomem error above. The gfn happend to be > 3aba9. I searched that in my xenpaging debug output. There was a > page-out of gfn 3aba9, but no page-in request. > > So it seems that gfn lost its "state" somehow.I think this means there''s a fault path that isn''t caught by xenpaging (again, my guess here would be with the grant table stuff).> Another thing: > Now that xenpaging does the page-out process in a slow way, it will take > alot more time to finish 65K pages. I did a ''init 0'' while it was still > in the middle of the process of filling &victims. This shutdown killed > xen itself. (ept_get_entry lines come from my own dbg printk, just there > to check where the 0xffffffffff is coming from.) > > --- xen-unstable.hg-4.1.21925.orig/xen/arch/x86/mm/hap/p2m-ept.c > +++ xen-unstable.hg-4.1.21925/xen/arch/x86/mm/hap/p2m-ept.c > @@ -488,8 +488,11 @@ static mfn_t ept_get_entry(struct domain > > if ( ept_entry->avail1 != p2m_invalid ) > { > + ept_entry_t **__p = (ept_entry_t **)ept_entry; > *t = ept_entry->avail1; > mfn = _mfn(ept_entry->mfn); > + if ((mfn_x(mfn) & 0xffffffffffUL) == 0xffffffffffUL) > + printk("%s:%s(%u) %lx %p mp %lx gfn %lx\n",__FILE__,__func__,__LINE__,mfn_x(mfn), *__p, max_page, gfn); > if ( i ) > { > /* > > > (XEN) p2m-ept.c:ept_get_entry(495) ffffffffff 000ffffffffffc00 mp 140000 gfn 135a > (XEN) mem_event.c:195:d0 Ignoring memory paging op on dying domain 1 > (XEN) p2m-ept.c:ept_get_entry(495) ffffffffff 000ffffffffffa00 mp 140000 gfn a7c2 > (XEN) p2m-ept.c:ept_get_entry(495) ffffffffff 000ffffffffffa00 mp 140000 gfn a7c2 > (XEN) Assertion ''(((lport) >= 0) && ((lport) < ((((ld)->arch.has_32bit_shinfo) ? 32 : 64) * (((ld)->arch.has_32bit_shinfo) ? 32 : 64))) && (((ld)->evtchn[(lport)/128]) != ((void*)0)))'' failed at event_channel.c:1033 > (XEN) Debugging connection not set up. > (XEN) ----[ Xen-4.1.21925-20100810.075543 x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c480105fed>] notify_via_xen_event_channel+0x43/0xfb > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: 0000000000000007 rcx: 0000000000000000 > (XEN) rdx: 0000000000000040 rsi: 0000000000000007 rdi: ffff830138370194 > (XEN) rbp: ffff83013febfc88 rsp: ffff83013febfc68 r8: 0000000000000000 > (XEN) r9: ffff82c48020aee0 r10: 00000000fffffff9 r11: 0000000000000004 > (XEN) r12: ffff830138370000 r13: ffff830138370190 r14: 000000000000a7c2 > (XEN) r15: 000000000012f977 cr0: 0000000080050033 cr4: 00000000000026f0 > (XEN) cr3: 000000012fb44000 cr2: ffff8800e948fe98 > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff83013febfc68: > (XEN) 0000000000000282 ffff830138370000 ffff83013febfcd8 ffff830138371548 > (XEN) ffff83013febfcb8 ffff82c4801cef11 ffff830138370000 ffff83013febff18 > (XEN) ffff830138370000 ffff8300bf752000 ffff83013febfd18 ffff82c4801cd070 > (XEN) 000000000000a7c2 0000000a00000003 000000000000a7c2 0000000000000000 > (XEN) 000000030000000a 0000000000000000 000000000000a7c2 0000000000000000 > (XEN) 0000000000000000 0000000000000000 ffff83013febfef8 ffff82c48016c18b > (XEN) ffff82c480153f82 ffff83013febfd70 ffff82c480151176 ffff83013febff18 > (XEN) ffff83013febff18 ffff83013febff18 ffff83013febff18 ffff83013febff18 > (XEN) ffff83013febff18 ffff83013febff18 ffff83013febff18 ffff83013febff18 > (XEN) ffff83013febff18 ffff83013febfde0 0000000000000286 ffff83013febfe00 > (XEN) 00000195a8185d6b 0000000000000286 ffff8300bf752030 0000000000000000 > (XEN) 0000000000000000 0000000100000001 0000000000000000 ffff83013febfe10 > (XEN) 00000000bf752000 ffff83012f977e98 ffff82f6025f2ee0 ffff83013cf50000 > (XEN) ffff830138370000 ffff8300bf752000 ffff8800f271d000 ffff83013febfe40 > (XEN) 00000195a7fb7184 ffff82c480122617 0000000000000000 800000000a7c2627 > (XEN) ffff83013febfe68 ffff82c48014bcc4 ffff83013febfe68 ffff82c4801615d2 > (XEN) ffff83013febff18 ffff8300bf752000 0000000000000001 0000000000000000 > (XEN) ffff83013febfef8 ffff82c4802033c0 00007f20d9bd3000 0000000000000206 > (XEN) 0000000a800073f0 0000000000000001 000000012f977e98 800000000a7c2627 > (XEN) ffff83013febfed8 ffff8300bf752000 8000000000000427 0000000000000000 > (XEN) Xen call trace: > (XEN) [<ffff82c480105fed>] notify_via_xen_event_channel+0x43/0xfb > (XEN) [<ffff82c4801cef11>] mem_event_put_request+0x99/0xa7 > (XEN) [<ffff82c4801cd070>] p2m_mem_paging_populate+0x230/0x242 > (XEN) [<ffff82c48016c18b>] do_mmu_update+0x696/0x1839 > (XEN) [<ffff82c4801fe1e2>] syscall_enter+0xf2/0x14c > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Assertion ''(((lport) >= 0) && ((lport) < ((((ld)->arch.has_32bit_shinfo) ? 32 : 64) * (((ld)->arch.has_32bit_shinfo) ? 32 : 64)**************************************** > (XEN) > (XEN) Reboot in five seconds... > (XEN) Debugging connection not set up. > (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.This crash is caused by something in dom0 playing around with the guest''s memory. My guess here is that the guest has shutdown enough to destroy its event channels. Not entire sure who''s the culprit here. It seems like the xenpaging daemon tried to page something in at some point, but was denied by Xen since the guest was shutting down. So I would hazard that the PV drivers are again the culprit (as I''ve not encountered this error before either). I suppose it could be a result of evicting slowly instead of up-front. I''ll need to get my hands on SLES or PV drivers so I can fix the grant table stuff (I had it working before, but that was before the new grant table v2 stuff). Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Aug-10 15:06 UTC
Re: [Xen-devel] xenpaging crashes xen in is_iomem_page()
On Tue, Aug 10, Patrick Colp wrote:> On 10 August 2010 10:19, Olaf Hering <olaf@aepfle.de> wrote: > > On Mon, Aug 09, Patrick Colp wrote: > > > >> > I tried to move the initial evict_victim() calls into the while(1) loop. > >> > If there is no event from xc_wait_for_event_or_timeout(), fill &victims > >> > one by one. > >> > > >> > My attempt looks basically like shown below. > >> > Unfortunately, it crashes xen itself in odd ways. I will look at this > >> > route further tomorrow. > >> > >> It''s not immediately clear to me why your change wouldn''t work. > > > > Patrick, > > > > there is something weird going on. > > Today I was able to boot the client sucessfully with my change. Still I > > got a few ''grant_table.c:583:d0 Iomem mapping not permitted ffffffffff > > (domain 1)'' lines. > > This sounds like it''s trying to grant pages which have been paged out > (since paged out pages change their p2m mapping to MFN_INVALID which > is 0xffffffff).ffffffffff is 40 bits, not 32 bits, from ept_entry_t->mfn. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel