thr3ads.net - Xen devel - Bug on shadow page mode [Mar 2013]

If this information is useful, please help other people find it:
Share via:

Hao, Xudong

2013-Mar-29 06:39 UTC

Bug on shadow page mode

Hi, Jan

There is a bug with booting 3 guest with no-EPT mode, when xen handle guest page
fault will walk shadow guest table, fail to get l4e from the top level table,
then it will trigger a Fatal Page Fault and panic Xen.

Looked at this issue and found the bug is brought by this patch 
    changeset 26523:fd997a96d448
    x86: debugging code for testing 16Tb support on smaller memory systems 

I''m not much clear what''s the reason of the
modification(#ifdef NDEBUG) in xen/arch/x86/domain_page.c of this patch?

But removing the macro limiting will solve the fatal page fault bug for shadow
page mode, can it be simply removed? or can you look at it because you are very
familiar with it.

--- a/xen/arch/x86/domain_page.c        Fri Feb 08 11:06:04 2013 +0100
+++ b/xen/arch/x86/domain_page.c        Fri Mar 29 14:21:02 2013 +0800
@@ -422,10 +422,8 @@ void *map_domain_page_global(unsigned lo

     ASSERT(!in_irq() && local_irq_is_enabled());

-#ifdef NDEBUG
     if ( mfn <= PFN_DOWN(__pa(HYPERVISOR_VIRT_END - 1)) )
         return mfn_to_virt(mfn);
-#endif

     spin_lock(&globalmap_lock);


Best Regards,
Xudong Hao

Jan Beulich

2013-Apr-02 07:38 UTC

head link

Re: Bug on shadow page mode

>>> On 29.03.13 at 07:39, "Hao, Xudong"
<xudong.hao@intel.com> wrote:
> There is a bug with booting 3 guest with no-EPT mode, when xen handle guest
> page fault will walk shadow guest table, fail to get l4e from the top level
> table, then it will trigger a Fatal Page Fault and panic Xen.
> 
> Looked at this issue and found the bug is brought by this patch 
>     changeset 26523:fd997a96d448
>     x86: debugging code for testing 16Tb support on smaller memory systems 
> 
> I''m not much clear what''s the reason of the
modification(#ifdef NDEBUG) in
> xen/arch/x86/domain_page.c of this patch?
The point is to make sure the domain page mapping code actually
gets tested. The shortcut is a performance optimization.
> But removing the macro limiting will solve the fatal page fault bug for 
> shadow page mode, can it be simply removed? or can you look at it because
you
> are very familiar with it.
While it could be removed, this would be a fix for the problem you''re
seeing - you''d only later see someone else run into it on a system
with more than 5Tb.

Hence we need you to provide technical details of the crash you
observed (register and stack dump as well as call trace, plus any
details on the specific guest you''re running that distinguishes it
from other guests that don''t exhibit this problem).

Jan

Hao, Xudong

2013-Apr-02 08:40 UTC

head link

Re: Bug on shadow page mode

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Tuesday, April 02, 2013 3:38 PM
> To: Hao, Xudong
> Cc: xen-devel (xen-devel@lists.xen.org)
> Subject: Re: Bug on shadow page mode
> 
> >>> On 29.03.13 at 07:39, "Hao, Xudong"
<xudong.hao@intel.com> wrote:
> > There is a bug with booting 3 guest with no-EPT mode, when xen handle
> guest
> > page fault will walk shadow guest table, fail to get l4e from the top
level
> > table, then it will trigger a Fatal Page Fault and panic Xen.
> >
> > Looked at this issue and found the bug is brought by this patch
> >     changeset 26523:fd997a96d448
> >     x86: debugging code for testing 16Tb support on smaller memory
> systems
> >
> > I''m not much clear what''s the reason of the
modification(#ifdef NDEBUG) in
> > xen/arch/x86/domain_page.c of this patch?
> 
> The point is to make sure the domain page mapping code actually
> gets tested. The shortcut is a performance optimization.
> So in a smaller memory system, original code would run into the shortcut. Now if
NDEBUG not be defined, it will skip the mfn_to_virt(mfn) returning and run code
below it.

Our case panic xen in this condition. We have 2G physical memory system, run 3
RHEL6u3 guests with shadow page mode, each guest allocate 400MB memory and set
dom0 512M memory, when the 3nd guest booting, xen will panic on walking shadow
page table.
> > But removing the macro limiting will solve the fatal page fault bug
for
> > shadow page mode, can it be simply removed? or can you look at it
because
> you
> > are very familiar with it.
> 
> While it could be removed, this would be a fix for the problem
you''re
> seeing - you''d only later see someone else run into it on a system
> with more than 5Tb.
> 
> Hence we need you to provide technical details of the crash you
> observed (register and stack dump as well as call trace, plus any
> details on the specific guest you''re running that distinguishes it
> from other guests that don''t exhibit this problem).
> 
(XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Tainted:    C ]----
(XEN) CPU:    4
(XEN) RIP:    e008:[<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
(XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
(XEN) rax: 0000000000000001   rbx: ffff83007f087c98   rcx: 0000000000000005
(XEN) rdx: 0000000000000002   rsi: 0000000000800000   rdi: ffff83007f087ce0
(XEN) rbp: ffff83007f087ab8   rsp: ffff83007f087a38   r8:  0000000000000004
(XEN) r9:  000000000002b650   r10: 0000000000000022   r11: 0000000000000206
(XEN) r12: ffff830047374000   r13: 0000003a0f388718   r14: ffff82c4c025f90c
(XEN) r15: ffff82c406a00000   cr0: 000000008005003b   cr4: 00000000000426f0
(XEN) cr3: 000000004767e000   cr2: ffff82c406a00000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
(XEN) Xen stack trace from rsp=ffff83007f087a38:
(XEN)    000000007f087a88 ffff8300477df000 800000003e604025 000000007f087b10
(XEN)    ffff83007f087ac8 ffff8300477c4820 0000000528dd40f8 0080000000000004
(XEN)    ffff830028dd4ff8 ffff830000000000 ffff83007f087b48 ffff830047374000
(XEN)    ffff8300477df000 ffff83007f087f18 0000000000000000 000000000000000e
(XEN)    ffff83007f087d18 ffff82c4c020d8cc ffff82c406a00000 ffff83007f087b18
(XEN)    ffff83007f080000 ffff82c4c0311cc8 0000000000000c40 ffff83007f080000
(XEN)    ffff82c4c0311cc8 00000000000003c8 0000000000000740 0000000000000000
(XEN)    0000000000000001 0000000000028dd4 ffff8300477dfb58 0000000000000108
(XEN)    ffff8300477dfb58 ffff83007f080000 ffff82c4c0311cc8 ffff82c4c0311cd8
(XEN)    0000000003a0f388 ffff82c4c011759c ffff8300477dfae8 ffff830047374980
(XEN)    ffff8300477df000 000000000000652f 0000003a0f388718 000000000002b650
(XEN)    ffff83007f087c18 ffff82c4c01ef4d7 ffff83007f087be0 ffff83007f080000
(XEN)    ffff83007f087c18 0000000100000100 ffff83007f087c18 ffff830047374000
(XEN)    0000000000006a00 000000370002b650 000000000002b650 ffff830047374000
(XEN)    ffff830047374980 0000000000000000 ffff83007f087c18 ffff82c4c0125e3d
(XEN)    ffff83007f087c78 ffff82c4c021017d 0c00000000000000 ffff8300477df000
(XEN)    ffff82e0008e8640 ffff830047374000 0000000013650000 0000000013650000
(XEN)    0000000000000000 0000000000000000 ffff83007f087c78 ffff82c4c01ae218
(XEN)    ffff83007f087cc8 ffff82c4c01b4abc 0000000028dd4027 ffff82c4c0207f86
(XEN)    0000003a0f388718 0000000000000000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN)    [<ffff82c4c01e637f>] guest_walk_tables_4_levels+0x135/0x6a6
(XEN)    [<ffff82c4c020d8cc>] sh_page_fault__guest_4+0x505/0x2015
(XEN)    [<ffff82c4c01d2135>] vmx_vmexit_handler+0x86c/0x1748
(XEN)    
(XEN) Pagetable walk from ffff82c406a00000:
(XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
(XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
(XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff 
(XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
(XEN) 
(XEN) ****************************************
(XEN) Panic on CPU 4:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0000]
(XEN) Faulting linear address: ffff82c406a00000
(XEN) ****************************************
(XEN) 
(XEN) Reboot in five seconds...
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG> Jan

Jan Beulich

2013-Apr-02 11:50 UTC

head link

Re: Bug on shadow page mode

>>> On 02.04.13 at 10:40, "Hao, Xudong"
<xudong.hao@intel.com> wrote:
> (XEN) ----[ Xen-4.3-unstable  x86_64  debug=y  Tainted:    C ]----
> (XEN) CPU:    4
> (XEN) RIP:    e008:[<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
> (XEN) RFLAGS: 0000000000010286   CONTEXT: hypervisor
> (XEN) rax: 0000000000000001   rbx: ffff83007f087c98   rcx: 0000000000000005
> (XEN) rdx: 0000000000000002   rsi: 0000000000800000   rdi: ffff83007f087ce0
> (XEN) rbp: ffff83007f087ab8   rsp: ffff83007f087a38   r8:  0000000000000004
> (XEN) r9:  000000000002b650   r10: 0000000000000022   r11: 0000000000000206
> (XEN) r12: ffff830047374000   r13: 0000003a0f388718   r14: ffff82c4c025f90c
> (XEN) r15: ffff82c406a00000   cr0: 000000008005003b   cr4: 00000000000426f0
> (XEN) cr3: 000000004767e000   cr2: ffff82c406a00000
> (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> (XEN) Xen stack trace from rsp=ffff83007f087a38:
> (XEN)    000000007f087a88 ffff8300477df000 800000003e604025
000000007f087b10
> (XEN)    ffff83007f087ac8 ffff8300477c4820 0000000528dd40f8
0080000000000004
> (XEN)    ffff830028dd4ff8 ffff830000000000 ffff83007f087b48
ffff830047374000
> (XEN)    ffff8300477df000 ffff83007f087f18 0000000000000000
000000000000000e
> (XEN)    ffff83007f087d18 ffff82c4c020d8cc ffff82c406a00000
ffff83007f087b18
> (XEN)    ffff83007f080000 ffff82c4c0311cc8 0000000000000c40
ffff83007f080000
> (XEN)    ffff82c4c0311cc8 00000000000003c8 0000000000000740
0000000000000000
> (XEN)    0000000000000001 0000000000028dd4 ffff8300477dfb58
0000000000000108
> (XEN)    ffff8300477dfb58 ffff83007f080000 ffff82c4c0311cc8
ffff82c4c0311cd8
> (XEN)    0000000003a0f388 ffff82c4c011759c ffff8300477dfae8
ffff830047374980
> (XEN)    ffff8300477df000 000000000000652f 0000003a0f388718
000000000002b650
> (XEN)    ffff83007f087c18 ffff82c4c01ef4d7 ffff83007f087be0
ffff83007f080000
> (XEN)    ffff83007f087c18 0000000100000100 ffff83007f087c18
ffff830047374000
> (XEN)    0000000000006a00 000000370002b650 000000000002b650
ffff830047374000
> (XEN)    ffff830047374980 0000000000000000 ffff83007f087c18
ffff82c4c0125e3d
> (XEN)    ffff83007f087c78 ffff82c4c021017d 0c00000000000000
ffff8300477df000
> (XEN)    ffff82e0008e8640 ffff830047374000 0000000013650000
0000000013650000
> (XEN)    0000000000000000 0000000000000000 ffff83007f087c78
ffff82c4c01ae218
> (XEN)    ffff83007f087cc8 ffff82c4c01b4abc 0000000028dd4027
ffff82c4c0207f86
> (XEN)    0000003a0f388718 0000000000000000 0000000000000000
0000000000000000
> (XEN) Xen call trace:
> (XEN)    [<ffff82c4c01e637f>] guest_walk_tables_4_levels+0x135/0x6a6
> (XEN)    [<ffff82c4c020d8cc>] sh_page_fault__guest_4+0x505/0x2015
> (XEN)    [<ffff82c4c01d2135>] vmx_vmexit_handler+0x86c/0x1748
> (XEN)    
> (XEN) Pagetable walk from ffff82c406a00000:
> (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
> (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
> (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff 
> (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
Tim,

I''m afraid this is something for you. From what I can tell, despite
sh_walk_guest_tables() being called from sh_page_fault() without
the paging lock held, there doesn''t appear to be a way for this to
race sh_update_cr3(). And with the way the latter updates
guest_vtable, the only way for a page fault to happen upon use
of that cached mapping would be between the call to
sh_unmap_domain_page_global() and the immediately following
one to sh_map_domain_page_global() (i.e. while the pointer is
stale).

What I do note is

        /* PAGING_LEVELS==4 implies 64-bit, which means that
         * map_domain_page_global can''t fail */
        BUG_ON(v->arch.paging.shadow.guest_vtable == NULL);

which is no longer true. Sadly the 2-level paging case also
doesn''t really handle the similar error there, so it''s not
really
clear to me how this would best be fixed. And that''s not the
reason for the problem here anyway.
> (XEN) 
> (XEN) ****************************************
> (XEN) Panic on CPU 4:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=0000]
> (XEN) Faulting linear address: ffff82c406a00000
> (XEN) ****************************************
> (XEN) 
> (XEN) Reboot in five seconds...
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG

Tim Deegan

2013-Apr-02 16:45 UTC

head link

Re: Bug on shadow page mode

At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich
wrote:> >>> On 02.04.13 at 10:40, "Hao, Xudong"
<xudong.hao@intel.com> wrote:
> > (XEN)    [<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
> > (XEN)    [<ffff82c4c020d8cc>]
sh_page_fault__guest_4+0x505/0x2015
> > (XEN)    [<ffff82c4c01d2135>] vmx_vmexit_handler+0x86c/0x1748
> > (XEN)    
> > (XEN) Pagetable walk from ffff82c406a00000:
> > (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
> > (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
> > (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff 
> > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> 
> Tim,
> 
> I''m afraid this is something for you. From what I can tell,
despite
> sh_walk_guest_tables() being called from sh_page_fault() without
> the paging lock held, there doesn''t appear to be a way for this to
> race sh_update_cr3(). And with the way the latter updates
> guest_vtable, the only way for a page fault to happen upon use
> of that cached mapping would be between the call to
> sh_unmap_domain_page_global() and the immediately following
> one to sh_map_domain_page_global() (i.e. while the pointer is
> stale).
I''ll have a look at it on Thursday; swapping the map and the unmap
should be trivial, anyway.

Is this bug easily reproducable, or was it only hit once?  I''d expect a
race like this to be nigh impossible, especially considering that 32-bit
Xen had the same code for years.
> What I do note is
> 
>         /* PAGING_LEVELS==4 implies 64-bit, which means that
>          * map_domain_page_global can''t fail */
>         BUG_ON(v->arch.paging.shadow.guest_vtable == NULL);
> 
> which is no longer true. Sadly the 2-level paging case also
> doesn''t really handle the similar error there, so it''s
not really
> clear to me how this would best be fixed. And that''s not the
> reason for the problem here anyway.
I''ll look at that too -- it may be that we can avoid the _global() map
altogether.  HAP seems to manage without it, but it has far fewer
lookups.  Maybe I could add a per-vcpu fixmap for it, which would cover
most cases (i.e. local lookups).

Cheers,

Tim.
> > (XEN) 
> > (XEN) ****************************************
> > (XEN) Panic on CPU 4:
> > (XEN) FATAL PAGE FAULT
> > (XEN) [error_code=0000]
> > (XEN) Faulting linear address: ffff82c406a00000
> > (XEN) ****************************************
> > (XEN) 
> > (XEN) Reboot in five seconds...
> > (XEN) Resetting with ACPI MEMORY or I/O RESET_REG
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Jan Beulich

2013-Apr-03 07:33 UTC

head link

Re: Bug on shadow page mode

>>> On 02.04.13 at 18:45, Tim Deegan <tim@xen.org> wrote:
> Is this bug easily reproducable, or was it only hit once?  I''d
expect a
> race like this to be nigh impossible, especially considering that 32-bit
> Xen had the same code for years.
I understood that this is reproducible - Xudong?

Jan

Hao, Xudong

2013-Apr-03 07:37 UTC

head link

Re: Bug on shadow page mode

> -----Original Message-----
> From: Jan Beulich [mailto:JBeulich@suse.com]
> Sent: Wednesday, April 03, 2013 3:34 PM
> To: Hao, Xudong; Tim Deegan
> Cc: xen-devel (xen-devel@lists.xen.org)
> Subject: Re: [Xen-devel] Bug on shadow page mode
> 
> >>> On 02.04.13 at 18:45, Tim Deegan <tim@xen.org> wrote:
> > Is this bug easily reproducable, or was it only hit once? 
I''d expect a
> > race like this to be nigh impossible, especially considering that
32-bit
> > Xen had the same code for years.
> 
> I understood that this is reproducible - Xudong?
> Yes, it''s easily to reproduce by xen debug.

Tim Deegan

2013-Apr-04 10:18 UTC

head link

Re: Bug on shadow page mode

Hi,

At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich
wrote:> > (XEN) Xen call trace:
> > (XEN)    [<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
> > (XEN)    [<ffff82c4c020d8cc>]
sh_page_fault__guest_4+0x505/0x2015
> > (XEN)    [<ffff82c4c01d2135>] vmx_vmexit_handler+0x86c/0x1748
> > (XEN)    
> > (XEN) Pagetable walk from ffff82c406a00000:
> > (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
> > (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
> > (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff 
> > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> 
> Tim,
> 
> I''m afraid this is something for you. From what I can tell,
despite
> sh_walk_guest_tables() being called from sh_page_fault() without
> the paging lock held, there doesn''t appear to be a way for this to
> race sh_update_cr3(). And with the way the latter updates
> guest_vtable, the only way for a page fault to happen upon use
> of that cached mapping would be between the call to
> sh_unmap_domain_page_global() and the immediately following
> one to sh_map_domain_page_global() (i.e. while the pointer is
> stale).
Hmmm.  So the only way I can see that happening is if some foreign agent
resets the vcpu''s state while it''s actually running, which
AFAICT
shouldn''t happen.  I looked at reversing the unmap and map, but
realised
that won''t solve the problem -- in such a race sh_walk_guest_tables()
could cache the old value.

For testing, here''s a patch that gets rid of guest_vtable altogether.
I''m not entirely happy with it, partly because it will have a perf hit
on large-RAM machines, and partly because I''m worried that whatever
path
caused this crash might cause other problems.  But it would be good to
confirm that it stops the crashes.

Cheers,

Tim.

commit 5469971c30eca85f270cbd9bc46f63ec0fd543c0
Author: Tim Deegan <tim@xen.org>
Date:   Thu Apr 4 11:16:28 2013 +0100

    x86/mm/shadow: Don''t keep a permanent mapping of the top-level
guest table.
    
    This addresses two issues.  First, a crash seen where
    sh_walk_guest_tables() found that guest_vtable was not a valid
    mapping.  Second, the lack of any error handling if
    map_domain_page_global() were to fail.
    
    Signed-off-by: Tim Deegan <tim@xen.org>

diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
index a593f76..8ca4b9b 100644
--- a/xen/arch/x86/mm/shadow/multi.c
+++ b/xen/arch/x86/mm/shadow/multi.c
@@ -174,15 +174,26 @@ static inline uint32_t
 sh_walk_guest_tables(struct vcpu *v, unsigned long va, walk_t *gw, 
                      uint32_t pfec)
 {
-    return guest_walk_tables(v, p2m_get_hostp2m(v->domain), va, gw, pfec, 
+    uint32_t rc;
+    mfn_t top_mfn;
+    void *top_map;
+
 #if GUEST_PAGING_LEVELS == 3 /* PAE */
-                             _mfn(INVALID_MFN),
-                             v->arch.paging.shadow.gl3e
+    top_mfn = _mfn(INVALID_MFN);
+    top_map = v->arch.paging.shadow.gl3e;
 #else /* 32 or 64 */
-                             pagetable_get_mfn(v->arch.guest_table),
-                             v->arch.paging.shadow.guest_vtable
+    top_mfn = pagetable_get_mfn(v->arch.guest_table);
+    top_map = sh_map_domain_page(top_mfn);
+#endif
+
+    rc = guest_walk_tables(v, p2m_get_hostp2m(v->domain), va, gw, pfec,
+                           top_mfn, top_map);
+
+#if GUEST_PAGING_LEVELS != 3
+    sh_unmap_domain_page(top_map);
 #endif
-                             );
+
+    return rc;
 }
 
 /* This validation is called with lock held, and after write permission
@@ -219,24 +230,20 @@ shadow_check_gwalk(struct vcpu *v, unsigned long va,
walk_t *gw, int version)
      * logic simple.
      */
     perfc_incr(shadow_check_gwalk);
-#if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
-#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
-    l4p = (guest_l4e_t *)v->arch.paging.shadow.guest_vtable;
+#if GUEST_PAGING_LEVELS >= 4 /* 64-bit... */
+    l4p = sh_map_domain_page(gw->l4mfn);
     mismatch |= (gw->l4e.l4 != l4p[guest_l4_table_offset(va)].l4);
+    sh_unmap_domain_page(l4p);
     l3p = sh_map_domain_page(gw->l3mfn);
     mismatch |= (gw->l3e.l3 != l3p[guest_l3_table_offset(va)].l3);
     sh_unmap_domain_page(l3p);
-#else
+#elif GUEST_PAGING_LEVELS >= 3 /* PAE... */
     mismatch |= (gw->l3e.l3 !                 
v->arch.paging.shadow.gl3e[guest_l3_table_offset(va)].l3);
 #endif
     l2p = sh_map_domain_page(gw->l2mfn);
     mismatch |= (gw->l2e.l2 != l2p[guest_l2_table_offset(va)].l2);
     sh_unmap_domain_page(l2p);
-#else
-    l2p = (guest_l2e_t *)v->arch.paging.shadow.guest_vtable;
-    mismatch |= (gw->l2e.l2 != l2p[guest_l2_table_offset(va)].l2);
-#endif
     if ( !(guest_supports_superpages(v) &&
            (guest_l2e_get_flags(gw->l2e) & _PAGE_PSE)) )
     {
@@ -3784,7 +3791,7 @@ sh_update_linear_entries(struct vcpu *v)
 }
 
 
-/* Removes vcpu->arch.paging.shadow.guest_vtable and
vcpu->arch.shadow_table[].
+/* Removes vcpu->arch.shadow_table[].
  * Does all appropriate management/bookkeeping/refcounting/etc...
  */
 static void
@@ -3794,24 +3801,6 @@ sh_detach_old_tables(struct vcpu *v)
     int i = 0;
 
     ////
-    //// vcpu->arch.paging.shadow.guest_vtable
-    ////
-
-#if GUEST_PAGING_LEVELS == 3
-    /* PAE guests don''t have a mapping of the guest top-level table */
-    ASSERT(v->arch.paging.shadow.guest_vtable == NULL);
-#else
-    if ( v->arch.paging.shadow.guest_vtable )
-    {
-        struct domain *d = v->domain;
-        if ( shadow_mode_external(d) || shadow_mode_translate(d) )
-            sh_unmap_domain_page_global(v->arch.paging.shadow.guest_vtable);
-        v->arch.paging.shadow.guest_vtable = NULL;
-    }
-#endif // !NDEBUG
-
-
-    ////
     //// vcpu->arch.shadow_table[]
     ////
 
@@ -3970,26 +3959,12 @@ sh_update_cr3(struct vcpu *v, int do_locking)
 
 
     ////
-    //// vcpu->arch.paging.shadow.guest_vtable
+    //// vcpu->arch.paging.shadow.gl3e[]
     ////
-#if GUEST_PAGING_LEVELS == 4
-    if ( shadow_mode_external(d) || shadow_mode_translate(d) )
-    {
-        if ( v->arch.paging.shadow.guest_vtable )
-            sh_unmap_domain_page_global(v->arch.paging.shadow.guest_vtable);
-        v->arch.paging.shadow.guest_vtable =
sh_map_domain_page_global(gmfn);
-        /* PAGING_LEVELS==4 implies 64-bit, which means that
-         * map_domain_page_global can''t fail */
-        BUG_ON(v->arch.paging.shadow.guest_vtable == NULL);
-    }
-    else
-        v->arch.paging.shadow.guest_vtable = __linear_l4_table;
-#elif GUEST_PAGING_LEVELS == 3
+#if GUEST_PAGING_LEVELS == 3
      /* On PAE guests we don''t use a mapping of the guest''s
own top-level
       * table.  We cache the current state of that table and shadow that,
       * until the next CR3 write makes us refresh our cache. */
-     ASSERT(v->arch.paging.shadow.guest_vtable == NULL);
- 
      if ( shadow_mode_external(d) ) 
          /* Find where in the page the l3 table is */
          guest_idx = guest_index((void *)v->arch.hvm_vcpu.guest_cr[3]);
@@ -4005,20 +3980,6 @@ sh_update_cr3(struct vcpu *v, int do_locking)
      for ( i = 0; i < 4 ; i++ )
          v->arch.paging.shadow.gl3e[i] = gl3e[i];
      sh_unmap_domain_page(gl3e);
-#elif GUEST_PAGING_LEVELS == 2
-    if ( shadow_mode_external(d) || shadow_mode_translate(d) )
-    {
-        if ( v->arch.paging.shadow.guest_vtable )
-            sh_unmap_domain_page_global(v->arch.paging.shadow.guest_vtable);
-        v->arch.paging.shadow.guest_vtable =
sh_map_domain_page_global(gmfn);
-        /* Does this really need map_domain_page_global?  Handle the
-         * error properly if so. */
-        BUG_ON(v->arch.paging.shadow.guest_vtable == NULL); /* XXX */
-    }
-    else
-        v->arch.paging.shadow.guest_vtable = __linear_l2_table;
-#else
-#error this should never happen
 #endif
 
 
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index 6f9744a..4921c99 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -122,8 +122,6 @@ struct shadow_vcpu {
     l3_pgentry_t l3table[4] __attribute__((__aligned__(32)));
     /* PAE guests: per-vcpu cache of the top-level *guest* entries */
     l3_pgentry_t gl3e[4] __attribute__((__aligned__(32)));
-    /* Non-PAE guests: pointer to guest top-level pagetable */
-    void *guest_vtable;
     /* Last MFN that we emulated a write to as unshadow heuristics. */
     unsigned long last_emulated_mfn_for_unshadow;
     /* MFN of the last shadow that we shot a writeable mapping in */

Tim Deegan

2013-Apr-04 10:34 UTC

head link

Re: Bug on shadow page mode

At 11:18 +0100 on 04 Apr (1365074288), Tim Deegan wrote:> Hi,
> 
> At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich wrote:
> > > (XEN) Xen call trace:
> > > (XEN)    [<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
> > > (XEN)    [<ffff82c4c020d8cc>]
sh_page_fault__guest_4+0x505/0x2015
> > > (XEN)    [<ffff82c4c01d2135>]
vmx_vmexit_handler+0x86c/0x1748
> > > (XEN)    
> > > (XEN) Pagetable walk from ffff82c406a00000:
> > > (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
> > > (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
> > > (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff 
> > > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> > 
> > Tim,
> > 
> > I''m afraid this is something for you. From what I can tell,
despite
> > sh_walk_guest_tables() being called from sh_page_fault() without
> > the paging lock held, there doesn''t appear to be a way for
this to
> > race sh_update_cr3(). And with the way the latter updates
> > guest_vtable, the only way for a page fault to happen upon use
> > of that cached mapping would be between the call to
> > sh_unmap_domain_page_global() and the immediately following
> > one to sh_map_domain_page_global() (i.e. while the pointer is
> > stale).
> 
> Hmmm.  So the only way I can see that happening is if some foreign agent
> resets the vcpu''s state while it''s actually running,
which AFAICT
> shouldn''t happen. 
OTOH, looking at map_domain_page_global, there doesn''t seem to be any 
locking preventing two CPUs from populating a page of global-map l1es at
the same time.  So, here''s a different patch to test -- it would be
good
to know if this patch by itself fixes the crash.

Tim.

diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
index 7421e03..efda6af 100644
--- a/xen/arch/x86/domain_page.c
+++ b/xen/arch/x86/domain_page.c
@@ -354,9 +354,10 @@ void *map_domain_page_global(unsigned long mfn)
     set_bit(idx, inuse);
     inuse_cursor = idx + 1;
 
+    pl1e = virt_to_xen_l1e(va);
+
     spin_unlock(&globalmap_lock);
 
-    pl1e = virt_to_xen_l1e(va);
     if ( !pl1e )
         return NULL;
     l1e_write(pl1e, l1e_from_pfn(mfn, __PAGE_HYPERVISOR));

Jan Beulich

2013-Apr-04 12:36 UTC

head link

Re: Bug on shadow page mode

>>> On 04.04.13 at 12:34, Tim Deegan <tim@xen.org> wrote:
> At 11:18 +0100 on 04 Apr (1365074288), Tim Deegan wrote:
>> Hi,
>> 
>> At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich wrote:
>> > > (XEN) Xen call trace:
>> > > (XEN)    [<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
>> > > (XEN)    [<ffff82c4c020d8cc>]
sh_page_fault__guest_4+0x505/0x2015
>> > > (XEN)    [<ffff82c4c01d2135>]
vmx_vmexit_handler+0x86c/0x1748
>> > > (XEN)    
>> > > (XEN) Pagetable walk from ffff82c406a00000:
>> > > (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
>> > > (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
>> > > (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff 
>> > > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
>> > 
>> > Tim,
>> > 
>> > I''m afraid this is something for you. From what I can
tell, despite
>> > sh_walk_guest_tables() being called from sh_page_fault() without
>> > the paging lock held, there doesn''t appear to be a way
for this to
>> > race sh_update_cr3(). And with the way the latter updates
>> > guest_vtable, the only way for a page fault to happen upon use
>> > of that cached mapping would be between the call to
>> > sh_unmap_domain_page_global() and the immediately following
>> > one to sh_map_domain_page_global() (i.e. while the pointer is
>> > stale).
>> 
>> Hmmm.  So the only way I can see that happening is if some foreign
agent
>> resets the vcpu''s state while it''s actually running,
which AFAICT
>> shouldn''t happen. 
> 
> OTOH, looking at map_domain_page_global, there doesn''t seem to be
any
> locking preventing two CPUs from populating a page of global-map l1es at
> the same time.  So, here''s a different patch to test -- it would
be good
> to know if this patch by itself fixes the crash.
That''s a very good point. The code flow was taken from the
old 32-bit implementation, and I didn''t pay attention to this
aspect (which is a non-issue for the 32-bit code) when porting
it over. So feel free to put my ack on this patch once you put
it into proper shape.

Thanks and sorry for bothering you with a bug I introduced,
Jan
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -354,9 +354,10 @@ void *map_domain_page_global(unsigned long mfn)
>      set_bit(idx, inuse);
>      inuse_cursor = idx + 1;
>  
> +    pl1e = virt_to_xen_l1e(va);
> +
>      spin_unlock(&globalmap_lock);
>  
> -    pl1e = virt_to_xen_l1e(va);
>      if ( !pl1e )
>          return NULL;
>      l1e_write(pl1e, l1e_from_pfn(mfn, __PAGE_HYPERVISOR));

Hao, Xudong

2013-Apr-07 09:25 UTC

head link

Re: Bug on shadow page mode

> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: Thursday, April 04, 2013 6:35 PM
> To: Jan Beulich
> Cc: Hao, Xudong; xen-devel (xen-devel@lists.xen.org)
> Subject: Re: [Xen-devel] Bug on shadow page mode
> 
> At 11:18 +0100 on 04 Apr (1365074288), Tim Deegan wrote:
> > Hi,
> >
> > At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich wrote:
> > > > (XEN) Xen call trace:
> > > > (XEN)    [<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
> > > > (XEN)    [<ffff82c4c020d8cc>]
sh_page_fault__guest_4+0x505/0x2015
> > > > (XEN)    [<ffff82c4c01d2135>]
vmx_vmexit_handler+0x86c/0x1748
> > > > (XEN)
> > > > (XEN) Pagetable walk from ffff82c406a00000:
> > > > (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
> > > > (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
> > > > (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff
> > > > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> > >
> > > Tim,
> > >
> > > I''m afraid this is something for you. From what I can
tell, despite
> > > sh_walk_guest_tables() being called from sh_page_fault() without
> > > the paging lock held, there doesn''t appear to be a way
for this to
> > > race sh_update_cr3(). And with the way the latter updates
> > > guest_vtable, the only way for a page fault to happen upon use
> > > of that cached mapping would be between the call to
> > > sh_unmap_domain_page_global() and the immediately following
> > > one to sh_map_domain_page_global() (i.e. while the pointer is
> > > stale).
> >
> > Hmmm.  So the only way I can see that happening is if some foreign
agent
> > resets the vcpu''s state while it''s actually running,
which AFAICT
> > shouldn''t happen.
> 
> OTOH, looking at map_domain_page_global, there doesn''t seem to be
any
> locking preventing two CPUs from populating a page of global-map l1es at
> the same time.  So, here''s a different patch to test -- it would
be good
> to know if this patch by itself fixes the crash.
> 
Holding lock during l1e populating fixes the crash on my side.

Thanks
-Xudong
> Tim.
> 
> diff --git a/xen/arch/x86/domain_page.c b/xen/arch/x86/domain_page.c
> index 7421e03..efda6af 100644
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -354,9 +354,10 @@ void *map_domain_page_global(unsigned long mfn)
>      set_bit(idx, inuse);
>      inuse_cursor = idx + 1;
> 
> +    pl1e = virt_to_xen_l1e(va);
> +
>      spin_unlock(&globalmap_lock);
> 
> -    pl1e = virt_to_xen_l1e(va);
>      if ( !pl1e )
>          return NULL;
>      l1e_write(pl1e, l1e_from_pfn(mfn, __PAGE_HYPERVISOR));

Hao, Xudong

2013-Apr-07 09:29 UTC

head link

Re: Bug on shadow page mode

> -----Original Message-----
> From: Tim Deegan [mailto:tim@xen.org]
> Sent: Thursday, April 04, 2013 6:18 PM
> To: Jan Beulich
> Cc: Hao, Xudong; xen-devel (xen-devel@lists.xen.org)
> Subject: Re: [Xen-devel] Bug on shadow page mode
> 
> Hi,
> 
> At 12:50 +0100 on 02 Apr (1364907054), Jan Beulich wrote:
> > > (XEN) Xen call trace:
> > > (XEN)    [<ffff82c4c01e637f>]
guest_walk_tables_4_levels+0x135/0x6a6
> > > (XEN)    [<ffff82c4c020d8cc>]
sh_page_fault__guest_4+0x505/0x2015
> > > (XEN)    [<ffff82c4c01d2135>]
vmx_vmexit_handler+0x86c/0x1748
> > > (XEN)
> > > (XEN) Pagetable walk from ffff82c406a00000:
> > > (XEN)  L4[0x105] = 000000007f26e063 ffffffffffffffff
> > > (XEN)  L3[0x110] = 000000005ce30063 ffffffffffffffff
> > > (XEN)  L2[0x035] = 0000000014aab063 ffffffffffffffff
> > > (XEN)  L1[0x000] = 0000000000000000 ffffffffffffffff
> >
> > Tim,
> >
> > I''m afraid this is something for you. From what I can tell,
despite
> > sh_walk_guest_tables() being called from sh_page_fault() without
> > the paging lock held, there doesn''t appear to be a way for
this to
> > race sh_update_cr3(). And with the way the latter updates
> > guest_vtable, the only way for a page fault to happen upon use
> > of that cached mapping would be between the call to
> > sh_unmap_domain_page_global() and the immediately following
> > one to sh_map_domain_page_global() (i.e. while the pointer is
> > stale).
> 
> Hmmm.  So the only way I can see that happening is if some foreign agent
> resets the vcpu''s state while it''s actually running,
which AFAICT
> shouldn''t happen.  I looked at reversing the unmap and map, but
realised
> that won''t solve the problem -- in such a race
sh_walk_guest_tables()
> could cache the old value.
> 
> For testing, here''s a patch that gets rid of guest_vtable
altogether.
> I''m not entirely happy with it, partly because it will have a perf
hit
> on large-RAM machines, and partly because I''m worried that
whatever path
> caused this crash might cause other problems.  But it would be good to
> confirm that it stops the crashes.
> 
This fixed the crash too, but I think you will use the other fixing. :)
> Cheers,
> 
> Tim.
> 
> commit 5469971c30eca85f270cbd9bc46f63ec0fd543c0
> Author: Tim Deegan <tim@xen.org>
> Date:   Thu Apr 4 11:16:28 2013 +0100
> 
>     x86/mm/shadow: Don''t keep a permanent mapping of the top-level
guest
> table.
> 
>     This addresses two issues.  First, a crash seen where
>     sh_walk_guest_tables() found that guest_vtable was not a valid
>     mapping.  Second, the lack of any error handling if
>     map_domain_page_global() were to fail.
> 
>     Signed-off-by: Tim Deegan <tim@xen.org>
> 
> diff --git a/xen/arch/x86/mm/shadow/multi.c
> b/xen/arch/x86/mm/shadow/multi.c
> index a593f76..8ca4b9b 100644
> --- a/xen/arch/x86/mm/shadow/multi.c
> +++ b/xen/arch/x86/mm/shadow/multi.c
> @@ -174,15 +174,26 @@ static inline uint32_t
>  sh_walk_guest_tables(struct vcpu *v, unsigned long va, walk_t *gw,
>                       uint32_t pfec)
>  {
> -    return guest_walk_tables(v, p2m_get_hostp2m(v->domain), va, gw,
pfec,
> +    uint32_t rc;
> +    mfn_t top_mfn;
> +    void *top_map;
> +
>  #if GUEST_PAGING_LEVELS == 3 /* PAE */
> -                             _mfn(INVALID_MFN),
> -                             v->arch.paging.shadow.gl3e
> +    top_mfn = _mfn(INVALID_MFN);
> +    top_map = v->arch.paging.shadow.gl3e;
>  #else /* 32 or 64 */
> -                             pagetable_get_mfn(v->arch.guest_table),
> -                             v->arch.paging.shadow.guest_vtable
> +    top_mfn = pagetable_get_mfn(v->arch.guest_table);
> +    top_map = sh_map_domain_page(top_mfn);
> +#endif
> +
> +    rc = guest_walk_tables(v, p2m_get_hostp2m(v->domain), va, gw, pfec,
> +                           top_mfn, top_map);
> +
> +#if GUEST_PAGING_LEVELS != 3
> +    sh_unmap_domain_page(top_map);
>  #endif
> -                             );
> +
> +    return rc;
>  }
> 
>  /* This validation is called with lock held, and after write permission
> @@ -219,24 +230,20 @@ shadow_check_gwalk(struct vcpu *v, unsigned long
> va, walk_t *gw, int version)
>       * logic simple.
>       */
>      perfc_incr(shadow_check_gwalk);
> -#if GUEST_PAGING_LEVELS >= 3 /* PAE or 64... */
> -#if GUEST_PAGING_LEVELS >= 4 /* 64-bit only... */
> -    l4p = (guest_l4e_t *)v->arch.paging.shadow.guest_vtable;
> +#if GUEST_PAGING_LEVELS >= 4 /* 64-bit... */
> +    l4p = sh_map_domain_page(gw->l4mfn);
>      mismatch |= (gw->l4e.l4 != l4p[guest_l4_table_offset(va)].l4);
> +    sh_unmap_domain_page(l4p);
>      l3p = sh_map_domain_page(gw->l3mfn);
>      mismatch |= (gw->l3e.l3 != l3p[guest_l3_table_offset(va)].l3);
>      sh_unmap_domain_page(l3p);
> -#else
> +#elif GUEST_PAGING_LEVELS >= 3 /* PAE... */
>      mismatch |= (gw->l3e.l3 !> 
> v->arch.paging.shadow.gl3e[guest_l3_table_offset(va)].l3);
>  #endif
>      l2p = sh_map_domain_page(gw->l2mfn);
>      mismatch |= (gw->l2e.l2 != l2p[guest_l2_table_offset(va)].l2);
>      sh_unmap_domain_page(l2p);
> -#else
> -    l2p = (guest_l2e_t *)v->arch.paging.shadow.guest_vtable;
> -    mismatch |= (gw->l2e.l2 != l2p[guest_l2_table_offset(va)].l2);
> -#endif
>      if ( !(guest_supports_superpages(v) &&
>             (guest_l2e_get_flags(gw->l2e) & _PAGE_PSE)) )
>      {
> @@ -3784,7 +3791,7 @@ sh_update_linear_entries(struct vcpu *v)
>  }
> 
> 
> -/* Removes vcpu->arch.paging.shadow.guest_vtable and
> vcpu->arch.shadow_table[].
> +/* Removes vcpu->arch.shadow_table[].
>   * Does all appropriate management/bookkeeping/refcounting/etc...
>   */
>  static void
> @@ -3794,24 +3801,6 @@ sh_detach_old_tables(struct vcpu *v)
>      int i = 0;
> 
>      ////
> -    //// vcpu->arch.paging.shadow.guest_vtable
> -    ////
> -
> -#if GUEST_PAGING_LEVELS == 3
> -    /* PAE guests don''t have a mapping of the guest top-level
table */
> -    ASSERT(v->arch.paging.shadow.guest_vtable == NULL);
> -#else
> -    if ( v->arch.paging.shadow.guest_vtable )
> -    {
> -        struct domain *d = v->domain;
> -        if ( shadow_mode_external(d) || shadow_mode_translate(d) )
> -
> sh_unmap_domain_page_global(v->arch.paging.shadow.guest_vtable);
> -        v->arch.paging.shadow.guest_vtable = NULL;
> -    }
> -#endif // !NDEBUG
> -
> -
> -    ////
>      //// vcpu->arch.shadow_table[]
>      ////
> 
> @@ -3970,26 +3959,12 @@ sh_update_cr3(struct vcpu *v, int do_locking)
> 
> 
>      ////
> -    //// vcpu->arch.paging.shadow.guest_vtable
> +    //// vcpu->arch.paging.shadow.gl3e[]
>      ////
> -#if GUEST_PAGING_LEVELS == 4
> -    if ( shadow_mode_external(d) || shadow_mode_translate(d) )
> -    {
> -        if ( v->arch.paging.shadow.guest_vtable )
> -
> sh_unmap_domain_page_global(v->arch.paging.shadow.guest_vtable);
> -        v->arch.paging.shadow.guest_vtable >
sh_map_domain_page_global(gmfn);
> -        /* PAGING_LEVELS==4 implies 64-bit, which means that
> -         * map_domain_page_global can''t fail */
> -        BUG_ON(v->arch.paging.shadow.guest_vtable == NULL);
> -    }
> -    else
> -        v->arch.paging.shadow.guest_vtable = __linear_l4_table;
> -#elif GUEST_PAGING_LEVELS == 3
> +#if GUEST_PAGING_LEVELS == 3
>       /* On PAE guests we don''t use a mapping of the
guest''s own top-level
>        * table.  We cache the current state of that table and shadow that,
>        * until the next CR3 write makes us refresh our cache. */
> -     ASSERT(v->arch.paging.shadow.guest_vtable == NULL);
> -
>       if ( shadow_mode_external(d) )
>           /* Find where in the page the l3 table is */
>           guest_idx = guest_index((void *)v->arch.hvm_vcpu.guest_cr[3]);
> @@ -4005,20 +3980,6 @@ sh_update_cr3(struct vcpu *v, int do_locking)
>       for ( i = 0; i < 4 ; i++ )
>           v->arch.paging.shadow.gl3e[i] = gl3e[i];
>       sh_unmap_domain_page(gl3e);
> -#elif GUEST_PAGING_LEVELS == 2
> -    if ( shadow_mode_external(d) || shadow_mode_translate(d) )
> -    {
> -        if ( v->arch.paging.shadow.guest_vtable )
> -
> sh_unmap_domain_page_global(v->arch.paging.shadow.guest_vtable);
> -        v->arch.paging.shadow.guest_vtable >
sh_map_domain_page_global(gmfn);
> -        /* Does this really need map_domain_page_global?  Handle the
> -         * error properly if so. */
> -        BUG_ON(v->arch.paging.shadow.guest_vtable == NULL); /* XXX */
> -    }
> -    else
> -        v->arch.paging.shadow.guest_vtable = __linear_l2_table;
> -#else
> -#error this should never happen
>  #endif
> 
> 
> diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
> index 6f9744a..4921c99 100644
> --- a/xen/include/asm-x86/domain.h
> +++ b/xen/include/asm-x86/domain.h
> @@ -122,8 +122,6 @@ struct shadow_vcpu {
>      l3_pgentry_t l3table[4] __attribute__((__aligned__(32)));
>      /* PAE guests: per-vcpu cache of the top-level *guest* entries */
>      l3_pgentry_t gl3e[4] __attribute__((__aligned__(32)));
> -    /* Non-PAE guests: pointer to guest top-level pagetable */
> -    void *guest_vtable;
>      /* Last MFN that we emulated a write to as unshadow heuristics. */
>      unsigned long last_emulated_mfn_for_unshadow;
>      /* MFN of the last shadow that we shot a writeable mapping in */

Jan Beulich

2013-Apr-08 08:12 UTC

head link

Re: Bug on shadow page mode

>>> On 04.04.13 at 12:34, Tim Deegan <tim@xen.org> wrote:
> OTOH, looking at map_domain_page_global, there doesn''t seem to be
any
> locking preventing two CPUs from populating a page of global-map l1es at
> the same time.  So, here''s a different patch to test -- it would
be good
> to know if this patch by itself fixes the crash.
With Xudong having verified the fix, I would convert it into full
patch form and commit it, but I''d need your confirmation that I
can put your S-o-b on it.

Jan
> --- a/xen/arch/x86/domain_page.c
> +++ b/xen/arch/x86/domain_page.c
> @@ -354,9 +354,10 @@ void *map_domain_page_global(unsigned long mfn)
>      set_bit(idx, inuse);
>      inuse_cursor = idx + 1;
>  
> +    pl1e = virt_to_xen_l1e(va);
> +
>      spin_unlock(&globalmap_lock);
>  
> -    pl1e = virt_to_xen_l1e(va);
>      if ( !pl1e )
>          return NULL;
>      l1e_write(pl1e, l1e_from_pfn(mfn, __PAGE_HYPERVISOR));

Tim Deegan

2013-Apr-09 03:14 UTC

head link

Re: Bug on shadow page mode

At 09:12 +0100 on 08 Apr (1365412350), Jan Beulich
wrote:> >>> On 04.04.13 at 12:34, Tim Deegan <tim@xen.org> wrote:
> > OTOH, looking at map_domain_page_global, there doesn''t seem
to be any
> > locking preventing two CPUs from populating a page of global-map l1es
at
> > the same time.  So, here''s a different patch to test -- it
would be good
> > to know if this patch by itself fixes the crash.
> 
> With Xudong having verified the fix, I would convert it into full
> patch form and commit it, but I''d need your confirmation that I
> can put your S-o-b on it.
Sure - thanks!  Signed-off-by: Tim Deegan <tim@xen.org>

I might commit the other patch as well, just to turn a possible BUG()
into a perf regression.  Or maybe I''ll see about some middle ground.

Tim.

Xen devel - Mar 2013 - Bug on shadow page mode

Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode

Re: Bug on shadow page mode