thr3ads.net - Xen devel - Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.) [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Lars Kurth

2013-Nov-04 19:54 UTC

Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

See
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-1.html
---
I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM''s.
DOM0 is Centos 6.3 based with linux kernel 3.10.16.
In my configuration all of the windows HVMs are running having been
restored from xl save.
VM''s are destroyed or restored in an on-demand fashion. After some time
XEN
will experience a fatal page fault while restoring one of the windows HVM
subjects. This does not happen very often, perhaps once in a 16 to 48 hour
period.
The stack trace from xen follows. Thanks in advance for any help.

(XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
(XEN) CPU: 52
(XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
(XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
(XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
(XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
(XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
(XEN) cr3: 000000211bee5000 cr2: ffff810000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen stack trace from rsp=ffff8310333e7cd8:
(XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
(XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
(XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
(XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
(XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
(XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
(XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
(XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
(XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
(XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
(XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
(XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
(XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
(XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
(XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
(XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN) [] domain_page_map_to_mfn+0x86/0xc0
(XEN) [] nvmx_handle_vmlaunch+0x49/0x160
(XEN) [] __update_vcpu_system_time+0x240/0x310
(XEN) [] vmx_vmexit_handler+0xb58/0x18c0
(XEN) [] pt_restore_timer+0xa8/0xc0
(XEN) [] hvm_io_assist+0xef/0x120
(XEN) [] hvm_do_resume+0x195/0x1c0
(XEN) [] vmx_do_resume+0x148/0x210
(XEN) [] context_switch+0x1bc/0xfc0
(XEN) [] schedule+0x254/0x5f0
(XEN) [] pt_update_irq+0x256/0x2b0
(XEN) [] timer_softirq_action+0x168/0x210
(XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
(XEN) [] nvmx_switch_guest+0x54/0x1560
(XEN) [] vmx_intr_assist+0x6c/0x490
(XEN) [] vmx_vmenter_helper+0x88/0x160
(XEN) [] __do_softirq+0x69/0xa0
(XEN) [] __do_softirq+0x69/0xa0
(XEN) [] vmx_asm_do_vmentry+0/0xed
(XEN)
(XEN) Pagetable walk from ffff810000000000:
(XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
(XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 52:
(XEN) FATAL PAGE FAULT
(XEN) [error_code=0000]
(XEN) Faulting linear address: ffff810000000000
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...
(XEN) Resetting with ACPI MEMORY or I/O RESET_REG.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Andrew Cooper

2013-Nov-04 20:00 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 04/11/13 19:54, Lars Kurth wrote:> See
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-1.html
> ---
> I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM''s.
> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> In my configuration all of the windows HVMs are running having been
> restored from xl save.
> VM''s are destroyed or restored in an on-demand fashion. After some
> time XEN will experience a fatal page fault while restoring one of the
> windows HVM subjects. This does not happen very often, perhaps once in
> a 16 to 48 hour period.
> The stack trace from xen follows. Thanks in advance for any help.
Which version of Xen were these images saved on?

Are you expecting to be using nested-virt? (It is still very definitely
experimental)

~Andrew
>
> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> (XEN) CPU: 52
> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
> (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
> (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
> (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
> (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
> (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
> (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
> (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
> (XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
> (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
> (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
> (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
> (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
> (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
> (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> (XEN) [] __update_vcpu_system_time+0x240/0x310
> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> (XEN) [] pt_restore_timer+0xa8/0xc0
> (XEN) [] hvm_io_assist+0xef/0x120
> (XEN) [] hvm_do_resume+0x195/0x1c0
> (XEN) [] vmx_do_resume+0x148/0x210
> (XEN) [] context_switch+0x1bc/0xfc0
> (XEN) [] schedule+0x254/0x5f0
> (XEN) [] pt_update_irq+0x256/0x2b0
> (XEN) [] timer_softirq_action+0x168/0x210
> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> (XEN) [] nvmx_switch_guest+0x54/0x1560
> (XEN) [] vmx_intr_assist+0x6c/0x490
> (XEN) [] vmx_vmenter_helper+0x88/0x160
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] vmx_asm_do_vmentry+0/0xed
> (XEN) 
> (XEN) Pagetable walk from ffff810000000000:
> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
> (XEN) 
> (XEN) ****************************************
> (XEN) Panic on CPU 52:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=0000]
> (XEN) Faulting linear address: ffff810000000000
> (XEN) ****************************************
> (XEN) 
> (XEN) Reboot in five seconds...
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.
>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Campbell

2013-Nov-05 09:53 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Mon, 2013-11-04 at 19:54 +0000, Lars Kurth wrote:> See
>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-1.html
TBH I think for this kind of thing (i.e. a bug not a user question) the
most appropriate thing to do would be to redirect them to xen-devel
themselves (with a reminder that they do not need to subscribe to post).
This is going to take some back and forth to get to the bottom of and
having you sit in the middle is just silly.

Ian.

Jan Beulich

2013-Nov-05 10:04 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 04.11.13 at 20:54, Lars Kurth <lars.kurth.xen@gmail.com>
wrote:
> See
>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
> 1.html
> ---
> I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM''s.
> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> In my configuration all of the windows HVMs are running having been
> restored from xl save.
> VM''s are destroyed or restored in an on-demand fashion. After some
time XEN
> will experience a fatal page fault while restoring one of the windows HVM
> subjects. This does not happen very often, perhaps once in a 16 to 48 hour
> period.
> The stack trace from xen follows. Thanks in advance for any help.
> 
> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> (XEN) CPU: 52
> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
Zapping addresses (here and below in the stack trace) is never
helpful when someone asks for help with a crash. Also, in order
to not just guess, the matching xen-syms or xen.efi should be
made available or pointed to.
> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
> (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
> (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
> (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
> (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
> (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
> (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
> (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
> (XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
> (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
> (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
> (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
> (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
> (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
> (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> (XEN) [] __update_vcpu_system_time+0x240/0x310
> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> (XEN) [] pt_restore_timer+0xa8/0xc0
> (XEN) [] hvm_io_assist+0xef/0x120
> (XEN) [] hvm_do_resume+0x195/0x1c0
> (XEN) [] vmx_do_resume+0x148/0x210
> (XEN) [] context_switch+0x1bc/0xfc0
> (XEN) [] schedule+0x254/0x5f0
> (XEN) [] pt_update_irq+0x256/0x2b0
> (XEN) [] timer_softirq_action+0x168/0x210
> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> (XEN) [] nvmx_switch_guest+0x54/0x1560
> (XEN) [] vmx_intr_assist+0x6c/0x490
> (XEN) [] vmx_vmenter_helper+0x88/0x160
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] vmx_asm_do_vmentry+0/0xed
> (XEN)
> (XEN) Pagetable walk from ffff810000000000:
> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
This makes me suspect that domain_page_map_to_mfn() gets a
NULL pointer passed here. As said above, this is only guesswork
at this point, and as Ian already pointed out, directing the
reporter to xen-devel would seem to be the right thing to do
here anyway.

Jan
> (XEN)
> (XEN) ****************************************
> (XEN) Panic on CPU 52:
> (XEN) FATAL PAGE FAULT
> (XEN) [error_code=0000]
> (XEN) Faulting linear address: ffff810000000000
> (XEN) ****************************************
> (XEN)
> (XEN) Reboot in five seconds...
> (XEN) Resetting with ACPI MEMORY or I/O RESET_REG.

Lars Kurth

2013-Nov-05 15:46 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Jan, Andrew, Ian,

pulling in Jeff who raised the question. Snippets from misc replies 
attached. Jeff, please look through these (in particular Jan''s answer) 
and answer any further questions on this thread.

On 05/11/2013 09:53, Ian Campbell wrote:
 > TBH I think for this kind of thing (i.e. a bug not a user question) 
the most appropriate thing to
 > do would be to redirect them to xen-devel themselves (with a reminder 
that they do not need
 > to subscribe to post).
Agreed. Another option is for me to start the thread and pull in the 
raiser of the thread into it, if it is a bug. Was not sure this was a 
real bug at first, but it seems it is.

On 04/11/2013 20:00, Andrew Cooper wrote:
 > Which version of Xen were these images saved on?
[Jeff] We were careful to regenerate all the images after upgrading the 
4.3.1. Also saw the same problem on 4.3.0.

 > Are you expecting to be using nested-virt? (It is still very 
definitely experimental)
[Jeff] Not using nested-virt.

On 05/11/2013 10:04, Jan Beulich wrote:>>>> On 04.11.13 at 20:54, Lars Kurth
<lars.kurth.xen@gmail.com> wrote:
>> See
>>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
>> 1.html
>> ---
>> I have a 32 core system running XEN 4.3.1 with 30 Windows XP
VM''s.
>> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
>> In my configuration all of the windows HVMs are running having been
>> restored from xl save.
>> VM''s are destroyed or restored in an on-demand fashion. After
some time XEN
>> will experience a fatal page fault while restoring one of the windows
HVM
>> subjects. This does not happen very often, perhaps once in a 16 to 48
hour
>> period.
>> The stack trace from xen follows. Thanks in advance for any help.
>>
>> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
>> (XEN) CPU: 52
>> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
> Zapping addresses (here and below in the stack trace) is never
> helpful when someone asks for help with a crash. Also, in order
> to not just guess, the matching xen-syms or xen.efi should be
> made available or pointed to.
>
>> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
>> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
>> (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
>> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
>> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
>> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
>> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
>> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
>> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
>> (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70
ffff8300bb163000
>> (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000
ffff82c4c01d7548
>> (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8
ffff8310333e7e60
>> (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003
ffff833144d8e000
>> (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000
ffff8300bdff1000
>> (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880
ffff82c4c0308440
>> (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c
ffff82c4c02f2880
>> (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060
ffff82c4c02f2880
>> (XEN) 0000000000000282 0010000000000000 0000000000000000
0000000000000000
>> (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000
ffff8300bb163000
>> (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880
ffff82c4c0308440
>> (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060
0000000001c9c380
>> (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380
ffffffffffffff00
>> (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000
ffff82c4c01bc490
>> (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0
ffff82c4c01cfc3c
>> (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9
ffff82c4c0125db9
>> (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0
0000000000000000
>> (XEN) 0000000000000000 0000000000000000 0000000000000000
ffff82c4c01deaa3
>> (XEN) 0000000000000000 0000000000000000 0000000000000000
0000000000000000
>> (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000
0000000000000000
>> (XEN) Xen call trace:
>> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
>> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
>> (XEN) [] __update_vcpu_system_time+0x240/0x310
>> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
>> (XEN) [] pt_restore_timer+0xa8/0xc0
>> (XEN) [] hvm_io_assist+0xef/0x120
>> (XEN) [] hvm_do_resume+0x195/0x1c0
>> (XEN) [] vmx_do_resume+0x148/0x210
>> (XEN) [] context_switch+0x1bc/0xfc0
>> (XEN) [] schedule+0x254/0x5f0
>> (XEN) [] pt_update_irq+0x256/0x2b0
>> (XEN) [] timer_softirq_action+0x168/0x210
>> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
>> (XEN) [] nvmx_switch_guest+0x54/0x1560
>> (XEN) [] vmx_intr_assist+0x6c/0x490
>> (XEN) [] vmx_vmenter_helper+0x88/0x160
>> (XEN) [] __do_softirq+0x69/0xa0
>> (XEN) [] __do_softirq+0x69/0xa0
>> (XEN) [] vmx_asm_do_vmentry+0/0xed
>> (XEN)
>> (XEN) Pagetable walk from ffff810000000000:
>> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
>> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
> This makes me suspect that domain_page_map_to_mfn() gets a
> NULL pointer passed here. As said above, this is only guesswork
> at this point, and as Ian already pointed out, directing the
> reporter to xen-devel would seem to be the right thing to do
> here anyway.
>
> Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

<Jeff_Zimmerman@McAfee.com>

2013-Nov-05 21:55 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Lars,
I understand the mailing list limits attachment size to 512K. Where can I post
the xen binary an symbols file?
Jeff

On Nov 5, 2013, at 7:46 AM, Lars Kurth
<lars.kurth@xen.org<mailto:lars.kurth@xen.org>> wrote:

Jan, Andrew, Ian,

pulling in Jeff who raised the question. Snippets from misc replies attached.
Jeff, please look through these (in particular Jan''s answer) and answer
any further questions on this thread.

On 05/11/2013 09:53, Ian Campbell wrote:> TBH I think for this kind of thing (i.e. a bug not a user question) the
most appropriate thing to
> do would be to redirect them to xen-devel themselves (with a reminder that
they do not need
> to subscribe to post).Agreed. Another option is for me to start the thread and pull in the raiser of
the thread into it, if it is a bug. Was not sure this was a real bug at first,
but it seems it is.

On 04/11/2013 20:00, Andrew Cooper wrote:> Which version of Xen were these images saved on?[Jeff] We were careful to regenerate all the images after upgrading the 4.3.1.
Also saw the same problem on 4.3.0.
> Are you expecting to be using nested-virt? (It is still very definitely
experimental)[Jeff] Not using nested-virt.

On 05/11/2013 10:04, Jan Beulich wrote:

On 04.11.13 at 20:54, Lars Kurth
<lars.kurth.xen@gmail.com><mailto:lars.kurth.xen@gmail.com> wrote:


See
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
1.html
---
I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM''s.
DOM0 is Centos 6.3 based with linux kernel 3.10.16.
In my configuration all of the windows HVMs are running having been
restored from xl save.
VM''s are destroyed or restored in an on-demand fashion. After some time
XEN
will experience a fatal page fault while restoring one of the windows HVM
subjects. This does not happen very often, perhaps once in a 16 to 48 hour
period.
The stack trace from xen follows. Thanks in advance for any help.

(XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
(XEN) CPU: 52
(XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0


Zapping addresses (here and below in the stack trace) is never
helpful when someone asks for help with a crash. Also, in order
to not just guess, the matching xen-syms or xen.efi should be
made available or pointed to.



(XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
(XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
(XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
(XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
(XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
(XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
(XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
(XEN) cr3: 000000211bee5000 cr2: ffff810000000000
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
(XEN) Xen stack trace from rsp=ffff8310333e7cd8:
(XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
(XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
(XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
(XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
(XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
(XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
(XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
(XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
(XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
(XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
(XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
(XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
(XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
(XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
(XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
(XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
(XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
(XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
(XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
(XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
(XEN) Xen call trace:
(XEN) [] domain_page_map_to_mfn+0x86/0xc0
(XEN) [] nvmx_handle_vmlaunch+0x49/0x160
(XEN) [] __update_vcpu_system_time+0x240/0x310
(XEN) [] vmx_vmexit_handler+0xb58/0x18c0
(XEN) [] pt_restore_timer+0xa8/0xc0
(XEN) [] hvm_io_assist+0xef/0x120
(XEN) [] hvm_do_resume+0x195/0x1c0
(XEN) [] vmx_do_resume+0x148/0x210
(XEN) [] context_switch+0x1bc/0xfc0
(XEN) [] schedule+0x254/0x5f0
(XEN) [] pt_update_irq+0x256/0x2b0
(XEN) [] timer_softirq_action+0x168/0x210
(XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
(XEN) [] nvmx_switch_guest+0x54/0x1560
(XEN) [] vmx_intr_assist+0x6c/0x490
(XEN) [] vmx_vmenter_helper+0x88/0x160
(XEN) [] __do_softirq+0x69/0xa0
(XEN) [] __do_softirq+0x69/0xa0
(XEN) [] vmx_asm_do_vmentry+0/0xed
(XEN)
(XEN) Pagetable walk from ffff810000000000:
(XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
(XEN) L3[0x000] = 0000000000000000 ffffffffffffffff


This makes me suspect that domain_page_map_to_mfn() gets a
NULL pointer passed here. As said above, this is only guesswork
at this point, and as Ian already pointed out, directing the
reporter to xen-devel would seem to be the right thing to do
here anyway.

Jan





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

<Jeff_Zimmerman@McAfee.com>

2013-Nov-05 22:46 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Asit,
I''ve attached two files, one is from dmesg | grep microcode, second is
first process from /proc/cpuinfo
Jeff

On Nov 5, 2013, at 2:29 PM, "Mallick, Asit K"
<asit.k.mallick@intel.com>
 wrote:
> Jeff,
> Could you check if you you have latest microcode updates installed on this
system? Or, could you send me the microcode rev and I can check.
>
> Thanks,
> Asit
>
>
> From:
"Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>"
<Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>>
> Date: Tuesday, November 5, 2013 2:55 PM
> To: "lars.kurth@xen.org<mailto:lars.kurth@xen.org>"
<lars.kurth@xen.org<mailto:lars.kurth@xen.org>>
> Cc:
"lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>"
<lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>>,
"xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>"
<xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>>,
"JBeulich@suse.com<mailto:JBeulich@suse.com>"
<JBeulich@suse.com<mailto:JBeulich@suse.com>>
> Subject: Re: [Xen-devel] Intermittent fatal page fault with XEN 4.3.1
(Centos 6.3 DOM0 with linux kernel 3.10.16.)
>
> Lars,
> I understand the mailing list limits attachment size to 512K. Where can I
post the xen binary an symbols file?
> Jeff
>
> On Nov 5, 2013, at 7:46 AM, Lars Kurth
<lars.kurth@xen.org<mailto:lars.kurth@xen.org>> wrote:
>
> Jan, Andrew, Ian,
>
> pulling in Jeff who raised the question. Snippets from misc replies
attached. Jeff, please look through these (in particular Jan''s answer)
and answer any further questions on this thread.
>
> On 05/11/2013 09:53, Ian Campbell wrote:
>> TBH I think for this kind of thing (i.e. a bug not a user question) the
most appropriate thing to
>> do would be to redirect them to xen-devel themselves (with a reminder
that they do not need
>> to subscribe to post).
> Agreed. Another option is for me to start the thread and pull in the raiser
of the thread into it, if it is a bug. Was not sure this was a real bug at
first, but it seems it is.
>
> On 04/11/2013 20:00, Andrew Cooper wrote:
>> Which version of Xen were these images saved on?
> [Jeff] We were careful to regenerate all the images after upgrading the
4.3.1. Also saw the same problem on 4.3.0.
>
>> Are you expecting to be using nested-virt? (It is still very definitely
experimental)
> [Jeff] Not using nested-virt.
>
> On 05/11/2013 10:04, Jan Beulich wrote:
>
> On 04.11.13 at 20:54, Lars Kurth
<lars.kurth.xen@gmail.com><mailto:lars.kurth.xen@gmail.com> wrote:
>
>
> See
>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
> 1.html
> ---
> I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM''s.
> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> In my configuration all of the windows HVMs are running having been
> restored from xl save.
> VM''s are destroyed or restored in an on-demand fashion. After some
time XEN
> will experience a fatal page fault while restoring one of the windows HVM
> subjects. This does not happen very often, perhaps once in a 16 to 48 hour
> period.
> The stack trace from xen follows. Thanks in advance for any help.
>
> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> (XEN) CPU: 52
> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
>
>
> Zapping addresses (here and below in the stack trace) is never
> helpful when someone asks for help with a crash. Also, in order
> to not just guess, the matching xen-syms or xen.efi should be
> made available or pointed to.
>
>
>
> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
> (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
> (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
> (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
> (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
> (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
> (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
> (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
> (XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
> (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
> (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
> (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
> (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
> (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
> (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> (XEN) [] __update_vcpu_system_time+0x240/0x310
> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> (XEN) [] pt_restore_timer+0xa8/0xc0
> (XEN) [] hvm_io_assist+0xef/0x120
> (XEN) [] hvm_do_resume+0x195/0x1c0
> (XEN) [] vmx_do_resume+0x148/0x210
> (XEN) [] context_switch+0x1bc/0xfc0
> (XEN) [] schedule+0x254/0x5f0
> (XEN) [] pt_update_irq+0x256/0x2b0
> (XEN) [] timer_softirq_action+0x168/0x210
> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> (XEN) [] nvmx_switch_guest+0x54/0x1560
> (XEN) [] vmx_intr_assist+0x6c/0x490
> (XEN) [] vmx_vmenter_helper+0x88/0x160
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] vmx_asm_do_vmentry+0/0xed
> (XEN)
> (XEN) Pagetable walk from ffff810000000000:
> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
>
>
> This makes me suspect that domain_page_map_to_mfn() gets a
> NULL pointer passed here. As said above, this is only guesswork
> at this point, and as Ian already pointed out, directing the
> reporter to xen-devel would seem to be the right thing to do
> here anyway.
>
> Jan
>
>
>




_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Mallick, Asit K

2013-Nov-05 23:17 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

It is running with the latest microcode revision 0x710.

Thanks,
Asit


From:
"Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>"
<Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>>
Date: Tuesday, November 5, 2013 3:46 PM
To: "Mallick, Asit K"
<asit.k.mallick@intel.com<mailto:asit.k.mallick@intel.com>>
Cc:
"xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>"
<xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>>
Subject: Re: [Xen-devel] Intermittent fatal page fault with XEN 4.3.1 (Centos
6.3 DOM0 with linux kernel 3.10.16.)

Asit,
I''ve attached two files, one is from dmesg | grep microcode, second is
first process from /proc/cpuinfo
Jeff

On Nov 5, 2013, at 2:29 PM, "Mallick, Asit K"
<asit.k.mallick@intel.com<mailto:asit.k.mallick@intel.com>>
 wrote:
> Jeff,
> Could you check if you you have latest microcode updates installed on this
system? Or, could you send me the microcode rev and I can check.
>
> Thanks,
> Asit
>
>
> From:
"Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com><mailto:Jeff_Zimmerman@McAfee.com>"
<Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com><mailto:Jeff_Zimmerman@McAfee.com>>
> Date: Tuesday, November 5, 2013 2:55 PM
> To:
"lars.kurth@xen.org<mailto:lars.kurth@xen.org><mailto:lars.kurth@xen.org>"
<lars.kurth@xen.org<mailto:lars.kurth@xen.org><mailto:lars.kurth@xen.org>>
> Cc:
"lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com><mailto:lars.kurth.xen@gmail.com>"
<lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com><mailto:lars.kurth.xen@gmail.com>>,
"xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org><mailto:xen-devel@lists.xenproject.org>"
<xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org><mailto:xen-devel@lists.xenproject.org>>,
"JBeulich@suse.com<mailto:JBeulich@suse.com><mailto:JBeulich@suse.com>"
<JBeulich@suse.com<mailto:JBeulich@suse.com><mailto:JBeulich@suse.com>>
> Subject: Re: [Xen-devel] Intermittent fatal page fault with XEN 4.3.1
(Centos 6.3 DOM0 with linux kernel 3.10.16.)
>
> Lars,
> I understand the mailing list limits attachment size to 512K. Where can I
post the xen binary an symbols file?
> Jeff
>
> On Nov 5, 2013, at 7:46 AM, Lars Kurth
<lars.kurth@xen.org<mailto:lars.kurth@xen.org><mailto:lars.kurth@xen.org>>
wrote:
>
> Jan, Andrew, Ian,
>
> pulling in Jeff who raised the question. Snippets from misc replies
attached. Jeff, please look through these (in particular Jan''s answer)
and answer any further questions on this thread.
>
> On 05/11/2013 09:53, Ian Campbell wrote:
>> TBH I think for this kind of thing (i.e. a bug not a user question) the
most appropriate thing to
>> do would be to redirect them to xen-devel themselves (with a reminder
that they do not need
>> to subscribe to post).
> Agreed. Another option is for me to start the thread and pull in the raiser
of the thread into it, if it is a bug. Was not sure this was a real bug at
first, but it seems it is.
>
> On 04/11/2013 20:00, Andrew Cooper wrote:
>> Which version of Xen were these images saved on?
> [Jeff] We were careful to regenerate all the images after upgrading the
4.3.1. Also saw the same problem on 4.3.0.
>
>> Are you expecting to be using nested-virt? (It is still very definitely
experimental)
> [Jeff] Not using nested-virt.
>
> On 05/11/2013 10:04, Jan Beulich wrote:
>
> On 04.11.13 at 20:54, Lars Kurth
<lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>><mailto:lars.kurth.xen@gmail.com>
wrote:
>
>
> See
>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
> 1.html
> ---
> I have a 32 core system running XEN 4.3.1 with 30 Windows XP VM''s.
> DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> In my configuration all of the windows HVMs are running having been
> restored from xl save.
> VM''s are destroyed or restored in an on-demand fashion. After some
time XEN
> will experience a fatal page fault while restoring one of the windows HVM
> subjects. This does not happen very often, perhaps once in a 16 to 48 hour
> period.
> The stack trace from xen follows. Thanks in advance for any help.
>
> (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> (XEN) CPU: 52
> (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
>
>
> Zapping addresses (here and below in the stack trace) is never
> helpful when someone asks for help with a crash. Also, in order
> to not just guess, the matching xen-syms or xen.efi should be
> made available or pointed to.
>
>
>
> (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx: 0000000000000000
> (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi: 0000000000000000
> (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14: 0000000000000000
> (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000000426f0
> (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70 ffff8300bb163000
> (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000 ffff82c4c01d7548
> (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8 ffff8310333e7e60
> (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003 ffff833144d8e000
> (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000 ffff8300bdff1000
> (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c ffff82c4c02f2880
> (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060 ffff82c4c02f2880
> (XEN) 0000000000000282 0010000000000000 0000000000000000 0000000000000000
> (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000 ffff8300bb163000
> (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880 ffff82c4c0308440
> (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060 0000000001c9c380
> (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380 ffffffffffffff00
> (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000 ffff82c4c01bc490
> (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0 ffff82c4c01cfc3c
> (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9 ffff82c4c0125db9
> (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0 0000000000000000
> (XEN) 0000000000000000 0000000000000000 0000000000000000 ffff82c4c01deaa3
> (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000 0000000000000000
> (XEN) Xen call trace:
> (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> (XEN) [] __update_vcpu_system_time+0x240/0x310
> (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> (XEN) [] pt_restore_timer+0xa8/0xc0
> (XEN) [] hvm_io_assist+0xef/0x120
> (XEN) [] hvm_do_resume+0x195/0x1c0
> (XEN) [] vmx_do_resume+0x148/0x210
> (XEN) [] context_switch+0x1bc/0xfc0
> (XEN) [] schedule+0x254/0x5f0
> (XEN) [] pt_update_irq+0x256/0x2b0
> (XEN) [] timer_softirq_action+0x168/0x210
> (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> (XEN) [] nvmx_switch_guest+0x54/0x1560
> (XEN) [] vmx_intr_assist+0x6c/0x490
> (XEN) [] vmx_vmenter_helper+0x88/0x160
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] __do_softirq+0x69/0xa0
> (XEN) [] vmx_asm_do_vmentry+0/0xed
> (XEN)
> (XEN) Pagetable walk from ffff810000000000:
> (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
>
>
> This makes me suspect that domain_page_map_to_mfn() gets a
> NULL pointer passed here. As said above, this is only guesswork
> at this point, and as Ian already pointed out, directing the
> reporter to xen-devel would seem to be the right thing to do
> here anyway.
>
> Jan
>
>
>

Andrew Cooper

2013-Nov-06 00:23 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 05/11/2013 22:46, Jeff_Zimmerman@McAfee.com wrote:> Asit,
> I''ve attached two files, one is from dmesg | grep microcode,
second is
> first process from /proc/cpuinfo
> Jeff
>
> On Nov 5, 2013, at 2:29 PM, "Mallick, Asit K"
<asit.k.mallick@intel.com>
>  wrote:
>
> > Jeff,
> > Could you check if you you have latest microcode updates installed
> on this system? Or, could you send me the microcode rev and I can check.
> >
> > Thanks,
> > Asit
> >
> >
> > From:
"Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>"
> <Jeff_Zimmerman@McAfee.com<mailto:Jeff_Zimmerman@McAfee.com>>
> > Date: Tuesday, November 5, 2013 2:55 PM
> > To: "lars.kurth@xen.org<mailto:lars.kurth@xen.org>"
> <lars.kurth@xen.org<mailto:lars.kurth@xen.org>>
> > Cc:
"lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>"
> <lars.kurth.xen@gmail.com<mailto:lars.kurth.xen@gmail.com>>,
>
"xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>"
<xen-devel@lists.xenproject.org<mailto:xen-devel@lists.xenproject.org>>,
> "JBeulich@suse.com<mailto:JBeulich@suse.com>"
> <JBeulich@suse.com<mailto:JBeulich@suse.com>>
> > Subject: Re: [Xen-devel] Intermittent fatal page fault with XEN
> 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)
> >
> > Lars,
> > I understand the mailing list limits attachment size to 512K. Where
> can I post the xen binary an symbols file?
> > Jeff
> >
> > On Nov 5, 2013, at 7:46 AM, Lars Kurth
> <lars.kurth@xen.org<mailto:lars.kurth@xen.org>> wrote:
> >
> > Jan, Andrew, Ian,
> >
> > pulling in Jeff who raised the question. Snippets from misc replies
> attached. Jeff, please look through these (in particular Jan''s
answer)
> and answer any further questions on this thread.
> >
> > On 05/11/2013 09:53, Ian Campbell wrote:
> >> TBH I think for this kind of thing (i.e. a bug not a user
question)
> the most appropriate thing to
> >> do would be to redirect them to xen-devel themselves (with a
> reminder that they do not need
> >> to subscribe to post).
> > Agreed. Another option is for me to start the thread and pull in the
> raiser of the thread into it, if it is a bug. Was not sure this was a
> real bug at first, but it seems it is.
> >
> > On 04/11/2013 20:00, Andrew Cooper wrote:
> >> Which version of Xen were these images saved on?
> > [Jeff] We were careful to regenerate all the images after upgrading
> the 4.3.1. Also saw the same problem on 4.3.0.
> >
> >> Are you expecting to be using nested-virt? (It is still very
> definitely experimental)
> > [Jeff] Not using nested-virt.
> >
> > On 05/11/2013 10:04, Jan Beulich wrote:
> >
> > On 04.11.13 at 20:54, Lars Kurth
> <lars.kurth.xen@gmail.com><mailto:lars.kurth.xen@gmail.com>
wrote:
> >
> >
> > See
> >
>
http://xenproject.org/help/questions-and-answers/hypervisor-fatal-page-fault-xen-4-3-
> > 1.html
> > ---
> > I have a 32 core system running XEN 4.3.1 with 30 Windows XP
VM''s.
> > DOM0 is Centos 6.3 based with linux kernel 3.10.16.
> > In my configuration all of the windows HVMs are running having been
> > restored from xl save.
> > VM''s are destroyed or restored in an on-demand fashion. After
some
> time XEN
> > will experience a fatal page fault while restoring one of the
> windows HVM
> > subjects. This does not happen very often, perhaps once in a 16 to
> 48 hour
> > period.
> > The stack trace from xen follows. Thanks in advance for any help.
> >
> > (XEN) ----[ Xen-4.3.1 x86_64 debug=n Tainted: C ]----
> > (XEN) CPU: 52
> > (XEN) RIP: e008:[] domain_page_map_to_mfn+0x86/0xc0
> >
> >
> > Zapping addresses (here and below in the stack trace) is never
> > helpful when someone asks for help with a crash. Also, in order
> > to not just guess, the matching xen-syms or xen.efi should be
> > made available or pointed to.
> >
> >
> >
> > (XEN) RFLAGS: 0000000000010246 CONTEXT: hypervisor
> > (XEN) rax: 000ffffffffff000 rbx: ffff8300bb163760 rcx:
0000000000000000
> > (XEN) rdx: ffff810000000000 rsi: 0000000000000000 rdi:
0000000000000000
> > (XEN) rbp: ffff8300bb163000 rsp: ffff8310333e7cd8 r8: 0000000000000000
> > (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000
> > (XEN) r12: ffff8310333e7f18 r13: 0000000000000000 r14:
0000000000000000
> > (XEN) r15: 0000000000000000 cr0: 0000000080050033 cr4:
00000000000426f0
> > (XEN) cr3: 000000211bee5000 cr2: ffff810000000000
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008
> > (XEN) Xen stack trace from rsp=ffff8310333e7cd8:
> > (XEN) 0000000000000001 ffff82c4c01de869 ffff82c4c0182c70
> ffff8300bb163000
> > (XEN) 0000000000000014 ffff8310333e7f18 0000000000000000
> ffff82c4c01d7548
> > (XEN) ffff8300bb163490 ffff8300bb163000 ffff82c4c01c65b8
> ffff8310333e7e60
> > (XEN) ffff82c4c01badef ffff8300bb163000 0000000000000003
> ffff833144d8e000
> > (XEN) ffff82c4c01b4885 ffff8300bb163000 ffff8300bb163000
> ffff8300bdff1000
> > (XEN) 0000000000000001 ffff82c4c02f2880 ffff82c4c02f2880
> ffff82c4c0308440
> > (XEN) ffff82c4c01d0ea8 ffff8300bb163000 ffff82c4c015ad6c
> ffff82c4c02f2880
> > (XEN) ffff82c4c02cf800 00000000ffffffff ffff8310333f5060
> ffff82c4c02f2880
> > (XEN) 0000000000000282 0010000000000000 0000000000000000
> 0000000000000000
> > (XEN) 0000000000000000 ffff82c4c02f2880 ffff8300bdff1000
> ffff8300bb163000
> > (XEN) 000031a10f2b16ca 0000000000000001 ffff82c4c02f2880
> ffff82c4c0308440
> > (XEN) ffff82c4c0124444 0000000000000034 ffff8310333f5060
> 0000000001c9c380
> > (XEN) 00000000c0155965 ffff82c4c01c6146 0000000001c9c380
> ffffffffffffff00
> > (XEN) ffff82c4c0128fa8 ffff8300bb163000 ffff8327d50e9000
> ffff82c4c01bc490
> > (XEN) 0000000000000000 ffff82c4c01dd254 0000000080549ae0
> ffff82c4c01cfc3c
> > (XEN) ffff8300bb163000 ffff82c4c01d6128 ffff82c4c0125db9
> ffff82c4c0125db9
> > (XEN) ffff8310333e0000 ffff8300bb163000 000000000012ffc0
> 0000000000000000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> ffff82c4c01deaa3
> > (XEN) 0000000000000000 0000000000000000 0000000000000000
> 0000000000000000
> > (XEN) 000000000012ffc0 000000007ffdf000 0000000000000000
> 0000000000000000
> > (XEN) Xen call trace:
> > (XEN) [] domain_page_map_to_mfn+0x86/0xc0
> > (XEN) [] nvmx_handle_vmlaunch+0x49/0x160
> > (XEN) [] __update_vcpu_system_time+0x240/0x310
> > (XEN) [] vmx_vmexit_handler+0xb58/0x18c0
> > (XEN) [] pt_restore_timer+0xa8/0xc0
> > (XEN) [] hvm_io_assist+0xef/0x120
> > (XEN) [] hvm_do_resume+0x195/0x1c0
> > (XEN) [] vmx_do_resume+0x148/0x210
> > (XEN) [] context_switch+0x1bc/0xfc0
> > (XEN) [] schedule+0x254/0x5f0
> > (XEN) [] pt_update_irq+0x256/0x2b0
> > (XEN) [] timer_softirq_action+0x168/0x210
> > (XEN) [] hvm_vcpu_has_pending_irq+0x50/0xb0
> > (XEN) [] nvmx_switch_guest+0x54/0x1560
> > (XEN) [] vmx_intr_assist+0x6c/0x490
> > (XEN) [] vmx_vmenter_helper+0x88/0x160
> > (XEN) [] __do_softirq+0x69/0xa0
> > (XEN) [] __do_softirq+0x69/0xa0
> > (XEN) [] vmx_asm_do_vmentry+0/0xed
> > (XEN)
> > (XEN) Pagetable walk from ffff810000000000:
> > (XEN) L4[0x102] = 000000211bee5063 ffffffffffffffff
> > (XEN) L3[0x000] = 0000000000000000 ffffffffffffffff
> >
> >
> > This makes me suspect that domain_page_map_to_mfn() gets a
> > NULL pointer passed here. As said above, this is only guesswork
> > at this point, and as Ian already pointed out, directing the
> > reporter to xen-devel would seem to be the right thing to do
> > here anyway.
> >
> > Jan
> >
> >
> >
>
As Jan said, the above censoring is almost completely defeating the
purpose of trying to help you.

However, while you are not expecting to be using nested-virt, you
clearly appear to be from the stack trace, so something is clearly up.

Which toolstack are you using for VMs ?  What is the configuration for
the affected VM?

~Andrew


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Ian Campbell

2013-Nov-06 10:05 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Wed, 2013-11-06 at 00:23 +0000, Andrew Cooper wrote:
> Which toolstack are you using for VMs ?  What is the configuration for
> the affected VM?
And what exact Windows OS? It''s not entirely out the question that a
modern one might try and use VMX for various things if it saw it. And
doesn''t mcafee have a Windows product which does things along those
lines ? :-)

Ian.

Jan Beulich

2013-Nov-06 14:09 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 05.11.13 at 22:36, <Jeff_Zimmerman@McAfee.com> wrote:
> Attaching the xen binary and symbols file.
> Hopefully they will come through.
Please give the attached patch a try - afaict it should eliminate
the host crash, but I''m pretty certain you''ll then see the
guest
misbehave. Depending on what other load you place on the
system as a whole, you''re either overloading it (i.e. we''re
running out of mapping space in the hypervisor) or there''s a
mapping leak that - so far at least - I can''t spot.

In any event I''d suggest you try running a debug build of the
hypervisor, so that eventual problems can be spotted earlier.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

<Jeff_Zimmerman@McAfee.com>

2013-Nov-06 16:05 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Jan,

I will give your patch a try.
I have to recant my previous statement regarding not using nested-virt.
It seems some of the code that is being executed on the vm contains vmx
instructions.
Since by virtue of running this code in an hvm subject make it nested-virt.

This raises a question, if this functionality is undesired can we just disable
nested virt by adding
nestedhvm=false to the configuration file?  Should the cpuid and cupid_check
settings be changed as well?

Thanks,
Jeff

On Nov 6, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com>
wrote:
>>>> On 05.11.13 at 22:36, <Jeff_Zimmerman@McAfee.com> wrote:
>> Attaching the xen binary and symbols file.
>> Hopefully they will come through.
> 
> Please give the attached patch a try - afaict it should eliminate
> the host crash, but I''m pretty certain you''ll then see
the guest
> misbehave. Depending on what other load you place on the
> system as a whole, you''re either overloading it (i.e.
we''re
> running out of mapping space in the hypervisor) or there''s a
> mapping leak that - so far at least - I can''t spot.
> 
> In any event I''d suggest you try running a debug build of the
> hypervisor, so that eventual problems can be spotted earlier.
> 
> Jan
> 
> <nVMX-map-errors.patch>

Jan Beulich

2013-Nov-06 16:16 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 06.11.13 at 17:05, <Jeff_Zimmerman@McAfee.com> wrote:
> This raises a question, if this functionality is undesired can we just 
> disable nested virt by adding
> nestedhvm=false to the configuration file?
Sure. And as that''s supposedly the default, just deleting the line
should be fine too.
> Should the cpuid and cupid_check settings be changed as well?
I don''t think so, unless you manually override it to look like VMX
was available.

That said - it would still be nice if you could help us figure out the
bug''s origin (and I assume you realize that it would be even more
helpful for us if you did all this on 4.4-unstable).

Jan

Ian Campbell

2013-Nov-06 16:18 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Wed, 2013-11-06 at 16:05 +0000, Jeff_Zimmerman@McAfee.com
wrote:> Jan,
> 
> I will give your patch a try.
> I have to recant my previous statement regarding not using nested-virt.
> It seems some of the code that is being executed on the vm contains vmx
instructions.
> Since by virtue of running this code in an hvm subject make it nested-virt.
> 
> This raises a question, if this functionality is undesired can we just
disable nested virt by adding
> nestedhvm=false to the configuration file?  Should the cpuid and
> cupid_check settings be changed as well?
I''m reasonably certain that nestedhvm=false will clear the relevant
flags in the guest visible cpuid. I''d say it was a bug if this
doesn''t
happen.

nestedhvm should be disabled by default, did you explicitly enable it?
Removing the line altogether ought to disable it too. Please let us know
if not.

Ian.

<Jeff_Zimmerman@McAfee.com>

2013-Nov-06 16:48 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Nov 6, 2013, at 8:18 AM, Ian Campbell <Ian.Campbell@citrix.com>
 wrote:
> On Wed, 2013-11-06 at 16:05 +0000, Jeff_Zimmerman@McAfee.com wrote:
>> Jan,
>> 
>> I will give your patch a try.
>> I have to recant my previous statement regarding not using nested-virt.
>> It seems some of the code that is being executed on the vm contains vmx
instructions.
>> Since by virtue of running this code in an hvm subject make it
nested-virt.
>> 
>> This raises a question, if this functionality is undesired can we just
disable nested virt by adding
>> nestedhvm=false to the configuration file?  Should the cpuid and
>> cupid_check settings be changed as well?
> 
> I''m reasonably certain that nestedhvm=false will clear the
relevant
> flags in the guest visible cpuid. I''d say it was a bug if this
doesn''t
> happen.
> 
> nestedhvm should be disabled by default, did you explicitly enable it?
> Removing the line altogether ought to disable it too. Please let us know
> if not.
> 
> Ian.
> I did not enable nestedvm and when I run xl list -l the output shows
nestedhvm=<default>
I was not sure what the default was supposed to be. I will try setting it and
re-run our test.
Jeff

Andrew Cooper

2013-Nov-06 16:54 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 06/11/13 16:48, Jeff_Zimmerman@McAfee.com wrote:> On Nov 6, 2013, at 8:18 AM, Ian Campbell <Ian.Campbell@citrix.com>
>  wrote:
>
>> On Wed, 2013-11-06 at 16:05 +0000, Jeff_Zimmerman@McAfee.com wrote:
>>> Jan,
>>>
>>> I will give your patch a try.
>>> I have to recant my previous statement regarding not using
nested-virt.
>>> It seems some of the code that is being executed on the vm contains
vmx instructions.
>>> Since by virtue of running this code in an hvm subject make it
nested-virt.
>>>
>>> This raises a question, if this functionality is undesired can we
just disable nested virt by adding
>>> nestedhvm=false to the configuration file?  Should the cpuid and
>>> cupid_check settings be changed as well?
>> I''m reasonably certain that nestedhvm=false will clear the
relevant
>> flags in the guest visible cpuid. I''d say it was a bug if this
doesn''t
>> happen.
>>
>> nestedhvm should be disabled by default, did you explicitly enable it?
>> Removing the line altogether ought to disable it too. Please let us
know
>> if not.
>>
>> Ian.
>>
> I did not enable nestedvm and when I run xl list -l the output shows
nestedhvm=<default>
> I was not sure what the default was supposed to be. I will try setting it
and re-run our test.
> Jeff
nested-virt is strictly experimental, and still has known bugs (and
clearly some unknown ones).

I looked over the xl code and thought that nestedhvm should default to
false, but I would prefer someone more familar with libxl and the idl to
confirm what the default should be.

~Andrew
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

Ian Campbell

2013-Nov-06 17:06 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Wed, 2013-11-06 at 16:54 +0000, Andrew Cooper wrote:> I looked over the xl code and thought that nestedhvm should default to
> false, but I would prefer someone more familar with libxl and the idl to
> confirm what the default should be.
libxl thinks the default is false and will set HVM_PARAM_NESTEDHVM to 0
in that case. Is there some way to query the hypervisor for what it
thinks the setting is?

Ian.

Andrew Cooper

2013-Nov-06 17:07 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 06/11/13 17:06, Ian Campbell wrote:> On Wed, 2013-11-06 at 16:54 +0000, Andrew Cooper wrote:
>> I looked over the xl code and thought that nestedhvm should default to
>> false, but I would prefer someone more familar with libxl and the idl
to
>> confirm what the default should be.
> libxl thinks the default is false and will set HVM_PARAM_NESTEDHVM to 0
> in that case. Is there some way to query the hypervisor for what it
> thinks the setting is?
>
> Ian.
>
>
A get hvmparam hypercall will retrieve the value, but it is initialised
to 0 and only ever set by a set hvmparam hypercall.

~Andrew

Jan Beulich

2013-Nov-07 09:10 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 06.11.13 at 18:07, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> On 06/11/13 17:06, Ian Campbell wrote:
>> On Wed, 2013-11-06 at 16:54 +0000, Andrew Cooper wrote:
>>> I looked over the xl code and thought that nestedhvm should default
to
>>> false, but I would prefer someone more familar with libxl and the
idl to
>>> confirm what the default should be.
>> libxl thinks the default is false and will set HVM_PARAM_NESTEDHVM to 0
>> in that case. Is there some way to query the hypervisor for what it
>> thinks the setting is?
> 
> A get hvmparam hypercall will retrieve the value, but it is initialised
> to 0 and only ever set by a set hvmparam hypercall.
Which makes me start suspecting that the guest might be deriving
its information on VMX being available from something other than
CPUID. Of course we ought to confirm that we don''t unintentionally
return the VMX flag set (and that the config file doesn''t override it
in this way - I think we shouldn''t be suppressing user overrides
here, but I didn''t go check whether we do).

Jan

Ian Campbell

2013-Nov-07 09:30 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Thu, 2013-11-07 at 09:10 +0000, Jan Beulich wrote:> >>> On 06.11.13 at 18:07, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> > On 06/11/13 17:06, Ian Campbell wrote:
> >> On Wed, 2013-11-06 at 16:54 +0000, Andrew Cooper wrote:
> >>> I looked over the xl code and thought that nestedhvm should
default to
> >>> false, but I would prefer someone more familar with libxl and
the idl to
> >>> confirm what the default should be.
> >> libxl thinks the default is false and will set HVM_PARAM_NESTEDHVM
to 0
> >> in that case. Is there some way to query the hypervisor for what
it
> >> thinks the setting is?
> > 
> > A get hvmparam hypercall will retrieve the value, but it is
initialised
> > to 0 and only ever set by a set hvmparam hypercall.
> 
> Which makes me start suspecting that the guest might be deriving
> its information on VMX being available from something other than
> CPUID. Of course we ought to confirm that we don''t unintentionally
> return the VMX flag set (and that the config file doesn''t override
it
> in this way - I think we shouldn''t be suppressing user overrides
> here, but I didn''t go check whether we do).
I was also wondering about the behaviour of using vmx instructions in a
guest despite vmx not being visible in cpuid...

Ian.

<Jeff_Zimmerman@McAfee.com>

2013-Nov-07 15:41 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Nov 7, 2013, at 1:30 AM, Ian Campbell <Ian.Campbell@citrix.com>
 wrote:
> On Thu, 2013-11-07 at 09:10 +0000, Jan Beulich wrote:
>>>>> On 06.11.13 at 18:07, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>> On 06/11/13 17:06, Ian Campbell wrote:
>>>> On Wed, 2013-11-06 at 16:54 +0000, Andrew Cooper wrote:
>>>>> I looked over the xl code and thought that nestedhvm should
default to
>>>>> false, but I would prefer someone more familar with libxl
and the idl to
>>>>> confirm what the default should be.
>>>> libxl thinks the default is false and will set
HVM_PARAM_NESTEDHVM to 0
>>>> in that case. Is there some way to query the hypervisor for
what it
>>>> thinks the setting is?
>>> 
>>> A get hvmparam hypercall will retrieve the value, but it is
initialised
>>> to 0 and only ever set by a set hvmparam hypercall.
>> 
>> Which makes me start suspecting that the guest might be deriving
>> its information on VMX being available from something other than
>> CPUID. Of course we ought to confirm that we don''t
unintentionally
>> return the VMX flag set (and that the config file doesn''t
override it
>> in this way - I think we shouldn''t be suppressing user
overrides
>> here, but I didn''t go check whether we do).
> 
> I was also wondering about the behaviour of using vmx instructions in a
> guest despite vmx not being visible in cpuid...
> 
> Ian.
> 
> We have found in our situation this is exactly the case. To verify we wrote some
test code that makes vmx calls without checking cupid. On bare hardware the
program
executes as expected. In a VM on Xen it causes the hypervisor to panic.

From a security standpoint this is very very bad. It might be a good idea to
provide either
a run-time or build-time option to disable nestedhvm. Just turning off the vmx
bit is not enough
as malicious or badly written code can cause a system crash.

For us it looks like we can disable these instructions and avoid the crash.

Jeff.

Andrew Cooper

2013-Nov-07 15:54 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 07/11/13 15:41, Jeff_Zimmerman@McAfee.com wrote:> On Nov 7, 2013, at 1:30 AM, Ian Campbell <Ian.Campbell@citrix.com>
>  wrote:
>
>> On Thu, 2013-11-07 at 09:10 +0000, Jan Beulich wrote:
>>>>>> On 06.11.13 at 18:07, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>>> On 06/11/13 17:06, Ian Campbell wrote:
>>>>> On Wed, 2013-11-06 at 16:54 +0000, Andrew Cooper wrote:
>>>>>> I looked over the xl code and thought that nestedhvm
should default to
>>>>>> false, but I would prefer someone more familar with
libxl and the idl to
>>>>>> confirm what the default should be.
>>>>> libxl thinks the default is false and will set
HVM_PARAM_NESTEDHVM to 0
>>>>> in that case. Is there some way to query the hypervisor for
what it
>>>>> thinks the setting is?
>>>> A get hvmparam hypercall will retrieve the value, but it is
initialised
>>>> to 0 and only ever set by a set hvmparam hypercall.
>>> Which makes me start suspecting that the guest might be deriving
>>> its information on VMX being available from something other than
>>> CPUID. Of course we ought to confirm that we don''t
unintentionally
>>> return the VMX flag set (and that the config file doesn''t
override it
>>> in this way - I think we shouldn''t be suppressing user
overrides
>>> here, but I didn''t go check whether we do).
>> I was also wondering about the behaviour of using vmx instructions in a
>> guest despite vmx not being visible in cpuid...
>>
>> Ian.
>>
>>
> We have found in our situation this is exactly the case. To verify we wrote
some
> test code that makes vmx calls without checking cupid. On bare hardware the
program
> executes as expected. In a VM on Xen it causes the hypervisor to panic.
>
> From a security standpoint this is very very bad. It might be a good idea
to provide either
> a run-time or build-time option to disable nestedhvm. Just turning off the
vmx bit is not enough
> as malicious or badly written code can cause a system crash.
>
> For us it looks like we can disable these instructions and avoid the crash.
>
> Jeff.
Hmm - that is very concerning that.

And there does look to be a bug.

Can you try the following patch and see whether it helps?

diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index c9afb56..7b1a349 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -359,7 +359,7 @@ static inline int hvm_event_pending(struct vcpu *v)
 /* These bits in CR4 cannot be set by the guest. */
 #define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
     (~((unsigned long)                                  \
-       (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
+       (X86_CR4_PVI | X86_CR4_TSD |                     \
         X86_CR4_DE  | X86_CR4_PSE | X86_CR4_PAE |       \
         X86_CR4_MCE | X86_CR4_PGE | X86_CR4_PCE |       \
         X86_CR4_OSFXSR | X86_CR4_OSXMMEXCPT |           \

~Andrew

Jan Beulich

2013-Nov-07 15:57 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 07.11.13 at 16:41, <Jeff_Zimmerman@McAfee.com> wrote:
> On Nov 7, 2013, at 1:30 AM, Ian Campbell <Ian.Campbell@citrix.com> 
wrote:
>> I was also wondering about the behaviour of using vmx instructions in a
>> guest despite vmx not being visible in cpuid...
>> 
> We have found in our situation this is exactly the case. To verify we wrote
> some
> test code that makes vmx calls without checking cupid. On bare hardware the
> program
> executes as expected. In a VM on Xen it causes the hypervisor to panic.
You trying it doesn''t yet imply that Windows also does so.

Also, you say "program" - are you using these from user mode code?
> From a security standpoint this is very very bad. It might be a good idea
to
> provide either
> a run-time or build-time option to disable nestedhvm. Just turning off the
vmx
> bit is not enough
> as malicious or badly written code can cause a system crash.
Yes, we will absolutely need to do that.

Jan

Jan Beulich

2013-Nov-07 16:00 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 07.11.13 at 16:54, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> Can you try the following patch and see whether it helps?
> 
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index c9afb56..7b1a349 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -359,7 +359,7 @@ static inline int hvm_event_pending(struct vcpu *v)
>  /* These bits in CR4 cannot be set by the guest. */
>  #define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
>      (~((unsigned long)                                  \
> -       (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
> +       (X86_CR4_PVI | X86_CR4_TSD |                     \
Are you mixing up VME and VMXE perhaps?

Jan

<Jeff_Zimmerman@McAfee.com>

2013-Nov-07 16:02 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Nov 7, 2013, at 7:57 AM, Jan Beulich <JBeulich@suse.com>
 wrote:
>>>> On 07.11.13 at 16:41, <Jeff_Zimmerman@McAfee.com> wrote:
>> On Nov 7, 2013, at 1:30 AM, Ian Campbell
<Ian.Campbell@citrix.com>  wrote:
>>> I was also wondering about the behaviour of using vmx instructions
in a
>>> guest despite vmx not being visible in cpuid...
>>> 
>> We have found in our situation this is exactly the case. To verify we
wrote
>> some
>> test code that makes vmx calls without checking cupid. On bare hardware
the
>> program
>> executes as expected. In a VM on Xen it causes the hypervisor to panic.
> 
> You trying it doesn''t yet imply that Windows also does so.
> 
> Also, you say "program" - are you using these from user mode
code?
Yes, from windows run as a privileged user. Windows XP sp3 can cause the crash.
It seems windows 7 has better security, we cannot crash the system from a win7
guest.> 
>> From a security standpoint this is very very bad. It might be a good
idea to
>> provide either
>> a run-time or build-time option to disable nestedhvm. Just turning off
the vmx
>> bit is not enough
>> as malicious or badly written code can cause a system crash.
> 
> Yes, we will absolutely need to do that.
> 
> Jan
>

Andrew Cooper

2013-Nov-07 16:06 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 07/11/13 16:00, Jan Beulich wrote:>>>> On 07.11.13 at 16:54, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>> Can you try the following patch and see whether it helps?
>>
>> diff --git a/xen/include/asm-x86/hvm/hvm.h
b/xen/include/asm-x86/hvm/hvm.h
>> index c9afb56..7b1a349 100644
>> --- a/xen/include/asm-x86/hvm/hvm.h
>> +++ b/xen/include/asm-x86/hvm/hvm.h
>> @@ -359,7 +359,7 @@ static inline int hvm_event_pending(struct vcpu *v)
>>  /* These bits in CR4 cannot be set by the guest. */
>>  #define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
>>      (~((unsigned long)                                  \
>> -       (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
>> +       (X86_CR4_PVI | X86_CR4_TSD |                     \
> Are you mixing up VME and VMXE perhaps?
>
> Jan
>
I am indeed.  Apologies for the noise, but I am still quite concerned

I shall attempt to repro this on a XenRT machine

Jeff: What system is this on (so I can pick a similar server to try with)?

~Andrew

<Jeff_Zimmerman@McAfee.com>

2013-Nov-07 16:12 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Nov 7, 2013, at 8:06 AM, Andrew Cooper <andrew.cooper3@citrix.com>
 wrote:
> On 07/11/13 16:00, Jan Beulich wrote:
>>>>> On 07.11.13 at 16:54, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
>>> Can you try the following patch and see whether it helps?
>>> 
>>> diff --git a/xen/include/asm-x86/hvm/hvm.h
b/xen/include/asm-x86/hvm/hvm.h
>>> index c9afb56..7b1a349 100644
>>> --- a/xen/include/asm-x86/hvm/hvm.h
>>> +++ b/xen/include/asm-x86/hvm/hvm.h
>>> @@ -359,7 +359,7 @@ static inline int hvm_event_pending(struct vcpu
*v)
>>> /* These bits in CR4 cannot be set by the guest. */
>>> #define HVM_CR4_GUEST_RESERVED_BITS(_v)                 \
>>>     (~((unsigned long)                                  \
>>> -       (X86_CR4_VME | X86_CR4_PVI | X86_CR4_TSD |       \
>>> +       (X86_CR4_PVI | X86_CR4_TSD |                     \
>> Are you mixing up VME and VMXE perhaps?
>> 
>> Jan
>> 
> 
> I am indeed.  Apologies for the noise, but I am still quite concerned
> 
> I shall attempt to repro this on a XenRT machine
> 
> Jeff: What system is this on (so I can pick a similar server to try with)?
It is an intel S4600LH board.> 
> ~Andrew

Jan Beulich

2013-Nov-07 16:53 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 07.11.13 at 17:02, <Jeff_Zimmerman@McAfee.com> wrote:
> On Nov 7, 2013, at 7:57 AM, Jan Beulich <JBeulich@suse.com>
>  wrote:
> 
>>>>> On 07.11.13 at 16:41, <Jeff_Zimmerman@McAfee.com>
wrote:
>>> On Nov 7, 2013, at 1:30 AM, Ian Campbell
<Ian.Campbell@citrix.com>  wrote:
>>>> I was also wondering about the behaviour of using vmx
instructions in a
>>>> guest despite vmx not being visible in cpuid...
>>>> 
>>> We have found in our situation this is exactly the case. To verify
we wrote
>>> some
>>> test code that makes vmx calls without checking cupid. On bare
hardware the
>>> program
>>> executes as expected. In a VM on Xen it causes the hypervisor to
panic.
>> 
>> You trying it doesn''t yet imply that Windows also does so.
>> 
>> Also, you say "program" - are you using these from user mode
code?
> 
> Yes, from windows run as a privileged user. Windows XP sp3 can cause the 
> crash.
> It seems windows 7 has better security, we cannot crash the system from a 
> win7 guest.
Which is sort of odd. Anyway - care to try the attached patch?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Andrew Cooper

2013-Nov-07 17:02 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 07/11/13 16:53, Jan Beulich wrote:>>>> On 07.11.13 at 17:02, <Jeff_Zimmerman@McAfee.com> wrote:
>> On Nov 7, 2013, at 7:57 AM, Jan Beulich <JBeulich@suse.com>
>>  wrote:
>>
>>>>>> On 07.11.13 at 16:41, <Jeff_Zimmerman@McAfee.com>
wrote:
>>>> On Nov 7, 2013, at 1:30 AM, Ian Campbell
<Ian.Campbell@citrix.com>  wrote:
>>>>> I was also wondering about the behaviour of using vmx
instructions in a
>>>>> guest despite vmx not being visible in cpuid...
>>>>>
>>>> We have found in our situation this is exactly the case. To
verify we wrote
>>>> some
>>>> test code that makes vmx calls without checking cupid. On bare
hardware the
>>>> program
>>>> executes as expected. In a VM on Xen it causes the hypervisor
to panic.
>>> You trying it doesn''t yet imply that Windows also does so.
>>>
>>> Also, you say "program" - are you using these from user
mode code?
>> Yes, from windows run as a privileged user. Windows XP sp3 can cause
the
>> crash.
>> It seems windows 7 has better security, we cannot crash the system from
a
>> win7 guest.
> Which is sort of odd. Anyway - care to try the attached patch?
>
> Jan
>
While the patch does look plausible, there is still clearly an issue
that an HVM guest with nested_virt disabled can even use the VMX
instructions, rather than getting flat out #UD exceptions.

~Andrew

Andrew Cooper

2013-Nov-07 18:13 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On 07/11/13 16:53, Jan Beulich wrote:>>>> On 07.11.13 at 17:02, <Jeff_Zimmerman@McAfee.com> wrote:
>> On Nov 7, 2013, at 7:57 AM, Jan Beulich <JBeulich@suse.com>
>>  wrote:
>>
>>>>>> On 07.11.13 at 16:41, <Jeff_Zimmerman@McAfee.com>
wrote:
>>>> On Nov 7, 2013, at 1:30 AM, Ian Campbell
<Ian.Campbell@citrix.com>  wrote:
>>>>> I was also wondering about the behaviour of using vmx
instructions in a
>>>>> guest despite vmx not being visible in cpuid...
>>>>>
>>>> We have found in our situation this is exactly the case. To
verify we wrote
>>>> some
>>>> test code that makes vmx calls without checking cupid. On bare
hardware the
>>>> program
>>>> executes as expected. In a VM on Xen it causes the hypervisor
to panic.
>>> You trying it doesn''t yet imply that Windows also does so.
>>>
>>> Also, you say "program" - are you using these from user
mode code?
>> Yes, from windows run as a privileged user. Windows XP sp3 can cause
the
>> crash.
>> It seems windows 7 has better security, we cannot crash the system from
a
>> win7 guest.
> Which is sort of odd. Anyway - care to try the attached patch?
>
> Jan
>
I have managed to reproduce the issue, and the patch appears to fix things.

I have to admit to being very surprised that the VMX hardware doesn''t
check CR4.VMXE before causing a vmexit.

Reviewed-and-tested-by: Andrew Cooper <andrew.cooper3@citrix.com>

<Jeff_Zimmerman@McAfee.com>

2013-Nov-07 18:33 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

On Nov 7, 2013, at 8:53 AM, Jan Beulich <JBeulich@suse.com>
 wrote:
>>>> On 07.11.13 at 17:02, <Jeff_Zimmerman@McAfee.com> wrote:
> 
>> On Nov 7, 2013, at 7:57 AM, Jan Beulich <JBeulich@suse.com>
>> wrote:
>> 
>>>>>> On 07.11.13 at 16:41, <Jeff_Zimmerman@McAfee.com>
wrote:
>>>> On Nov 7, 2013, at 1:30 AM, Ian Campbell
<Ian.Campbell@citrix.com>  wrote:
>>>>> I was also wondering about the behaviour of using vmx
instructions in a
>>>>> guest despite vmx not being visible in cpuid...
>>>>> 
>>>> We have found in our situation this is exactly the case. To
verify we wrote
>>>> some
>>>> test code that makes vmx calls without checking cupid. On bare
hardware the
>>>> program
>>>> executes as expected. In a VM on Xen it causes the hypervisor
to panic.
>>> 
>>> You trying it doesn''t yet imply that Windows also does so.
>>> 
>>> Also, you say "program" - are you using these from user
mode code?
>> 
>> Yes, from windows run as a privileged user. Windows XP sp3 can cause
the
>> crash.
>> It seems windows 7 has better security, we cannot crash the system from
a
>> win7 guest.
> 
> Which is sort of odd. Anyway - care to try the attached patch?
> 
> Jan
> 
> <xsa75.patch>
Just tried your patch. It seems to mitigate the problem.
Thanks!  -jeff

Jan Beulich

2013-Nov-08 07:50 UTC

head link

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

>>> On 07.11.13 at 18:02, Andrew Cooper
<andrew.cooper3@citrix.com> wrote:
> While the patch does look plausible, there is still clearly an issue
> that an HVM guest with nested_virt disabled can even use the VMX
> instructions, rather than getting flat out #UD exceptions.
The real CR4.VMXE is (of course) set, and basing a decision on the
read shadow would clearly be wrong from an architectural pov (as
then this would no longer be just a read shadow).

And this isn''t the problem here anyway - one problems is that the
privilege level check is done _after_ the VMX non-root mode one.
I guess they do it that way in order to allow the VMM maximum
flexibility.

Jan

Xen devel - Nov 2013 - Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)

Re: Intermittent fatal page fault with XEN 4.3.1 (Centos 6.3 DOM0 with linux kernel 3.10.16.)