David Stone
2007-Dec-12 22:13 UTC
[Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
Hi, I''m doing an experiment in which I try to pass through a PCI Express graphics card to a Windows DomU via VT-D. There is another thread in which the overall feasibility of this is discussed, but here I''d at least like to better understand the crash I''m seeing when I do this. (I can pass through a PCI NIC via VT-D no problem). I hide the graphics card from Dom0 (and I have Xen and Grub using the serial port rather than than the graphics card for output) and pass it through to my Windows DomU. About 20 seconds or so after I create my Windows DomU, the (physical) machine reboots. I found that when I''m lucky I get this message from Xen right before the reboot, which I''m assuming is related: [root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given guest l1e flags:63 at 10000000 (the effective mm type:6), because the host mtrr type is:0 (XEN) CPU 1: Machine Check Exception: 0000000000000005 (XEN) Bank 0: b200004000000800 (XEN) Bank 5: b200121020080400 (XEN) (XEN) **************************************** (XEN) Panic on CPU 1: (XEN) CPU context corrupt**************************************** (XEN) (XEN) Reboot in five seconds.. I''d like to better understand how Xen handles MTRR/PAT in its page table shadowing. I do generally understand page table shadowing, and just now having read the relevant section of the Intel System Programming maual, so I have good idea of what MTRR and PAT is about. I looked at the code that generates the "Conflict" message above...is is contained in get_pat_flags() .../xen/arch/x86/hvm/mtrr.c (reproduced below) and is getting called from _sh_propagate(). I can more-or-less follow the code...I guess it''s figuring out what the guest thinks the MTRR/PAT memory type should be, "combining" this with what the host says the memory type actually is according to the real MTRR, and using that to create the real ("shadow") PTE. The error message shows that the guest wants the memory to be write-back cacheable, which doesn''t match the real MTRR for that system memory address which is uncacheable. I guess my question is can someone give a higher-level explaination of how MTRR/PAT is supposed to work between Xen (the real MTRRs and real PATs in the real ("shadow") PTEs) and an HVM DomU (virtual MTRRs and fake PATs in the fake PTEs)? Thanks, Dave uint32_t get_pat_flags(struct vcpu *v, uint32_t gl1e_flags, paddr_t gpaddr, paddr_t spaddr) { uint8_t guest_eff_mm_type; uint8_t shadow_mtrr_type; uint8_t pat_entry_value; uint64_t pat = v->arch.hvm_vcpu.pat_cr; struct mtrr_state *g = &v->arch.hvm_vcpu.mtrr; /* 1. Get the effective memory type of guest physical address, * with the pair of guest MTRR and PAT */ guest_eff_mm_type = effective_mm_type(g, pat, gpaddr, gl1e_flags); /* 2. Get the memory type of host physical address, with MTRR */ shadow_mtrr_type = get_mtrr_type(&mtrr_state, spaddr); /* 3. Find the memory type in PAT, with host MTRR memory type * and guest effective memory type. */ pat_entry_value = mtrr_epat_tbl[shadow_mtrr_type][guest_eff_mm_type]; /* If conflit occurs(e.g host MTRR is UC, guest memory type is * WB),set UC as effective memory. Here, returning PAT_TYPE_UNCACHABLE will * always set effective memory as UC. */ if ( pat_entry_value == INVALID_MEM_TYPE ) { gdprintk(XENLOG_WARNING, "Conflict occurs for a given guest l1e flags:%x " "at %"PRIx64" (the effective mm type:%d), " "because the host mtrr type is:%d\n", gl1e_flags, (uint64_t)gpaddr, guest_eff_mm_type, shadow_mtrr_type); pat_entry_value = PAT_TYPE_UNCACHABLE; } /* 4. Get the pte flags */ return pat_type_2_pte_flags(pat_entry_value); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Dec-12 23:02 UTC
Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
On 12/12/07 22:13, "David Stone" <unclestoner@gmail.com> wrote:> root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given > guest l1e flags:63 at 10000000 (the effective mm type:6), because the > host mtrr type is:0 > (XEN) CPU 1: Machine Check Exception: 0000000000000005 > (XEN) Bank 0: b200004000000800 > (XEN) Bank 5: b200121020080400 > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 1: > (XEN) CPU context corrupt**************************************** > (XEN) > (XEN) Reboot in five seconds..That looks like the CPU toasted itself. Bits 0-16 == 0x0400 in a machine-check status register means ''CPU internal timer error''. Perhaps this #MC means something else in the context of VT-d though? We probably need someone from Intel to help decode what happened here.> I guess my question is can someone give a higher-level explaination of > how MTRR/PAT is supposed to work between Xen (the real MTRRs and real > PATs in the real ("shadow") PTEs) and an HVM DomU (virtual MTRRs and > fake PATs in the fake PTEs)?You pretty much described it yourself. It tries to pick a shadow-pte PAT value which, coupled with the physical MTRR value for that physical address, will give the same cache attribute as the guest virtual PAT/MTRR combination. If it can''t find a match (not entirely unlikely I suspect) then we print a warning and fall back to UC. This code is quite ''fresh'' and definitely open for changes based on real-world testing. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Stone
2007-Dec-14 22:43 UTC
Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
> > root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given > > guest l1e flags:63 at 10000000 (the effective mm type:6), because the > > host mtrr type is:0 > > (XEN) CPU 1: Machine Check Exception: 0000000000000005 > > (XEN) Bank 0: b200004000000800 > > (XEN) Bank 5: b200121020080400 > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 1: > > (XEN) CPU context corrupt**************************************** > > (XEN) > > (XEN) Reboot in five seconds.. > > That looks like the CPU toasted itself. Bits 0-16 == 0x0400 in a > machine-check status register means ''CPU internal timer error''. Perhaps this > #MC means something else in the context of VT-d though? We probably need > someone from Intel to help decode what happened here.Hmm, thanks. I''ll concentrate ont he #MC for now. Regarding which, how areyou resolving 0x0400 in the status register to ''CPU internal timer error''? I''m looking at the "System Programming" Intel manual and it seems to indicate that an Error Code with bits 0000 01xx xxxx xxxx (like 0x0400) is an "Internal Unclassified" error. For machine-checks, is there the notion of protecting the hypervisor from problems encountered in the HVM guest? I.e., if a #MC happens when a guest is executing (non-root mode), is the host equally screwed? I''m guessing not if it is the nature of a #MC is such that it is the processor itself that is screwed, not any particular level of hardware? Finally, one thing I''m still not sure about is exactly what PCI devices (as identified by B:D:F) I should hide from Dom0 and pass through to the guest. For my machine, the PCI topology as seen from Dom0 is: #lspci 00:00.0 Host bridge [0600]: Intel Corporation DRAM Controller [8086:29b0] (rev 02) 00:01.0 PCI bridge [0604]: Intel Corporation PCI Express Root Port [8086:29b1] (rev 02) 00:1c.0 PCI bridge [0604]: Intel Corporation PCI Express Port 1 [8086:2940] (rev 02) 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge [8086:244e] (rev 92) 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Unknown device [1002:94c3] 01:00.0 is the 16-lane PCI-Express graphics card I''m trying to pass through to my Windows DomU. 00:01.0 is the root complex to which is attached (I''m pretty sure based on the below). I think 00:1c.0 is a switch to a one-lane PCI Express slot on the motherboard. So I''m hiding/passing through both the root complex (00:01.0) and the graphics card (01:00.0). Interestingly if I explicitly hide the root complex only, pciback seems to automagically graps the graphics card. Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Dec-14 23:55 UTC
Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
On 14/12/07 22:43, "David Stone" <unclestoner@gmail.com> wrote:> Regarding which, how areyou resolving 0x0400 in the status register to > ''CPU internal timer error''? I''m looking at the "System Programming" > Intel manual and it seems to indicate that an Error Code with bits > 0000 01xx xxxx xxxx (like 0x0400) is an "Internal Unclassified" error.I was looking at the very latest manual (Vol 3A, Nov 2007). It provides a more specific decoding for 0000 0100 0000 0000.> For machine-checks, is there the notion of protecting the hypervisor > from problems encountered in the HVM guest? I.e., if a #MC happens > when a guest is executing (non-root mode), is the host equally > screwed? I''m guessing not if it is the nature of a #MC is such that > it is the processor itself that is screwed, not any particular level > of hardware?If a #MC happens, it''s bad news!> Finally, one thing I''m still not sure about is exactly what PCI > devices (as identified by B:D:F) I should hide from Dom0 and pass > through to the guest. For my machine, the PCI topology as seen from > Dom0 is:I''m not an expert on PCI topologies I''m afraid. :-( -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Han, Weidong
2007-Dec-17 02:21 UTC
RE: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
I think only need to hide graphics card (01:00.0). Randy (Weidong) David Stone wrote:> #lspci > 00:00.0 Host bridge [0600]: Intel Corporation DRAM Controller > [8086:29b0] (rev 02) > 00:01.0 PCI bridge [0604]: Intel Corporation PCI Express Root Port > [8086:29b1] (rev 02) > 00:1c.0 PCI bridge [0604]: Intel Corporation PCI Express Port 1 > [8086:2940] (rev 02) > 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge > [8086:244e] (rev 92) > 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Unknown > device [1002:94c3] > > 01:00.0 is the 16-lane PCI-Express graphics card I''m trying to pass > through to my Windows DomU. 00:01.0 is the root complex to which is > attached (I''m pretty sure based on the below). I think 00:1c.0 is a > switch to a one-lane PCI Express slot on the motherboard. > > So I''m hiding/passing through both the root complex (00:01.0) and the > graphics card (01:00.0). Interestingly if I explicitly hide the root > complex only, pciback seems to automagically graps the graphics card. > > Dave > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Su, Disheng
2007-Dec-17 02:51 UTC
RE: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
David Stone wrote:>>> root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given >>> guest l1e flags:63 at 10000000 (the effective mm type:6), because >>> the host mtrr type is:0 (XEN) CPU 1: Machine Check Exception: >>> 0000000000000005 (XEN) Bank 0: b200004000000800 >>> (XEN) Bank 5: b200121020080400 >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 1: >>> (XEN) CPU context corrupt**************************************** >>> (XEN) (XEN) Reboot in five seconds.. >> >> That looks like the CPU toasted itself. Bits 0-16 == 0x0400 in a >> machine-check status register means ''CPU internal timer error''. >> Perhaps this #MC means something else in the context of VT-d though? >> We probably need someone from Intel to help decode what happened >> here. > > Hmm, thanks. I''ll concentrate ont he #MC for now. > > Regarding which, how areyou resolving 0x0400 in the status register to > ''CPU internal timer error''? I''m looking at the "System Programming" > Intel manual and it seems to indicate that an Error Code with bits > 0000 01xx xxxx xxxx (like 0x0400) is an "Internal Unclassified" error. >For #MC, I read the spec 14.7.2, Bits 0-16 = 0800 means : BUSL0_SRC_ERR_M_NOTIMEOUT_ERR. Seems this one may relate to memory operation. For MTRR conflict warning, it should not result to a MC... A little curious about the cache type of the spaddr. Usually, the conflict Occurs when guest wants a strong cache type, but spaddr is a weaker cache type. Can you check the cache type of spaddr/gpaddr manually? Or provide more information About guest/host MTRR/PAT, and corresponding pte.> For machine-checks, is there the notion of protecting the hypervisor > from problems encountered in the HVM guest? I.e., if a #MC happens > when a guest is executing (non-root mode), is the host equally > screwed? I''m guessing not if it is the nature of a #MC is such that > it is the processor itself that is screwed, not any particular level > of hardware? > > Finally, one thing I''m still not sure about is exactly what PCI > devices (as identified by B:D:F) I should hide from Dom0 and pass > through to the guest. For my machine, the PCI topology as seen from > Dom0 is: > > #lspci > 00:00.0 Host bridge [0600]: Intel Corporation DRAM Controller > [8086:29b0] (rev 02) > 00:01.0 PCI bridge [0604]: Intel Corporation PCI Express Root Port > [8086:29b1] (rev 02) > 00:1c.0 PCI bridge [0604]: Intel Corporation PCI Express Port 1 > [8086:2940] (rev 02) > 00:1e.0 PCI bridge [0604]: Intel Corporation 82801 PCI Bridge > [8086:244e] (rev 92) > 01:00.0 VGA compatible controller [0300]: ATI Technologies Inc Unknown > device [1002:94c3] > > 01:00.0 is the 16-lane PCI-Express graphics card I''m trying to pass > through to my Windows DomU. 00:01.0 is the root complex to which is > attached (I''m pretty sure based on the below). I think 00:1c.0 is a > switch to a one-lane PCI Express slot on the motherboard. > > So I''m hiding/passing through both the root complex (00:01.0) and the > graphics card (01:00.0). Interestingly if I explicitly hide the root > complex only, pciback seems to automagically graps the graphics card. > > Dave > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-develBest Regards, Disheng, Su _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Stone
2008-Jan-02 21:41 UTC
Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
Thanks for your response. I''ve done a bit more troubleshooting on this. Below is the error message again:> >>> root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a given > >>> guest l1e flags:63 at 10000000 (the effective mm type:6), because > >>> the host mtrr type is:0 (XEN) CPU 1: Machine Check Exception: > >>> 0000000000000005 (XEN) Bank 0: b200004000000800 > >>> (XEN) Bank 5: b200121020080400 > >>> (XEN) > >>> (XEN) **************************************** > >>> (XEN) Panic on CPU 1: > >>> (XEN) CPU context corrupt**************************************** > >>> (XEN) (XEN) Reboot in five seconds..I know that theoretically the memory cache-type mismatch shouldn''t directly cause a Machine Check, but I can''t help but think it''s related...I see the machine check if and only if I see the cache-type mismatch and they happen in quick succession. The guest physical address is 0x10000000 as shown above. I added more tracing and found that it corresponds to host address 0x80020000.>From the qemu logs I also found that this is a PCI BAR for my PCIExpress graphics card that I am trying to pass through via IOMMU (see below) pt_register_regions: IO region registered (size=0x00010000 base_addr=0x80020000). With lspci on Dom0 I confirmed that 0x80020000 is a 64KB region of address space assigned to the PCI-XP graphics card. It is marked non-prefetchable. (The card also has a 256MB region assigned to it as prefetchable.) I also found that both the guest PAT and the guest MTRR for 0x10000000 classify that address as type 6 (MTRR_TYPE_WRBACK), making the guest effective type also 6. So my first question is, does anyone have a guess as to what this 64KB region assigned to the graphics card is for? I assume the 256MB region is the general-purpose video memory for textures, vertices, etc. The mtrr warning message happens when the shadow page table is getting updated as the guest is trying to update his page tables. But why would the guest only update the PTE at the beginning of the 64KB region, and not all 64KB/4KB=16 PTEs in the region? I assume the guest isn''t updating them all, because then I would get 16 of the mtrr warning messages? I wonder if the guest is updating the page table (causing the MTRR warning but succeeding), and then trying to read/write from that page, and this is timing out causing the machine check? Any help is much appreciated! Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Su, Disheng
2008-Jan-03 04:55 UTC
RE: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
David Stone wrote:> Thanks for your response. I''ve done a bit more troubleshooting on > this. Below is the error message again: > >>>>> root@localhost xen]# (XEN) mtrr.c:552:d1 Conflict occurs for a >>>>> given guest l1e flags:63 at 10000000 (the effective mm type:6), >>>>> because the host mtrr type is:0 (XEN) CPU 1: Machine Check >>>>> Exception: 0000000000000005 (XEN) Bank 0: b200004000000800 >>>>> (XEN) Bank 5: b200121020080400 >>>>> (XEN) >>>>> (XEN) **************************************** >>>>> (XEN) Panic on CPU 1: >>>>> (XEN) CPU context corrupt**************************************** >>>>> (XEN) (XEN) Reboot in five seconds.. > > I know that theoretically the memory cache-type mismatch shouldn''t > directly cause a Machine Check, but I can''t help but think it''s > related...I see the machine check if and only if I see the cache-type > mismatch and they happen in quick succession. > > The guest physical address is 0x10000000 as shown above. I added more > tracing and found that it corresponds to host address 0x80020000. > From the qemu logs I also found that this is a PCI BAR for my PCI > Express graphics card that I am trying to pass through via IOMMU (see > below) > pt_register_regions: IO region registered (size=0x00010000 > base_addr=0x80020000). > With lspci on Dom0 I confirmed that 0x80020000 is a 64KB region of > address space assigned to the PCI-XP graphics card. It is marked > non-prefetchable. (The card also has a 256MB region assigned to it as > prefetchable.) I also found that both the guest PAT and the guest > MTRR for 0x10000000 classify that address as type 6 > (MTRR_TYPE_WRBACK), making the guest effective type also 6. > > So my first question is, does anyone have a guess as to what this 64KB > region assigned to the graphics card is for? I assume the 256MB > region is the general-purpose video memory for textures, vertices, > etc.64KB region is GART?(just guess:)> > The mtrr warning message happens when the shadow page table is getting > updated as the guest is trying to update his page tables. But why > would the guest only update the PTE at the beginning of the 64KB > region, and not all 64KB/4KB=16 PTEs in the region? I assume the > guest isn''t updating them all, because then I would get 16 of the mtrr > warning messages? I wonder if the guest is updating the page table > (causing the MTRR warning but succeeding), and then trying to > read/write from that page, and this is timing out causing the machine > check?Why the guest thinks this region as WB? Is it already wrong in there?> > Any help is much appreciated! > DaveBest Regards, Disheng, Su _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Stone
2008-Jan-03 22:04 UTC
Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
On Jan 2, 2008 11:55 PM, Su, Disheng <disheng.su@intel.com> wrote:> > > So my first question is, does anyone have a guess as to what this 64KB > > region assigned to the graphics card is for? I assume the 256MB > > region is the general-purpose video memory for textures, vertices, > > etc. > 64KB region is GART?(just guess:)That would make sense, I had not thought of that.> > The mtrr warning message happens when the shadow page table is getting > > updated as the guest is trying to update his page tables. But why > > would the guest only update the PTE at the beginning of the 64KB > > region, and not all 64KB/4KB=16 PTEs in the region? I assume the > > guest isn''t updating them all, because then I would get 16 of the mtrr > > warning messages? I wonder if the guest is updating the page table > > (causing the MTRR warning but succeeding), and then trying to > > read/write from that page, and this is timing out causing the machine > > check? > Why the guest thinks this region as WB? Is it already wrong in there?OK, I did some more tracing based on your question and now I think I _may_ see a problem. Anyway: The guest thinks the memory is WB because the guest has a single (virtual) variable-range MTRR that specifies the addresses 0x00000000-0x40000000 are WB. 0x10000000 falls in this range. This MTRR is coming from the guest''s (emulated?) E820. This in turn is coming from the "memory" directive in the domain''s configuration file...I am specifying memory=''1024'' which is 1GB which is 0x00000000-0x40000000. So, the emulated E820 is reporting the bottom 1GB as available, and an MTRR is getting created to represent this area as WB. On the other hand the hypervisor says the memory is uncacheable because the host has three variable range MTRRs: - The first specifies the entire 32-bit address range as WB - A second overrides the first for 0x7df00000-0x7dffffff as uncacheable - A third overrides the first for 0x80000000-0xfffffff s uncacheable. These set of MTRRs seem to be standard for bare-metal Linux, Dom0 Linux, and the Xen hypervisor. So, the guest address of 0x10000000 is WB according to the guest''s MTRRs. The corresponding host address 0x80020000 is uncacheable according to the hypervisor''s MTRRs. But hold on...this excercise made me realize that it appears the guest is programming the graphics card''s BAR with an address (guest-physical address of 0x10000000) that is within the guest''s system memory (lower 1GB of guest-physical address space, 0x00000000-0x40000000). Can someone out there throw me a bone and tell me that I''m right in thinking this should never happen? Dave _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jan-06 23:00 UTC
Re: [Xen-devel] Guest-vs-Host MTRR/PAT conflict and a crash?
On 3/1/08 22:04, "David Stone" <unclestoner@gmail.com> wrote:> But hold on...this excercise made me realize that it appears the guest > is programming the graphics card''s BAR with an address (guest-physical > address of 0x10000000) that is within the guest''s system memory (lower > 1GB of guest-physical address space, 0x00000000-0x40000000). Can > someone out there throw me a bone and tell me that I''m right in > thinking this should never happen?That should never happen. I wonder where it gets that address from. Perhaps it is stupidly hardcoded somewhere? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel