After rebasing my dom0 on latest, it didn''t boot. After debugging couple days, it turned out to be : + if ( is_pvh_domain(d) ) + { + if ( direct_mmio ) + return MTRR_TYPE_UNCACHABLE; + return MTRR_TYPE_WRBACK; + } + I had in my patches, missing in epte_get_entry_emt() in latest. So, since I don''t know much about this, is an HVM guest setting MTRR range types? Looking for suggestions on best way to do this for PVH. I thought from EPT standpoint, either RAM or IO region, so either MTRR_TYPE_UNCACHABLE or MTRR_TYPE_WRBACK. In case of MSR_IA32_CR_PAT, the guest just writes to it natively. I''m confused between PAT and MTRRs, reading now that I can just focus on this fixme. Should I look where guest is parsing e820 and setting MTRRs on baremetal? thanks Mukesh
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Mukesh Rathor > Sent: Wednesday, November 20, 2013 10:12 AM > To: Xen-devel@lists.xensource.com; Jan Beulich > Subject: [Xen-devel] PVH and mtrr/PAT......... > > After rebasing my dom0 on latest, it didn''t boot. After debugging > couple days, it turned out to be : > > + if ( is_pvh_domain(d) ) > + { > + if ( direct_mmio ) > + return MTRR_TYPE_UNCACHABLE; > + return MTRR_TYPE_WRBACK; > + } > + > > I had in my patches, missing in epte_get_entry_emt() in latest. > > So, since I don''t know much about this, is an HVM guest setting MTRR > range types? Looking for suggestions on best way to do this for PVH.When using EPT for HVM guest, the MTRRs have no effect on the memory type used for an access to an guest physical address. Instead, the EMT (EPT memory type) in EPT entry acts as the role. epte_get_entry_emt() demonstrates how to determine the EMT for a certain gfn/mfn.> > I thought from EPT standpoint, either RAM or IO region, so either > MTRR_TYPE_UNCACHABLE or MTRR_TYPE_WRBACK. In case of > MSR_IA32_CR_PAT, > the guest just writes to it natively. I''m confused between PAT and > MTRRs, reading now that I can just focus on this fixme. Should I look > where guest is parsing e820 and setting MTRRs on baremetal?If "IPAT" bit in EPT entry is set to 1, PAT will not take effect and only EMT controls the memory type. Otherwise if PAT and MTRR/EMT are both enabled, you can refer to a detailed table in SDM 11.5.2.2 to see which memory type really takes effect. Thanks, Dongxiao> > thanks > Mukesh > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > After rebasing my dom0 on latest, it didn''t boot. After debugging > couple days, it turned out to be : > > + if ( is_pvh_domain(d) ) > + { > + if ( direct_mmio ) > + return MTRR_TYPE_UNCACHABLE; > + return MTRR_TYPE_WRBACK; > + } > + > > I had in my patches, missing in epte_get_entry_emt() in latest. > > So, since I don''t know much about this, is an HVM guest setting MTRR > range types? Looking for suggestions on best way to do this for PVH.A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV guest isn''t. I''m inclined to prefer PV behavior here to be used for PVH (since, as explained by Dongxiao, MTRRs don''t really matter for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to get translated to EPT memory types anyway, hence a PVH guest ought to be fine ignoring the MTRRs altogether and handling memory types exclusively via PAT mechanisms). Jan
On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: >> After rebasing my dom0 on latest, it didn''t boot. After debugging >> couple days, it turned out to be : >> >> + if ( is_pvh_domain(d) ) >> + { >> + if ( direct_mmio ) >> + return MTRR_TYPE_UNCACHABLE; >> + return MTRR_TYPE_WRBACK; >> + } >> + >> >> I had in my patches, missing in epte_get_entry_emt() in latest. >> >> So, since I don''t know much about this, is an HVM guest setting MTRR >> range types? Looking for suggestions on best way to do this for PVH. > > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV > guest isn''t. I''m inclined to prefer PV behavior here to be used for > PVH (since, as explained by Dongxiao, MTRRs don''t really matter > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to > get translated to EPT memory types anyway, hence a PVH guest > ought to be fine ignoring the MTRRs altogether and handling memory > types exclusively via PAT mechanisms).Mukesh, Do you know why this line is having this effect? For one, is it a matter of direct_mmio pages being given something other than UNCACHEABLE, or a matter of non-direct_mmio pages given something other than WRBACK? And is the problem that the guest is *not* setting MTRRs, or that the guest *is* setting MTRRs? I''d prefer to avoid having a special case for PVH in this path if possible. -George
On Wed, 20 Nov 2013 18:12:13 +0000 George Dunlap <dunlapg@umich.edu> wrote:> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> > wrote: > >>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> > >>>> wrote: > >> After rebasing my dom0 on latest, it didn''t boot. After debugging > >> couple days, it turned out to be : > >> > >> + if ( is_pvh_domain(d) ) > >> + { > >> + if ( direct_mmio ) > >> + return MTRR_TYPE_UNCACHABLE; > >> + return MTRR_TYPE_WRBACK; > >> + } > >> + > >> > >> I had in my patches, missing in epte_get_entry_emt() in latest. > >> > >> So, since I don''t know much about this, is an HVM guest setting > >> MTRR range types? Looking for suggestions on best way to do this > >> for PVH. > > > > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV > > guest isn''t. I''m inclined to prefer PV behavior here to be used for > > PVH (since, as explained by Dongxiao, MTRRs don''t really matter > > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to > > get translated to EPT memory types anyway, hence a PVH guest > > ought to be fine ignoring the MTRRs altogether and handling memory > > types exclusively via PAT mechanisms). > > Mukesh, > > Do you know why this line is having this effect? For one, is it a > matter of direct_mmio pages being given something other than > UNCACHEABLE, or a matter of non-direct_mmio pages given something > other than WRBACK? > > And is the problem that the guest is *not* setting MTRRs, or that the > guest *is* setting MTRRs? > > I''d prefer to avoid having a special case for PVH in this path if > possible.Without any changes to epte_get_entry_emt(), all types are being returned as MTRR_TYPE_WRBACK for PVH because of: if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) return MTRR_TYPE_WRBACK; This is problem for low level non-ram pages being accessed in dom0, (which interesting gave MCE errors). non-ram IO pages have to be MTRR_TYPE_UNCACHABLE. After changing this to, if ( !is_pvh_vcpu(v) && !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) I started hitting if ( direct_mmio ), and getting proper UNCACHABLE for io pages, but RAM pages started being returned as UNCACHABLE also thru get_mtrr_type() which I''ve not investigated. For domU, it''s incorrect, but happens to work because of: if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) return MTRR_TYPE_WRBACK; as domU only has RAM pages, and thus WRBACK is correct for all. My quick fix while we come up with better solution was: ----------- + /* PVH fixme: Add support for more memory types. */ + if ( is_pvh_domain(d) ) + { + if ( direct_mmio ) + return MTRR_TYPE_UNCACHABLE; + return MTRR_TYPE_WRBACK; + } + if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) --------------- It appears you didn''t check all places where params was being used before adding it for PVH. thanks Mukesh
On Wed, 20 Nov 2013 08:42:08 +0000 "Jan Beulich" <JBeulich@suse.com> wrote:> >>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> > >>> wrote: > > After rebasing my dom0 on latest, it didn''t boot. After debugging > > couple days, it turned out to be : > > > > + if ( is_pvh_domain(d) ) > > + { > > + if ( direct_mmio ) > > + return MTRR_TYPE_UNCACHABLE; > > + return MTRR_TYPE_WRBACK; > > + } > > + > > > > I had in my patches, missing in epte_get_entry_emt() in latest. > > > > So, since I don''t know much about this, is an HVM guest setting > > MTRR range types? Looking for suggestions on best way to do this > > for PVH. > > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV > guest isn''t. I''m inclined to prefer PV behavior here to be used for > PVH (since, as explained by Dongxiao, MTRRs don''t really matter > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to > get translated to EPT memory types anyway, hence a PVH guest > ought to be fine ignoring the MTRRs altogether and handling memory > types exclusively via PAT mechanisms).Ok. So, it appears that for PV, we store the cacheattr in page_info and use it during pte update. But in case of PVH, the page tables are native, the pte update is native, so we don''t really have access to PCD/PWT/PAT bits in the pte entry! It says PAT+PWT+PCD selects a PAT entry from the IA32_PAT msr. In case of PVH, the msr is guest managed, and intercept is disabled. I assume the EPT should mirror the pte PAT entries? Or, can we just set WB for all RAM, and UC for all non-ram for PVH and keep it simple? Thanks a lot for the help. Mukesh
On 11/20/2013 9:42 PM, Mukesh Rathor wrote:> On Wed, 20 Nov 2013 08:42:08 +0000 > "Jan Beulich" <JBeulich@suse.com> wrote: > >>>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> >>>>> wrote: >>> After rebasing my dom0 on latest, it didn''t boot. After debugging >>> couple days, it turned out to be : >>> >>> + if ( is_pvh_domain(d) ) >>> + { >>> + if ( direct_mmio ) >>> + return MTRR_TYPE_UNCACHABLE; >>> + return MTRR_TYPE_WRBACK; >>> + } >>> + >>> >>> I had in my patches, missing in epte_get_entry_emt() in latest. >>> >>> So, since I don''t know much about this, is an HVM guest setting >>> MTRR range types? Looking for suggestions on best way to do this >>> for PVH. >> >> A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV >> guest isn''t. I''m inclined to prefer PV behavior here to be used for >> PVH (since, as explained by Dongxiao, MTRRs don''t really matter >> for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to >> get translated to EPT memory types anyway, hence a PVH guest >> ought to be fine ignoring the MTRRs altogether and handling memory >> types exclusively via PAT mechanisms). > > Ok. So, it appears that for PV, we store the cacheattr > in page_info and use it during pte update. But in case of PVH, > the page tables are native, the pte update is native, so we > don''t really have access to PCD/PWT/PAT bits in the pte entry! Right, which is OK - b/c the mechanism to set a WC page and back is at odds with how the Linux sets its bits. This is a problem for PV guests (pvops based) as they cannot do PAT right now. HVM guest can as they omit all of this. > > It says PAT+PWT+PCD selects a PAT entry from the IA32_PAT msr. Right. > In case of PVH, the msr is guest managed, and intercept is disabled. I thought the IA32_PAT MSR was intercepted? > I assume the EPT should mirror the pte PAT entries? > > Or, can we just set WB for all RAM, and UC for all non-ram for > PVH and keep it simple? If you are using an i915 it will want to map its MMIO bars as WC. Ditto for InfiniBand cards, radeon and nouveau cards. > > Thanks a lot for the help. > Mukesh > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
>>> Mukesh Rathor <mukesh.rathor@oracle.com> 11/21/13 3:42 AM >>> >On Wed, 20 Nov 2013 08:42:08 +0000 "Jan Beulich" <JBeulich@suse.com> wrote: >> >>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> >> >>> wrote: >> > So, since I don''t know much about this, is an HVM guest setting >> > MTRR range types? Looking for suggestions on best way to do this >> > for PVH. >> >> A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV >> guest isn''t. I''m inclined to prefer PV behavior here to be used for >> PVH (since, as explained by Dongxiao, MTRRs don''t really matter >> for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to >> get translated to EPT memory types anyway, hence a PVH guest >> ought to be fine ignoring the MTRRs altogether and handling memory >> types exclusively via PAT mechanisms). > >Ok. So, it appears that for PV, we store the cacheattr >in page_info and use it during pte update. But in case of PVH, >the page tables are native, the pte update is native, so we >don''t really have access to PCD/PWT/PAT bits in the pte entry! > >It says PAT+PWT+PCD selects a PAT entry from the IA32_PAT msr. >In case of PVH, the msr is guest managed, and intercept is disabled. >I assume the EPT should mirror the pte PAT entries?Mirroring would be the simplest possible solution, since then EMT and PAT are always in agreement. But EMT is really there to allow modifying the (guest controlled) PAT selection. I.e. (not for PVH, but as a general example) while a guest would map the LAPIC page UC, the hypervisor would generally be happy with it being WB.>Or, can we just set WB for all RAM, and UC for all non-ram for >PVH and keep it simple?As a first step it could be that simple, but as soon as possible at least WC should be taken into consideration. Jan
On Wed, Nov 20, 2013 at 10:24 PM, Mukesh Rathor <mukesh.rathor@oracle.com> wrote:> On Wed, 20 Nov 2013 18:12:13 +0000 > George Dunlap <dunlapg@umich.edu> wrote: > >> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> >> wrote: >> >>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> >> >>>> wrote: >> >> After rebasing my dom0 on latest, it didn''t boot. After debugging >> >> couple days, it turned out to be : >> >> >> >> + if ( is_pvh_domain(d) ) >> >> + { >> >> + if ( direct_mmio ) >> >> + return MTRR_TYPE_UNCACHABLE; >> >> + return MTRR_TYPE_WRBACK; >> >> + } >> >> + >> >> >> >> I had in my patches, missing in epte_get_entry_emt() in latest. >> >> >> >> So, since I don''t know much about this, is an HVM guest setting >> >> MTRR range types? Looking for suggestions on best way to do this >> >> for PVH. >> > >> > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV >> > guest isn''t. I''m inclined to prefer PV behavior here to be used for >> > PVH (since, as explained by Dongxiao, MTRRs don''t really matter >> > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to >> > get translated to EPT memory types anyway, hence a PVH guest >> > ought to be fine ignoring the MTRRs altogether and handling memory >> > types exclusively via PAT mechanisms). >> >> Mukesh, >> >> Do you know why this line is having this effect? For one, is it a >> matter of direct_mmio pages being given something other than >> UNCACHEABLE, or a matter of non-direct_mmio pages given something >> other than WRBACK? >> >> And is the problem that the guest is *not* setting MTRRs, or that the >> guest *is* setting MTRRs? >> >> I''d prefer to avoid having a special case for PVH in this path if >> possible. > > Without any changes to epte_get_entry_emt(), all types are being returned > as MTRR_TYPE_WRBACK for PVH because of: > > if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > return MTRR_TYPE_WRBACK; > > This is problem for low level non-ram pages being accessed in dom0, > (which interesting gave MCE errors). non-ram IO pages have to be > MTRR_TYPE_UNCACHABLE.Hmm -- that looks like a bug in the logic there. AFAICT, there''s no reason for the lack of IDENT_PT to change the memory type like that. Unfortunately the changeset that introduced this (77283249) has neither comments nor a proper description of what''s going on, so it''s hard to tell where this came from. We should check to make sure that this conditional is actually necessary; since the HVM builder unconditionally sets IDENT_PT, I don''t think this conditional has ever actually been tested. (Obviously too late to change this for 4.4.) In the mean time, adding "is_pvh_vcpu()" here should be fine. If the conditional turns out to be necessary, we can either change this to a "can_disable_paging" flag at some point in the future, or, if we want to allow the guest to disable paging, actually set up an IDENT_PT for PVH guests.> > After changing this to, > if ( !is_pvh_vcpu(v) && > !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > > I started hitting if ( direct_mmio ), and getting proper UNCACHABLE > for io pages, but RAM pages started being returned as UNCACHABLE also > thru get_mtrr_type() which I''ve not investigated.Presumably because the PVH guests aren''t actually setting up their MTRRs, whereas HVM guests do? One solution of course would be to add MTRRs to the PVH interface. A more flexible option might be to set up default MTRRs in the domain builder (which the guest can change if they want). Or, we can change this line: gmtrr_mtype = get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << PAGE_SHIFT)); to this: gmtrr_mtype = is_hvm_domain(v) ? get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << PAGE_SHIFT)) : MTRR_TYPE_WRBACK; (Line-wrapped obviously.)> > For domU, it''s incorrect, but happens to work because of: > > if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > return MTRR_TYPE_WRBACK; > > as domU only has RAM pages, and thus WRBACK is correct for all. > > My quick fix while we come up with better solution was: > ----------- > + /* PVH fixme: Add support for more memory types. */ > + if ( is_pvh_domain(d) ) > + { > + if ( direct_mmio ) > + return MTRR_TYPE_UNCACHABLE; > + return MTRR_TYPE_WRBACK; > + } > + > if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > --------------- > > It appears you didn''t check all places where params was being used > before adding it for PVH.Given that there was no comment explaining why you were adding in the special case above, and that it was not actually necessary for the domU case (and so it passed my testing), it''s not too surprising that I missed it. I figured this was another example of the "disable everything and enable it as you need it" philosophy. I am sorry that it took you so long to track down -- I thought I had mentioned it in my cover letter, but apparently only indirectly. -George
On Thu, 21 Nov 2013 15:47:43 +0000 George Dunlap <George.Dunlap@eu.citrix.com> wrote:> On Wed, Nov 20, 2013 at 10:24 PM, Mukesh Rathor > <mukesh.rathor@oracle.com> wrote: > > On Wed, 20 Nov 2013 18:12:13 +0000 > > George Dunlap <dunlapg@umich.edu> wrote: > > > >> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> > >> wrote: > >> >>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> > >> >>>> wrote:......> > Without any changes to epte_get_entry_emt(), all types are being > > returned as MTRR_TYPE_WRBACK for PVH because of: > > > > if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > > return MTRR_TYPE_WRBACK; > > > > This is problem for low level non-ram pages being accessed in dom0, > > (which interesting gave MCE errors). non-ram IO pages have to be > > MTRR_TYPE_UNCACHABLE. > > Hmm -- that looks like a bug in the logic there. AFAICT, there''s no > reason for the lack of IDENT_PT to change the memory type like that. > > Unfortunately the changeset that introduced this (77283249) has > neither comments nor a proper description of what''s going on, so it''s > hard to tell where this came from.Yeah, I was baffled why that was there, gave up, and took the easy way out for PVH :). ...> > > > After changing this to, > > if ( !is_pvh_vcpu(v) && > > !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > > > > I started hitting if ( direct_mmio ), and getting proper UNCACHABLE > > for io pages, but RAM pages started being returned as UNCACHABLE > > also thru get_mtrr_type() which I''ve not investigated. > > Presumably because the PVH guests aren''t actually setting up their > MTRRs, whereas HVM guests do?Correct.> One solution of course would be to add MTRRs to the PVH interface. A > more flexible option might be to set up default MTRRs in the domain > builder (which the guest can change if they want).That was my first thought too, but Jan/Dongxiao pointed out guest is permitted to write virtual MTRRs and they get translated to EPT types anyways, the PV approach of just doing the PAT would be better. I''m trying to figure that out....> Or, we can change this line: > gmtrr_mtype = get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << > PAGE_SHIFT)); > > to this: > gmtrr_mtype = is_hvm_domain(v) ? > get_mtrr_type(&v->arch.hvm_vcpu.mtrr, (gfn << PAGE_SHIFT)) : > MTRR_TYPE_WRBACK; > > (Line-wrapped obviously.)Yup, that would work, and Jan is OK with it as first step. Sending patch for it.... thanks, Mukesh
On Thu, 21 Nov 2013 11:40:01 +0000 "Jan Beulich" <jbeulich@suse.com> wrote:> >>> Mukesh Rathor <mukesh.rathor@oracle.com> 11/21/13 3:42 AM >>> > >On Wed, 20 Nov 2013 08:42:08 +0000 "Jan Beulich" <JBeulich@suse.com> > >wrote: > >> >>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> > >> >>> wrote:.....> >Or, can we just set WB for all RAM, and UC for all non-ram for > >PVH and keep it simple? > > As a first step it could be that simple, but as soon as possible at > least WC should be taken into consideration.Thanks, that helps a lot. Sending patch. Mukesh
>>> On 21.11.13 at 16:47, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > On Wed, Nov 20, 2013 at 10:24 PM, Mukesh Rathor > <mukesh.rathor@oracle.com> wrote: >> On Wed, 20 Nov 2013 18:12:13 +0000 >> George Dunlap <dunlapg@umich.edu> wrote: >> >>> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> >>> wrote: >>> >>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> >>> >>>> wrote: >>> >> After rebasing my dom0 on latest, it didn''t boot. After debugging >>> >> couple days, it turned out to be : >>> >> >>> >> + if ( is_pvh_domain(d) ) >>> >> + { >>> >> + if ( direct_mmio ) >>> >> + return MTRR_TYPE_UNCACHABLE; >>> >> + return MTRR_TYPE_WRBACK; >>> >> + } >>> >> + >>> >> >>> >> I had in my patches, missing in epte_get_entry_emt() in latest. >>> >> >>> >> So, since I don''t know much about this, is an HVM guest setting >>> >> MTRR range types? Looking for suggestions on best way to do this >>> >> for PVH. >>> > >>> > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV >>> > guest isn''t. I''m inclined to prefer PV behavior here to be used for >>> > PVH (since, as explained by Dongxiao, MTRRs don''t really matter >>> > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to >>> > get translated to EPT memory types anyway, hence a PVH guest >>> > ought to be fine ignoring the MTRRs altogether and handling memory >>> > types exclusively via PAT mechanisms). >>> >>> Mukesh, >>> >>> Do you know why this line is having this effect? For one, is it a >>> matter of direct_mmio pages being given something other than >>> UNCACHEABLE, or a matter of non-direct_mmio pages given something >>> other than WRBACK? >>> >>> And is the problem that the guest is *not* setting MTRRs, or that the >>> guest *is* setting MTRRs? >>> >>> I''d prefer to avoid having a special case for PVH in this path if >>> possible. >> >> Without any changes to epte_get_entry_emt(), all types are being returned >> as MTRR_TYPE_WRBACK for PVH because of: >> >> if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) >> return MTRR_TYPE_WRBACK; >> >> This is problem for low level non-ram pages being accessed in dom0, >> (which interesting gave MCE errors). non-ram IO pages have to be >> MTRR_TYPE_UNCACHABLE. > > Hmm -- that looks like a bug in the logic there. AFAICT, there''s no > reason for the lack of IDENT_PT to change the memory type like that. > > Unfortunately the changeset that introduced this (77283249) has > neither comments nor a proper description of what''s going on, so it''s > hard to tell where this came from.While a commit without any description at all is clearly bogus (even more so in the light that this is the very change that also caused XSA-60), in the case here it introduces the function as a whole, and hence would not have been likely to comment on why the function was written the way it is. I take it that this goes alongside the other check immediately previous to it: When not set yet, assume WB (for the sake of the tool stack). But I think this tool stack requirement should be expressed without looking at this HVM param. Sadly the person having written that code doesn''t appear to be working on Xen anymore (and my not be at Intel anymore either), so I''m afraid we''ll have to sort this out ourselves. Nevertheless - Intel folks, can you comment on this please? Jan
On 21/11/13 23:41, Mukesh Rathor wrote:> On Thu, 21 Nov 2013 15:47:43 +0000 > George Dunlap <George.Dunlap@eu.citrix.com> wrote: > >> On Wed, Nov 20, 2013 at 10:24 PM, Mukesh Rathor >> <mukesh.rathor@oracle.com> wrote: >>> On Wed, 20 Nov 2013 18:12:13 +0000 >>> George Dunlap <dunlapg@umich.edu> wrote: >>> >>>> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> >>>> wrote: >>>>>>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> >>>>>>>> wrote: > ...... >>> Without any changes to epte_get_entry_emt(), all types are being >>> returned as MTRR_TYPE_WRBACK for PVH because of: >>> >>> if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) >>> return MTRR_TYPE_WRBACK; >>> >>> This is problem for low level non-ram pages being accessed in dom0, >>> (which interesting gave MCE errors). non-ram IO pages have to be >>> MTRR_TYPE_UNCACHABLE. >> Hmm -- that looks like a bug in the logic there. AFAICT, there''s no >> reason for the lack of IDENT_PT to change the memory type like that. >> >> Unfortunately the changeset that introduced this (77283249) has >> neither comments nor a proper description of what''s going on, so it''s >> hard to tell where this came from. > Yeah, I was baffled why that was there, gave up, and took the easy > way out for PVH :). > > ... >>> After changing this to, >>> if ( !is_pvh_vcpu(v) && >>> !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) >>> >>> I started hitting if ( direct_mmio ), and getting proper UNCACHABLE >>> for io pages, but RAM pages started being returned as UNCACHABLE >>> also thru get_mtrr_type() which I''ve not investigated. >> Presumably because the PVH guests aren''t actually setting up their >> MTRRs, whereas HVM guests do? > Correct. > >> One solution of course would be to add MTRRs to the PVH interface. A >> more flexible option might be to set up default MTRRs in the domain >> builder (which the guest can change if they want). > That was my first thought too, but Jan/Dongxiao pointed out guest is > permitted to write virtual MTRRs and they get translated > to EPT types anyways, the PV approach of just doing the PAT would > be better. I''m trying to figure that out....I''m not incredibly familiar with the PAT / MTRR stuff, either from a hardware level or a Xen level, so sorry if this is a dumb question. It sounds like you''re saying, because we have virtual MTRRs that are already translated into EPT types, we should disable virtual MTRRs and use PAT instead. That doesn''t make any kind of sense to me. (I didn''t understand it when Jan said it either.) If there is already a mechanism to allow HVM guests to set memory types, and it even works for passed-through devices for HVM guests, why would we disable it and use something else instead? -George
>>> On 22.11.13 at 11:43, George Dunlap <george.dunlap@eu.citrix.com> wrote: > I''m not incredibly familiar with the PAT / MTRR stuff, either from a > hardware level or a Xen level, so sorry if this is a dumb question. It > sounds like you''re saying, because we have virtual MTRRs that are > already translated into EPT types, we should disable virtual MTRRs and > use PAT instead. That doesn''t make any kind of sense to me. (I didn''t > understand it when Jan said it either.)The underlying observation is that MTRRs aren''t really needed - all they can do can be done with PAT. They pre-date PAT though, hence hardware vendors can''t easily drop them. But in a model like PVH I just don''t see the value of allowing their use, considering that this adds unnecessary complexity. Jan
On 22/11/13 11:09, Jan Beulich wrote:>>>> On 22.11.13 at 11:43, George Dunlap <george.dunlap@eu.citrix.com> wrote: >> I''m not incredibly familiar with the PAT / MTRR stuff, either from a >> hardware level or a Xen level, so sorry if this is a dumb question. It >> sounds like you''re saying, because we have virtual MTRRs that are >> already translated into EPT types, we should disable virtual MTRRs and >> use PAT instead. That doesn''t make any kind of sense to me. (I didn''t >> understand it when Jan said it either.) > The underlying observation is that MTRRs aren''t really needed - > all they can do can be done with PAT. They pre-date PAT though, > hence hardware vendors can''t easily drop them. But in a model > like PVH I just don''t see the value of allowing their use, considering > that this adds unnecessary complexity.OK -- so when we move forward with the plan of "PVH mode is HVM mode with a couple of tweaks", you think that we should have an "enable virtual MTRR" flag, and disable this for PVH mode? -George
>>> On 22.11.13 at 13:16, George Dunlap <george.dunlap@eu.citrix.com> wrote: > On 22/11/13 11:09, Jan Beulich wrote: >>>>> On 22.11.13 at 11:43, George Dunlap <george.dunlap@eu.citrix.com> wrote: >>> I''m not incredibly familiar with the PAT / MTRR stuff, either from a >>> hardware level or a Xen level, so sorry if this is a dumb question. It >>> sounds like you''re saying, because we have virtual MTRRs that are >>> already translated into EPT types, we should disable virtual MTRRs and >>> use PAT instead. That doesn''t make any kind of sense to me. (I didn''t >>> understand it when Jan said it either.) >> The underlying observation is that MTRRs aren''t really needed - >> all they can do can be done with PAT. They pre-date PAT though, >> hence hardware vendors can''t easily drop them. But in a model >> like PVH I just don''t see the value of allowing their use, considering >> that this adds unnecessary complexity. > > OK -- so when we move forward with the plan of "PVH mode is HVM mode > with a couple of tweaks", you think that we should have an "enable > virtual MTRR" flag, and disable this for PVH mode?Yes (provided the guest kernel code can be made cope with this). Jan
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich > Sent: Friday, November 22, 2013 6:29 PM > To: George Dunlap; Dong, Eddie; Nakajima, Jun > Cc: xen-devel; Tim Deegan > Subject: Re: [Xen-devel] PVH and mtrr/PAT......... > > >>> On 21.11.13 at 16:47, George Dunlap <George.Dunlap@eu.citrix.com> wrote: > > On Wed, Nov 20, 2013 at 10:24 PM, Mukesh Rathor > > <mukesh.rathor@oracle.com> wrote: > >> On Wed, 20 Nov 2013 18:12:13 +0000 > >> George Dunlap <dunlapg@umich.edu> wrote: > >> > >>> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> > >>> wrote: > >>> >>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> > >>> >>>> wrote: > >>> >> After rebasing my dom0 on latest, it didn''t boot. After debugging > >>> >> couple days, it turned out to be : > >>> >> > >>> >> + if ( is_pvh_domain(d) ) > >>> >> + { > >>> >> + if ( direct_mmio ) > >>> >> + return MTRR_TYPE_UNCACHABLE; > >>> >> + return MTRR_TYPE_WRBACK; > >>> >> + } > >>> >> + > >>> >> > >>> >> I had in my patches, missing in epte_get_entry_emt() in latest. > >>> >> > >>> >> So, since I don''t know much about this, is an HVM guest setting > >>> >> MTRR range types? Looking for suggestions on best way to do this > >>> >> for PVH. > >>> > > >>> > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV > >>> > guest isn''t. I''m inclined to prefer PV behavior here to be used for > >>> > PVH (since, as explained by Dongxiao, MTRRs don''t really matter > >>> > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to > >>> > get translated to EPT memory types anyway, hence a PVH guest > >>> > ought to be fine ignoring the MTRRs altogether and handling memory > >>> > types exclusively via PAT mechanisms). > >>> > >>> Mukesh, > >>> > >>> Do you know why this line is having this effect? For one, is it a > >>> matter of direct_mmio pages being given something other than > >>> UNCACHEABLE, or a matter of non-direct_mmio pages given something > >>> other than WRBACK? > >>> > >>> And is the problem that the guest is *not* setting MTRRs, or that the > >>> guest *is* setting MTRRs? > >>> > >>> I''d prefer to avoid having a special case for PVH in this path if > >>> possible. > >> > >> Without any changes to epte_get_entry_emt(), all types are being returned > >> as MTRR_TYPE_WRBACK for PVH because of: > >> > >> if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > >> return MTRR_TYPE_WRBACK; > >> > >> This is problem for low level non-ram pages being accessed in dom0, > >> (which interesting gave MCE errors). non-ram IO pages have to be > >> MTRR_TYPE_UNCACHABLE. > > > > Hmm -- that looks like a bug in the logic there. AFAICT, there''s no > > reason for the lack of IDENT_PT to change the memory type like that. > > > > Unfortunately the changeset that introduced this (77283249) has > > neither comments nor a proper description of what''s going on, so it''s > > hard to tell where this came from. > > While a commit without any description at all is clearly bogus (even > more so in the light that this is the very change that also caused > XSA-60), in the case here it introduces the function as a whole, and > hence would not have been likely to comment on why the function > was written the way it is. > > I take it that this goes alongside the other check immediately previous > to it: When not set yet, assume WB (for the sake of the tool stack). > But I think this tool stack requirement should be expressed without > looking at this HVM param. > > Sadly the person having written that code doesn''t appear to be > working on Xen anymore (and my not be at Intel anymore either), > so I''m afraid we''ll have to sort this out ourselves. Nevertheless - > Intel folks, can you comment on this please?Hi Jan, This is really a piece of old code, and we can''t recall why this judgment (IDENT_PT) is added in epte_get_entry_emt(). From current view, these two lines are buggy and not necessary, and I will make a patch to remove them. Thanks, Dongxiao> > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Tue, Dec 3, 2013 at 7:20 AM, Xu, Dongxiao <dongxiao.xu@intel.com> wrote:>> -----Original Message----- >> From: xen-devel-bounces@lists.xen.org >> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich >> Sent: Friday, November 22, 2013 6:29 PM >> To: George Dunlap; Dong, Eddie; Nakajima, Jun >> Cc: xen-devel; Tim Deegan >> Subject: Re: [Xen-devel] PVH and mtrr/PAT......... >> >> >>> On 21.11.13 at 16:47, George Dunlap <George.Dunlap@eu.citrix.com> wrote: >> > On Wed, Nov 20, 2013 at 10:24 PM, Mukesh Rathor >> > <mukesh.rathor@oracle.com> wrote: >> >> On Wed, 20 Nov 2013 18:12:13 +0000 >> >> George Dunlap <dunlapg@umich.edu> wrote: >> >> >> >>> On Wed, Nov 20, 2013 at 8:42 AM, Jan Beulich <JBeulich@suse.com> >> >>> wrote: >> >>> >>>> On 20.11.13 at 03:11, Mukesh Rathor <mukesh.rathor@oracle.com> >> >>> >>>> wrote: >> >>> >> After rebasing my dom0 on latest, it didn''t boot. After debugging >> >>> >> couple days, it turned out to be : >> >>> >> >> >>> >> + if ( is_pvh_domain(d) ) >> >>> >> + { >> >>> >> + if ( direct_mmio ) >> >>> >> + return MTRR_TYPE_UNCACHABLE; >> >>> >> + return MTRR_TYPE_WRBACK; >> >>> >> + } >> >>> >> + >> >>> >> >> >>> >> I had in my patches, missing in epte_get_entry_emt() in latest. >> >>> >> >> >>> >> So, since I don''t know much about this, is an HVM guest setting >> >>> >> MTRR range types? Looking for suggestions on best way to do this >> >>> >> for PVH. >> >>> > >> >>> > A HVM guest is permitted to write to (virtual) MTRRs, whereas a PV >> >>> > guest isn''t. I''m inclined to prefer PV behavior here to be used for >> >>> > PVH (since, as explained by Dongxiao, MTRRs don''t really matter >> >>> > for VMX guests anyway, i.e. the setting of (virtual) MTRRs needs to >> >>> > get translated to EPT memory types anyway, hence a PVH guest >> >>> > ought to be fine ignoring the MTRRs altogether and handling memory >> >>> > types exclusively via PAT mechanisms). >> >>> >> >>> Mukesh, >> >>> >> >>> Do you know why this line is having this effect? For one, is it a >> >>> matter of direct_mmio pages being given something other than >> >>> UNCACHEABLE, or a matter of non-direct_mmio pages given something >> >>> other than WRBACK? >> >>> >> >>> And is the problem that the guest is *not* setting MTRRs, or that the >> >>> guest *is* setting MTRRs? >> >>> >> >>> I''d prefer to avoid having a special case for PVH in this path if >> >>> possible. >> >> >> >> Without any changes to epte_get_entry_emt(), all types are being returned >> >> as MTRR_TYPE_WRBACK for PVH because of: >> >> >> >> if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) >> >> return MTRR_TYPE_WRBACK; >> >> >> >> This is problem for low level non-ram pages being accessed in dom0, >> >> (which interesting gave MCE errors). non-ram IO pages have to be >> >> MTRR_TYPE_UNCACHABLE. >> > >> > Hmm -- that looks like a bug in the logic there. AFAICT, there''s no >> > reason for the lack of IDENT_PT to change the memory type like that. >> > >> > Unfortunately the changeset that introduced this (77283249) has >> > neither comments nor a proper description of what''s going on, so it''s >> > hard to tell where this came from. >> >> While a commit without any description at all is clearly bogus (even >> more so in the light that this is the very change that also caused >> XSA-60), in the case here it introduces the function as a whole, and >> hence would not have been likely to comment on why the function >> was written the way it is. >> >> I take it that this goes alongside the other check immediately previous >> to it: When not set yet, assume WB (for the sake of the tool stack). >> But I think this tool stack requirement should be expressed without >> looking at this HVM param. >> >> Sadly the person having written that code doesn''t appear to be >> working on Xen anymore (and my not be at Intel anymore either), >> so I''m afraid we''ll have to sort this out ourselves. Nevertheless - >> Intel folks, can you comment on this please? > > Hi Jan, > > This is really a piece of old code, and we can''t recall why this judgment (IDENT_PT) is added in epte_get_entry_emt(). > > From current view, these two lines are buggy and not necessary, and I will make a patch to remove them.Thanks, Dongxiao. No rush with the patch -- it isn''t causing any functional issues at the moment (just code a bit uglier than necessary), so I think I''d probably advise holding off applying it until the 4.5 development window opens (hopefully sometime end of January / beginning of February). (You can of course get reviews and acks in the mean time.) -George