Mukesh Rathor
2013-Mar-16 00:46 UTC
[PATCH 11/18 V2]: PVH xen: some misc changes like mtrr, intr, msi.
Changes in irq.c as PVH doesn''t use vlapic emulation. In mtrr we add assert and set MTRR_TYPEs for PVH. Changes in V2: - Some cleanup of redundant code. - time.c: Honor no rdtsc exiting for PVH by setting vtsc to 0 in time.c Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> --- xen/arch/x86/hvm/irq.c | 3 +++ xen/arch/x86/hvm/mtrr.c | 10 ++++++++++ xen/arch/x86/hvm/vmx/intr.c | 7 ++++--- xen/arch/x86/msi.c | 9 +++++++-- xen/arch/x86/time.c | 2 +- 5 files changed, 25 insertions(+), 6 deletions(-) diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c index 9eae5de..92fb245 100644 --- a/xen/arch/x86/hvm/irq.c +++ b/xen/arch/x86/hvm/irq.c @@ -405,6 +405,9 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v) && vcpu_info(v, evtchn_upcall_pending) ) return hvm_intack_vector(plat->irq.callback_via.vector); + if ( is_pvh_vcpu(v) ) + return hvm_intack_none; + if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output ) return hvm_intack_pic(0); diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c index ef51a8d..8057e88 100644 --- a/xen/arch/x86/hvm/mtrr.c +++ b/xen/arch/x86/hvm/mtrr.c @@ -578,6 +578,9 @@ int32_t hvm_set_mem_pinned_cacheattr( { struct hvm_mem_pinned_cacheattr_range *range; + /* PVH note: The guest writes to MSR_IA32_CR_PAT natively */ + ASSERT( !is_pvh_domain(d) ); + if ( !((type == PAT_TYPE_UNCACHABLE) || (type == PAT_TYPE_WRCOMB) || (type == PAT_TYPE_WRTHROUGH) || @@ -693,6 +696,13 @@ uint8_t epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn, ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) ) return MTRR_TYPE_WRBACK; + /* PVH: fixme/help: do I have this correct? */ + if ( is_pvh_domain(d) ) { + if (direct_mmio) + return MTRR_TYPE_UNCACHABLE; + return MTRR_TYPE_WRBACK; + } + if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) return MTRR_TYPE_WRBACK; diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c index e376f3c..b94f9d5 100644 --- a/xen/arch/x86/hvm/vmx/intr.c +++ b/xen/arch/x86/hvm/vmx/intr.c @@ -219,15 +219,16 @@ void vmx_intr_assist(void) return; } - /* Crank the handle on interrupt state. */ - pt_vector = pt_update_irq(v); + if ( !is_pvh_vcpu(v) ) + /* Crank the handle on interrupt state. */ + pt_vector = pt_update_irq(v); do { intack = hvm_vcpu_has_pending_irq(v); if ( likely(intack.source == hvm_intsrc_none) ) goto out; - if ( unlikely(nvmx_intr_intercept(v, intack)) ) + if ( !is_pvh_vcpu(v) && unlikely(nvmx_intr_intercept(v, intack)) ) goto out; intblk = hvm_interrupt_blocked(v, intack); diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c index 8804306..0fe3108 100644 --- a/xen/arch/x86/msi.c +++ b/xen/arch/x86/msi.c @@ -787,10 +787,15 @@ static int msix_capability_init(struct pci_dev *dev, if ( !dev->msix_used_entries ) { - if ( rangeset_add_range(mmio_ro_ranges, dev->msix_table.first, + /* PVH: this is temporary only until linux msi.c is fixed. See xen-devel + * thread: "[PVH]: Help: msi.c". + */ + if ( !is_pvh_domain(dev->domain) && + rangeset_add_range(mmio_ro_ranges, dev->msix_table.first, dev->msix_table.last) ) WARN(); - if ( rangeset_add_range(mmio_ro_ranges, dev->msix_pba.first, + if ( !is_pvh_domain(dev->domain) && + rangeset_add_range(mmio_ro_ranges, dev->msix_pba.first, dev->msix_pba.last) ) WARN(); diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index 56bffdb..eaa1989 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -1879,7 +1879,7 @@ void tsc_set_info(struct domain *d, uint32_t tsc_mode, uint64_t elapsed_nsec, uint32_t gtsc_khz, uint32_t incarnation) { - if ( is_idle_domain(d) || (d->domain_id == 0) ) + if ( is_idle_domain(d) || (d->domain_id == 0) || is_pvh_domain(d) ) { d->arch.vtsc = 0; return; -- 1.7.2.3
Jan Beulich
2013-Mar-18 12:21 UTC
Re: [PATCH 11/18 V2]: PVH xen: some misc changes like mtrr, intr, msi.
>>> On 16.03.13 at 01:46, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > @@ -693,6 +696,13 @@ uint8_t epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn, > ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) ) > return MTRR_TYPE_WRBACK; > > + /* PVH: fixme/help: do I have this correct? */ > + if ( is_pvh_domain(d) ) { > + if (direct_mmio) > + return MTRR_TYPE_UNCACHABLE; > + return MTRR_TYPE_WRBACK;I don''t think you can reasonably reduce a guest to just 2 memory types, particularly not Dom0.> --- a/xen/arch/x86/msi.c > +++ b/xen/arch/x86/msi.c > @@ -787,10 +787,15 @@ static int msix_capability_init(struct pci_dev *dev, > > if ( !dev->msix_used_entries ) > { > - if ( rangeset_add_range(mmio_ro_ranges, dev->msix_table.first, > + /* PVH: this is temporary only until linux msi.c is fixed. See xen-devel > + * thread: "[PVH]: Help: msi.c". > + */ > + if ( !is_pvh_domain(dev->domain) && > + rangeset_add_range(mmio_ro_ranges, dev->msix_table.first, > dev->msix_table.last) ) > WARN(); > - if ( rangeset_add_range(mmio_ro_ranges, dev->msix_pba.first, > + if ( !is_pvh_domain(dev->domain) && > + rangeset_add_range(mmio_ro_ranges, dev->msix_pba.first, > dev->msix_pba.last) ) > WARN(); >As already said before, the whole change above has my explicit NAK. Jan
Konrad Rzeszutek Wilk
2013-Mar-18 17:03 UTC
Re: [PATCH 11/18 V2]: PVH xen: some misc changes like mtrr, intr, msi.
On Fri, Mar 15, 2013 at 05:46:05PM -0700, Mukesh Rathor wrote:> Changes in irq.c as PVH doesn''t use vlapic emulation. In mtrr we add > assert and set MTRR_TYPEs for PVH. > > Changes in V2: > - Some cleanup of redundant code. > - time.c: Honor no rdtsc exiting for PVH by setting vtsc to 0 in time.c > > Signed-off-by: Mukesh Rathor <mukesh.rathor@oracle.com> > --- > xen/arch/x86/hvm/irq.c | 3 +++ > xen/arch/x86/hvm/mtrr.c | 10 ++++++++++ > xen/arch/x86/hvm/vmx/intr.c | 7 ++++--- > xen/arch/x86/msi.c | 9 +++++++-- > xen/arch/x86/time.c | 2 +- > 5 files changed, 25 insertions(+), 6 deletions(-) > > diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c > index 9eae5de..92fb245 100644 > --- a/xen/arch/x86/hvm/irq.c > +++ b/xen/arch/x86/hvm/irq.c > @@ -405,6 +405,9 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v) > && vcpu_info(v, evtchn_upcall_pending) ) > return hvm_intack_vector(plat->irq.callback_via.vector); > > + if ( is_pvh_vcpu(v) ) > + return hvm_intack_none; > + > if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output ) > return hvm_intack_pic(0); > > diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c > index ef51a8d..8057e88 100644 > --- a/xen/arch/x86/hvm/mtrr.c > +++ b/xen/arch/x86/hvm/mtrr.c > @@ -578,6 +578,9 @@ int32_t hvm_set_mem_pinned_cacheattr( > { > struct hvm_mem_pinned_cacheattr_range *range; > > + /* PVH note: The guest writes to MSR_IA32_CR_PAT natively */ > + ASSERT( !is_pvh_domain(d) ); > + > if ( !((type == PAT_TYPE_UNCACHABLE) || > (type == PAT_TYPE_WRCOMB) || > (type == PAT_TYPE_WRTHROUGH) || > @@ -693,6 +696,13 @@ uint8_t epte_get_entry_emt(struct domain *d, unsigned long gfn, mfn_t mfn, > ((d->vcpu == NULL) || ((v = d->vcpu[0]) == NULL)) ) > return MTRR_TYPE_WRBACK; > > + /* PVH: fixme/help: do I have this correct? */ > + if ( is_pvh_domain(d) ) { > + if (direct_mmio) > + return MTRR_TYPE_UNCACHABLE; > + return MTRR_TYPE_WRBACK; > + } > +Could we use the other code? The get_mtrr_type for example? Or just let this code run through?> if ( !v->domain->arch.hvm_domain.params[HVM_PARAM_IDENT_PT] ) > return MTRR_TYPE_WRBACK; > > diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c > index e376f3c..b94f9d5 100644 > --- a/xen/arch/x86/hvm/vmx/intr.c > +++ b/xen/arch/x86/hvm/vmx/intr.c > @@ -219,15 +219,16 @@ void vmx_intr_assist(void) > return; > } > > - /* Crank the handle on interrupt state. */ > - pt_vector = pt_update_irq(v); > + if ( !is_pvh_vcpu(v) ) > + /* Crank the handle on interrupt state. */ > + pt_vector = pt_update_irq(v); > > do { > intack = hvm_vcpu_has_pending_irq(v); > if ( likely(intack.source == hvm_intsrc_none) ) > goto out; > > - if ( unlikely(nvmx_intr_intercept(v, intack)) ) > + if ( !is_pvh_vcpu(v) && unlikely(nvmx_intr_intercept(v, intack)) ) > goto out; > > intblk = hvm_interrupt_blocked(v, intack); > diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c > index 8804306..0fe3108 100644 > --- a/xen/arch/x86/msi.c > +++ b/xen/arch/x86/msi.c > @@ -787,10 +787,15 @@ static int msix_capability_init(struct pci_dev *dev, > > if ( !dev->msix_used_entries ) > { > - if ( rangeset_add_range(mmio_ro_ranges, dev->msix_table.first, > + /* PVH: this is temporary only until linux msi.c is fixed. See xen-devel > + * thread: "[PVH]: Help: msi.c". > + */ > + if ( !is_pvh_domain(dev->domain) && > + rangeset_add_range(mmio_ro_ranges, dev->msix_table.first, > dev->msix_table.last) ) > WARN(); > - if ( rangeset_add_range(mmio_ro_ranges, dev->msix_pba.first, > + if ( !is_pvh_domain(dev->domain) && > + rangeset_add_range(mmio_ro_ranges, dev->msix_pba.first, > dev->msix_pba.last) ) > WARN(); > > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index 56bffdb..eaa1989 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -1879,7 +1879,7 @@ void tsc_set_info(struct domain *d, > uint32_t tsc_mode, uint64_t elapsed_nsec, > uint32_t gtsc_khz, uint32_t incarnation) > { > - if ( is_idle_domain(d) || (d->domain_id == 0) ) > + if ( is_idle_domain(d) || (d->domain_id == 0) || is_pvh_domain(d) )Ah, so the ''timer_mode'' option in the guest config if completly ignored if ''pvh'' is set then! You need to document that in the docs/man/xl.conf.5 file.> { > d->arch.vtsc = 0; > return; > -- > 1.7.2.3 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel >
Mukesh Rathor
2013-Mar-19 01:20 UTC
Re: [PATCH 11/18 V2]: PVH xen: some misc changes like mtrr, intr, msi.
On Mon, 18 Mar 2013 12:21:11 +0000 "Jan Beulich" <JBeulich@suse.com> wrote:> >>> On 16.03.13 at 01:46, Mukesh Rathor <mukesh.rathor@oracle.com> > >>> wrote: > > +++ b/xen/arch/x86/msi.c > > @@ -787,10 +787,15 @@ static int msix_capability_init(struct > > pci_dev *dev, > > if ( !dev->msix_used_entries ) > > { > > - if ( rangeset_add_range(mmio_ro_ranges, > > dev->msix_table.first, > > + /* PVH: this is temporary only until linux msi.c is fixed. > > See xen-devel > > + * thread: "[PVH]: Help: msi.c". > > + */ > > + if ( !is_pvh_domain(dev->domain) && > > + rangeset_add_range(mmio_ro_ranges, > > dev->msix_table.first, dev->msix_table.last) ) > > WARN(); > > - if ( rangeset_add_range(mmio_ro_ranges, > > dev->msix_pba.first, > > + if ( !is_pvh_domain(dev->domain) && > > + rangeset_add_range(mmio_ro_ranges, > > dev->msix_pba.first, dev->msix_pba.last) ) > > WARN(); > > > > As already said before, the whole change above has my explicit > NAK. > JanDo you have any suggestions? Do you want to hold off on entire xen patch until we go fix linux for this? Or can we just omit this change in the next V 3 and come back to this later. Would you be OK with that? thanks MR
Jan Beulich
2013-Mar-19 08:55 UTC
Re: [PATCH 11/18 V2]: PVH xen: some misc changes like mtrr, intr, msi.
>>> On 19.03.13 at 02:20, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > On Mon, 18 Mar 2013 12:21:11 +0000 > "Jan Beulich" <JBeulich@suse.com> wrote: > >> >>> On 16.03.13 at 01:46, Mukesh Rathor <mukesh.rathor@oracle.com> >> >>> wrote: >> > +++ b/xen/arch/x86/msi.c >> > @@ -787,10 +787,15 @@ static int msix_capability_init(struct >> > pci_dev *dev, >> > if ( !dev->msix_used_entries ) >> > { >> > - if ( rangeset_add_range(mmio_ro_ranges, >> > dev->msix_table.first, >> > + /* PVH: this is temporary only until linux msi.c is fixed. >> > See xen-devel >> > + * thread: "[PVH]: Help: msi.c". >> > + */ >> > + if ( !is_pvh_domain(dev->domain) && >> > + rangeset_add_range(mmio_ro_ranges, >> > dev->msix_table.first, dev->msix_table.last) ) >> > WARN(); >> > - if ( rangeset_add_range(mmio_ro_ranges, >> > dev->msix_pba.first, >> > + if ( !is_pvh_domain(dev->domain) && >> > + rangeset_add_range(mmio_ro_ranges, >> > dev->msix_pba.first, dev->msix_pba.last) ) >> > WARN(); >> > >> >> As already said before, the whole change above has my explicit >> NAK. >> Jan > > Do you have any suggestions? Do you want to hold off on entire xen patch > until we go fix linux for this? Or can we just omit this change in the next > V 3 and come back to this later. Would you be OK with that?If you can''t get MSI (or perhaps just MSI-X) to work properly for PVH, failing the respective setup operations just for PVH guests would be the right approach - the guest has to be prepared to run without MSI generally anyway (minus devices that have no interrupt pin at all of course). Opening security holes certainly isn''t acceptable, even for code that is considered experimental only (mostly because it way too easily happens that the experimental status then gets dropped without remembering all the pieces that need fixing). Jan
Konrad Rzeszutek Wilk
2013-Mar-19 14:11 UTC
Re: [PATCH 11/18 V2]: PVH xen: some misc changes like mtrr, intr, msi.
On Tue, Mar 19, 2013 at 08:55:20AM +0000, Jan Beulich wrote:> >>> On 19.03.13 at 02:20, Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > > On Mon, 18 Mar 2013 12:21:11 +0000 > > "Jan Beulich" <JBeulich@suse.com> wrote: > > > >> >>> On 16.03.13 at 01:46, Mukesh Rathor <mukesh.rathor@oracle.com> > >> >>> wrote: > >> > +++ b/xen/arch/x86/msi.c > >> > @@ -787,10 +787,15 @@ static int msix_capability_init(struct > >> > pci_dev *dev, > >> > if ( !dev->msix_used_entries ) > >> > { > >> > - if ( rangeset_add_range(mmio_ro_ranges, > >> > dev->msix_table.first, > >> > + /* PVH: this is temporary only until linux msi.c is fixed. > >> > See xen-devel > >> > + * thread: "[PVH]: Help: msi.c". > >> > + */ > >> > + if ( !is_pvh_domain(dev->domain) && > >> > + rangeset_add_range(mmio_ro_ranges, > >> > dev->msix_table.first, dev->msix_table.last) ) > >> > WARN(); > >> > - if ( rangeset_add_range(mmio_ro_ranges, > >> > dev->msix_pba.first, > >> > + if ( !is_pvh_domain(dev->domain) && > >> > + rangeset_add_range(mmio_ro_ranges, > >> > dev->msix_pba.first, dev->msix_pba.last) ) > >> > WARN(); > >> > > >> > >> As already said before, the whole change above has my explicit > >> NAK. > >> Jan > > > > Do you have any suggestions? Do you want to hold off on entire xen patch > > until we go fix linux for this? Or can we just omit this change in the next > > V 3 and come back to this later. Would you be OK with that? > > If you can''t get MSI (or perhaps just MSI-X) to work properly > for PVH, failing the respective setup operations just for PVH > guests would be the right approach - the guest has to be > prepared to run without MSI generally anyway (minus devices > that have no interrupt pin at all of course). Opening security > holes certainly isn''t acceptable, even for code that is considered > experimental only (mostly because it way too easily happens > that the experimental status then gets dropped without > remembering all the pieces that need fixing).And it looks like it can actually work [ 19.931793] atl1c 0000:03:00.0: xen map irq failed -22 for 32752 domain [ 19.938464] atl1c 0000:03:00.0: Unable to allocate MSI interrupt Error: -22 [ 19.945989] atl1c 0000:03:00.0: atl1c: eth0 NIC Link is Up<1000 Mbps Full Duplex> Albeit the Linux code looks to have a bug. (This is with a really silly code): diff --git a/xen/arch/x86/physdev.c b/xen/arch/x86/physdev.c index 876ac9d..2571e6b 100644 --- a/xen/arch/x86/physdev.c +++ b/xen/arch/x86/physdev.c @@ -19,6 +19,9 @@ #include <xsm/xsm.h> #include <asm/p2m.h> +static bool_t msi_enabled = 1; +boolean_param("msi_enabled", msi_enabled); + int physdev_map_pirq(domid_t, int type, int *index, int *pirq_p, struct msi_info *); int physdev_unmap_pirq(domid_t, int pirq); @@ -140,6 +143,9 @@ int physdev_map_pirq(domid_t domid, int type, int *index, int *pirq_p, break; case MAP_PIRQ_TYPE_MSI: + if ( !msi_enabled ) + goto out; + irq = *index; if ( irq == -1 ) irq = create_irq(NUMA_NO_NODE); @@ -157,6 +163,7 @@ int physdev_map_pirq(domid_t domid, int type, int *index, int *pirq_p, break; default: +out: dprintk(XENLOG_G_ERR, "dom%d: wrong map_pirq type %x\n", d->domain_id, type); ret = -EINVAL;