Wu, Feng
2013-Nov-11 12:33 UTC
[PATCH] x86: properly handle MSI-X unmask operation from guests
Patch #1: For Xen itself From 5e89c4a3f16d3556294f276039683dcf8abaeb84 Mon Sep 17 00:00:00 2001 From: Feng Wu <feng.wu@intel.com> Date: Fri, 1 Nov 2013 00:51:46 -0400 Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests For a pass-through device with MSI-x capability, when guest tries to unmask the MSI-x interrupt for the passed through device, xen doesn''t clear the mask bit for MSI-x in hardware in the following scenario, which will cause network disconnection: 1. Guest masks the MSI-x interrupt 2. Guest updates the address and data for it 3. Guest unmasks the MSI-x interrupt (This is the problematic step) In the step #3 above, Xen doesn''t handle it well. When guest tries to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu if it notices that address or data has been modified by guest before, then Qemu will update Xen with the latest value of address/data by hypercall. However, in this whole process, the MSI-X interrupt unmask operation is missing, which means Xen doesn''t clear the mask bit in hardware for the MSI-X interrupt, so it remains disabled, that is why it loses the network connection. This patch fixes this issue. Signed-off-by: Feng Wu <feng.wu@intel.com> --- tools/libxc/xc_domain.c | 4 ++++ tools/libxc/xenctrl.h | 2 ++ xen/arch/x86/hvm/vmsi.c | 14 ++++++++++++++ xen/drivers/passthrough/io.c | 7 +++++++ xen/include/public/domctl.h | 2 ++ xen/include/xen/pci.h | 2 ++ 6 files changed, 31 insertions(+), 0 deletions(-) diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c index 2cea6e3..d8e583e 100644 --- a/tools/libxc/xc_domain.c +++ b/tools/libxc/xc_domain.c @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq( uint32_t gvec, uint32_t pirq, uint32_t gflags, + uint32_t vector_ctrl, + int nr_entry, uint64_t gtable) { int rc; @@ -1359,6 +1361,8 @@ int xc_domain_update_msi_irq( bind->u.msi.gvec = gvec; bind->u.msi.gflags = gflags; bind->u.msi.gtable = gtable; + bind->u.msi.vector_ctrl = vector_ctrl; + bind->u.msi.nr_entry = nr_entry; rc = do_domctl(xch, &domctl); return rc; diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h index 8cf3f3b..9b972b3 100644 --- a/tools/libxc/xenctrl.h +++ b/tools/libxc/xenctrl.h @@ -1781,6 +1781,8 @@ int xc_domain_update_msi_irq( uint32_t gvec, uint32_t pirq, uint32_t gflags, + uint32_t vector_ctrl, + int nr_entry, uint64_t gtable); int xc_domain_unbind_msi_irq(xc_interface *xch, diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c index 4826b4a..5cee099 100644 --- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -528,3 +528,17 @@ void msixtbl_pt_cleanup(struct domain *d) spin_unlock(&d->arch.hvm_domain.msixtbl_list_lock); local_irq_restore(flags); } + +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, + unsigned int nr_entry) +{ + unsigned long ctrl_address; + + ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE + + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; + + if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY ) + return -1; + + return 0; +} diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c index f64e4ac..8039b72 100644 --- a/xen/drivers/passthrough/io.c +++ b/xen/drivers/passthrough/io.c @@ -193,6 +193,13 @@ int pt_irq_create_bind( spin_unlock(&d->event_lock); if ( dest_vcpu_id >= 0 ) hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]); + + if ((pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK) == 0) { + rc = msixtbl_unmask(d->vcpu[0], pt_irq_bind->u.msi.gtable, + pt_irq_bind->u.msi.nr_entry); + if (rc) + return -EBUSY; + } } else { diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h index d4e479f..7a31c28 100644 --- a/xen/include/public/domctl.h +++ b/xen/include/public/domctl.h @@ -503,6 +503,8 @@ struct xen_domctl_bind_pt_irq { struct { uint8_t gvec; uint32_t gflags; + uint32_t vector_ctrl; + int nr_entry; uint64_aligned_t gtable; } msi; } u; diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h index cadb525..1eec625 100644 --- a/xen/include/xen/pci.h +++ b/xen/include/xen/pci.h @@ -147,5 +147,7 @@ struct pirq; int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable); void msixtbl_pt_unregister(struct domain *, struct pirq *); void msixtbl_pt_cleanup(struct domain *d); +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, + unsigned int nr_entry); #endif /* __XEN_PCI_H__ */ -- 1.7.1 Patch #2: For Qemu-xen: From 0c3d2e73b06b7ccf357163c75dedf537897c6a21 Mon Sep 17 00:00:00 2001 From: Feng Wu <feng.wu@intel.com> Date: Wed, 6 Nov 2013 01:26:58 -0500 Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests For a pass-through device with MSI-x capability, when guest tries to unmask the MSI-x interrupt for the passed through device, xen doesn''t clear the mask bit for MSI-x in hardware in the following scenario, which will cause network disconnection: 1. Guest masks the MSI-x interrupt 2. Guest updates the address and data for it 3. Guest unmasks the MSI-x interrupt (This is the problematic step) In the step #3 above, Xen doesn''t handle it well. When guest tries to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu if it notices that address or data has been modified by guest before, then Qemu will update Xen with the latest value of address/data by hypercall. However, in this whole process, the MSI-X interrupt unmask operation is missing, which means Xen doesn''t clear the mask bit in hardware for the MSI-X interrupt, so it remains disabled, that is why it loses the network connection. This patch fixes this issue. Signed-off-by: Feng Wu <feng.wu@intel.com> --- hw/xen_pt_msi.c | 37 +++++++++++++++++++++++-------------- 1 files changed, 23 insertions(+), 14 deletions(-) diff --git a/hw/xen_pt_msi.c b/hw/xen_pt_msi.c index db757cd..bacf2e9 100644 --- a/hw/xen_pt_msi.c +++ b/hw/xen_pt_msi.c @@ -143,6 +143,7 @@ static int msi_msix_setup(XenPCIPassthroughState *s, static int msi_msix_update(XenPCIPassthroughState *s, uint64_t addr, uint32_t data, + uint32_t vector_ctrl, int pirq, bool is_msix, int msix_entry, @@ -162,8 +163,8 @@ static int msi_msix_update(XenPCIPassthroughState *s, table_addr = s->msix->mmio_base_addr; } - rc = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec, - pirq, gflags, table_addr); + rc = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec, pirq, gflags, + vector_ctrl, msix_entry, table_addr); if (rc) { XEN_PT_ERR(d, "Updating of MSI%s failed. (rc: %d)\n", @@ -264,7 +265,7 @@ int xen_pt_msi_setup(XenPCIPassthroughState *s) int xen_pt_msi_update(XenPCIPassthroughState *s) { XenPTMSI *msi = s->msi; - return msi_msix_update(s, msi_addr64(msi), msi->data, msi->pirq, + return msi_msix_update(s, msi_addr64(msi), msi->data, 1, msi->pirq, false, 0, &msi->pirq); } @@ -330,8 +331,8 @@ static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr) entry->pirq = pirq; } - rc = msi_msix_update(s, entry->addr, entry->data, pirq, true, - entry_nr, &entry->pirq); + rc = msi_msix_update(s, entry->addr, entry->data, entry->vector_ctrl, pirq, + true, entry_nr, &entry->pirq); if (!rc) { entry->updated = false; @@ -434,6 +435,7 @@ static void pci_msix_write(void *opaque, hwaddr addr, XenPTMSIX *msix = s->msix; XenPTMSIXEntry *entry; int entry_nr, offset; + const volatile uint32_t *vec_ctrl; entry_nr = addr / PCI_MSIX_ENTRY_SIZE; if (entry_nr < 0 || entry_nr >= msix->total_entries) { @@ -443,20 +445,18 @@ static void pci_msix_write(void *opaque, hwaddr addr, entry = &msix->msix_entry[entry_nr]; offset = addr % PCI_MSIX_ENTRY_SIZE; - if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { - const volatile uint32_t *vec_ctrl; + /* + * If Xen intercepts the mask bit access, entry->vec_ctrl may not be + * up-to-date. Read from hardware directly. + */ + vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE + + PCI_MSIX_ENTRY_VECTOR_CTRL; + if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { if (get_entry_value(entry, offset) == val) { return; } - /* - * If Xen intercepts the mask bit access, entry->vec_ctrl may not be - * up-to-date. Read from hardware directly. - */ - vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE - + PCI_MSIX_ENTRY_VECTOR_CTRL; - if (msix->enabled && !(*vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) { XEN_PT_ERR(&s->dev, "Can''t update msix entry %d since MSI-X is" " already enabled.\n", entry_nr); @@ -469,6 +469,15 @@ static void pci_msix_write(void *opaque, hwaddr addr, set_entry_value(entry, offset, val); if (offset == PCI_MSIX_ENTRY_VECTOR_CTRL) { + + /* + * If guest unmasks the MSI-X interrupt, we also need to + * notify Xen to update it. + */ + if ((val & PCI_MSIX_ENTRY_CTRL_MASKBIT) !+ (*vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) + entry->updated = true; + if (msix->enabled && !(val & PCI_MSIX_ENTRY_CTRL_MASKBIT)) { xen_pt_msix_update_one(s, entry_nr); } -- 1.7.1 Patch #3: For Qemu-traditional: From 9c0cb506a10ce4588b5fc048f4be2b9997cb0c75 Mon Sep 17 00:00:00 2001 From: Feng Wu <feng.wu@intel.com> Date: Fri, 1 Nov 2013 00:52:48 -0400 Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests For a pass-through device with MSI-x capability, when guest tries to unmask the MSI-x interrupt for the passed through device, xen doesn''t clear the mask bit for MSI-x in hardware in the following scenario, which will cause network disconnection: 1. Guest masks the MSI-x interrupt 2. Guest updates the address and data for it 3. Guest unmasks the MSI-x interrupt (This is the problematic step) In the step #3 above, Xen doesn''t handle it well. When guest tries to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu if it notices that address or data has been modified by guest before, then Qemu will update Xen with the latest value of address/data by hypercall. However, in this whole process, the MSI-X interrupt unmask operation is missing, which means Xen doesn''t clear the mask bit in hardware for the MSI-X interrupt, so it remains disabled, that is why it loses the network connection. This patch fixes this issue. Signed-off-by: Feng Wu <feng.wu@intel.com> --- hw/pt-msi.c | 27 ++++++++++++++++++--------- 1 files changed, 18 insertions(+), 9 deletions(-) diff --git a/hw/pt-msi.c b/hw/pt-msi.c index b03b989..64a2531 100644 --- a/hw/pt-msi.c +++ b/hw/pt-msi.c @@ -142,7 +142,7 @@ int pt_msi_update(struct pt_dev *d) d->msi->pirq, gvec, gflags); ret = xc_domain_update_msi_irq(xc_handle, domid, gvec, - d->msi->pirq, gflags, 0); + d->msi->pirq, gflags, 1, 0, 0); if (ret) { @@ -281,6 +281,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr) int gvec = entry->io_mem[2] & 0xff; uint64_t gaddr = *(uint64_t *)&entry->io_mem[0]; uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr); + uint32_t vector_ctrl = entry->io_mem[3]; int ret; if ( !entry->flags ) @@ -319,6 +320,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr) entry_nr, pirq, gvec); ret = xc_domain_update_msi_irq(xc_handle, domid, gvec, pirq, gflags, + vector_ctrl, entry_nr, dev->msix->mmio_base_addr); if ( ret ) { @@ -431,6 +433,7 @@ static void pci_msix_writel(void *opaque, target_phys_addr_t addr, uint32_t val) struct pt_msix_info *msix = dev->msix; struct msix_entry_info *entry; int entry_nr, offset; + const volatile uint32_t *vec_ctrl; if ( addr % 4 ) { @@ -443,19 +446,17 @@ static void pci_msix_writel(void *opaque, target_phys_addr_t addr, uint32_t val) entry = &msix->msix_entry[entry_nr]; offset = ((addr - msix->mmio_base_addr) % 16) / 4; + /* + * If Xen intercepts the mask bit access, io_mem[3] may not be + * up-to-date. Read from hardware directly. + */ + vec_ctrl = msix->phys_iomem_base + 16 * entry_nr + 12; + if ( offset != 3 ) { - const volatile uint32_t *vec_ctrl; - if ( entry->io_mem[offset] == val ) return; - /* - * If Xen intercepts the mask bit access, io_mem[3] may not be - * up-to-date. Read from hardware directly. - */ - vec_ctrl = msix->phys_iomem_base + 16 * entry_nr + 12; - if ( msix->enabled && !(*vec_ctrl & 0x1) ) { PT_LOG("Can''t update entry %d since MSI-X is already enabled" @@ -471,6 +472,14 @@ static void pci_msix_writel(void *opaque, target_phys_addr_t addr, uint32_t val) if ( offset == 3 ) { + + /* + * If guest unmasks the MSI-X interrupt, we also need to + * notify Xen to update it. + */ + if ((val & 0x1) != (*vec_ctrl & 0x1)) + entry->flags = 1; + if ( msix->enabled && !(val & 0x1) ) pt_msix_update_one(dev, entry_nr); } -- 1.7.1 Thanks, Feng _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Andrew Cooper
2013-Nov-11 12:53 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
On 11/11/13 12:33, Wu, Feng wrote:> Patch #1: For Xen itself > > From 5e89c4a3f16d3556294f276039683dcf8abaeb84 Mon Sep 17 00:00:00 2001 > From: Feng Wu <feng.wu@intel.com> > Date: Fri, 1 Nov 2013 00:51:46 -0400 > Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests > > For a pass-through device with MSI-x capability, when guest tries > to unmask the MSI-x interrupt for the passed through device, xen > doesn''t clear the mask bit for MSI-x in hardware in the following > scenario, which will cause network disconnection: > > 1. Guest masks the MSI-x interrupt > 2. Guest updates the address and data for it > 3. Guest unmasks the MSI-x interrupt (This is the problematic step) > > In the step #3 above, Xen doesn''t handle it well. When guest tries > to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu > if it notices that address or data has been modified by guest before, > then Qemu will update Xen with the latest value of address/data by > hypercall. However, in this whole process, the MSI-X interrupt unmask > operation is missing, which means Xen doesn''t clear the mask bit in > hardware for the MSI-X interrupt, so it remains disabled, that is why > it loses the network connection. > > This patch fixes this issue. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > tools/libxc/xc_domain.c | 4 ++++ > tools/libxc/xenctrl.h | 2 ++ > xen/arch/x86/hvm/vmsi.c | 14 ++++++++++++++ > xen/drivers/passthrough/io.c | 7 +++++++ > xen/include/public/domctl.h | 2 ++ > xen/include/xen/pci.h | 2 ++ > 6 files changed, 31 insertions(+), 0 deletions(-) > > diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c > index 2cea6e3..d8e583e 100644 > --- a/tools/libxc/xc_domain.c > +++ b/tools/libxc/xc_domain.c > @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq( > uint32_t gvec, > uint32_t pirq, > uint32_t gflags, > + uint32_t vector_ctrl, > + int nr_entry, > uint64_t gtable) > { > int rc; > @@ -1359,6 +1361,8 @@ int xc_domain_update_msi_irq( > bind->u.msi.gvec = gvec; > bind->u.msi.gflags = gflags; > bind->u.msi.gtable = gtable; > + bind->u.msi.vector_ctrl = vector_ctrl; > + bind->u.msi.nr_entry = nr_entry; > > rc = do_domctl(xch, &domctl); > return rc; > diff --git a/tools/libxc/xenctrl.h b/tools/libxc/xenctrl.h > index 8cf3f3b..9b972b3 100644 > --- a/tools/libxc/xenctrl.h > +++ b/tools/libxc/xenctrl.h > @@ -1781,6 +1781,8 @@ int xc_domain_update_msi_irq( > uint32_t gvec, > uint32_t pirq, > uint32_t gflags, > + uint32_t vector_ctrl, > + int nr_entry, > uint64_t gtable); > > int xc_domain_unbind_msi_irq(xc_interface *xch, > diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c > index 4826b4a..5cee099 100644 > --- a/xen/arch/x86/hvm/vmsi.c > +++ b/xen/arch/x86/hvm/vmsi.c > @@ -528,3 +528,17 @@ void msixtbl_pt_cleanup(struct domain *d) > spin_unlock(&d->arch.hvm_domain.msixtbl_list_lock); > local_irq_restore(flags); > } > + > +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, > + unsigned int nr_entry) > +{ > + unsigned long ctrl_address; > + > + ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE + > + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; > + > + if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY ) > + return -1; > + > + return 0; > +} > diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c > index f64e4ac..8039b72 100644 > --- a/xen/drivers/passthrough/io.c > +++ b/xen/drivers/passthrough/io.c > @@ -193,6 +193,13 @@ int pt_irq_create_bind( > spin_unlock(&d->event_lock); > if ( dest_vcpu_id >= 0 ) > hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]); > + > + if ((pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK) == 0) { > + rc = msixtbl_unmask(d->vcpu[0], pt_irq_bind->u.msi.gtable, > + pt_irq_bind->u.msi.nr_entry); > + if (rc) > + return -EBUSY; > + } > } > else > { > diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h > index d4e479f..7a31c28 100644 > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -503,6 +503,8 @@ struct xen_domctl_bind_pt_irq { > struct { > uint8_t gvec; > uint32_t gflags; > + uint32_t vector_ctrl; > + int nr_entry; > uint64_aligned_t gtable; > } msi; > } u;NACK - you absolutely cannot change the ABI like this. At the very least, you must bump XEN_DOMCTL_INTERFACE_VERSION. ~Andrew> diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h > index cadb525..1eec625 100644 > --- a/xen/include/xen/pci.h > +++ b/xen/include/xen/pci.h > @@ -147,5 +147,7 @@ struct pirq; > int msixtbl_pt_register(struct domain *, struct pirq *, uint64_t gtable); > void msixtbl_pt_unregister(struct domain *, struct pirq *); > void msixtbl_pt_cleanup(struct domain *d); > +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, > + unsigned int nr_entry); > > #endif /* __XEN_PCI_H__ */ > -- > 1.7.1 > > Patch #2: For Qemu-xen: > > From 0c3d2e73b06b7ccf357163c75dedf537897c6a21 Mon Sep 17 00:00:00 2001 > From: Feng Wu <feng.wu@intel.com> > Date: Wed, 6 Nov 2013 01:26:58 -0500 > Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests > > For a pass-through device with MSI-x capability, when guest tries > to unmask the MSI-x interrupt for the passed through device, xen > doesn''t clear the mask bit for MSI-x in hardware in the following > scenario, which will cause network disconnection: > > 1. Guest masks the MSI-x interrupt > 2. Guest updates the address and data for it > 3. Guest unmasks the MSI-x interrupt (This is the problematic step) > > In the step #3 above, Xen doesn''t handle it well. When guest tries > to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu > if it notices that address or data has been modified by guest before, > then Qemu will update Xen with the latest value of address/data by > hypercall. However, in this whole process, the MSI-X interrupt unmask > operation is missing, which means Xen doesn''t clear the mask bit in > hardware for the MSI-X interrupt, so it remains disabled, that is why > it loses the network connection. > > This patch fixes this issue. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > hw/xen_pt_msi.c | 37 +++++++++++++++++++++++-------------- > 1 files changed, 23 insertions(+), 14 deletions(-) > > diff --git a/hw/xen_pt_msi.c b/hw/xen_pt_msi.c > index db757cd..bacf2e9 100644 > --- a/hw/xen_pt_msi.c > +++ b/hw/xen_pt_msi.c > @@ -143,6 +143,7 @@ static int msi_msix_setup(XenPCIPassthroughState *s, > static int msi_msix_update(XenPCIPassthroughState *s, > uint64_t addr, > uint32_t data, > + uint32_t vector_ctrl, > int pirq, > bool is_msix, > int msix_entry, > @@ -162,8 +163,8 @@ static int msi_msix_update(XenPCIPassthroughState *s, > table_addr = s->msix->mmio_base_addr; > } > > - rc = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec, > - pirq, gflags, table_addr); > + rc = xc_domain_update_msi_irq(xen_xc, xen_domid, gvec, pirq, gflags, > + vector_ctrl, msix_entry, table_addr); > > if (rc) { > XEN_PT_ERR(d, "Updating of MSI%s failed. (rc: %d)\n", > @@ -264,7 +265,7 @@ int xen_pt_msi_setup(XenPCIPassthroughState *s) > int xen_pt_msi_update(XenPCIPassthroughState *s) > { > XenPTMSI *msi = s->msi; > - return msi_msix_update(s, msi_addr64(msi), msi->data, msi->pirq, > + return msi_msix_update(s, msi_addr64(msi), msi->data, 1, msi->pirq, > false, 0, &msi->pirq); > } > > @@ -330,8 +331,8 @@ static int xen_pt_msix_update_one(XenPCIPassthroughState *s, int entry_nr) > entry->pirq = pirq; > } > > - rc = msi_msix_update(s, entry->addr, entry->data, pirq, true, > - entry_nr, &entry->pirq); > + rc = msi_msix_update(s, entry->addr, entry->data, entry->vector_ctrl, pirq, > + true, entry_nr, &entry->pirq); > > if (!rc) { > entry->updated = false; > @@ -434,6 +435,7 @@ static void pci_msix_write(void *opaque, hwaddr addr, > XenPTMSIX *msix = s->msix; > XenPTMSIXEntry *entry; > int entry_nr, offset; > + const volatile uint32_t *vec_ctrl; > > entry_nr = addr / PCI_MSIX_ENTRY_SIZE; > if (entry_nr < 0 || entry_nr >= msix->total_entries) { > @@ -443,20 +445,18 @@ static void pci_msix_write(void *opaque, hwaddr addr, > entry = &msix->msix_entry[entry_nr]; > offset = addr % PCI_MSIX_ENTRY_SIZE; > > - if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { > - const volatile uint32_t *vec_ctrl; > + /* > + * If Xen intercepts the mask bit access, entry->vec_ctrl may not be > + * up-to-date. Read from hardware directly. > + */ > + vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE > + + PCI_MSIX_ENTRY_VECTOR_CTRL; > > + if (offset != PCI_MSIX_ENTRY_VECTOR_CTRL) { > if (get_entry_value(entry, offset) == val) { > return; > } > > - /* > - * If Xen intercepts the mask bit access, entry->vec_ctrl may not be > - * up-to-date. Read from hardware directly. > - */ > - vec_ctrl = s->msix->phys_iomem_base + entry_nr * PCI_MSIX_ENTRY_SIZE > - + PCI_MSIX_ENTRY_VECTOR_CTRL; > - > if (msix->enabled && !(*vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) { > XEN_PT_ERR(&s->dev, "Can''t update msix entry %d since MSI-X is" > " already enabled.\n", entry_nr); > @@ -469,6 +469,15 @@ static void pci_msix_write(void *opaque, hwaddr addr, > set_entry_value(entry, offset, val); > > if (offset == PCI_MSIX_ENTRY_VECTOR_CTRL) { > + > + /* > + * If guest unmasks the MSI-X interrupt, we also need to > + * notify Xen to update it. > + */ > + if ((val & PCI_MSIX_ENTRY_CTRL_MASKBIT) !> + (*vec_ctrl & PCI_MSIX_ENTRY_CTRL_MASKBIT)) > + entry->updated = true; > + > if (msix->enabled && !(val & PCI_MSIX_ENTRY_CTRL_MASKBIT)) { > xen_pt_msix_update_one(s, entry_nr); > } > -- > 1.7.1 > > > Patch #3: For Qemu-traditional: > > From 9c0cb506a10ce4588b5fc048f4be2b9997cb0c75 Mon Sep 17 00:00:00 2001 > From: Feng Wu <feng.wu@intel.com> > Date: Fri, 1 Nov 2013 00:52:48 -0400 > Subject: [PATCH] x86: properly handle MSI-X unmask operation from guests > > For a pass-through device with MSI-x capability, when guest tries > to unmask the MSI-x interrupt for the passed through device, xen > doesn''t clear the mask bit for MSI-x in hardware in the following > scenario, which will cause network disconnection: > > 1. Guest masks the MSI-x interrupt > 2. Guest updates the address and data for it > 3. Guest unmasks the MSI-x interrupt (This is the problematic step) > > In the step #3 above, Xen doesn''t handle it well. When guest tries > to unmask MSI-X interrupt, it traps to Xen, Xen just returns to Qemu > if it notices that address or data has been modified by guest before, > then Qemu will update Xen with the latest value of address/data by > hypercall. However, in this whole process, the MSI-X interrupt unmask > operation is missing, which means Xen doesn''t clear the mask bit in > hardware for the MSI-X interrupt, so it remains disabled, that is why > it loses the network connection. > > This patch fixes this issue. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > hw/pt-msi.c | 27 ++++++++++++++++++--------- > 1 files changed, 18 insertions(+), 9 deletions(-) > > diff --git a/hw/pt-msi.c b/hw/pt-msi.c > index b03b989..64a2531 100644 > --- a/hw/pt-msi.c > +++ b/hw/pt-msi.c > @@ -142,7 +142,7 @@ int pt_msi_update(struct pt_dev *d) > d->msi->pirq, gvec, gflags); > > ret = xc_domain_update_msi_irq(xc_handle, domid, gvec, > - d->msi->pirq, gflags, 0); > + d->msi->pirq, gflags, 1, 0, 0); > > if (ret) > { > @@ -281,6 +281,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr) > int gvec = entry->io_mem[2] & 0xff; > uint64_t gaddr = *(uint64_t *)&entry->io_mem[0]; > uint32_t gflags = __get_msi_gflags(entry->io_mem[2], gaddr); > + uint32_t vector_ctrl = entry->io_mem[3]; > int ret; > > if ( !entry->flags ) > @@ -319,6 +320,7 @@ static int pt_msix_update_one(struct pt_dev *dev, int entry_nr) > entry_nr, pirq, gvec); > > ret = xc_domain_update_msi_irq(xc_handle, domid, gvec, pirq, gflags, > + vector_ctrl, entry_nr, > dev->msix->mmio_base_addr); > if ( ret ) > { > @@ -431,6 +433,7 @@ static void pci_msix_writel(void *opaque, target_phys_addr_t addr, uint32_t val) > struct pt_msix_info *msix = dev->msix; > struct msix_entry_info *entry; > int entry_nr, offset; > + const volatile uint32_t *vec_ctrl; > > if ( addr % 4 ) > { > @@ -443,19 +446,17 @@ static void pci_msix_writel(void *opaque, target_phys_addr_t addr, uint32_t val) > entry = &msix->msix_entry[entry_nr]; > offset = ((addr - msix->mmio_base_addr) % 16) / 4; > > + /* > + * If Xen intercepts the mask bit access, io_mem[3] may not be > + * up-to-date. Read from hardware directly. > + */ > + vec_ctrl = msix->phys_iomem_base + 16 * entry_nr + 12; > + > if ( offset != 3 ) > { > - const volatile uint32_t *vec_ctrl; > - > if ( entry->io_mem[offset] == val ) > return; > > - /* > - * If Xen intercepts the mask bit access, io_mem[3] may not be > - * up-to-date. Read from hardware directly. > - */ > - vec_ctrl = msix->phys_iomem_base + 16 * entry_nr + 12; > - > if ( msix->enabled && !(*vec_ctrl & 0x1) ) > { > PT_LOG("Can''t update entry %d since MSI-X is already enabled" > @@ -471,6 +472,14 @@ static void pci_msix_writel(void *opaque, target_phys_addr_t addr, uint32_t val) > > if ( offset == 3 ) > { > + > + /* > + * If guest unmasks the MSI-X interrupt, we also need to > + * notify Xen to update it. > + */ > + if ((val & 0x1) != (*vec_ctrl & 0x1)) > + entry->flags = 1; > + > if ( msix->enabled && !(val & 0x1) ) > pt_msix_update_one(dev, entry_nr); > } > -- > 1.7.1 > > Thanks, > Feng > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Nov-11 13:18 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
>>> On 11.11.13 at 13:33, "Wu, Feng" <feng.wu@intel.com> wrote: > --- a/tools/libxc/xc_domain.c > +++ b/tools/libxc/xc_domain.c > @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq( > uint32_t gvec, > uint32_t pirq, > uint32_t gflags, > + uint32_t vector_ctrl, > + int nr_entry, > uint64_t gtable) > {With there being out of tree consumers (like upstream qemu), you can''t just add two parameters here.> +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, > + unsigned int nr_entry) > +{ > + unsigned long ctrl_address; > + > + ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE + > + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; > + > + if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY )So once again you clear the mask bit with no consideration whatsoever as to the state Xen want the mask bit to be in. Did you not read through the history of all these MSI-X related changes? Otherwise you should have known that it is precisely this which we must not allow.> --- a/xen/drivers/passthrough/io.c > +++ b/xen/drivers/passthrough/io.c > @@ -193,6 +193,13 @@ int pt_irq_create_bind( > spin_unlock(&d->event_lock); > if ( dest_vcpu_id >= 0 ) > hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]); > + > + if ((pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK) == 0) { > + rc = msixtbl_unmask(d->vcpu[0], pt_irq_bind->u.msi.gtable, > + pt_irq_bind->u.msi.nr_entry); > + if (rc) > + return -EBUSY; > + }Without explanation in the commit message I don''t see why this adjustment is needed.> --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -503,6 +503,8 @@ struct xen_domctl_bind_pt_irq { > struct { > uint8_t gvec; > uint32_t gflags; > + uint32_t vector_ctrl; > + int nr_entry; > uint64_aligned_t gtable; > } msi; > } u;Andrew already told you: _If_ you need to change this, it has to be accompanied by an interface version change. But I this should be done entirely inside the hypervisor - after all it is the hypervisor that forwards the request to qemu, so upon completion of the request it could as well do the necessary unmasking (as long as it doesn''t need the interrupt masked for internal purposes). Jan
Xu, Dongxiao
2013-Nov-11 16:16 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich > Sent: Monday, November 11, 2013 9:19 PM > To: Wu, Feng > Cc: xen-devel; Zhang, Xiantao > Subject: Re: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask operation > from guests > > >>> On 11.11.13 at 13:33, "Wu, Feng" <feng.wu@intel.com> wrote: > > --- a/tools/libxc/xc_domain.c > > +++ b/tools/libxc/xc_domain.c > > @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq( > > uint32_t gvec, > > uint32_t pirq, > > uint32_t gflags, > > + uint32_t vector_ctrl, > > + int nr_entry, > > uint64_t gtable) > > { > > With there being out of tree consumers (like upstream qemu), you > can''t just add two parameters here. > > > +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, > > + unsigned int nr_entry) > > +{ > > + unsigned long ctrl_address; > > + > > + ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE + > > + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; > > + > > + if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY ) > > So once again you clear the mask bit with no consideration > whatsoever as to the state Xen want the mask bit to be in. Did > you not read through the history of all these MSI-X related > changes? Otherwise you should have known that it is precisely > this which we must not allow. > > > --- a/xen/drivers/passthrough/io.c > > +++ b/xen/drivers/passthrough/io.c > > @@ -193,6 +193,13 @@ int pt_irq_create_bind( > > spin_unlock(&d->event_lock); > > if ( dest_vcpu_id >= 0 ) > > hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]); > > + > > + if ((pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK) > == 0) { > > + rc = msixtbl_unmask(d->vcpu[0], pt_irq_bind->u.msi.gtable, > > + pt_irq_bind->u.msi.nr_entry); > > + if (rc) > > + return -EBUSY; > > + } > > Without explanation in the commit message I don''t see why this > adjustment is needed. > > > --- a/xen/include/public/domctl.h > > +++ b/xen/include/public/domctl.h > > @@ -503,6 +503,8 @@ struct xen_domctl_bind_pt_irq { > > struct { > > uint8_t gvec; > > uint32_t gflags; > > + uint32_t vector_ctrl; > > + int nr_entry; > > uint64_aligned_t gtable; > > } msi; > > } u; > > Andrew already told you: _If_ you need to change this, it has to > be accompanied by an interface version change. > > But I this should be done entirely inside the hypervisor - after all > it is the hypervisor that forwards the request to qemu, so upon > completion of the request it could as well do the necessary > unmasking (as long as it doesn''t need the interrupt masked for > internal purposes).Currently, Feng''s patch handles the unmask operation when QEMU issues the MSI-X update hypercall (XEN_DOMCTL_bind_pt_irq). However, if add the unmask operation in Xen''s generic I/O emulation code path, we may need to introduce a new MSI-X specific callback function, which is supposed to be executed after QEMU I/O operation is done. Is this what you mean? Thanks, Dongxiao> > Jan > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Jan Beulich
2013-Nov-11 16:25 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
>>> On 11.11.13 at 17:16, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote: >> -----Original Message----- >> From: xen-devel-bounces@lists.xen.org >> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich >> But I this should be done entirely inside the hypervisor - after all >> it is the hypervisor that forwards the request to qemu, so upon >> completion of the request it could as well do the necessary >> unmasking (as long as it doesn''t need the interrupt masked for >> internal purposes). > > Currently, Feng''s patch handles the unmask operation when QEMU issues the > MSI-X update hypercall (XEN_DOMCTL_bind_pt_irq). > However, if add the unmask operation in Xen''s generic I/O emulation code > path, we may need to introduce a new MSI-X specific callback function, which > is supposed to be executed after QEMU I/O operation is done. Is this what you > mean?Yes. Jan
Jan Beulich
2013-Nov-12 08:52 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
>>> On 11.11.13 at 17:25, "Jan Beulich" <JBeulich@suse.com> wrote: >>>> On 11.11.13 at 17:16, "Xu, Dongxiao" <dongxiao.xu@intel.com> wrote: >>> -----Original Message----- >>> From: xen-devel-bounces@lists.xen.org >>> [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich >>> But I this should be done entirely inside the hypervisor - after all >>> it is the hypervisor that forwards the request to qemu, so upon >>> completion of the request it could as well do the necessary >>> unmasking (as long as it doesn''t need the interrupt masked for >>> internal purposes). >> >> Currently, Feng''s patch handles the unmask operation when QEMU issues the >> MSI-X update hypercall (XEN_DOMCTL_bind_pt_irq). >> However, if add the unmask operation in Xen''s generic I/O emulation code >> path, we may need to introduce a new MSI-X specific callback function, which >> is supposed to be executed after QEMU I/O operation is done. Is this what > you >> mean? > > Yes.But of course that''s the more difficult to implement operation. If there''s a way to forward the necessary information from the write path to the path handling whatever qemu issues as operations to the hypervisor in response to the writes (_without_ altering the qemu interface), that might still be the easier route implementation wise. Jan
Wu, Feng
2013-Nov-12 08:55 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
> -----Original Message----- > From: Xu, Dongxiao > Sent: Tuesday, November 12, 2013 12:16 AM > To: Jan Beulich; Wu, Feng > Cc: xen-devel; Zhang, Xiantao > Subject: RE: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask operation > from guests > > > -----Original Message----- > > From: xen-devel-bounces@lists.xen.org > > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich > > Sent: Monday, November 11, 2013 9:19 PM > > To: Wu, Feng > > Cc: xen-devel; Zhang, Xiantao > > Subject: Re: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask > operation > > from guests > > > > >>> On 11.11.13 at 13:33, "Wu, Feng" <feng.wu@intel.com> wrote: > > > --- a/tools/libxc/xc_domain.c > > > +++ b/tools/libxc/xc_domain.c > > > @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq( > > > uint32_t gvec, > > > uint32_t pirq, > > > uint32_t gflags, > > > + uint32_t vector_ctrl, > > > + int nr_entry, > > > uint64_t gtable) > > > { > > > > With there being out of tree consumers (like upstream qemu), you > > can''t just add two parameters here.So I will add the needed changes in upstream qemu as what I did for qemu-xen and qemu-traditional> > > > > +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, > > > + unsigned int nr_entry) > > > +{ > > > + unsigned long ctrl_address; > > > + > > > + ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE + > > > + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; > > > + > > > + if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY ) > > > > So once again you clear the mask bit with no consideration > > whatsoever as to the state Xen want the mask bit to be in. Did > > you not read through the history of all these MSI-X related > > changes? Otherwise you should have known that it is precisely > > this which we must not allow.There already have been some handlings in misxtbl_write() as follows: /* * Do not allow guest to modify MSI-X control bit if it is masked * by Xen. We''ll only handle the case where Xen thinks that * bit is unmasked, but hardware has silently masked the bit * (in case of SR-IOV VF reset, etc). On the other hand, if Xen * thinks that the bit is masked, but it''s really not, * we log a warning. */ if ( msi_desc->msi_attrib.masked ) { if ( !(orig & PCI_MSIX_VECTOR_BITMASK) ) printk(XENLOG_WARNING "MSI-X control bit is unmasked when" " it is expected to be masked [%04x:%02x:%02x.%u]\n", entry->pdev->seg, entry->pdev->bus, PCI_SLOT(entry->pdev->devfn), PCI_FUNC(entry->pdev->devfn)); goto unlock; } So here the calling to misxtbl_write() to unmaks msi-x will follow the above check. I am not sure if this is what you concerned. Please correct me if my understanding is not what you expected.> > > > > --- a/xen/drivers/passthrough/io.c > > > +++ b/xen/drivers/passthrough/io.c > > > @@ -193,6 +193,13 @@ int pt_irq_create_bind( > > > spin_unlock(&d->event_lock); > > > if ( dest_vcpu_id >= 0 ) > > > hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]); > > > + > > > + if ((pt_irq_bind->u.msi.vector_ctrl & > PCI_MSIX_VECTOR_BITMASK) > > == 0) { > > > + rc = msixtbl_unmask(d->vcpu[0], pt_irq_bind->u.msi.gtable, > > > + pt_irq_bind->u.msi.nr_entry); > > > + if (rc) > > > + return -EBUSY; > > > + } > > > > Without explanation in the commit message I don''t see why this > > adjustment is needed.It writes the mask bit to hardware directly by Xen instead of going to Qemu for performance consideration when guests try to mask msi-x, so we don''t need to do anything here if pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK equals 1. However, if pt_irq_bind->u.msi.vector_ctrl & PCI_MSIX_VECTOR_BITMASK equals 0, it means that we just got a msi-x unmask operation from guests and we need to do it in real hardware.> > > > > --- a/xen/include/public/domctl.h > > > +++ b/xen/include/public/domctl.h > > > @@ -503,6 +503,8 @@ struct xen_domctl_bind_pt_irq { > > > struct { > > > uint8_t gvec; > > > uint32_t gflags; > > > + uint32_t vector_ctrl; > > > + int nr_entry; > > > uint64_aligned_t gtable; > > > } msi; > > > } u; > > > > Andrew already told you: _If_ you need to change this, it has to > > be accompanied by an interface version change. > > > > But I this should be done entirely inside the hypervisor - after all > > it is the hypervisor that forwards the request to qemu, so upon > > completion of the request it could as well do the necessary > > unmasking (as long as it doesn''t need the interrupt masked for > > internal purposes). > > Currently, Feng''s patch handles the unmask operation when QEMU issues the > MSI-X update hypercall (XEN_DOMCTL_bind_pt_irq). > However, if add the unmask operation in Xen''s generic I/O emulation code path, > we may need to introduce a new MSI-X specific callback function, which is > supposed to be executed after QEMU I/O operation is done. Is this what you > mean? > > Thanks, > DongxiaoJan, If you prefer handling this issue totally inside hypervisor, I will try to find a proper solution soon. In that case, we should not need to modify code of QEMU.> > > > > Jan > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel
Jan Beulich
2013-Nov-12 09:32 UTC
Re: [PATCH] x86: properly handle MSI-X unmask operation from guests
>>> On 12.11.13 at 09:55, "Wu, Feng" <feng.wu@intel.com> wrote: >> -----Original Message----- >> From: Xu, Dongxiao >> Sent: Tuesday, November 12, 2013 12:16 AM >> To: Jan Beulich; Wu, Feng >> Cc: xen-devel; Zhang, Xiantao >> Subject: RE: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask operation >> from guests >> >> > -----Original Message----- >> > From: xen-devel-bounces@lists.xen.org >> > [mailto:xen-devel-bounces@lists.xen.org] On Behalf Of Jan Beulich >> > Sent: Monday, November 11, 2013 9:19 PM >> > To: Wu, Feng >> > Cc: xen-devel; Zhang, Xiantao >> > Subject: Re: [Xen-devel] [PATCH] x86: properly handle MSI-X unmask >> operation >> > from guests >> > >> > >>> On 11.11.13 at 13:33, "Wu, Feng" <feng.wu@intel.com> wrote: >> > > --- a/tools/libxc/xc_domain.c >> > > +++ b/tools/libxc/xc_domain.c >> > > @@ -1342,6 +1342,8 @@ int xc_domain_update_msi_irq( >> > > uint32_t gvec, >> > > uint32_t pirq, >> > > uint32_t gflags, >> > > + uint32_t vector_ctrl, >> > > + int nr_entry, >> > > uint64_t gtable) >> > > { >> > >> > With there being out of tree consumers (like upstream qemu), you >> > can''t just add two parameters here. > > So I will add the needed changes in upstream qemu as what I did for qemu-xen > and qemu-traditionalThat wouldn''t help - you''ll be unable to modify what''s already released. Changes like this _must_ be done such that existing consumers remain unaffected. But anyway - the primary goal here ought to be to find a solution not needing changes to the interfaces in the first place.>> > > +int msixtbl_unmask(struct vcpu *v, unsigned long table_base, >> > > + unsigned int nr_entry) >> > > +{ >> > > + unsigned long ctrl_address; >> > > + >> > > + ctrl_address = table_base + nr_entry * PCI_MSIX_ENTRY_SIZE + >> > > + PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET; >> > > + >> > > + if (msixtbl_write(v, ctrl_address, 4, 0) != X86EMUL_OKAY ) >> > >> > So once again you clear the mask bit with no consideration >> > whatsoever as to the state Xen want the mask bit to be in. Did >> > you not read through the history of all these MSI-X related >> > changes? Otherwise you should have known that it is precisely >> > this which we must not allow. > > There already have been some handlings in misxtbl_write() as follows: > > /* > * Do not allow guest to modify MSI-X control bit if it is masked > * by Xen. We''ll only handle the case where Xen thinks that > * bit is unmasked, but hardware has silently masked the bit > * (in case of SR-IOV VF reset, etc). On the other hand, if Xen > * thinks that the bit is masked, but it''s really not, > * we log a warning. > */ > if ( msi_desc->msi_attrib.masked ) > { > if ( !(orig & PCI_MSIX_VECTOR_BITMASK) ) > printk(XENLOG_WARNING "MSI-X control bit is unmasked when" > " it is expected to be masked [%04x:%02x:%02x.%u]\n", > entry->pdev->seg, entry->pdev->bus, > PCI_SLOT(entry->pdev->devfn), > PCI_FUNC(entry->pdev->devfn)); > > goto unlock; > } > > So here the calling to misxtbl_write() to unmaks msi-x will follow the above > check. > I am not sure if this is what you concerned. Please correct me if my > understanding > is not what you expected.You quote but continue to appear to neglect the condition: The unmasking is _not_ being done when the hypervisor wants the interrupt masked.> Jan, If you prefer handling this issue totally inside hypervisor, I will try > to find a proper solution soon. > In that case, we should not need to modify code of QEMU.As said - yes, please, if at all possible. Jan