Joby Poriyath
2013-Jul-18 17:44 UTC
[PATCH] interrupts: allow guest to set and clear MSI-X mask bit
Guest needs the ability to enable and disable MSI-X interrupts by setting the MSI-X control bit. Currently, a write to MSI-X mask bit by the guest is silently ignored. A likely scenario is where we have a 82599 SR-IOV nic passed through to a guest. From the guest if you do ifconfig <ETH_DEV> down ifconfig <ETH_DEV> up the interrupts remain masked. The the mask bit for the VF is being set by the PF performing a reset (at the request of the VF). However, interrupts are enabled by VF driver by clearing the mask bit by writing directly to BAR3 region containing the MSI-X table. From dom0, we can verify that interrupts are being masked using ''xl debug-keys M''. Intially, guest was allowed to modify MSI-X bit. Later this behaviour was changed. See changeset 74c213c506afcd74a8556dd092995fd4dc38b225. Signed-off-by: Joby Poriyath <joby.poriyath@citrix.com> --- xen/arch/x86/hvm/vmsi.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/xen/arch/x86/hvm/vmsi.c b/xen/arch/x86/hvm/vmsi.c index 36de312..6d9892a 100644 --- a/xen/arch/x86/hvm/vmsi.c +++ b/xen/arch/x86/hvm/vmsi.c @@ -169,6 +169,7 @@ struct msixtbl_entry uint32_t msi_ad[3]; /* Shadow of address low, high and data */ } gentries[MAX_MSIX_ACC_ENTRIES]; struct rcu_head rcu; + struct pirq *pirq; }; static DEFINE_RCU_READ_LOCK(msixtbl_rcu_lock); @@ -254,6 +255,8 @@ static int msixtbl_write(struct vcpu *v, unsigned long address, void *virt; unsigned int nr_entry, index; int r = X86EMUL_UNHANDLEABLE; + unsigned long flags; + struct irq_desc *desc; if ( len != 4 || (address & 3) ) return r; @@ -283,20 +286,13 @@ static int msixtbl_write(struct vcpu *v, unsigned long address, if ( !virt ) goto out; - /* Do not allow the mask bit to be changed. */ -#if 0 /* XXX - * As the mask bit is the only defined bit in the word, and as the - * host MSI-X code doesn''t preserve the other bits anyway, doing - * this is pointless. So for now just discard the write (also - * saving us from having to determine the matching irq_desc). - */ - spin_lock_irqsave(&desc->lock, flags); - orig = readl(virt); - val &= ~PCI_MSIX_VECTOR_BITMASK; - val |= orig & PCI_MSIX_VECTOR_BITMASK; + desc = pirq_spin_lock_irq_desc(entry->pirq, &flags); + if ( !desc ) + goto out; + + val &= PCI_MSIX_VECTOR_BITMASK; writel(val, virt); spin_unlock_irqrestore(&desc->lock, flags); -#endif r = X86EMUL_OKAY; out: @@ -328,7 +324,8 @@ const struct hvm_mmio_handler msixtbl_mmio_handler = { static void add_msixtbl_entry(struct domain *d, struct pci_dev *pdev, uint64_t gtable, - struct msixtbl_entry *entry) + struct msixtbl_entry *entry, + struct pirq *pirq) { u32 len; @@ -342,6 +339,7 @@ static void add_msixtbl_entry(struct domain *d, entry->table_len = len; entry->pdev = pdev; entry->gtable = (unsigned long) gtable; + entry->pirq = pirq; list_add_rcu(&entry->list, &d->arch.hvm_domain.msixtbl_list); } @@ -404,7 +402,7 @@ int msixtbl_pt_register(struct domain *d, struct pirq *pirq, uint64_t gtable) entry = new_entry; new_entry = NULL; - add_msixtbl_entry(d, pdev, gtable, entry); + add_msixtbl_entry(d, pdev, gtable, entry, pirq); found: atomic_inc(&entry->refcnt); -- 1.7.10.4
Ian Campbell
2013-Jul-19 08:58 UTC
Re: [PATCH] interrupts: allow guest to set and clear MSI-X mask bit
On Thu, 2013-07-18 at 18:44 +0100, Joby Poriyath wrote:> Guest needs the ability to enable and disable MSI-X interrupts > by setting the MSI-X control bit. Currently, a write to MSI-X > mask bit by the guest is silently ignored. > > A likely scenario is where we have a 82599 SR-IOV nic passed > through to a guest. From the guest if you do > > ifconfig <ETH_DEV> down > ifconfig <ETH_DEV> up > > the interrupts remain masked. The the mask bit for the VF is > being set by the PF performing a reset (at the request of the VF). > However, interrupts are enabled by VF driver by clearing the mask > bit by writing directly to BAR3 region containing the MSI-X table. > > From dom0, we can verify that > interrupts are being masked using 'xl debug-keys M'. > > Intially, guest was allowed to modify MSI-X bit. > Later this behaviour was changed. > See changeset 74c213c506afcd74a8556dd092995fd4dc38b225.That commit message says: - the interrupt mask bit was permitted to be written by the guest (while Xen's interrupt flow control routines need to control it) I guess it's not entirely clear that simply reversing this is sufficient. The above doesn't give much to go on but I would have naïvely thought that any change to allow the guest to control this bit would be accompanied by some sort of call to Xen's interrupt flow control routines.> - * As the mask bit is the only defined bit in the word, and as the > - * host MSI-X code doesn't preserve the other bits anyway, doing > - * this is pointless. So for now just discard the write (also > - * saving us from having to determine the matching irq_desc). > - */ > - spin_lock_irqsave(&desc->lock, flags); > - orig = readl(virt); > - val &= ~PCI_MSIX_VECTOR_BITMASK; > - val |= orig & PCI_MSIX_VECTOR_BITMASK; > + desc = pirq_spin_lock_irq_desc(entry->pirq, &flags); > + if ( !desc ) > + goto out; > + > + val &= PCI_MSIX_VECTOR_BITMASK;I think you need to at least retain the bits of the comment which explain why we don't preserve the other bits, or actually preserve them. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Joby Poriyath
2013-Jul-19 13:05 UTC
Re: [PATCH] interrupts: allow guest to set and clear MSI-X mask bit
Thanks Ian. On Fri, Jul 19, 2013 at 09:58:48AM +0100, Ian Campbell wrote:> On Thu, 2013-07-18 at 18:44 +0100, Joby Poriyath wrote: > > Guest needs the ability to enable and disable MSI-X interrupts > > by setting the MSI-X control bit. Currently, a write to MSI-X > > mask bit by the guest is silently ignored. > > > > A likely scenario is where we have a 82599 SR-IOV nic passed > > through to a guest. From the guest if you do > > > > ifconfig <ETH_DEV> down > > ifconfig <ETH_DEV> up > > > > the interrupts remain masked. The the mask bit for the VF is > > being set by the PF performing a reset (at the request of the VF). > > However, interrupts are enabled by VF driver by clearing the mask > > bit by writing directly to BAR3 region containing the MSI-X table. > > > > From dom0, we can verify that > > interrupts are being masked using ''xl debug-keys M''. > > > > Intially, guest was allowed to modify MSI-X bit. > > Later this behaviour was changed. > > See changeset 74c213c506afcd74a8556dd092995fd4dc38b225. > > That commit message says: > - the interrupt mask bit was permitted to be written by the guest > (while Xen''s interrupt flow control routines need to control it) > I guess it''s not entirely clear that simply reversing this is > sufficient. The above doesn''t give much to go on but I would have > naïvely thought that any change to allow the guest to control this bit > would be accompanied by some sort of call to Xen''s interrupt flow > control routines. >I would have thought so. May be there is a interrupt flow control routine that I ought to have called. But looking through the code it wasn''t obvious. The write to mask bit was allowed so as to reduce the load on qemu-dm if guest updates the mask bit frequently. See changeset 34097f0d30802ecdc6da79658090fab9479a0c1c which introduced this. However I wasn''t sure why this was disabled. Perhaps Jan could comment on this.> > - * As the mask bit is the only defined bit in the word, and as the > > - * host MSI-X code doesn''t preserve the other bits anyway, doing > > - * this is pointless. So for now just discard the write (also > > - * saving us from having to determine the matching irq_desc). > > - */ > > - spin_lock_irqsave(&desc->lock, flags); > > - orig = readl(virt); > > - val &= ~PCI_MSIX_VECTOR_BITMASK; > > - val |= orig & PCI_MSIX_VECTOR_BITMASK; > > + desc = pirq_spin_lock_irq_desc(entry->pirq, &flags); > > + if ( !desc ) > > + goto out; > > + > > + val &= PCI_MSIX_VECTOR_BITMASK; > > I think you need to at least retain the bits of the comment which > explain why we don''t preserve the other bits, or actually preserve them.I should have preserved the comments. Moreover I should only change the 0th bit. Remaing 31 bit are reserved. Data sheet for 82599 clearly says that the reserved bits should be preserved (as a general rule). I''ll send the updated patch for more comments.> > Ian. >Thanks, Joby