thr3ads.net - Xen devel - [Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU [Jul 2011]

If this information is useful, please help other people find it:
Share via:

George Dunlap

2011-Jul-26 16:33 UTC

[Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU

The interrupt remapping tables on AMD IOMMUs index by vector only.
This means that if two MSIs go through the table that are destined for
different cpus, but they share the same vector, they will be
redirected to the same place.  (E.g., one interrupt on p5 vector 67,
another interrupt on p7 vector 67; both will be redirected to the same
place.)

Introducing per-device interrupt mappings reduces the problem, but
does not solve it completely if the same device can have multiple IRQs
assigned to it, because you can get the same issue -- two different
IRQs from the same device can be assigned the same vector on different
cpus.  This causes one of the IRQs to activated when either interrupt
is triggered, and the other IRQ to never receive any interrupts.

This series consists of three patches:

1: Introduce infrastructure to allow irqs to share vector maps.  Any
IRQs sharing the same vector map will never have vector collisions.

2: Introduce option to have per-device vector maps for MSI IRQs.

3: Automatically enable per-device vector maps when running on an AMD
system with the IOMMU enabled, unless otherwise specified.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-Jul-26 16:33 UTC

head link

[Xen-devel] [PATCH 1 of 3] xen: Infrastructure to allow irqs to share vector maps

Laying the groundwork for per-device vector maps.  This generic
code allows any irq to point to a vector map; all irqs sharing the
same vector map will avoid sharing vectors.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r e8d1c8f074ba -r 4c8f3cae5007 xen/arch/x86/io_apic.c
--- a/xen/arch/x86/io_apic.c	Mon Jul 25 16:43:26 2011 +0100
+++ b/xen/arch/x86/io_apic.c	Tue Jul 26 17:21:23 2011 +0100
@@ -449,6 +449,11 @@ fastcall void smp_irq_move_cleanup_inter
                  irq, vector, smp_processor_id());
 
         __get_cpu_var(vector_irq)[vector] = -1;
+        if ( cfg->used_vectors )
+        {
+            ASSERT(test_bit(vector, cfg->used_vectors));
+            clear_bit(vector, cfg->used_vectors);
+        }
         cfg->move_cleanup_count--;
 unlock:
         spin_unlock(&desc->lock);
diff -r e8d1c8f074ba -r 4c8f3cae5007 xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c	Mon Jul 25 16:43:26 2011 +0100
+++ b/xen/arch/x86/irq.c	Tue Jul 26 17:21:23 2011 +0100
@@ -108,6 +108,8 @@ static int __init __bind_irq_vector(int 
         per_cpu(vector_irq, cpu)[vector] = irq;
     cfg->vector = vector;
     cfg->cpu_mask = online_mask;
+    if ( cfg->used_vectors )
+        set_bit(vector, cfg->used_vectors);
     irq_status[irq] = IRQ_USED;
     if (IO_APIC_IRQ(irq))
         irq_vector[irq] = vector;
@@ -172,6 +174,7 @@ static void dynamic_irq_cleanup(unsigned
     desc->depth   = 1;
     desc->msi_desc = NULL;
     desc->handler = &no_irq_type;
+    desc->chip_data->used_vectors=NULL;
     cpus_setall(desc->affinity);
     spin_unlock_irqrestore(&desc->lock, flags);
 
@@ -200,6 +203,9 @@ static void __clear_irq_vector(int irq)
     for_each_cpu_mask(cpu, tmp_mask)
         per_cpu(vector_irq, cpu)[vector] = -1;
 
+    if ( cfg->used_vectors )
+        clear_bit(vector, cfg->used_vectors);
+
     cfg->vector = IRQ_VECTOR_UNASSIGNED;
     cpus_clear(cfg->cpu_mask);
     init_one_irq_status(irq);
@@ -277,6 +283,7 @@ static void __init init_one_irq_cfg(stru
     cfg->vector = IRQ_VECTOR_UNASSIGNED;
     cpus_clear(cfg->cpu_mask);
     cpus_clear(cfg->old_cpu_mask);
+    cfg->used_vectors = NULL;
 }
 
 int __init init_irq_data(void)
@@ -402,6 +409,10 @@ next:
         if (test_bit(vector, used_vectors))
             goto next;
 
+        if (cfg->used_vectors
+            && test_bit(vector, cfg->used_vectors) )
+            goto next;
+
         for_each_cpu_mask(new_cpu, tmp_mask)
             if (per_cpu(vector_irq, new_cpu)[vector] != -1)
                 goto next;
@@ -417,6 +428,11 @@ next:
             per_cpu(vector_irq, new_cpu)[vector] = irq;
         cfg->vector = vector;
         cpus_copy(cfg->cpu_mask, tmp_mask);
+        if ( cfg->used_vectors )
+        {
+            ASSERT(!test_bit(vector, cfg->used_vectors));
+            set_bit(vector, cfg->used_vectors);
+        }
 
         irq_status[irq] = IRQ_USED;
             if (IO_APIC_IRQ(irq))
diff -r e8d1c8f074ba -r 4c8f3cae5007 xen/include/asm-x86/irq.h
--- a/xen/include/asm-x86/irq.h	Mon Jul 25 16:43:26 2011 +0100
+++ b/xen/include/asm-x86/irq.h	Tue Jul 26 17:21:23 2011 +0100
@@ -24,11 +24,16 @@
 #define irq_to_desc(irq)    (&irq_desc[irq])
 #define irq_cfg(irq)        (&irq_cfg[irq])
 
+typedef struct {
+    DECLARE_BITMAP(_bits,NR_VECTORS);
+} vmask_t;
+
 struct irq_cfg {
         int  vector;
         cpumask_t cpu_mask;
         cpumask_t old_cpu_mask;
         unsigned move_cleanup_count;
+        vmask_t *used_vectors;
         u8 move_in_progress : 1;
 };
 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-Jul-26 16:33 UTC

head link

[Xen-devel] [PATCH 2 of 3] xen: Option to allow per-device vector maps for MSI IRQs

Add a vector-map to pci_dev, and add an option to point MSI-related
IRQs to the vector-map of the device.

This prevents irqs from the same device from being assigned
the same vector on different pcpus.  This is required for systems
using an AMD IOMMU, since the intremap tables on AMD only look at
vector, and not destination ID.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 4c8f3cae5007 -r 362a881a79a4 xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c	Tue Jul 26 17:21:23 2011 +0100
+++ b/xen/arch/x86/irq.c	Tue Jul 26 17:21:23 2011 +0100
@@ -32,6 +32,9 @@ unsigned int __read_mostly nr_irqs_gsi  unsigned int
__read_mostly nr_irqs;
 integer_param("nr_irqs", nr_irqs);
 
+bool_t __read_mostly opt_irq_perdev_vector_map = 0;
+boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map);
+
 u8 __read_mostly *irq_vector;
 struct irq_desc __read_mostly *irq_desc = NULL;
 
@@ -1654,6 +1657,9 @@ int map_domain_pirq(
             dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n",
               d->domain_id, irq);
         desc->handler = &pci_msi_type;
+        if ( opt_irq_perdev_vector_map
+             && !desc->chip_data->used_vectors )
+            desc->chip_data->used_vectors =
&pdev->info.used_vectors;
         set_domain_irq_pirq(d, irq, info);
         setup_msi_irq(pdev, msi_desc, irq);
         spin_unlock_irqrestore(&desc->lock, flags);
diff -r 4c8f3cae5007 -r 362a881a79a4 xen/include/xen/pci.h
--- a/xen/include/xen/pci.h	Tue Jul 26 17:21:23 2011 +0100
+++ b/xen/include/xen/pci.h	Tue Jul 26 17:21:23 2011 +0100
@@ -11,6 +11,7 @@
 #include <xen/types.h>
 #include <xen/list.h>
 #include <xen/spinlock.h>
+#include <xen/irq.h>
 
 /*
  * The PCI interface treats multi-function devices as independent
@@ -38,6 +39,7 @@ struct pci_dev_info {
         u8 bus;
         u8 devfn;
     } physfn;
+   vmask_t used_vectors; 
 };
 
 struct pci_dev {

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-Jul-26 16:33 UTC

head link

[Xen-devel] [PATCH 3 of 3] xen: AMD IOMMU: Automatically enable per-device vector maps

Automatically enable per-device vector maps when using IOMMU,
unless disabled specifically by an IOMMU parameter.

Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com>

diff -r 362a881a79a4 -r fffeeea7afbb xen/arch/x86/irq.c
--- a/xen/arch/x86/irq.c	Tue Jul 26 17:21:23 2011 +0100
+++ b/xen/arch/x86/irq.c	Tue Jul 26 17:21:23 2011 +0100
@@ -32,6 +32,7 @@ unsigned int __read_mostly nr_irqs_gsi  unsigned int
__read_mostly nr_irqs;
 integer_param("nr_irqs", nr_irqs);
 
+/* This default may be changed by the AMD IOMMU code */
 bool_t __read_mostly opt_irq_perdev_vector_map = 0;
 boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map);
 
diff -r 362a881a79a4 -r fffeeea7afbb xen/drivers/passthrough/amd/pci_amd_iommu.c
--- a/xen/drivers/passthrough/amd/pci_amd_iommu.c	Tue Jul 26 17:21:23 2011 +0100
+++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c	Tue Jul 26 17:21:23 2011 +0100
@@ -25,6 +25,9 @@
 #include <asm/amd-iommu.h>
 #include <asm/hvm/svm/amd-iommu-proto.h>
 
+extern bool_t __read_mostly opt_irq_perdev_vector_map;
+extern bool_t __read_mostly iommu_amd_perdev_vector_map;
+
 struct amd_iommu *find_iommu_for_device(int bdf)
 {
     BUG_ON ( bdf >= ivrs_bdf_entries );
@@ -148,6 +151,18 @@ int __init amd_iov_detect(void)
         return -ENODEV;
     }
 
+    /* Enable use of per-device vector map unless otherwise
+     * specified */
+    if ( iommu_amd_perdev_vector_map )
+    {
+        printk("AMD-Vi: Enabling per-device vector maps\n");
+        opt_irq_perdev_vector_map=1;
+    }
+    else
+    {
+        printk("AMD-Vi: WARNING - not enabling per-device vector
maps\n");
+    }
+
     return scan_pci_devices();
 }
 
diff -r 362a881a79a4 -r fffeeea7afbb xen/drivers/passthrough/iommu.c
--- a/xen/drivers/passthrough/iommu.c	Tue Jul 26 17:21:23 2011 +0100
+++ b/xen/drivers/passthrough/iommu.c	Tue Jul 26 17:21:23 2011 +0100
@@ -49,6 +49,7 @@ bool_t __read_mostly iommu_qinval = 1;
 bool_t __read_mostly iommu_intremap = 1;
 bool_t __read_mostly iommu_hap_pt_share;
 bool_t __read_mostly iommu_debug;
+bool_t __read_mostly iommu_amd_perdev_vector_map = 1;
 
 static void __init parse_iommu_param(char *s)
 {
@@ -81,6 +82,8 @@ static void __init parse_iommu_param(cha
             iommu_dom0_strict = 1;
         else if ( !strcmp(s, "sharept") )
             iommu_hap_pt_share = 1;
+        else if ( !strcmp(s, "no-perdev-vector-map") )
+            iommu_amd_perdev_vector_map = 0;
 
         s = ss + 1;
     } while ( ss );

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2011-Jul-26 17:01 UTC

head link

Re: [Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU

On 26/07/2011 17:33, "George Dunlap"
<george.dunlap@eu.citrix.com> wrote:
> The interrupt remapping tables on AMD IOMMUs index by vector only.
> This means that if two MSIs go through the table that are destined for
> different cpus, but they share the same vector, they will be
> redirected to the same place.  (E.g., one interrupt on p5 vector 67,
> another interrupt on p7 vector 67; both will be redirected to the same
> place.)
> 
> Introducing per-device interrupt mappings reduces the problem, but
> does not solve it completely if the same device can have multiple IRQs
> assigned to it, because you can get the same issue -- two different
> IRQs from the same device can be assigned the same vector on different
> cpus.  This causes one of the IRQs to activated when either interrupt
> is triggered, and the other IRQ to never receive any interrupts.
The patches look fine, but I don''t see a reason to add yet more
command-line
options. AMD systems using irq remapping will need to avoid vector sharing;
other systems do not need to do so. We detect that automatically and DTRT
and there''s no reason for a user to override that. It''s just
extending the
set of arcane IOMMU command line settings that noone will understand the
implications of specifying.

 -- Keir
> This series consists of three patches:
> 
> 1: Introduce infrastructure to allow irqs to share vector maps.  Any
> IRQs sharing the same vector map will never have vector collisions.
> 
> 2: Introduce option to have per-device vector maps for MSI IRQs.
> 
> 3: Automatically enable per-device vector maps when running on an AMD
> system with the IOMMU enabled, unless otherwise specified.
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xensource.com
> http://lists.xensource.com/xen-devel


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

George Dunlap

2011-Jul-26 17:17 UTC

head link

Re: [Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU

On Tue, 2011-07-26 at 18:01 +0100, Keir Fraser wrote:> On 26/07/2011 17:33, "George Dunlap"
<george.dunlap@eu.citrix.com> wrote:
> 
> > The interrupt remapping tables on AMD IOMMUs index by vector only.
> > This means that if two MSIs go through the table that are destined for
> > different cpus, but they share the same vector, they will be
> > redirected to the same place.  (E.g., one interrupt on p5 vector 67,
> > another interrupt on p7 vector 67; both will be redirected to the same
> > place.)
> > 
> > Introducing per-device interrupt mappings reduces the problem, but
> > does not solve it completely if the same device can have multiple IRQs
> > assigned to it, because you can get the same issue -- two different
> > IRQs from the same device can be assigned the same vector on different
> > cpus.  This causes one of the IRQs to activated when either interrupt
> > is triggered, and the other IRQ to never receive any interrupts.
> 
> The patches look fine, but I don''t see a reason to add yet more
command-line
> options. AMD systems using irq remapping will need to avoid vector sharing;
> other systems do not need to do so. We detect that automatically and DTRT
> and there''s no reason for a user to override that. It''s
just extending the
> set of arcane IOMMU command line settings that noone will understand the
> implications of specifying.
I was mainly thinking of the support issue.  We don''t do extensive
testing of AMD passthrough here at Citrix (or we probably would have
tripped over this problem sooner).  If there are subtle, unexpected
side-effects of this patch, being able to tell people, "Specify
iommu=no-perdevice-irq-map" is a lot easier than telling them to install
a custom build with the perdevice stuff off.  (And it''s something that
power users can share with other users without needing to be a
developer.)

I''m not sure how common it is to have devices register more than one
interrupt.  If it''s just specialist multi-queue cards, then it may be
the case that having per-device intremap tables will suffice in many
cases.  If nearly all devices have multiple interrupts, then you''re
right, the option to disable it for AMDs with IOMMUs disabled is pretty
pointless.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Jul 2011 - [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU

[Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU

[Xen-devel] [PATCH 1 of 3] xen: Infrastructure to allow irqs to share vector maps

[Xen-devel] [PATCH 2 of 3] xen: Option to allow per-device vector maps for MSI IRQs

[Xen-devel] [PATCH 3 of 3] xen: AMD IOMMU: Automatically enable per-device vector maps

Re: [Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU

Re: [Xen-devel] [PATCH 0 of 3] Avoid sharing vectors within a device when using an AMD IOMMU