Wei Wang2
2011-Oct-27 14:07 UTC
[Xen-devel] Backport per-device vector map patches to xen 4.1.3
Recently we found an issue in xen 4.1. Under heavy I/O stress such as running bonnie++, Dom0 would lost its hard disk with lots of I/O errors. We found that some PCI-E devices was using the same vector as SMBus on AMD platforms and George'' patch set that enables per-device vector map can fix this problem. Following patches have been back ported and tested by Jacob and Wei H. Please apply them to Xen 4.1.3 Thanks, Wei 23752 xen: Infrastructure to allow irqs to share vector maps 23753 xen: Option to allow per-device vector maps for MSI IRQs 23754 xen: AMD IOMMU: Automatically enable per-device vector maps 23786 x86: Fix up irq vector map logic 23812 xen: Add global irq_vector_map option 23899 AMD-IOMMU: remove dead variable references _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Nov-17 16:02 UTC
Re: [Xen-devel] Backport per-device vector map patches to xen 4.1.3
On Thu, Oct 27, 2011 at 04:07:51PM +0200, Wei Wang2 wrote:> Recently we found an issue in xen 4.1. Under heavy I/O stress such as running > bonnie++, Dom0 would lost its hard disk with lots of I/O errors. We found > that some PCI-E devices was using the same vector as SMBus on AMD platforms > and George'' patch set that enables per-device vector map can fix this > problem. > > Following patches have been back ported and tested by Jacob and Wei H. > Please apply them to Xen 4.1.3Can they be applied please? I''ve been running these without hiccups for well, since they were posted. On both AMD and Intel boxes. So, Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> > Thanks, > Wei > > 23752 xen: Infrastructure to allow irqs to share vector maps > 23753 xen: Option to allow per-device vector maps for MSI IRQs > 23754 xen: AMD IOMMU: Automatically enable per-device vector maps > 23786 x86: Fix up irq vector map logic > 23812 xen: Add global irq_vector_map option > 23899 AMD-IOMMU: remove dead variable references> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1311701852 -3600 > # Node ID fa4e2ca9ecffbc432b451f495ad0a403644a6be8 > # Parent 2e0cf9428554da666616982cd0074024ff85b221 > xen: AMD IOMMU: Automatically enable per-device vector maps > > Automatically enable per-device vector maps when using IOMMU, > unless disabled specifically by an IOMMU parameter. > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > > diff -r e017a9a2f27c xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Tue Jul 26 18:37:16 2011 +0100 > +++ b/xen/arch/x86/irq.c Fri Oct 21 15:15:13 2011 -0400 > @@ -32,6 +32,7 @@ > unsigned int __read_mostly nr_irqs; > integer_param("nr_irqs", nr_irqs); > > +/* This default may be changed by the AMD IOMMU code */ > bool_t __read_mostly opt_irq_perdev_vector_map = 0; > boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map); > > diff -r e017a9a2f27c xen/drivers/passthrough/amd/pci_amd_iommu.c > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Tue Jul 26 18:37:16 2011 +0100 > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Fri Oct 21 15:15:13 2011 -0400 > @@ -24,6 +24,9 @@ > #include <asm/hvm/iommu.h> > #include <asm/amd-iommu.h> > #include <asm/hvm/svm/amd-iommu-proto.h> > + > +extern bool_t __read_mostly opt_irq_perdev_vector_map; > +extern bool_t __read_mostly iommu_amd_perdev_vector_map; > > extern unsigned short ivrs_bdf_entries; > extern struct ivrs_mappings *ivrs_mappings; > @@ -164,6 +167,18 @@ > { > printk("AMD-Vi: Error initialization\n"); > return -ENODEV; > + } > + > + /* Enable use of per-device vector map unless otherwise > + * specified */ > + if ( iommu_amd_perdev_vector_map ) > + { > + printk("AMD-Vi: Enabling per-device vector maps\n"); > + opt_irq_perdev_vector_map=1; > + } > + else > + { > + printk("AMD-Vi: WARNING - not enabling per-device vector maps\n"); > } > > return scan_pci_devices(); > diff -r e017a9a2f27c xen/drivers/passthrough/iommu.c > --- a/xen/drivers/passthrough/iommu.c Tue Jul 26 18:37:16 2011 +0100 > +++ b/xen/drivers/passthrough/iommu.c Fri Oct 21 15:15:13 2011 -0400 > @@ -49,6 +49,7 @@ > bool_t __read_mostly iommu_intremap = 1; > bool_t __read_mostly iommu_hap_pt_share; > bool_t __read_mostly amd_iommu_debug; > +bool_t __read_mostly iommu_amd_perdev_vector_map = 1; > bool_t __read_mostly amd_iommu_perdev_intremap; > > static void __init parse_iommu_param(char *s) > @@ -84,6 +85,8 @@ > iommu_dom0_strict = 1; > else if ( !strcmp(s, "sharept") ) > iommu_hap_pt_share = 1; > + else if ( !strcmp(s, "no-perdev-vector-map") ) > + iommu_amd_perdev_vector_map = 0; > > s = ss + 1; > } while ( ss );> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1311701836 -3600 > # Node ID 2e0cf9428554da666616982cd0074024ff85b221 > # Parent ef9ed3d2aa870a37ed5e611be9c524d526a2d604 > xen: Option to allow per-device vector maps for MSI IRQs > > Add a vector-map to pci_dev, and add an option to point MSI-related > IRQs to the vector-map of the device. > > This prevents irqs from the same device from being assigned > the same vector on different pcpus. This is required for systems > using an AMD IOMMU, since the intremap tables on AMD only look at > vector, and not destination ID. > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > > diff -r 84b8504a6125 xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Tue Jul 26 18:36:58 2011 +0100 > +++ b/xen/arch/x86/irq.c Fri Oct 21 15:12:46 2011 -0400 > @@ -31,6 +31,9 @@ > unsigned int __read_mostly nr_irqs_gsi = 16; > unsigned int __read_mostly nr_irqs; > integer_param("nr_irqs", nr_irqs); > + > +bool_t __read_mostly opt_irq_perdev_vector_map = 0; > +boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map); > > u8 __read_mostly *irq_vector; > struct irq_desc __read_mostly *irq_desc = NULL; > @@ -1560,6 +1563,9 @@ > dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n", > d->domain_id, irq); > desc->handler = &pci_msi_type; > + if ( opt_irq_perdev_vector_map > + && !desc->chip_data->used_vectors ) > + desc->chip_data->used_vectors = &pdev->info.used_vectors; > d->arch.pirq_irq[pirq] = irq; > d->arch.irq_pirq[irq] = pirq; > setup_msi_irq(pdev, msi_desc, irq); > diff -r 84b8504a6125 xen/include/xen/pci.h > --- a/xen/include/xen/pci.h Tue Jul 26 18:36:58 2011 +0100 > +++ b/xen/include/xen/pci.h Fri Oct 21 15:12:46 2011 -0400 > @@ -11,6 +11,7 @@ > #include <xen/types.h> > #include <xen/list.h> > #include <xen/spinlock.h> > +#include <xen/irq.h> > > /* > * The PCI interface treats multi-function devices as independent > @@ -38,6 +39,7 @@ > u8 bus; > u8 devfn; > } physfn; > + vmask_t used_vectors; > }; > > struct pci_dev {> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1311701818 -3600 > # Node ID ef9ed3d2aa870a37ed5e611be9c524d526a2d604 > # Parent 590aadf7c46ae979da3552332f592f9492ce6d8b > xen: Infrastructure to allow irqs to share vector maps > > Laying the groundwork for per-device vector maps. This generic > code allows any irq to point to a vector map; all irqs sharing the > same vector map will avoid sharing vectors. > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > > diff -r 590aadf7c46a -r ef9ed3d2aa87 xen/arch/x86/io_apic.c > --- a/xen/arch/x86/io_apic.c Tue Jul 26 17:00:25 2011 +0100 > +++ b/xen/arch/x86/io_apic.c Tue Jul 26 18:36:58 2011 +0100 > @@ -449,6 +449,11 @@ > irq, vector, smp_processor_id()); > > __get_cpu_var(vector_irq)[vector] = -1; > + if ( cfg->used_vectors ) > + { > + ASSERT(test_bit(vector, cfg->used_vectors)); > + clear_bit(vector, cfg->used_vectors); > + } > cfg->move_cleanup_count--; > unlock: > spin_unlock(&desc->lock); > diff -r 590aadf7c46a -r ef9ed3d2aa87 xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Tue Jul 26 17:00:25 2011 +0100 > +++ b/xen/arch/x86/irq.c Tue Jul 26 18:36:58 2011 +0100 > @@ -108,6 +108,8 @@ > per_cpu(vector_irq, cpu)[vector] = irq; > cfg->vector = vector; > cfg->cpu_mask = online_mask; > + if ( cfg->used_vectors ) > + set_bit(vector, cfg->used_vectors); > irq_status[irq] = IRQ_USED; > if (IO_APIC_IRQ(irq)) > irq_vector[irq] = vector; > @@ -172,6 +174,7 @@ > desc->depth = 1; > desc->msi_desc = NULL; > desc->handler = &no_irq_type; > + desc->chip_data->used_vectors=NULL; > cpus_setall(desc->affinity); > spin_unlock_irqrestore(&desc->lock, flags); > > @@ -199,6 +202,9 @@ > > for_each_cpu_mask(cpu, tmp_mask) > per_cpu(vector_irq, cpu)[vector] = -1; > + > + if ( cfg->used_vectors ) > + clear_bit(vector, cfg->used_vectors); > > cfg->vector = IRQ_VECTOR_UNASSIGNED; > cpus_clear(cfg->cpu_mask); > @@ -277,6 +283,7 @@ > cfg->vector = IRQ_VECTOR_UNASSIGNED; > cpus_clear(cfg->cpu_mask); > cpus_clear(cfg->old_cpu_mask); > + cfg->used_vectors = NULL; > } > > int __init init_irq_data(void) > @@ -402,6 +409,10 @@ > if (test_bit(vector, used_vectors)) > goto next; > > + if (cfg->used_vectors > + && test_bit(vector, cfg->used_vectors) ) > + goto next; > + > for_each_cpu_mask(new_cpu, tmp_mask) > if (per_cpu(vector_irq, new_cpu)[vector] != -1) > goto next; > @@ -417,6 +428,11 @@ > per_cpu(vector_irq, new_cpu)[vector] = irq; > cfg->vector = vector; > cpus_copy(cfg->cpu_mask, tmp_mask); > + if ( cfg->used_vectors ) > + { > + ASSERT(!test_bit(vector, cfg->used_vectors)); > + set_bit(vector, cfg->used_vectors); > + } > > irq_status[irq] = IRQ_USED; > if (IO_APIC_IRQ(irq)) > diff -r 590aadf7c46a -r ef9ed3d2aa87 xen/include/asm-x86/irq.h > --- a/xen/include/asm-x86/irq.h Tue Jul 26 17:00:25 2011 +0100 > +++ b/xen/include/asm-x86/irq.h Tue Jul 26 18:36:58 2011 +0100 > @@ -24,11 +24,16 @@ > #define irq_to_desc(irq) (&irq_desc[irq]) > #define irq_cfg(irq) (&irq_cfg[irq]) > > +typedef struct { > + DECLARE_BITMAP(_bits,NR_VECTORS); > +} vmask_t; > + > struct irq_cfg { > int vector; > cpumask_t cpu_mask; > cpumask_t old_cpu_mask; > unsigned move_cleanup_count; > + vmask_t *used_vectors; > u8 move_in_progress : 1; > }; >> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1314026133 -3600 > # Node ID 3a05da2dc7c0a5fc0fcfc40c535d1fcb71203625 > # Parent d1cd78a73a79e0e648937322cdb8d92a7f86327a > x86: Fix up irq vector map logic > > We need to make sure that cfg->used_vector is only cleared once; > otherwise there may be a race condition that allows the same vector to > be assigned twice, defeating the whole purpose of the map. > > This makes two changes: > * __clear_irq_vector() only clears the vector if the irq is not being > moved > * smp_iqr_move_cleanup_interrupt() only clears used_vector if this > is the last place it''s being used (move_cleanup_count==0 after > decrement). > > Also make use of asserts more consistent, to catch this kind of logic > bug in the future. > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > > diff -r d1cd78a73a79 -r 3a05da2dc7c0 xen/arch/x86/io_apic.c > --- a/xen/arch/x86/io_apic.c Mon Aug 22 16:15:19 2011 +0100 > +++ b/xen/arch/x86/io_apic.c Mon Aug 22 16:15:33 2011 +0100 > @@ -485,12 +485,14 @@ > irq, vector, smp_processor_id()); > > __get_cpu_var(vector_irq)[vector] = -1; > - if ( cfg->used_vectors ) > + cfg->move_cleanup_count--; > + > + if ( cfg->move_cleanup_count == 0 > + && cfg->used_vectors ) > { > ASSERT(test_bit(vector, cfg->used_vectors)); > clear_bit(vector, cfg->used_vectors); > } > - cfg->move_cleanup_count--; > unlock: > spin_unlock(&desc->lock); > } > diff -r d1cd78a73a79 -r 3a05da2dc7c0 xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Mon Aug 22 16:15:19 2011 +0100 > +++ b/xen/arch/x86/irq.c Mon Aug 22 16:15:33 2011 +0100 > @@ -113,7 +113,10 @@ > cfg->vector = vector; > cfg->cpu_mask = online_mask; > if ( cfg->used_vectors ) > + { > + ASSERT(!test_bit(vector, cfg->used_vectors)); > set_bit(vector, cfg->used_vectors); > + } > irq_status[irq] = IRQ_USED; > if (IO_APIC_IRQ(irq)) > irq_vector[irq] = vector; > @@ -207,15 +210,13 @@ > for_each_cpu_mask(cpu, tmp_mask) > per_cpu(vector_irq, cpu)[vector] = -1; > > - if ( cfg->used_vectors ) > - clear_bit(vector, cfg->used_vectors); > - > cfg->vector = IRQ_VECTOR_UNASSIGNED; > cpus_clear(cfg->cpu_mask); > init_one_irq_status(irq); > > if (likely(!cfg->move_in_progress)) > return; > + > cpus_and(tmp_mask, cfg->old_cpu_mask, cpu_online_map); > for_each_cpu_mask(cpu, tmp_mask) { > for (vector = FIRST_DYNAMIC_VECTOR; vector <= LAST_DYNAMIC_VECTOR; > @@ -228,6 +229,12 @@ > break; > } > } > + > + if ( cfg->used_vectors ) > + { > + ASSERT(test_bit(vector, cfg->used_vectors)); > + clear_bit(vector, cfg->used_vectors); > + } > > cfg->move_in_progress = 0; > }> # HG changeset patch > # User Jan Beulich <jbeulich@suse.com> > # Date 1317730316 -7200 > # Node ID a99d75671a911f9c0d5d11e0fe88a0a65863cb44 > # Parent 3d1664cc9e458809e399320204aca8536e401ee1 > AMD-IOMMU: remove dead variable references > > These got orphaned up by recent changes. > > Signed-off-by: Jan Beulich <jbeulich@suse.com> > Acked-by: Keir Fraser <keir@xen.org> > > diff -r 599cf097900b xen/drivers/passthrough/amd/pci_amd_iommu.c > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Sep 05 15:00:15 2011 +0100 > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Fri Oct 21 15:39:30 2011 -0400 > @@ -24,9 +24,6 @@ > #include <asm/hvm/iommu.h> > #include <asm/amd-iommu.h> > #include <asm/hvm/svm/amd-iommu-proto.h> > - > -extern bool_t __read_mostly opt_irq_perdev_vector_map; > -extern bool_t __read_mostly iommu_amd_perdev_vector_map; > > extern unsigned short ivrs_bdf_entries; > extern struct ivrs_mappings *ivrs_mappings; > diff -r 599cf097900b xen/drivers/passthrough/iommu.c > --- a/xen/drivers/passthrough/iommu.c Mon Sep 05 15:00:15 2011 +0100 > +++ b/xen/drivers/passthrough/iommu.c Fri Oct 21 15:39:30 2011 -0400 > @@ -49,7 +49,6 @@ > bool_t __read_mostly iommu_intremap = 1; > bool_t __read_mostly iommu_hap_pt_share; > bool_t __read_mostly amd_iommu_debug; > -bool_t __read_mostly iommu_amd_perdev_vector_map = 1; > bool_t __read_mostly amd_iommu_perdev_intremap; > > static void __init parse_iommu_param(char *s) > @@ -85,8 +84,6 @@ > iommu_dom0_strict = 1; > else if ( !strcmp(s, "sharept") ) > iommu_hap_pt_share = 1; > - else if ( !strcmp(s, "no-perdev-vector-map") ) > - iommu_amd_perdev_vector_map = 0; > > s = ss + 1; > } while ( ss );> # HG changeset patch > # User George Dunlap <george.dunlap@eu.citrix.com> > # Date 1315231215 -3600 > # Node ID 32814ad7458dc842a7c588eee13e5c4ee11709a3 > # Parent f1349a968a5ac5577d67ad4a3f3490c580dbe264 > xen: Add global irq_vector_map option, set if using AMD global intremap tables > > As mentioned in previous changesets, AMD IOMMU interrupt > remapping tables only look at the vector, not the destination > id of an interrupt. This means that all IRQs going through > the same interrupt remapping table need to *not* share vectors. > > The irq "vector map" functionality was originally introduced > after a patch which disabled global AMD IOMMUs entirely. That > patch has since been reverted, meaning that AMD intremap tables > can either be per-device or global. > > This patch therefore introduces a global irq vector map option, > and enables it if we''re using an AMD IOMMU with a global > interrupt remapping table. > > This patch removes the "irq-perdev-vector-map" boolean > command-line optino and replaces it with "irq_vector_map", > which can have one of three values: none, global, or per-device. > > Setting the irq_vector_map to any value will override the > default that the AMD code sets. > > Signed-off-by: George Dunlap <george.dunlap@eu.citrix.com> > > diff -r 6fc00d52c179 docs/src/user.tex > --- a/docs/src/user.tex Mon Aug 22 16:15:33 2011 +0100 > +++ b/docs/src/user.tex Fri Oct 21 15:21:23 2011 -0400 > @@ -4197,6 +4197,10 @@ > \item [ vcpu\_migration\_delay=$<$minimum\_time$>$] Set minimum time of > vcpu migration in microseconds (default 0). This parameter avoids agressive > vcpu migration. For example, the linux kernel uses 0.5ms by default. > +\item [ irq_vector_map=xxx ] Enable irq vector non-sharing maps. Setting ''global'' > + will ensure that no IRQs will share vectors. Setting ''per-device'' will ensure > + that no IRQs from the same device will share vectors. Setting to ''none'' will > + disable it entirely, overriding any defaults the IOMMU code may set. > \end{description} > > In addition, the following options may be specified on the Xen command > diff -r 6fc00d52c179 xen/arch/x86/irq.c > --- a/xen/arch/x86/irq.c Mon Aug 22 16:15:33 2011 +0100 > +++ b/xen/arch/x86/irq.c Fri Oct 21 15:21:23 2011 -0400 > @@ -24,6 +24,8 @@ > #include <asm/mach-generic/mach_apic.h> > #include <public/physdev.h> > > +static void parse_irq_vector_map_param(char *s); > + > /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. */ > bool_t __read_mostly opt_noirqbalance = 0; > boolean_param("noirqbalance", opt_noirqbalance); > @@ -33,8 +35,10 @@ > integer_param("nr_irqs", nr_irqs); > > /* This default may be changed by the AMD IOMMU code */ > -bool_t __read_mostly opt_irq_perdev_vector_map = 0; > -boolean_param("irq-perdev-vector-map", opt_irq_perdev_vector_map); > +int __read_mostly opt_irq_vector_map = OPT_IRQ_VECTOR_MAP_DEFAULT; > +custom_param("irq_vector_map", parse_irq_vector_map_param); > + > +vmask_t global_used_vector_map; > > u8 __read_mostly *irq_vector; > struct irq_desc __read_mostly *irq_desc = NULL; > @@ -63,6 +67,26 @@ > /* irq_ratelimit: the max irq rate allowed in every 10ms, set 0 to disable */ > static unsigned int __read_mostly irq_ratelimit_threshold = 10000; > integer_param("irq_ratelimit", irq_ratelimit_threshold); > + > +static void __init parse_irq_vector_map_param(char *s) > +{ > + char *ss; > + > + do { > + ss = strchr(s, '',''); > + if ( ss ) > + *ss = ''\0''; > + > + if ( !strcmp(s, "none")) > + opt_irq_vector_map=OPT_IRQ_VECTOR_MAP_NONE; > + else if ( !strcmp(s, "global")) > + opt_irq_vector_map=OPT_IRQ_VECTOR_MAP_GLOBAL; > + else if ( !strcmp(s, "per-device")) > + opt_irq_vector_map=OPT_IRQ_VECTOR_MAP_PERDEV; > + > + s = ss + 1; > + } while ( ss ); > +} > > /* Must be called when irq disabled */ > void lock_vector_lock(void) > @@ -348,6 +372,41 @@ > end_none > }; > > +static vmask_t *irq_get_used_vector_mask(int irq) > +{ > + vmask_t *ret = NULL; > + > + if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_GLOBAL ) > + { > + struct irq_desc *desc = irq_to_desc(irq); > + > + ret = &global_used_vector_map; > + > + if ( desc->chip_data->used_vectors ) > + { > + printk(XENLOG_INFO "%s: Strange, unassigned irq %d already has used_vectors!\n", > + __func__, irq); > + } > + else > + { > + int vector; > + > + vector = irq_to_vector(irq); > + if ( vector > 0 ) > + { > + printk(XENLOG_INFO "%s: Strange, irq %d already assigned vector %d!\n", > + __func__, irq, vector); > + > + ASSERT(!test_bit(vector, ret)); > + > + set_bit(vector, ret); > + } > + } > + } > + > + return ret; > +} > + > int __assign_irq_vector(int irq, struct irq_cfg *cfg, const cpumask_t *mask) > { > /* > @@ -366,6 +425,7 @@ > int cpu, err; > unsigned long flags; > cpumask_t tmp_mask; > + vmask_t *irq_used_vectors = NULL; > > old_vector = irq_to_vector(irq); > if (old_vector) { > @@ -380,6 +440,17 @@ > return -EAGAIN; > > err = -ENOSPC; > + > + /* This is the only place normal IRQs are ever marked > + * as "in use". If they''re not in use yet, check to see > + * if we need to assign a global vector mask. */ > + if ( irq_status[irq] == IRQ_USED ) > + { > + irq_used_vectors = cfg->used_vectors; > + } > + else > + irq_used_vectors = irq_get_used_vector_mask(irq); > + > for_each_cpu_mask(cpu, *mask) { > int new_cpu; > int vector, offset; > @@ -405,8 +476,8 @@ > if (test_bit(vector, used_vectors)) > goto next; > > - if (cfg->used_vectors > - && test_bit(vector, cfg->used_vectors) ) > + if (irq_used_vectors > + && test_bit(vector, irq_used_vectors) ) > goto next; > > for_each_cpu_mask(new_cpu, tmp_mask) > @@ -424,15 +495,22 @@ > per_cpu(vector_irq, new_cpu)[vector] = irq; > cfg->vector = vector; > cpus_copy(cfg->cpu_mask, tmp_mask); > + > + irq_status[irq] = IRQ_USED; > + ASSERT((cfg->used_vectors == NULL) > + || (cfg->used_vectors == irq_used_vectors)); > + cfg->used_vectors = irq_used_vectors; > + > + if (IO_APIC_IRQ(irq)) > + irq_vector[irq] = vector; > + > if ( cfg->used_vectors ) > { > ASSERT(!test_bit(vector, cfg->used_vectors)); > + > set_bit(vector, cfg->used_vectors); > } > > - irq_status[irq] = IRQ_USED; > - if (IO_APIC_IRQ(irq)) > - irq_vector[irq] = vector; > err = 0; > local_irq_restore(flags); > break; > @@ -1521,7 +1599,7 @@ > > if ( !IS_PRIV(current->domain) && > !(IS_PRIV_FOR(current->domain, d) && > - irq_access_permitted(current->domain, pirq))) > + irq_access_permitted(current->domain, pirq))) > return -EPERM; > > if ( pirq < 0 || pirq >= d->nr_pirqs || irq < 0 || irq >= nr_irqs ) > @@ -1569,11 +1647,21 @@ > > if ( desc->handler != &no_irq_type ) > dprintk(XENLOG_G_ERR, "dom%d: irq %d in use\n", > - d->domain_id, irq); > + d->domain_id, irq); > desc->handler = &pci_msi_type; > - if ( opt_irq_perdev_vector_map > + if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_PERDEV > && !desc->chip_data->used_vectors ) > + { > desc->chip_data->used_vectors = &pdev->info.used_vectors; > + if ( desc->chip_data->vector != IRQ_VECTOR_UNASSIGNED ) > + { > + int vector = desc->chip_data->vector; > + ASSERT(!test_bit(vector, desc->chip_data->used_vectors)); > + > + set_bit(vector, desc->chip_data->used_vectors); > + } > + } > + > d->arch.pirq_irq[pirq] = irq; > d->arch.irq_pirq[irq] = pirq; > setup_msi_irq(pdev, msi_desc, irq); > @@ -1584,9 +1672,12 @@ > d->arch.pirq_irq[pirq] = irq; > d->arch.irq_pirq[irq] = pirq; > spin_unlock_irqrestore(&desc->lock, flags); > + > + if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_PERDEV ) > + printk(XENLOG_INFO "Per-device vector maps for GSIs not implemented yet.\n"); > } > > - done: > +done: > return ret; > } > > diff -r 6fc00d52c179 xen/drivers/passthrough/amd/pci_amd_iommu.c > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Aug 22 16:15:33 2011 +0100 > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Fri Oct 21 15:21:23 2011 -0400 > @@ -169,18 +169,35 @@ > return -ENODEV; > } > > - /* Enable use of per-device vector map unless otherwise > - * specified */ > - if ( iommu_amd_perdev_vector_map ) > + /* > + * AMD IOMMUs don''t distinguish between vectors destined for > + * different cpus when doing interrupt remapping. This means > + * that interrupts going through the same intremap table > + * can''t share the same vector. > + * > + * If irq_vector_map isn''t specified, choose a sensible default: > + * - If we''re using per-device interemap tables, per-device > + * vector non-sharing maps > + * - If we''re using a global interemap table, global vector > + * non-sharing map > + */ > + if ( opt_irq_vector_map == OPT_IRQ_VECTOR_MAP_DEFAULT ) > { > - printk("AMD-Vi: Enabling per-device vector maps\n"); > - opt_irq_perdev_vector_map=1; > + if ( amd_iommu_perdev_intremap ) > + { > + printk("AMD-Vi: Enabling per-device vector maps\n"); > + opt_irq_vector_map = OPT_IRQ_VECTOR_MAP_PERDEV; > + } > + else > + { > + printk("AMD-Vi: Enabling global vector map\n"); > + opt_irq_vector_map = OPT_IRQ_VECTOR_MAP_GLOBAL; > + } > } > else > { > - printk("AMD-Vi: WARNING - not enabling per-device vector maps\n"); > + printk("AMD-Vi: Not overriding irq_vector_map setting\n"); > } > - > return scan_pci_devices(); > } > > diff -r 6fc00d52c179 xen/include/asm-x86/irq.h > --- a/xen/include/asm-x86/irq.h Mon Aug 22 16:15:33 2011 +0100 > +++ b/xen/include/asm-x86/irq.h Fri Oct 21 15:21:23 2011 -0400 > @@ -44,6 +44,13 @@ > extern u8 *irq_vector; > > extern bool_t opt_noirqbalance; > + > +#define OPT_IRQ_VECTOR_MAP_DEFAULT 0 /* Do the default thing */ > +#define OPT_IRQ_VECTOR_MAP_NONE 1 /* None */ > +#define OPT_IRQ_VECTOR_MAP_GLOBAL 2 /* One global vector map (no vector sharing) */ > +#define OPT_IRQ_VECTOR_MAP_PERDEV 3 /* Per-device vetor map (no vector sharing w/in a device) */ > + > +extern int opt_irq_vector_map; > > /* > * Per-cpu current frame pointer - the location of the last exception frame on> _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Nov-17 16:49 UTC
Re: [Xen-devel] Backport per-device vector map patches to xen 4.1.3
>>> On 17.11.11 at 17:02, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > On Thu, Oct 27, 2011 at 04:07:51PM +0200, Wei Wang2 wrote: >> Recently we found an issue in xen 4.1. Under heavy I/O stress such as > running >> bonnie++, Dom0 would lost its hard disk with lots of I/O errors. We found >> that some PCI-E devices was using the same vector as SMBus on AMD platforms >> and George'' patch set that enables per-device vector map can fix this >> problem. >> >> Following patches have been back ported and tested by Jacob and Wei H. >> Please apply them to Xen 4.1.3 > > Can they be applied please? I''ve been running these without hiccups for > well, > since they were posted. On both AMD and Intel boxes. > > So, Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Thanks to Keir, they''re already in - c/s 23177:9a38e30e5459. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Nov-17 17:05 UTC
Re: [Xen-devel] Backport per-device vector map patches to xen 4.1.3
On Thu, Nov 17, 2011 at 04:49:28PM +0000, Jan Beulich wrote:> >>> On 17.11.11 at 17:02, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > On Thu, Oct 27, 2011 at 04:07:51PM +0200, Wei Wang2 wrote: > >> Recently we found an issue in xen 4.1. Under heavy I/O stress such as > > running > >> bonnie++, Dom0 would lost its hard disk with lots of I/O errors. We found > >> that some PCI-E devices was using the same vector as SMBus on AMD platforms > >> and George'' patch set that enables per-device vector map can fix this > >> problem. > >> > >> Following patches have been back ported and tested by Jacob and Wei H. > >> Please apply them to Xen 4.1.3 > > > > Can they be applied please? I''ve been running these without hiccups for > > well, > > since they were posted. On both AMD and Intel boxes. > > > > So, Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Thanks to Keir, they''re already in - c/s 23177:9a38e30e5459.Oh, they are. Hmm, somehow when I looked at xenbits.xen.org I couldn''t find them. Now I see them :-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel