This set of patches are partly bugfixed to problems I have noticed while working on this area of the codebase, and targeted fixes to get the kexec path working again on Xen 4.x The first three patches are small bugfixes which are not directly related to the kexec problems, but are on the codepath. The next four patches are directly related to fixing the kexec path. These patches cause xen to track the BSP local APIC boot state and return to it before kexec''ing to a new kernel. This prevents the kdump kernel falling over itself when booting in x2apic mode while expecting to be in xapic mode. It also makes sure to disable IO virtualisation, along with fixing the current codepath related to disabling Interrup Remapping on Intel VTD boxes. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown
This is a rare case, but if the BIOS is set to uniprocessor, and Xen is booted with ''lapic x2apic'', Xen will switch into x2apic mode, which will cause a protection fault when disabling the local APIC. This leads to a general protection fault as this code is also in the fault handler. When x2apic mode is enabled, the only tranlsation which does not result in a protection fault is to clear both the EN and EXTD bits, which is safe to do in all cases, even if you are in xapic mode rather than x2apic mode. The linux code from which this is derrived is protected by an if ( ! x2apic_mode ...) clause which is how they get away with it. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 37c77bacb52a -r 076c3034c8c7 xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100 +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 @@ -340,7 +340,8 @@ void disable_local_APIC(void) if (enabled_via_apicbase) { uint64_t msr_content; rdmsrl(MSR_IA32_APICBASE, msr_content); - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE); + wrmsrl(MSR_IA32_APICBASE, msr_content & + ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD)); } } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn''t look after the interrupt flag
nmi_shootdown_cpus is part of the kexec path, coming from a panic, and as such can be called both with interrupts enabled or disabled. We really dont want to accidentally set IF. Therefore, use save/restore in preference to disable/enable. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu static void nmi_shootdown_cpus(void) { - unsigned long msecs; + unsigned long msecs, flags; - local_irq_disable(); + local_irq_save(flags); crashing_cpu = smp_processor_id(); local_irq_count(crashing_cpu) = 0; @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void) __stop_this_cpu(); disable_IO_APIC(); - local_irq_enable(); + local_irq_restore(flags); } void machine_crash_shutdown(void) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 3 of 7] IOMMU: Sanitise pointer work
Check for null pointers before calling function pointers. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un return ops->read_apic_from_ire(apic, reg); } -void iommu_resume() +void iommu_resume(void) { const struct iommu_ops *ops = iommu_get_ops(); - if ( iommu_enabled ) + if ( iommu_enabled && ops && ops->resume ) ops->resume(); } -void iommu_suspend() +void iommu_suspend(void) { const struct iommu_ops *ops = iommu_get_ops(); - if ( iommu_enabled ) + if ( iommu_enabled && ops && ops->suspend ) ops->suspend(); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot
Xen does not store the boot local APIC state which leads to problems when shutting down for a kexec jump. This patch records the boot state so we can return to the boot state when kexecing. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; static bool_t __initdata opt_x2apic = 1; boolean_param("x2apic", opt_x2apic); +enum apic_mode apic_boot_mode = APIC_MODE_INVALID; + bool_t __read_mostly x2apic_enabled = 0; bool_t __read_mostly directed_eoi_enabled = 0; @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void) return 0; } +/* Needs to be called during startup. It records the state the BIOS + * leaves the local APIC so we can undo upon kexec. + */ +void __init record_boot_APIC_mode(void) +{ + /* Sanity check - we should only ever run once, but could possibly + * be called several times */ + if ( APIC_MODE_INVALID != apic_boot_mode ) + return; + + apic_boot_mode = current_local_apic_mode(); + + apic_printk(APIC_DEBUG, "APIC boot state is ''%s''\n", + apic_mode_to_str(apic_boot_mode)); +} + +/* Look at the bits in MSR_IA32_APICBASE and work out which + * APIC mode we are in */ +enum apic_mode current_local_apic_mode(void) +{ + u64 msr_contents; + + rdmsrl(MSR_IA32_APICBASE, msr_contents); + + /* Reading EXTD bit from the MSR is only valid if CPUID + * says so, else reserved */ + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) + return APIC_MODE_X2APIC; + + /* EN bit should always be valid as long as we can read the MSR + */ + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) + return APIC_MODE_XAPIC; + + return APIC_MODE_DISABLED; +} + + +const char * apic_mode_to_str(const enum apic_mode mode) +{ + switch ( mode ) + { + case APIC_MODE_INVALID: + return "invalid"; + case APIC_MODE_DISABLED: + return "disabled"; + case APIC_MODE_XAPIC: + return "xapic"; + case APIC_MODE_X2APIC: + return "x2apic"; + default: + return "unrecognised"; + } +} + void check_for_unexpected_msi(unsigned int vector) { unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) { int i, changed; + record_boot_APIC_mode(); + check_x2apic_preenabled(); cmdline_apic = changed = (genapic != NULL); diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 @@ -21,6 +21,23 @@ #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 +/* Possible APIC states */ +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ + APIC_MODE_DISABLED, /* If uniprocessor, or smp in + * uniprocessor mode */ + APIC_MODE_XAPIC, /* xAPIC mode - default upon + * chipset reset */ + APIC_MODE_X2APIC /* x2APIC mode - common for large + * smp machines */ +}; + +/* Bootstrap processor local APIC boot mode - so we can undo our changes + * to the APIC state */ +extern enum apic_mode apic_boot_mode; + +/* enum apic_mode -> str function for logging/debug */ +const char * apic_mode_to_str(const enum apic_mode); + extern u8 apic_verbosity; extern bool_t x2apic_enabled; extern bool_t directed_eoi_enabled; @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void); extern void enable_APIC_timer(void); extern int lapic_suspend(void); extern int lapic_resume(void); +extern void record_boot_APIC_mode(void); +extern enum apic_mode current_local_apic_mode(void); extern int check_nmi_watchdog (void); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
Experimental evidence shows that Extended Interrupt Mode remains in effect even after Interrupt Remapping is disabled in each DMAR Global Command Register. A consiquence of this is that when we switch from x2apic mode back to xapic mode, and disable interrupt remapping for the kdump kernel, interrupts passing through the IO APICs are in x2apic format as opposed xapic. This causes a triple fault in the kexec kernel. As EIM is explicitly set up each time Interrup Remapping is enabled, it is safe for us to clobber this when taring down. Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused verbose and error-prone code, and was only used in 1 place before. We now have IRTA_EIME which is the specific bit in the register. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu, #ifdef CONFIG_X86 /* set extended interrupt mode bit */ - ir_ctrl->iremap_maddr |- eim ? (1 << IRTA_REG_EIME_SHIFT) : 0; + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; #endif spin_lock_irqsave(&iommu->register_lock, flags); @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm if ( !ecap_intr_remap(iommu->ecap) ) return; + /* If we are disabling Interrupt Remapping, make sure we dont stay in + * Extended Interrupt Mode, as this is unaffected by the Interrupt + * Remapping flag in each DMAR Global Control Register. + * Specifically, local apics in xapic mode do not like interrupts delivered + * in x2apic mode. Any code turning interrupt remapping back on will set + * EIME back correctly. + */ + if ( iommu_supports_eim() ) + { + u64 irta; + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, + !(irta & IRTA_EIME), irta); + } + spin_lock_irqsave(&iommu->register_lock, flags); sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); if ( !(sts & DMA_GSTS_IRES) ) diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 @@ -471,7 +471,7 @@ struct qinval_entry { #define IEC_GLOBAL_INVL 0 #define IEC_INDEX_INVL 1 -#define IRTA_REG_EIME_SHIFT 11 +#define IRTA_EIME (1 << 11) /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */ #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 ) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
The kdump kernel has problems booting with interrupt/dma remapping enabled, so we need a new iommu_ops called crash_shutdown which is basically suspend but doesn''t need to bother saving state. Make sure that crash_shutdown is called on the kexec path. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 2635774346c4 -r 3ad737eb0a8d xen/arch/x86/crash.c --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 @@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void) msecs--; } + /* Crash shutdown any IOMMU functionality as the crashdump kernel is not + * happy when booting if interrupt/dma remapping is still enabled */ + iommu_crash_shutdown(); + __stop_this_cpu(); disable_IO_APIC(); diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100 @@ -921,6 +921,14 @@ void amd_iommu_suspend(void) disable_iommu(iommu); } +void amd_iommu_crash_shutdown(void) +{ + struct amd_iommu *iommu; + + for_each_amd_iommu ( iommu ) + disable_iommu(iommu); +} + void amd_iommu_resume(void) { struct amd_iommu *iommu; diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/pci_amd_iommu.c --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 +0100 @@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = { .suspend = amd_iommu_suspend, .resume = amd_iommu_resume, .share_p2m = amd_iommu_share_p2m, + .crash_shutdown = amd_iommu_crash_shutdown, }; diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 @@ -429,6 +429,14 @@ void iommu_share_p2m_table(struct domain ops->share_p2m(d); } +void iommu_crash_shutdown(void) +{ + const struct iommu_ops *ops = iommu_get_ops(); + if ( ops && ops->crash_shutdown ) + ops->crash_shutdown(); + iommu_enabled = 0; +} + /* * Local variables: * mode: C diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/vtd/iommu.c --- a/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100 @@ -2238,6 +2238,25 @@ static void vtd_suspend(void) } } +static void vtd_crash_shutdown(void) +{ + struct acpi_drhd_unit *drhd; + struct iommu *iommu; + + if ( !iommu_enabled ) + return; + + iommu_flush_all(); + + for_each_drhd_unit ( drhd ) + { + iommu = drhd->iommu; + iommu_disable_translation(iommu); + } + + iommu_disable_x2apic_IR(); +} + static void vtd_resume(void) { struct acpi_drhd_unit *drhd; @@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops .suspend = vtd_suspend, .resume = vtd_resume, .share_p2m = iommu_set_pgd, + .crash_shutdown = vtd_crash_shutdown, }; /* diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/asm-x86/hvm/svm/amd-iommu-proto.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 +0100 @@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_ /* power management support */ void amd_iommu_resume(void); void amd_iommu_suspend(void); +void amd_iommu_crash_shutdown(void); static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift) { diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/xen/iommu.h --- a/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100 @@ -133,6 +133,7 @@ struct iommu_ops { void (*suspend)(void); void (*resume)(void); void (*share_p2m)(struct domain *d); + void (*crash_shutdown)(void); }; void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value); @@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un void iommu_suspend(void); void iommu_resume(void); +void iommu_crash_shutdown(void); void iommu_set_dom0_mapping(struct domain *d); void iommu_share_p2m_table(struct domain *d); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-13 17:02 UTC
[Xen-devel] [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
Introduce the boolean variable ''kexecing'' which indicates to functions whether we are on the kexec path or not. This is used by disable_local_APIC() to try and revert the APIC mode back to how it was found on boot. We also need some fudging of the x2apic_enabled variable. It is used in multiple places over the codebase to mean multiple things, including: What did the user specifify on the command line? Did the BIOS boot me in x2apic mode? Is the BSP Local APIC in x2apic mode? What mode is my Local APIC in? Therefore, set it up to prevent a protection fault when disabling the IOAPICs. (In this case, it is used in the "What mode is my Local APIC in?" case, so the processor doesnt suffer a protection fault because of trying to use x2apic MSRs when it should be using xapic MMIO) Finally, make sure that interrupts are disabled when jumping into the purgatory code. It would be bad to service interrupts in the Xen context when the next kernel is booting. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 @@ -37,6 +37,7 @@ #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */ #include <mach_apic.h> #include <io_ports.h> +#include <xen/kexec.h> static bool_t tdt_enabled __read_mostly; static bool_t tdt_enable __initdata = 1; @@ -345,6 +346,33 @@ void disable_local_APIC(void) wrmsrl(MSR_IA32_APICBASE, msr_content & ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD)); } + + if ( kexecing ) + { + uint64_t msr_content; + rdmsrl(MSR_IA32_APICBASE, msr_content); + msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD); + wrmsrl(MSR_IA32_APICBASE, msr_content); + + switch ( apic_boot_mode ) + { + case APIC_MODE_DISABLED: + break; /* Nothing to do - we did this above */ + case APIC_MODE_XAPIC: + msr_content |= MSR_IA32_APICBASE_ENABLE; + wrmsrl(MSR_IA32_APICBASE, msr_content); + break; + case APIC_MODE_X2APIC: + msr_content |= (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD); + wrmsrl(MSR_IA32_APICBASE, msr_content); + break; + default: + printk("Default case when reverting #%d lapic to boot state\n", + smp_processor_id()); + break; + } + } + } /* diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 @@ -27,6 +27,7 @@ #include <asm/hvm/support.h> #include <asm/apic.h> #include <asm/io_apic.h> +#include <xen/iommu.h> static atomic_t waiting_for_crash_ipi; static unsigned int crashing_cpu; @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void) iommu_crash_shutdown(); __stop_this_cpu(); + + /* This is a bit of a hack due to the problems with the x2apic_enabled + * variable, but we can''t do any better without a significant refactoring + * of the APIC code */ + if ( current_local_apic_mode() == APIC_MODE_X2APIC ) + x2apic_enabled = 1; + else + x2apic_enabled = 0; + disable_IO_APIC(); local_irq_restore(flags); diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c --- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100 @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im if ( hpet_broadcast_is_available() ) hpet_disable_legacy_broadcast(); + /* We are about to permenantly jump out of the Xen context into the kexec + * purgatory code. We really dont want to be still servicing interupts. + */ + local_irq_disable(); + /* * compat_machine_kexec() returns to idle pagetables, which requires us * to be running on a static GDT mapping (idle pagetables have no GDT diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c --- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100 @@ -29,6 +29,8 @@ #include <compat/kexec.h> #endif +bool_t kexecing = FALSE; + static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes); static Elf_Note *xen_crash_note; @@ -220,6 +222,8 @@ void kexec_crash(void) if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) ) return; + kexecing = TRUE; + kexec_common_shutdown(); kexec_crash_save_cpu(); machine_crash_shutdown(); @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image) { xen_kexec_image_t *image = _image; + kexecing = TRUE; + kexec_common_shutdown(); machine_reboot_kexec(image); diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h --- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100 +++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100 @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve { extern xen_kexec_reserve_t kexec_crash_area; +extern bool_t kexecing; + void set_kexec_crash_area_size(u64 system_ram); /* We have space for 4 images to support atomic update _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-13 18:13 UTC
Re: [Xen-devel] [PATCH 3 of 7] IOMMU: Sanitise pointer work
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> Check for null pointers before calling function pointers.But they''re never NULL? This one''s a bit pointless. -- Keir> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c > --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 > @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un > return ops->read_apic_from_ire(apic, reg); > } > > -void iommu_resume() > +void iommu_resume(void) > { > const struct iommu_ops *ops = iommu_get_ops(); > - if ( iommu_enabled ) > + if ( iommu_enabled && ops && ops->resume ) > ops->resume(); > } > > -void iommu_suspend() > +void iommu_suspend(void) > { > const struct iommu_ops *ops = iommu_get_ops(); > - if ( iommu_enabled ) > + if ( iommu_enabled && ops && ops->suspend ) > ops->suspend(); > } > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-13 18:15 UTC
Re: [Xen-devel] [PATCH 0 of 7] Fix kexec in Xen (take 4)
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> This set of patches are partly bugfixed to problems I have noticed while > working on this area of the codebase, and targeted fixes to get the kexec path > working again on Xen 4.x > > The first three patches are small bugfixes which are not directly related to > the kexec problems, but are on the codepath. > > The next four patches are directly related to fixing the kexec path. > > These patches cause xen to track the BSP local APIC boot state and return to > it before kexec''ing to a new kernel. This prevents the kdump kernel falling > over itself when booting in x2apic mode while expecting to be in xapic mode. > > It also makes sure to disable IO virtualisation, along with fixing the current > codepath related to disabling Interrup Remapping on Intel VTD boxes.Overall looking much better! I will take a look at the patches individually, and I expect Jan will be looking too. I think we may have a largely check-in-able series here. :-) -- Keir> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-14 08:44 UTC
Re: [Xen-devel] [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > This is a rare case, but if the BIOS is set to uniprocessor, and Xen > is booted with ''lapic x2apic'', Xen will switch into x2apic mode, which > will cause a protection fault when disabling the local APIC. This > leads to a general protection fault as this code is also in the fault > handler. > > When x2apic mode is enabled, the only tranlsation which does > not result in a protection fault is to clear both the EN and EXTD > bits, which is safe to do in all cases, even if you are in xapic > mode rather than x2apic mode. > > The linux code from which this is derrived is protected by an > if ( ! x2apic_mode ...) clause which is how they get away with it. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>Acked-by: Jan Beulich <jbeulich@novell.com> You may want to submit a similar patch to Linux (which is what this code got derived from), so that in the future no-one will get surprised that this is different in Xen and Linux. Otoh, interestingly this is being done only for x86-32 in Linux, and I highly doubt any X2APIC capable machine would boot with APIC disabled. Jan> diff -r 37c77bacb52a -r 076c3034c8c7 xen/arch/x86/apic.c > --- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100 > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > @@ -340,7 +340,8 @@ void disable_local_APIC(void) > if (enabled_via_apicbase) { > uint64_t msr_content; > rdmsrl(MSR_IA32_APICBASE, msr_content); > - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE); > + wrmsrl(MSR_IA32_APICBASE, msr_content & > + ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD)); > } > } > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-14 08:46 UTC
Re: [Xen-devel] [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn''t look after the interrupt flag
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > nmi_shootdown_cpus is part of the kexec path, coming from a panic, and > as such can be called both with interrupts enabled or disabled. We > really dont want to accidentally set IF.Can interrupts really be enabled when entering this function?> Therefore, use save/restore in preference to disable/enable.I.e. wouldn''t just removing the stray local_irq_enable() suffice? Jan> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c > --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 > @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu > > static void nmi_shootdown_cpus(void) > { > - unsigned long msecs; > + unsigned long msecs, flags; > > - local_irq_disable(); > + local_irq_save(flags); > > crashing_cpu = smp_processor_id(); > local_irq_count(crashing_cpu) = 0; > @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void) > __stop_this_cpu(); > disable_IO_APIC(); > > - local_irq_enable(); > + local_irq_restore(flags); > } > > void machine_crash_shutdown(void) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-14 08:57 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Xen does not store the boot local APIC state which leads to problems > when shutting down for a kexec jump. This patch records the boot > state so we can return to the boot state when kexecing. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>With a couple of minor adjustments (see inline comments), Acked-by: Jan Beulich <jbeulich@novell.com>> diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c > --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; > static bool_t __initdata opt_x2apic = 1; > boolean_param("x2apic", opt_x2apic); > > +enum apic_mode apic_boot_mode = APIC_MODE_INVALID;static ... __read_mostly ...> + > bool_t __read_mostly x2apic_enabled = 0; > bool_t __read_mostly directed_eoi_enabled = 0; > > @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void) > return 0; > } > > +/* Needs to be called during startup. It records the state the BIOS > + * leaves the local APIC so we can undo upon kexec. > + */ > +void __init record_boot_APIC_mode(void) > +{ > + /* Sanity check - we should only ever run once, but could possibly > + * be called several times */ > + if ( APIC_MODE_INVALID != apic_boot_mode ) > + return; > + > + apic_boot_mode = current_local_apic_mode(); > + > + apic_printk(APIC_DEBUG, "APIC boot state is ''%s''\n", > + apic_mode_to_str(apic_boot_mode)); > +} > + > +/* Look at the bits in MSR_IA32_APICBASE and work out which > + * APIC mode we are in */ > +enum apic_mode current_local_apic_mode(void)Logically (within this patch) this function should also be static, but I see it is being referenced by a later patch. Read on there.> +{ > + u64 msr_contents; > + > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > + > + /* Reading EXTD bit from the MSR is only valid if CPUID > + * says so, else reserved */ > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) > + return APIC_MODE_X2APIC; > + > + /* EN bit should always be valid as long as we can read the MSR > + */ > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) > + return APIC_MODE_XAPIC; > + > + return APIC_MODE_DISABLED; > +} > + > + > +const char * apic_mode_to_str(const enum apic_mode mode)static ... __init ... Jan> +{ > + switch ( mode ) > + { > + case APIC_MODE_INVALID: > + return "invalid"; > + case APIC_MODE_DISABLED: > + return "disabled"; > + case APIC_MODE_XAPIC: > + return "xapic"; > + case APIC_MODE_X2APIC: > + return "x2apic"; > + default: > + return "unrecognised"; > + } > +} > + > void check_for_unexpected_msi(unsigned int vector) > { > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c > --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) > { > int i, changed; > > + record_boot_APIC_mode(); > + > check_x2apic_preenabled(); > cmdline_apic = changed = (genapic != NULL); > > diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h > --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 > @@ -21,6 +21,23 @@ > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 > > +/* Possible APIC states */ > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ > + APIC_MODE_DISABLED, /* If uniprocessor, or smp in > + * uniprocessor mode */ > + APIC_MODE_XAPIC, /* xAPIC mode - default upon > + * chipset reset */ > + APIC_MODE_X2APIC /* x2APIC mode - common for large > + * smp machines */ > +}; > + > +/* Bootstrap processor local APIC boot mode - so we can undo our changes > + * to the APIC state */ > +extern enum apic_mode apic_boot_mode; > + > +/* enum apic_mode -> str function for logging/debug */ > +const char * apic_mode_to_str(const enum apic_mode); > + > extern u8 apic_verbosity; > extern bool_t x2apic_enabled; > extern bool_t directed_eoi_enabled; > @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void); > extern void enable_APIC_timer(void); > extern int lapic_suspend(void); > extern int lapic_resume(void); > +extern void record_boot_APIC_mode(void); > +extern enum apic_mode current_local_apic_mode(void); > > extern int check_nmi_watchdog (void); > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-14 09:02 UTC
Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Experimental evidence shows that Extended Interrupt Mode remains in > effect even after Interrupt Remapping is disabled in each DMAR Global > Command Register. A consiquence of this is that when we switch from > x2apic mode back to xapic mode, and disable interrupt remapping for > the kdump kernel, interrupts passing through the IO APICs are in > x2apic format as opposed xapic. This causes a triple fault in the > kexec kernel. > > As EIM is explicitly set up each time Interrup Remapping is enabled, > it is safe for us to clobber this when taring down. > > Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused > verbose and error-prone code, and was only used in 1 place before. We > now have IRTA_EIME which is the specific bit in the register. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c > --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 > @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu, > > #ifdef CONFIG_X86 > /* set extended interrupt mode bit */ > - ir_ctrl->iremap_maddr |> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0; > + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; > #endif > spin_lock_irqsave(&iommu->register_lock, flags); > > @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm > if ( !ecap_intr_remap(iommu->ecap) ) > return; > > + /* If we are disabling Interrupt Remapping, make sure we dont stay in > + * Extended Interrupt Mode, as this is unaffected by the Interrupt > + * Remapping flag in each DMAR Global Control Register. > + * Specifically, local apics in xapic mode do not like interrupts > delivered > + * in x2apic mode. Any code turning interrupt remapping back on will > set > + * EIME back correctly. > + */ > + if ( iommu_supports_eim() ) > + { > + u64 irta; > + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); > + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); > + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, > + !(irta & IRTA_EIME), irta);Hmm, this of course can be problematic in the context of a crash: I realize that you want it cleared, but does system state guarantee that this WAIT_OP will always be able to complete? You''d get a hung system instead if it won''t. Admittedly it''s not clear which one is better, but a best effort approach would probably skip the wait loop. Let''s see what our friends at Intel think... Jan> + } > + > spin_lock_irqsave(&iommu->register_lock, flags); > sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); > if ( !(sts & DMA_GSTS_IRES) ) > diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h > --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 > @@ -471,7 +471,7 @@ struct qinval_entry { > > #define IEC_GLOBAL_INVL 0 > #define IEC_INDEX_INVL 1 > -#define IRTA_REG_EIME_SHIFT 11 > +#define IRTA_EIME (1 << 11) > > /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */ > #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 ) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-14 09:44 UTC
Re: [Xen-devel] [PATCH 1 of 7] APIC BUG: fix potential Protection Fault during shutdown
On 14/06/11 09:44, Jan Beulich wrote:>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> This is a rare case, but if the BIOS is set to uniprocessor, and Xen >> is booted with ''lapic x2apic'', Xen will switch into x2apic mode, which >> will cause a protection fault when disabling the local APIC. This >> leads to a general protection fault as this code is also in the fault >> handler. >> >> When x2apic mode is enabled, the only tranlsation which does >> not result in a protection fault is to clear both the EN and EXTD >> bits, which is safe to do in all cases, even if you are in xapic >> mode rather than x2apic mode. >> >> The linux code from which this is derrived is protected by an >> if ( ! x2apic_mode ...) clause which is how they get away with it. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Acked-by: Jan Beulich <jbeulich@novell.com> > > You may want to submit a similar patch to Linux (which is what > this code got derived from), so that in the future no-one will get > surprised that this is different in Xen and Linux. > > Otoh, interestingly this is being done only for x86-32 in Linux, and > I highly doubt any X2APIC capable machine would boot with APIC > disabled. > > Jan >As I said, it is an edge case and shouldn''t occur under any normal circumstances, but given the nature of the fix, we might as well help the odd setups. I considered upstreaming it to Linux but I doubt It will be taken because there is no way to force their code to have a protection fault. ~Andrew>> diff -r 37c77bacb52a -r 076c3034c8c7 xen/arch/x86/apic.c >> --- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100 >> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -340,7 +340,8 @@ void disable_local_APIC(void) >> if (enabled_via_apicbase) { >> uint64_t msr_content; >> rdmsrl(MSR_IA32_APICBASE, msr_content); >> - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE); >> + wrmsrl(MSR_IA32_APICBASE, msr_content & >> + ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD)); >> } >> } >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-14 09:46 UTC
Re: [Xen-devel] [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn''t look after the interrupt flag
On 14/06/2011 09:46, "Jan Beulich" <JBeulich@novell.com> wrote:>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> nmi_shootdown_cpus is part of the kexec path, coming from a panic, and >> as such can be called both with interrupts enabled or disabled. We >> really dont want to accidentally set IF. > > Can interrupts really be enabled when entering this function? > >> Therefore, use save/restore in preference to disable/enable. > > I.e. wouldn''t just removing the stray local_irq_enable() suffice?I''d say that the caller should local_irq_disable(). The function itself should at most BUG_ON(local_irq_is_enabled()). -- Keir> Jan > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c >> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu >> >> static void nmi_shootdown_cpus(void) >> { >> - unsigned long msecs; >> + unsigned long msecs, flags; >> >> - local_irq_disable(); >> + local_irq_save(flags); >> >> crashing_cpu = smp_processor_id(); >> local_irq_count(crashing_cpu) = 0; >> @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void) >> __stop_this_cpu(); >> disable_IO_APIC(); >> >> - local_irq_enable(); >> + local_irq_restore(flags); >> } >> >> void machine_crash_shutdown(void) >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-14 09:51 UTC
Re: [Xen-devel] [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn''t look after the interrupt flag
On 14/06/11 09:46, Jan Beulich wrote:>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> nmi_shootdown_cpus is part of the kexec path, coming from a panic, and >> as such can be called both with interrupts enabled or disabled. We >> really dont want to accidentally set IF. > Can interrupts really be enabled when entering this function?It is unlikely, but I believe there are a few codepaths which could enter this function with interrupts enabled. Either way, the last thing we need is the usual occurrence of leaving this function with interrupts enabled>> Therefore, use save/restore in preference to disable/enable. > I.e. wouldn''t just removing the stray local_irq_enable() suffice? > > JanI was trying to go for minimal changes to the Linux code. I would agree that a single disable right at the top of the crash path might be a better solution. ~Andrew>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 076c3034c8c7 -r 1c3d2e4d06fe xen/arch/x86/crash.c >> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -55,9 +55,9 @@ static int crash_nmi_callback(struct cpu >> >> static void nmi_shootdown_cpus(void) >> { >> - unsigned long msecs; >> + unsigned long msecs, flags; >> >> - local_irq_disable(); >> + local_irq_save(flags); >> >> crashing_cpu = smp_processor_id(); >> local_irq_count(crashing_cpu) = 0; >> @@ -80,7 +80,7 @@ static void nmi_shootdown_cpus(void) >> __stop_this_cpu(); >> disable_IO_APIC(); >> >> - local_irq_enable(); >> + local_irq_restore(flags); >> } >> >> void machine_crash_shutdown(void) >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-14 09:53 UTC
Re: [Xen-devel] [PATCH 3 of 7] IOMMU: Sanitise pointer work
On 13/06/11 19:13, Keir Fraser wrote:> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> Check for null pointers before calling function pointers. > But they''re never NULL? This one''s a bit pointless. > > -- Keir >This patch was a direct result of comments regarding my previous attempt to make a crash_shutdown iommu_op. Are we happy to assume that noone in the future is going to partially implement an iommu_ops structure? ~Andrew>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c >> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un >> return ops->read_apic_from_ire(apic, reg); >> } >> >> -void iommu_resume() >> +void iommu_resume(void) >> { >> const struct iommu_ops *ops = iommu_get_ops(); >> - if ( iommu_enabled ) >> + if ( iommu_enabled && ops && ops->resume ) >> ops->resume(); >> } >> >> -void iommu_suspend() >> +void iommu_suspend(void) >> { >> const struct iommu_ops *ops = iommu_get_ops(); >> - if ( iommu_enabled ) >> + if ( iommu_enabled && ops && ops->suspend ) >> ops->suspend(); >> } >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-14 09:59 UTC
Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
On 14/06/11 10:02, Jan Beulich wrote:>>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> Experimental evidence shows that Extended Interrupt Mode remains in >> effect even after Interrupt Remapping is disabled in each DMAR Global >> Command Register. A consiquence of this is that when we switch from >> x2apic mode back to xapic mode, and disable interrupt remapping for >> the kdump kernel, interrupts passing through the IO APICs are in >> x2apic format as opposed xapic. This causes a triple fault in the >> kexec kernel. >> >> As EIM is explicitly set up each time Interrup Remapping is enabled, >> it is safe for us to clobber this when taring down. >> >> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused >> verbose and error-prone code, and was only used in 1 place before. We >> now have IRTA_EIME which is the specific bit in the register. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c >> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu, >> >> #ifdef CONFIG_X86 >> /* set extended interrupt mode bit */ >> - ir_ctrl->iremap_maddr |>> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0; >> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; >> #endif >> spin_lock_irqsave(&iommu->register_lock, flags); >> >> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm >> if ( !ecap_intr_remap(iommu->ecap) ) >> return; >> >> + /* If we are disabling Interrupt Remapping, make sure we dont stay in >> + * Extended Interrupt Mode, as this is unaffected by the Interrupt >> + * Remapping flag in each DMAR Global Control Register. >> + * Specifically, local apics in xapic mode do not like interrupts >> delivered >> + * in x2apic mode. Any code turning interrupt remapping back on will >> set >> + * EIME back correctly. >> + */ >> + if ( iommu_supports_eim() ) >> + { >> + u64 irta; >> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); >> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); >> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, >> + !(irta & IRTA_EIME), irta); > Hmm, this of course can be problematic in the context of a crash: > I realize that you want it cleared, but does system state guarantee > that this WAIT_OP will always be able to complete? You''d get a > hung system instead if it won''t. Admittedly it''s not clear which one > is better, but a best effort approach would probably skip the wait > loop. Let''s see what our friends at Intel think... > > Jan >From what I can see, IOMMU_WAIT_OP just polls memory mapped registersuntil they change. As we are doing multiple dmar_read/writes and the spec warns that we must wait. I cant see anything inherently dangerous doing this in the crash context, because if the wait times out, then the chances are that the hardware is broken, at which point you have bigger problems that a kexec not happening. ~Andrew>> + } >> + >> spin_lock_irqsave(&iommu->register_lock, flags); >> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); >> if ( !(sts & DMA_GSTS_IRES) ) >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h >> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 >> @@ -471,7 +471,7 @@ struct qinval_entry { >> >> #define IEC_GLOBAL_INVL 0 >> #define IEC_INDEX_INVL 1 >> -#define IRTA_REG_EIME_SHIFT 11 >> +#define IRTA_EIME (1 << 11) >> >> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */ >> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 ) >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Jun-14 10:48 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot
On Tue, 2011-06-14 at 09:57 +0100, Jan Beulich wrote:> >>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > > Xen does not store the boot local APIC state which leads to problems > > when shutting down for a kexec jump. This patch records the boot > > state so we can return to the boot state when kexecing. > > > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > With a couple of minor adjustments (see inline comments), > > Acked-by: Jan Beulich <jbeulich@novell.com> > > > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c > > --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; > > static bool_t __initdata opt_x2apic = 1; > > boolean_param("x2apic", opt_x2apic); > > > > +enum apic_mode apic_boot_mode = APIC_MODE_INVALID; > > static ... __read_mostly ...I think I suggested that Andrew remove the read mostly annotiation since this variable isn''t really read mostly -- more of an access almost never variable (it''s written on boot and only read on the kexec path). Does optimising for this variable really help anything? I suspect (without evidence to backup my feeling) that over using this annotation will actually push actual frequently read __read_mostly variables into taking up more cache lines and actually hurt... Ian.> > + > > bool_t __read_mostly x2apic_enabled = 0; > > bool_t __read_mostly directed_eoi_enabled = 0; > > > > @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void) > > return 0; > > } > > > > +/* Needs to be called during startup. It records the state the BIOS > > + * leaves the local APIC so we can undo upon kexec. > > + */ > > +void __init record_boot_APIC_mode(void) > > +{ > > + /* Sanity check - we should only ever run once, but could possibly > > + * be called several times */ > > + if ( APIC_MODE_INVALID != apic_boot_mode ) > > + return; > > + > > + apic_boot_mode = current_local_apic_mode(); > > + > > + apic_printk(APIC_DEBUG, "APIC boot state is ''%s''\n", > > + apic_mode_to_str(apic_boot_mode)); > > +} > > + > > +/* Look at the bits in MSR_IA32_APICBASE and work out which > > + * APIC mode we are in */ > > +enum apic_mode current_local_apic_mode(void) > > Logically (within this patch) this function should also be static, but > I see it is being referenced by a later patch. Read on there. > > > +{ > > + u64 msr_contents; > > + > > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > > + > > + /* Reading EXTD bit from the MSR is only valid if CPUID > > + * says so, else reserved */ > > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) > > + return APIC_MODE_X2APIC; > > + > > + /* EN bit should always be valid as long as we can read the MSR > > + */ > > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) > > + return APIC_MODE_XAPIC; > > + > > + return APIC_MODE_DISABLED; > > +} > > + > > + > > +const char * apic_mode_to_str(const enum apic_mode mode) > > static ... __init ... > > Jan > > > +{ > > + switch ( mode ) > > + { > > + case APIC_MODE_INVALID: > > + return "invalid"; > > + case APIC_MODE_DISABLED: > > + return "disabled"; > > + case APIC_MODE_XAPIC: > > + return "xapic"; > > + case APIC_MODE_X2APIC: > > + return "x2apic"; > > + default: > > + return "unrecognised"; > > + } > > +} > > + > > void check_for_unexpected_msi(unsigned int vector) > > { > > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); > > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c > > --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 > > +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 > > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) > > { > > int i, changed; > > > > + record_boot_APIC_mode(); > > + > > check_x2apic_preenabled(); > > cmdline_apic = changed = (genapic != NULL); > > > > diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h > > --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 > > +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 > > @@ -21,6 +21,23 @@ > > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 > > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 > > > > +/* Possible APIC states */ > > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ > > + APIC_MODE_DISABLED, /* If uniprocessor, or smp in > > + * uniprocessor mode */ > > + APIC_MODE_XAPIC, /* xAPIC mode - default upon > > + * chipset reset */ > > + APIC_MODE_X2APIC /* x2APIC mode - common for large > > + * smp machines */ > > +}; > > + > > +/* Bootstrap processor local APIC boot mode - so we can undo our changes > > + * to the APIC state */ > > +extern enum apic_mode apic_boot_mode; > > + > > +/* enum apic_mode -> str function for logging/debug */ > > +const char * apic_mode_to_str(const enum apic_mode); > > + > > extern u8 apic_verbosity; > > extern bool_t x2apic_enabled; > > extern bool_t directed_eoi_enabled; > > @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void); > > extern void enable_APIC_timer(void); > > extern int lapic_suspend(void); > > extern int lapic_resume(void); > > +extern void record_boot_APIC_mode(void); > > +extern enum apic_mode current_local_apic_mode(void); > > > > extern int check_nmi_watchdog (void); > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-14 11:21 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot
>>> On 14.06.11 at 12:48, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2011-06-14 at 09:57 +0100, Jan Beulich wrote: >> >>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> > Xen does not store the boot local APIC state which leads to problems >> > when shutting down for a kexec jump. This patch records the boot >> > state so we can return to the boot state when kexecing. >> > >> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> With a couple of minor adjustments (see inline comments), >> >> Acked-by: Jan Beulich <jbeulich@novell.com> >> >> > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/apic.c >> > --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 >> > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 >> > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; >> > static bool_t __initdata opt_x2apic = 1; >> > boolean_param("x2apic", opt_x2apic); >> > >> > +enum apic_mode apic_boot_mode = APIC_MODE_INVALID; >> >> static ... __read_mostly ... > > I think I suggested that Andrew remove the read mostly annotiation since > this variable isn''t really read mostly -- more of an access almost never > variable (it''s written on boot and only read on the kexec path).Indeed, the "static" part is the more important one.> Does optimising for this variable really help anything? I suspect > (without evidence to backup my feeling) that over using this annotation > will actually push actual frequently read __read_mostly variables into > taking up more cache lines and actually hurt...Agreed. Jan> Ian. > >> > + >> > bool_t __read_mostly x2apic_enabled = 0; >> > bool_t __read_mostly directed_eoi_enabled = 0; >> > >> > @@ -1438,6 +1440,62 @@ int __init APIC_init_uniprocessor (void) >> > return 0; >> > } >> > >> > +/* Needs to be called during startup. It records the state the BIOS >> > + * leaves the local APIC so we can undo upon kexec. >> > + */ >> > +void __init record_boot_APIC_mode(void) >> > +{ >> > + /* Sanity check - we should only ever run once, but could possibly >> > + * be called several times */ >> > + if ( APIC_MODE_INVALID != apic_boot_mode ) >> > + return; >> > + >> > + apic_boot_mode = current_local_apic_mode(); >> > + >> > + apic_printk(APIC_DEBUG, "APIC boot state is ''%s''\n", >> > + apic_mode_to_str(apic_boot_mode)); >> > +} >> > + >> > +/* Look at the bits in MSR_IA32_APICBASE and work out which >> > + * APIC mode we are in */ >> > +enum apic_mode current_local_apic_mode(void) >> >> Logically (within this patch) this function should also be static, but >> I see it is being referenced by a later patch. Read on there. >> >> > +{ >> > + u64 msr_contents; >> > + >> > + rdmsrl(MSR_IA32_APICBASE, msr_contents); >> > + >> > + /* Reading EXTD bit from the MSR is only valid if CPUID >> > + * says so, else reserved */ >> > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) >> > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) >> > + return APIC_MODE_X2APIC; >> > + >> > + /* EN bit should always be valid as long as we can read the MSR >> > + */ >> > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) >> > + return APIC_MODE_XAPIC; >> > + >> > + return APIC_MODE_DISABLED; >> > +} >> > + >> > + >> > +const char * apic_mode_to_str(const enum apic_mode mode) >> >> static ... __init ... >> >> Jan >> >> > +{ >> > + switch ( mode ) >> > + { >> > + case APIC_MODE_INVALID: >> > + return "invalid"; >> > + case APIC_MODE_DISABLED: >> > + return "disabled"; >> > + case APIC_MODE_XAPIC: >> > + return "xapic"; >> > + case APIC_MODE_X2APIC: >> > + return "x2apic"; >> > + default: >> > + return "unrecognised"; >> > + } >> > +} >> > + >> > void check_for_unexpected_msi(unsigned int vector) >> > { >> > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); >> > diff -r 68efd418b6f1 -r 615db56671fa xen/arch/x86/genapic/probe.c >> > --- a/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 >> > +++ b/xen/arch/x86/genapic/probe.c Mon Jun 13 17:45:43 2011 +0100 >> > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) >> > { >> > int i, changed; >> > >> > + record_boot_APIC_mode(); >> > + >> > check_x2apic_preenabled(); >> > cmdline_apic = changed = (genapic != NULL); >> > >> > diff -r 68efd418b6f1 -r 615db56671fa xen/include/asm-x86/apic.h >> > --- a/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 >> > +++ b/xen/include/asm-x86/apic.h Mon Jun 13 17:45:43 2011 +0100 >> > @@ -21,6 +21,23 @@ >> > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 >> > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 >> > >> > +/* Possible APIC states */ >> > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ >> > + APIC_MODE_DISABLED, /* If uniprocessor, or smp in >> > + * uniprocessor mode */ >> > + APIC_MODE_XAPIC, /* xAPIC mode - default upon >> > + * chipset reset */ >> > + APIC_MODE_X2APIC /* x2APIC mode - common for large >> > + * smp machines */ >> > +}; >> > + >> > +/* Bootstrap processor local APIC boot mode - so we can undo our changes >> > + * to the APIC state */ >> > +extern enum apic_mode apic_boot_mode; >> > + >> > +/* enum apic_mode -> str function for logging/debug */ >> > +const char * apic_mode_to_str(const enum apic_mode); >> > + >> > extern u8 apic_verbosity; >> > extern bool_t x2apic_enabled; >> > extern bool_t directed_eoi_enabled; >> > @@ -203,6 +220,8 @@ extern void disable_APIC_timer(void); >> > extern void enable_APIC_timer(void); >> > extern int lapic_suspend(void); >> > extern int lapic_resume(void); >> > +extern void record_boot_APIC_mode(void); >> > +extern enum apic_mode current_local_apic_mode(void); >> > >> > extern int check_nmi_watchdog (void); >> > >> > >> > _______________________________________________ >> > Xen-devel mailing list >> > Xen-devel@lists.xensource.com >> > http://lists.xensource.com/xen-devel >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-14 11:51 UTC
Re: [Xen-devel] [PATCH 3 of 7] IOMMU: Sanitise pointer work
On 14/06/2011 10:53, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> On 13/06/11 19:13, Keir Fraser wrote: >> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: >> >>> Check for null pointers before calling function pointers. >> But they''re never NULL? This one''s a bit pointless. >> >> -- Keir >> > This patch was a direct result of comments regarding my previous attempt > to make a crash_shutdown iommu_op. Are we happy to assume that noone in > the future is going to partially implement an iommu_ops structure?We''ll cross that bridge if we come to it. I can''t see anyone implementing iommu_ops that don''t need a call back on suspend and resume. -- Keir> ~Andrew >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> >>> diff -r 1c3d2e4d06fe -r 68efd418b6f1 xen/drivers/passthrough/iommu.c >>> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 >>> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 >>> @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un >>> return ops->read_apic_from_ire(apic, reg); >>> } >>> >>> -void iommu_resume() >>> +void iommu_resume(void) >>> { >>> const struct iommu_ops *ops = iommu_get_ops(); >>> - if ( iommu_enabled ) >>> + if ( iommu_enabled && ops && ops->resume ) >>> ops->resume(); >>> } >>> >>> -void iommu_suspend() >>> +void iommu_suspend(void) >>> { >>> const struct iommu_ops *ops = iommu_get_ops(); >>> - if ( iommu_enabled ) >>> + if ( iommu_enabled && ops && ops->suspend ) >>> ops->suspend(); >>> } >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-14 12:10 UTC
Re: [Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> The kdump kernel has problems booting with interrupt/dma > remapping enabled, so we need a new iommu_ops called > crash_shutdown which is basically suspend but doesn''t > need to bother saving state. > > Make sure that crash_shutdown is called on the kexec > path.What''s wrong with the existing .suspend iommu callbacks? They look very similar (identical in the AMD case in fact). The extra iommu_disable_x2apic_IR() could be avoided in favour of the eisting one in the lapic_suspend() path, perhaps? -- Keir> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 2635774346c4 -r 3ad737eb0a8d xen/arch/x86/crash.c > --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 > @@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void) > msecs--; > } > > + /* Crash shutdown any IOMMU functionality as the crashdump kernel is not > + * happy when booting if interrupt/dma remapping is still enabled */ > + iommu_crash_shutdown(); > + > __stop_this_cpu(); > disable_IO_APIC(); > > diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/iommu_init.c > --- a/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100 > @@ -921,6 +921,14 @@ void amd_iommu_suspend(void) > disable_iommu(iommu); > } > > +void amd_iommu_crash_shutdown(void) > +{ > + struct amd_iommu *iommu; > + > + for_each_amd_iommu ( iommu ) > + disable_iommu(iommu); > +} > + > void amd_iommu_resume(void) > { > struct amd_iommu *iommu; > diff -r 2635774346c4 -r 3ad737eb0a8d > xen/drivers/passthrough/amd/pci_amd_iommu.c > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 > +0100 > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 > +0100 > @@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = { > .suspend = amd_iommu_suspend, > .resume = amd_iommu_resume, > .share_p2m = amd_iommu_share_p2m, > + .crash_shutdown = amd_iommu_crash_shutdown, > }; > diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/iommu.c > --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 > @@ -429,6 +429,14 @@ void iommu_share_p2m_table(struct domain > ops->share_p2m(d); > } > > +void iommu_crash_shutdown(void) > +{ > + const struct iommu_ops *ops = iommu_get_ops(); > + if ( ops && ops->crash_shutdown ) > + ops->crash_shutdown(); > + iommu_enabled = 0; > +} > + > /* > * Local variables: > * mode: C > diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/vtd/iommu.c > --- a/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100 > @@ -2238,6 +2238,25 @@ static void vtd_suspend(void) > } > } > > +static void vtd_crash_shutdown(void) > +{ > + struct acpi_drhd_unit *drhd; > + struct iommu *iommu; > + > + if ( !iommu_enabled ) > + return; > + > + iommu_flush_all(); > + > + for_each_drhd_unit ( drhd ) > + { > + iommu = drhd->iommu; > + iommu_disable_translation(iommu); > + } > + > + iommu_disable_x2apic_IR(); > +} > + > static void vtd_resume(void) > { > struct acpi_drhd_unit *drhd; > @@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops > .suspend = vtd_suspend, > .resume = vtd_resume, > .share_p2m = iommu_set_pgd, > + .crash_shutdown = vtd_crash_shutdown, > }; > > /* > diff -r 2635774346c4 -r 3ad737eb0a8d > xen/include/asm-x86/hvm/svm/amd-iommu-proto.h > --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 > +0100 > +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 > +0100 > @@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_ > /* power management support */ > void amd_iommu_resume(void); > void amd_iommu_suspend(void); > +void amd_iommu_crash_shutdown(void); > > static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift) > { > diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/xen/iommu.h > --- a/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100 > @@ -133,6 +133,7 @@ struct iommu_ops { > void (*suspend)(void); > void (*resume)(void); > void (*share_p2m)(struct domain *d); > + void (*crash_shutdown)(void); > }; > > void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned > int value); > @@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un > > void iommu_suspend(void); > void iommu_resume(void); > +void iommu_crash_shutdown(void); > > void iommu_set_dom0_mapping(struct domain *d); > void iommu_share_p2m_table(struct domain *d); > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-14 12:11 UTC
Re: [Xen-devel] [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> Introduce the boolean variable ''kexecing'' which indicates to functions > whether we are on the kexec path or not. This is used by > disable_local_APIC() to try and revert the APIC mode back to how it > was found on boot.Does this need to be done only on the kexec path, rather than unconditionally in disable_local_APIC()? -- Keir> We also need some fudging of the x2apic_enabled variable. It is used > in multiple places over the codebase to mean multiple things, > including: > What did the user specifify on the command line? > Did the BIOS boot me in x2apic mode? > Is the BSP Local APIC in x2apic mode? > What mode is my Local APIC in? > > Therefore, set it up to prevent a protection fault when disabling the > IOAPICs. (In this case, it is used in the "What mode is my Local APIC > in?" case, so the processor doesnt suffer a protection fault because > of trying to use x2apic MSRs when it should be using xapic MMIO) > > Finally, make sure that interrupts are disabled when jumping into the > purgatory code. It would be bad to service interrupts in the Xen > context when the next kernel is booting. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c > --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 > @@ -37,6 +37,7 @@ > #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */ > #include <mach_apic.h> > #include <io_ports.h> > +#include <xen/kexec.h> > > static bool_t tdt_enabled __read_mostly; > static bool_t tdt_enable __initdata = 1; > @@ -345,6 +346,33 @@ void disable_local_APIC(void) > wrmsrl(MSR_IA32_APICBASE, msr_content & > ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD)); > } > + > + if ( kexecing ) > + { > + uint64_t msr_content; > + rdmsrl(MSR_IA32_APICBASE, msr_content); > + msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD); > + wrmsrl(MSR_IA32_APICBASE, msr_content); > + > + switch ( apic_boot_mode ) > + { > + case APIC_MODE_DISABLED: > + break; /* Nothing to do - we did this above */ > + case APIC_MODE_XAPIC: > + msr_content |= MSR_IA32_APICBASE_ENABLE; > + wrmsrl(MSR_IA32_APICBASE, msr_content); > + break; > + case APIC_MODE_X2APIC: > + msr_content |= (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD); > + wrmsrl(MSR_IA32_APICBASE, msr_content); > + break; > + default: > + printk("Default case when reverting #%d lapic to boot state\n", > + smp_processor_id()); > + break; > + } > + } > + > } > > /* > diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c > --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 > @@ -27,6 +27,7 @@ > #include <asm/hvm/support.h> > #include <asm/apic.h> > #include <asm/io_apic.h> > +#include <xen/iommu.h> > > static atomic_t waiting_for_crash_ipi; > static unsigned int crashing_cpu; > @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void) > iommu_crash_shutdown(); > > __stop_this_cpu(); > + > + /* This is a bit of a hack due to the problems with the x2apic_enabled > + * variable, but we can''t do any better without a significant refactoring > + * of the APIC code */ > + if ( current_local_apic_mode() == APIC_MODE_X2APIC ) > + x2apic_enabled = 1; > + else > + x2apic_enabled = 0; > + > disable_IO_APIC(); > > local_irq_restore(flags); > diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c > --- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100 > @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im > if ( hpet_broadcast_is_available() ) > hpet_disable_legacy_broadcast(); > > + /* We are about to permenantly jump out of the Xen context into the kexec > + * purgatory code. We really dont want to be still servicing interupts. > + */ > + local_irq_disable(); > + > /* > * compat_machine_kexec() returns to idle pagetables, which requires us > * to be running on a static GDT mapping (idle pagetables have no GDT > diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c > --- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100 > @@ -29,6 +29,8 @@ > #include <compat/kexec.h> > #endif > > +bool_t kexecing = FALSE; > + > static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes); > > static Elf_Note *xen_crash_note; > @@ -220,6 +222,8 @@ void kexec_crash(void) > if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) ) > return; > > + kexecing = TRUE; > + > kexec_common_shutdown(); > kexec_crash_save_cpu(); > machine_crash_shutdown(); > @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image) > { > xen_kexec_image_t *image = _image; > > + kexecing = TRUE; > + > kexec_common_shutdown(); > machine_reboot_kexec(image); > > diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h > --- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100 > @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve { > > extern xen_kexec_reserve_t kexec_crash_area; > > +extern bool_t kexecing; > + > void set_kexec_crash_area_size(u64 system_ram); > > /* We have space for 4 images to support atomic update > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-14 13:05 UTC
Re: [Xen-devel] [PATCH 7 of 7] KEXEC: correctly revert x2apic state when kexecing
On 14/06/11 13:11, Keir Fraser wrote:> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> Introduce the boolean variable ''kexecing'' which indicates to functions >> whether we are on the kexec path or not. This is used by >> disable_local_APIC() to try and revert the APIC mode back to how it >> was found on boot. > Does this need to be done only on the kexec path, rather than > unconditionally in disable_local_APIC()? > > -- KeirI tried doing it unconditionally, but because of the interaction of the local apic mode and the x2apic_enabled variable, it was causing protection faults all over the place in the regular reboot code. Specifically: the apic code switches between MSRs and MMIO based on the single global x2apic_enabled variable. Whether you should use MSRs or MMIO depends on whether your local apic is in xapic or x2apic mode, and using MSRs when you should be using MMIO results in a protection fault, which cascades as disable_local_APIC() is on the panic codepath. The function is also caused as part of __stop_this_cpu() which is on several codepaths. While none of those codepaths immediately jump out as being problematic, I have a feeling that the interaction with x2apic_enabled will cause problems there as well. I can''t think of any architectural reason for it not to happen, but it wont work without the original refactoring of the apic code which was rejected on terms of straying too far from Linux. ~Andrew>> We also need some fudging of the x2apic_enabled variable. It is used >> in multiple places over the codebase to mean multiple things, >> including: >> What did the user specifify on the command line? >> Did the BIOS boot me in x2apic mode? >> Is the BSP Local APIC in x2apic mode? >> What mode is my Local APIC in? >> >> Therefore, set it up to prevent a protection fault when disabling the >> IOAPICs. (In this case, it is used in the "What mode is my Local APIC >> in?" case, so the processor doesnt suffer a protection fault because >> of trying to use x2apic MSRs when it should be using xapic MMIO) >> >> Finally, make sure that interrupts are disabled when jumping into the >> purgatory code. It would be bad to service interrupts in the Xen >> context when the next kernel is booting. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/apic.c >> --- a/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/arch/x86/apic.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -37,6 +37,7 @@ >> #include <asm/asm_defns.h> /* for BUILD_SMP_INTERRUPT */ >> #include <mach_apic.h> >> #include <io_ports.h> >> +#include <xen/kexec.h> >> >> static bool_t tdt_enabled __read_mostly; >> static bool_t tdt_enable __initdata = 1; >> @@ -345,6 +346,33 @@ void disable_local_APIC(void) >> wrmsrl(MSR_IA32_APICBASE, msr_content & >> ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD)); >> } >> + >> + if ( kexecing ) >> + { >> + uint64_t msr_content; >> + rdmsrl(MSR_IA32_APICBASE, msr_content); >> + msr_content &= ~(MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD); >> + wrmsrl(MSR_IA32_APICBASE, msr_content); >> + >> + switch ( apic_boot_mode ) >> + { >> + case APIC_MODE_DISABLED: >> + break; /* Nothing to do - we did this above */ >> + case APIC_MODE_XAPIC: >> + msr_content |= MSR_IA32_APICBASE_ENABLE; >> + wrmsrl(MSR_IA32_APICBASE, msr_content); >> + break; >> + case APIC_MODE_X2APIC: >> + msr_content |= (MSR_IA32_APICBASE_ENABLE|MSR_IA32_APICBASE_EXTD); >> + wrmsrl(MSR_IA32_APICBASE, msr_content); >> + break; >> + default: >> + printk("Default case when reverting #%d lapic to boot state\n", >> + smp_processor_id()); >> + break; >> + } >> + } >> + >> } >> >> /* >> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/crash.c >> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -27,6 +27,7 @@ >> #include <asm/hvm/support.h> >> #include <asm/apic.h> >> #include <asm/io_apic.h> >> +#include <xen/iommu.h> >> >> static atomic_t waiting_for_crash_ipi; >> static unsigned int crashing_cpu; >> @@ -82,6 +83,15 @@ static void nmi_shootdown_cpus(void) >> iommu_crash_shutdown(); >> >> __stop_this_cpu(); >> + >> + /* This is a bit of a hack due to the problems with the x2apic_enabled >> + * variable, but we can''t do any better without a significant refactoring >> + * of the APIC code */ >> + if ( current_local_apic_mode() == APIC_MODE_X2APIC ) >> + x2apic_enabled = 1; >> + else >> + x2apic_enabled = 0; >> + >> disable_IO_APIC(); >> >> local_irq_restore(flags); >> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/arch/x86/machine_kexec.c >> --- a/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/arch/x86/machine_kexec.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -91,6 +91,11 @@ void machine_kexec(xen_kexec_image_t *im >> if ( hpet_broadcast_is_available() ) >> hpet_disable_legacy_broadcast(); >> >> + /* We are about to permenantly jump out of the Xen context into the kexec >> + * purgatory code. We really dont want to be still servicing interupts. >> + */ >> + local_irq_disable(); >> + >> /* >> * compat_machine_kexec() returns to idle pagetables, which requires us >> * to be running on a static GDT mapping (idle pagetables have no GDT >> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/common/kexec.c >> --- a/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/common/kexec.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -29,6 +29,8 @@ >> #include <compat/kexec.h> >> #endif >> >> +bool_t kexecing = FALSE; >> + >> static DEFINE_PER_CPU_READ_MOSTLY(void *, crash_notes); >> >> static Elf_Note *xen_crash_note; >> @@ -220,6 +222,8 @@ void kexec_crash(void) >> if ( !test_bit(KEXEC_IMAGE_CRASH_BASE + pos, &kexec_flags) ) >> return; >> >> + kexecing = TRUE; >> + >> kexec_common_shutdown(); >> kexec_crash_save_cpu(); >> machine_crash_shutdown(); >> @@ -232,6 +236,8 @@ static long kexec_reboot(void *_image) >> { >> xen_kexec_image_t *image = _image; >> >> + kexecing = TRUE; >> + >> kexec_common_shutdown(); >> machine_reboot_kexec(image); >> >> diff -r 3ad737eb0a8d -r b6666fbafe33 xen/include/xen/kexec.h >> --- a/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/include/xen/kexec.h Mon Jun 13 17:45:43 2011 +0100 >> @@ -12,6 +12,8 @@ typedef struct xen_kexec_reserve { >> >> extern xen_kexec_reserve_t kexec_crash_area; >> >> +extern bool_t kexecing; >> + >> void set_kexec_crash_area_size(u64 system_ram); >> >> /* We have space for 4 images to support atomic update >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Jun-14 21:20 UTC
RE: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
> Let''s see what our friends at Intel think...IOMMU_WAIT_OP() has a timeout value of 1 second. When it times out, it will cause a system panic. -----Original Message----- From: Jan Beulich [mailto:JBeulich@novell.com] Sent: Tuesday, June 14, 2011 2:02 AM To: Andrew Cooper Cc: Kay, Allen M; xen-devel@lists.xensource.com Subject: Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > Experimental evidence shows that Extended Interrupt Mode remains in > effect even after Interrupt Remapping is disabled in each DMAR Global > Command Register. A consiquence of this is that when we switch from > x2apic mode back to xapic mode, and disable interrupt remapping for > the kdump kernel, interrupts passing through the IO APICs are in > x2apic format as opposed xapic. This causes a triple fault in the > kexec kernel. > > As EIM is explicitly set up each time Interrup Remapping is enabled, > it is safe for us to clobber this when taring down. > > Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused > verbose and error-prone code, and was only used in 1 place before. We > now have IRTA_EIME which is the specific bit in the register. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c > --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 > @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu, > > #ifdef CONFIG_X86 > /* set extended interrupt mode bit */ > - ir_ctrl->iremap_maddr |> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0; > + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; > #endif > spin_lock_irqsave(&iommu->register_lock, flags); > > @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm > if ( !ecap_intr_remap(iommu->ecap) ) > return; > > + /* If we are disabling Interrupt Remapping, make sure we dont stay in > + * Extended Interrupt Mode, as this is unaffected by the Interrupt > + * Remapping flag in each DMAR Global Control Register. > + * Specifically, local apics in xapic mode do not like interrupts > delivered > + * in x2apic mode. Any code turning interrupt remapping back on will > set > + * EIME back correctly. > + */ > + if ( iommu_supports_eim() ) > + { > + u64 irta; > + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); > + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); > + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, > + !(irta & IRTA_EIME), irta);Hmm, this of course can be problematic in the context of a crash: I realize that you want it cleared, but does system state guarantee that this WAIT_OP will always be able to complete? You''d get a hung system instead if it won''t. Admittedly it''s not clear which one is better, but a best effort approach would probably skip the wait loop. Let''s see what our friends at Intel think... Jan> + } > + > spin_lock_irqsave(&iommu->register_lock, flags); > sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); > if ( !(sts & DMA_GSTS_IRES) ) > diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h > --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 > +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 > @@ -471,7 +471,7 @@ struct qinval_entry { > > #define IEC_GLOBAL_INVL 0 > #define IEC_INDEX_INVL 1 > -#define IRTA_REG_EIME_SHIFT 11 > +#define IRTA_EIME (1 << 11) > > /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */ > #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 ) > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Jun-14 21:45 UTC
RE: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
> + /* If we are disabling Interrupt Remapping, make sure we dont stay in > + * Extended Interrupt Mode, as this is unaffected by the Interrupt > + * Remapping flag in each DMAR Global Control Register. > + * Specifically, local apics in xapic mode do not like interrupts delivered > + * in x2apic mode. Any code turning interrupt remapping back on will set > + * EIME back correctly. > + */ > + if ( iommu_supports_eim() ) > + { > + u64 irta; > + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); > + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); > + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, > + !(irta & IRTA_EIME), irta); > + } > + > spin_lock_irqsave(&iommu->register_lock, flags); > sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); > if ( !(sts & DMA_GSTS_IRES) )This new code should go after spin_lock_irqsave() call. You should also model your code after existing code that disables interrupt remapping in the same function, by checking whether EIME is enabled after the read before proceed to disable it. It''s good to keep the software consistent in case we encounter any hardware bugs in the future.> #define IEC_GLOBAL_INVL 0 > #define IEC_INDEX_INVL 1 > -#define IRTA_REG_EIME_SHIFT 11 > +#define IRTA_EIME (1 << 11)Cast with u64 as follow like other defines in iommu.h. #define IRTA_EIME (((u64)1) << 11) Allen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Jun-14 22:15 UTC
RE: [Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
+static void vtd_crash_shutdown(void) +{ + struct acpi_drhd_unit *drhd; + struct iommu *iommu; + + if ( !iommu_enabled ) + return; + + iommu_flush_all(); + + for_each_drhd_unit ( drhd ) + { + iommu = drhd->iommu; + iommu_disable_translation(iommu); + } + + iommu_disable_x2apic_IR(); +} + Iommu_disable_x2apic_IR() check for iommu_supports_eim() before entering. What happens when x2apic is not enabled but interrupt remapping is enabled? Maybe you should just create disable_intremap() and disable_qi() functions and call from vtd_crash_shutdown() and iommu_disable_x2apic_IR(). Allen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-15 06:48 UTC
RE: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
>>> On 14.06.11 at 23:20, "Kay, Allen M" <allen.m.kay@intel.com> wrote: >> Let''s see what our friends at Intel think... > > IOMMU_WAIT_OP() has a timeout value of 1 second. When it times out, it will > cause a system panic.Sort of bad when we''re already bringing down the system possibly because of a panic. Jan> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@novell.com] > Sent: Tuesday, June 14, 2011 2:02 AM > To: Andrew Cooper > Cc: Kay, Allen M; xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended > Interrupt Mode when disabling Interupt Remapping > >>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: >> Experimental evidence shows that Extended Interrupt Mode remains in >> effect even after Interrupt Remapping is disabled in each DMAR Global >> Command Register. A consiquence of this is that when we switch from >> x2apic mode back to xapic mode, and disable interrupt remapping for >> the kdump kernel, interrupts passing through the IO APICs are in >> x2apic format as opposed xapic. This causes a triple fault in the >> kexec kernel. >> >> As EIM is explicitly set up each time Interrup Remapping is enabled, >> it is safe for us to clobber this when taring down. >> >> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused >> verbose and error-prone code, and was only used in 1 place before. We >> now have IRTA_EIME which is the specific bit in the register. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c >> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu, >> >> #ifdef CONFIG_X86 >> /* set extended interrupt mode bit */ >> - ir_ctrl->iremap_maddr |>> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0; >> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; >> #endif >> spin_lock_irqsave(&iommu->register_lock, flags); >> >> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm >> if ( !ecap_intr_remap(iommu->ecap) ) >> return; >> >> + /* If we are disabling Interrupt Remapping, make sure we dont stay in >> + * Extended Interrupt Mode, as this is unaffected by the Interrupt >> + * Remapping flag in each DMAR Global Control Register. >> + * Specifically, local apics in xapic mode do not like interrupts >> delivered >> + * in x2apic mode. Any code turning interrupt remapping back on will >> set >> + * EIME back correctly. >> + */ >> + if ( iommu_supports_eim() ) >> + { >> + u64 irta; >> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); >> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); >> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, >> + !(irta & IRTA_EIME), irta); > > Hmm, this of course can be problematic in the context of a crash: > I realize that you want it cleared, but does system state guarantee > that this WAIT_OP will always be able to complete? You''d get a > hung system instead if it won''t. Admittedly it''s not clear which one > is better, but a best effort approach would probably skip the wait > loop. Let''s see what our friends at Intel think... > > Jan > >> + } >> + >> spin_lock_irqsave(&iommu->register_lock, flags); >> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); >> if ( !(sts & DMA_GSTS_IRES) ) >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h >> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 >> @@ -471,7 +471,7 @@ struct qinval_entry { >> >> #define IEC_GLOBAL_INVL 0 >> #define IEC_INDEX_INVL 1 >> -#define IRTA_REG_EIME_SHIFT 11 >> +#define IRTA_EIME (1 << 11) >> >> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */ >> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 ) >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Jun-15 07:45 UTC
RE: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping
On Wed, 2011-06-15 at 07:48 +0100, Jan Beulich wrote:> >>> On 14.06.11 at 23:20, "Kay, Allen M" <allen.m.kay@intel.com> wrote: > >> Let''s see what our friends at Intel think... > > > > IOMMU_WAIT_OP() has a timeout value of 1 second. When it times out, it will > > cause a system panic. > > Sort of bad when we''re already bringing down the system possibly > because of a panic.Right. I think we could do with an __IOMMU_WAIT_OP (if one doesn''t already exist) which just waits for the timeout and returns. There''s no harm in trying to continue with the kexec if it fails IMHO. Ian.> > Jan > > > -----Original Message----- > > From: Jan Beulich [mailto:JBeulich@novell.com] > > Sent: Tuesday, June 14, 2011 2:02 AM > > To: Andrew Cooper > > Cc: Kay, Allen M; xen-devel@lists.xensource.com > > Subject: Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended > > Interrupt Mode when disabling Interupt Remapping > > > >>>> On 13.06.11 at 19:02, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > >> Experimental evidence shows that Extended Interrupt Mode remains in > >> effect even after Interrupt Remapping is disabled in each DMAR Global > >> Command Register. A consiquence of this is that when we switch from > >> x2apic mode back to xapic mode, and disable interrupt remapping for > >> the kdump kernel, interrupts passing through the IO APICs are in > >> x2apic format as opposed xapic. This causes a triple fault in the > >> kexec kernel. > >> > >> As EIM is explicitly set up each time Interrup Remapping is enabled, > >> it is safe for us to clobber this when taring down. > >> > >> Also, change the header definition of IRTA_REG_EIME_SHIFT. It caused > >> verbose and error-prone code, and was only used in 1 place before. We > >> now have IRTA_EIME which is the specific bit in the register. > >> > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > >> > >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/intremap.c > >> --- a/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 > >> +++ b/xen/drivers/passthrough/vtd/intremap.c Mon Jun 13 17:45:43 2011 +0100 > >> @@ -776,8 +776,7 @@ int enable_intremap(struct iommu *iommu, > >> > >> #ifdef CONFIG_X86 > >> /* set extended interrupt mode bit */ > >> - ir_ctrl->iremap_maddr |> >> - eim ? (1 << IRTA_REG_EIME_SHIFT) : 0; > >> + ir_ctrl->iremap_maddr |= eim ? IRTA_EIME : 0; > >> #endif > >> spin_lock_irqsave(&iommu->register_lock, flags); > >> > >> @@ -817,6 +816,22 @@ void disable_intremap(struct iommu *iomm > >> if ( !ecap_intr_remap(iommu->ecap) ) > >> return; > >> > >> + /* If we are disabling Interrupt Remapping, make sure we dont stay in > >> + * Extended Interrupt Mode, as this is unaffected by the Interrupt > >> + * Remapping flag in each DMAR Global Control Register. > >> + * Specifically, local apics in xapic mode do not like interrupts > >> delivered > >> + * in x2apic mode. Any code turning interrupt remapping back on will > >> set > >> + * EIME back correctly. > >> + */ > >> + if ( iommu_supports_eim() ) > >> + { > >> + u64 irta; > >> + irta = dmar_readl(iommu->reg, DMAR_IRTA_REG); > >> + dmar_writel(iommu->reg, DMAR_IRTA_REG, irta & ~IRTA_EIME); > >> + IOMMU_WAIT_OP(iommu, DMAR_IRTA_REG, dmar_readl, > >> + !(irta & IRTA_EIME), irta); > > > > Hmm, this of course can be problematic in the context of a crash: > > I realize that you want it cleared, but does system state guarantee > > that this WAIT_OP will always be able to complete? You''d get a > > hung system instead if it won''t. Admittedly it''s not clear which one > > is better, but a best effort approach would probably skip the wait > > loop. Let''s see what our friends at Intel think... > > > > Jan > > > >> + } > >> + > >> spin_lock_irqsave(&iommu->register_lock, flags); > >> sts = dmar_readl(iommu->reg, DMAR_GSTS_REG); > >> if ( !(sts & DMA_GSTS_IRES) ) > >> diff -r 615db56671fa -r 2635774346c4 xen/drivers/passthrough/vtd/iommu.h > >> --- a/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 > >> +++ b/xen/drivers/passthrough/vtd/iommu.h Mon Jun 13 17:45:43 2011 +0100 > >> @@ -471,7 +471,7 @@ struct qinval_entry { > >> > >> #define IEC_GLOBAL_INVL 0 > >> #define IEC_INDEX_INVL 1 > >> -#define IRTA_REG_EIME_SHIFT 11 > >> +#define IRTA_EIME (1 << 11) > >> > >> /* 2^(IRTA_REG_TABLE_SIZE + 1) = IREMAP_ENTRY_NR */ > >> #define IRTA_REG_TABLE_SIZE ( IREMAP_PAGE_ORDER + 7 ) > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 11:01 UTC
Re: [Xen-devel] [PATCH 2 of 7] KEXEC BUG: nmi_shootdown_cpus doesn''t look after the interrupt flag [Reformatted]
Reformatted as per Keir''s suggestion. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 12:33 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
It turns out that we do require apic_boot_mode to be accessible outside in a later reformatted patch so that cant be static any more. Suggested changes for acpi_mode_to_str have been taken, along with removing its declaration from apic.h and putting it at the top of apic.c -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-15 12:42 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
On 15/06/2011 13:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> It turns out that we do require apic_boot_mode to be accessible outside > in a later reformatted patch so that cant be static any more. > > Suggested changes for acpi_mode_to_str have been taken, along with > removing its declaration from apic.h and putting it at the top of apic.cI applied some of you series already to xen-unstable tip. Please re-send the remaining bits you need as a new series against tip. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Jun-15 12:50 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
>>> On 15.06.11 at 14:33, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > It turns out that we do require apic_boot_mode to be accessible outside > in a later reformatted patch so that cant be static any more.As said earlier, I think the code that accesses this variable really belongs into apic.c (possibly into a new function).> Suggested changes for acpi_mode_to_str have been taken, along with > removing its declaration from apic.h and putting it at the top of apic.cThanks. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 12:50 UTC
Re: [Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
On 14/06/11 13:10, Keir Fraser wrote:> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> The kdump kernel has problems booting with interrupt/dma >> remapping enabled, so we need a new iommu_ops called >> crash_shutdown which is basically suspend but doesn''t >> need to bother saving state. >> >> Make sure that crash_shutdown is called on the kexec >> path. > What''s wrong with the existing .suspend iommu callbacks? They look very > similar (identical in the AMD case in fact). The extra > iommu_disable_x2apic_IR() could be avoided in favour of the eisting one in > the lapic_suspend() path, perhaps? > > -- Keir >For the Intel case, we don''t care whether the user has forced iommu on the comandline - we still need to disable it. Also, saving state is pointless in the case of a crash. It also gives us a clean way to set iommu_enabled = 0; which prevents unhelpful panics later on when disable_IO_APICs times out because it is polling the wrong register. For AMD, that is simply because AMD doesn''t actually save state. (I assume that the state is retained in the hardware, but I don''t have any AMD boxes I can test with)>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> diff -r 2635774346c4 -r 3ad737eb0a8d xen/arch/x86/crash.c >> --- a/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/arch/x86/crash.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -77,6 +77,10 @@ static void nmi_shootdown_cpus(void) >> msecs--; >> } >> >> + /* Crash shutdown any IOMMU functionality as the crashdump kernel is not >> + * happy when booting if interrupt/dma remapping is still enabled */ >> + iommu_crash_shutdown(); >> + >> __stop_this_cpu(); >> disable_IO_APIC(); >> >> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/amd/iommu_init.c >> --- a/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/amd/iommu_init.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -921,6 +921,14 @@ void amd_iommu_suspend(void) >> disable_iommu(iommu); >> } >> >> +void amd_iommu_crash_shutdown(void) >> +{ >> + struct amd_iommu *iommu; >> + >> + for_each_amd_iommu ( iommu ) >> + disable_iommu(iommu); >> +} >> + >> void amd_iommu_resume(void) >> { >> struct amd_iommu *iommu; >> diff -r 2635774346c4 -r 3ad737eb0a8d >> xen/drivers/passthrough/amd/pci_amd_iommu.c >> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 >> +0100 >> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Mon Jun 13 17:45:43 2011 >> +0100 >> @@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = { >> .suspend = amd_iommu_suspend, >> .resume = amd_iommu_resume, >> .share_p2m = amd_iommu_share_p2m, >> + .crash_shutdown = amd_iommu_crash_shutdown, >> }; >> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/iommu.c >> --- a/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/iommu.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -429,6 +429,14 @@ void iommu_share_p2m_table(struct domain >> ops->share_p2m(d); >> } >> >> +void iommu_crash_shutdown(void) >> +{ >> + const struct iommu_ops *ops = iommu_get_ops(); >> + if ( ops && ops->crash_shutdown ) >> + ops->crash_shutdown(); >> + iommu_enabled = 0; >> +} >> + >> /* >> * Local variables: >> * mode: C >> diff -r 2635774346c4 -r 3ad737eb0a8d xen/drivers/passthrough/vtd/iommu.c >> --- a/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/drivers/passthrough/vtd/iommu.c Mon Jun 13 17:45:43 2011 +0100 >> @@ -2238,6 +2238,25 @@ static void vtd_suspend(void) >> } >> } >> >> +static void vtd_crash_shutdown(void) >> +{ >> + struct acpi_drhd_unit *drhd; >> + struct iommu *iommu; >> + >> + if ( !iommu_enabled ) >> + return; >> + >> + iommu_flush_all(); >> + >> + for_each_drhd_unit ( drhd ) >> + { >> + iommu = drhd->iommu; >> + iommu_disable_translation(iommu); >> + } >> + >> + iommu_disable_x2apic_IR(); >> +} >> + >> static void vtd_resume(void) >> { >> struct acpi_drhd_unit *drhd; >> @@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops >> .suspend = vtd_suspend, >> .resume = vtd_resume, >> .share_p2m = iommu_set_pgd, >> + .crash_shutdown = vtd_crash_shutdown, >> }; >> >> /* >> diff -r 2635774346c4 -r 3ad737eb0a8d >> xen/include/asm-x86/hvm/svm/amd-iommu-proto.h >> --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 >> +0100 >> +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Mon Jun 13 17:45:43 2011 >> +0100 >> @@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_ >> /* power management support */ >> void amd_iommu_resume(void); >> void amd_iommu_suspend(void); >> +void amd_iommu_crash_shutdown(void); >> >> static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift) >> { >> diff -r 2635774346c4 -r 3ad737eb0a8d xen/include/xen/iommu.h >> --- a/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100 >> +++ b/xen/include/xen/iommu.h Mon Jun 13 17:45:43 2011 +0100 >> @@ -133,6 +133,7 @@ struct iommu_ops { >> void (*suspend)(void); >> void (*resume)(void); >> void (*share_p2m)(struct domain *d); >> + void (*crash_shutdown)(void); >> }; >> >> void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned >> int value); >> @@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un >> >> void iommu_suspend(void); >> void iommu_resume(void); >> +void iommu_crash_shutdown(void); >> >> void iommu_set_dom0_mapping(struct domain *d); >> void iommu_share_p2m_table(struct domain *d); >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 13:06 UTC
Re: [Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
On 14/06/11 23:15, Kay, Allen M wrote:> +static void vtd_crash_shutdown(void) > +{ > + struct acpi_drhd_unit *drhd; > + struct iommu *iommu; > + > + if ( !iommu_enabled ) > + return; > + > + iommu_flush_all(); > + > + for_each_drhd_unit ( drhd ) > + { > + iommu = drhd->iommu; > + iommu_disable_translation(iommu); > + } > + > + iommu_disable_x2apic_IR(); > +} > + > > Iommu_disable_x2apic_IR() check for iommu_supports_eim() before entering. What happens when x2apic is not enabled but interrupt remapping is enabled? > > Maybe you should just create disable_intremap() and disable_qi() functions and call from vtd_crash_shutdown() and iommu_disable_x2apic_IR(). > > AllenWell spotted - I missed that. My suggestion would be to remove the check for eim and deal with it in the relevant disable_intremap and disable_qi functions. My feeling is that a call to "iommu_disable_IR" should be able to deal whether or not you have eim. If there are no objections, I will go ahead and try this and integrate it into the patch 5 of the series which is already dealing with eim, and needs some refactoring following my chat with Ian Campbell this morning. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 13:38 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
On 15/06/11 13:42, Keir Fraser wrote:> On 15/06/2011 13:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> It turns out that we do require apic_boot_mode to be accessible outside >> in a later reformatted patch so that cant be static any more. >> >> Suggested changes for acpi_mode_to_str have been taken, along with >> removing its declaration from apic.h and putting it at the top of apic.c > I applied some of you series already to xen-unstable tip. Please re-send the > remaining bits you need as a new series against tip. > > -- KeirAh - I am working against the wrong unstable. Here is the change against staging unstable. Also, is it wise having extern enum apic_mode current_local_apic_mode(void); is apic.h if the function itself is static inside apic.c? -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 14:49 UTC
Re: [Xen-devel] [PATCH 5 of 7] IOMMU VTD BUG: disable Extended Interrupt Mode when disabling Interupt Remapping [Reformatted]
Reformatted for staging unstable following Allen''s suggestions. The problem about panicking on the kexec path will be fixed in a later reformatted patch. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 14:49 UTC
Re: [Xen-devel] [PATCH 4 of 7] APIC: record local APIC state on boot [Reformatted]
On 15/06/11 14:38, Andrew Cooper wrote:> > On 15/06/11 13:42, Keir Fraser wrote: >> On 15/06/2011 13:33, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: >> >>> It turns out that we do require apic_boot_mode to be accessible outside >>> in a later reformatted patch so that cant be static any more. >>> >>> Suggested changes for acpi_mode_to_str have been taken, along with >>> removing its declaration from apic.h and putting it at the top of apic.c >> I applied some of you series already to xen-unstable tip. Please re-send the >> remaining bits you need as a new series against tip. >> >> -- Keir > Ah - I am working against the wrong unstable. Here is the change > against staging unstable. > > Also, is it wise having extern enum apic_mode > current_local_apic_mode(void); is apic.h if the function itself is > static inside apic.c? >Sorry - previous patch was still against the wrong repo. This is correctly against staging. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-15 15:00 UTC
Re: [Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op [Reformatted]
Reformatted patch which removes the crash_shutdown call to iommu_disable_x2apic_IR and merges disable_intremap and disable_qinval into the single for_each_dhrd_unit loop. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-Jun-15 16:39 UTC
RE: [Xen-devel] [PATCH 6 of 7] IOMMU: add crash_shutdown iommu_op
> Well spotted - I missed that. My suggestion would be to remove the > check for eim and deal with it in the relevant disable_intremap and > disable_qi functions. My feeling is that a call to "iommu_disable_IR" > should be able to deal whether or not you have eim. >I agree with your suggestion of not checking for EIM in iommu_disable_IR(). Allen _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-Jun-16 13:05 UTC
Re: [Xen-devel] [PATCH 0 of 7] Fix kexec in Xen (take 4)
One final thing to say, These fixes are candidates for backport to 4.1 as the kexec path is still broken there. ~Andrew On 13/06/11 19:15, Keir Fraser wrote:> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: > >> This set of patches are partly bugfixed to problems I have noticed while >> working on this area of the codebase, and targeted fixes to get the kexec path >> working again on Xen 4.x >> >> The first three patches are small bugfixes which are not directly related to >> the kexec problems, but are on the codepath. >> >> The next four patches are directly related to fixing the kexec path. >> >> These patches cause xen to track the BSP local APIC boot state and return to >> it before kexec''ing to a new kernel. This prevents the kdump kernel falling >> over itself when booting in x2apic mode while expecting to be in xapic mode. >> >> It also makes sure to disable IO virtualisation, along with fixing the current >> codepath related to disabling Interrup Remapping on Intel VTD boxes. > Overall looking much better! I will take a look at the patches individually, > and I expect Jan will be looking too. I think we may have a largely > check-in-able series here. :-) > > -- Keir > >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Jun-16 13:13 UTC
Re: [Xen-devel] [PATCH 0 of 7] Fix kexec in Xen (take 4)
On 16/06/2011 14:05, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> One final thing to say, These fixes are candidates for backport to 4.1 > as the kexec path is still broken there.Since I think you ended up with their scope limited to the kexec paths only, that should be doable. -- Keir> ~Andrew > > On 13/06/11 19:15, Keir Fraser wrote: >> On 13/06/2011 18:02, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote: >> >>> This set of patches are partly bugfixed to problems I have noticed while >>> working on this area of the codebase, and targeted fixes to get the kexec >>> path >>> working again on Xen 4.x >>> >>> The first three patches are small bugfixes which are not directly related to >>> the kexec problems, but are on the codepath. >>> >>> The next four patches are directly related to fixing the kexec path. >>> >>> These patches cause xen to track the BSP local APIC boot state and return to >>> it before kexec''ing to a new kernel. This prevents the kdump kernel falling >>> over itself when booting in x2apic mode while expecting to be in xapic mode. >>> >>> It also makes sure to disable IO virtualisation, along with fixing the >>> current >>> codepath related to disabling Interrup Remapping on Intel VTD boxes. >> Overall looking much better! I will take a look at the patches individually, >> and I expect Jan will be looking too. I think we may have a largely >> check-in-able series here. :-) >> >> -- Keir >> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xensource.com >>> http://lists.xensource.com/xen-devel >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel