This set of patches is designed to get the kexec path working again on Xen 4.x kdump kernels can''t boot if x2apic mode is enabled and the ACPI tables dont state this fact. They also cant boot at all with interrupt remapping enabled. These patches cause xen to track the BSP local APIC boot state and return to it before kexec''ing to a new kernel. It also makes sure to disable IO virtualisation. One area which is problematic is disabling interrupt remapping. lapic_suspend() calls iommu_disable_x2apic_IR() which in a previous thread was deemed to be Intel specific and only works by chance on AMD boxes by effectivly being a NOP. As lapic_suspend() is generic code, does this mean that we can''t/don''t ever disable interrupt remapping on AMD boxes? Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 14:32 UTC
[Xen-devel] [PATCH 1 of 6] APIC: record local APIC state on boot
Xen does not store the boot local APIC state which leads to problems when shutting down for a kexec jump. This patch records the boot state so we can return to the boot state when kexec''ing. Section 3.8 of the MultiProcessor spec states that when the bios hands over to the operating system, all AP locals APICs should be disabled, and the BSP local APIC is up to the bios with regards to compatability. Therefore, we only need to record the BSP mode. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 37c77bacb52a -r 8d30fccc0771 xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Mon May 23 17:38:28 2011 +0100 +++ b/xen/arch/x86/apic.c Wed May 25 15:10:24 2011 +0100 @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; static bool_t __initdata opt_x2apic = 1; boolean_param("x2apic", opt_x2apic); +enum apic_mode __read_mostly apic_boot_mode = APIC_MODE_INVALID; + bool_t __read_mostly x2apic_enabled = 0; bool_t __read_mostly directed_eoi_enabled = 0; @@ -1437,8 +1439,64 @@ int __init APIC_init_uniprocessor (void) return 0; } +/* Needs to be called once per CPU during startup. It records the state the BIOS + * leaves the local APIC so we can tare back down upon shutdown/crash + */ +void __init record_boot_APIC_mode(void) +{ + + /* Sanity check - we should only ever run once, but could possibly be called + * several times */ + if( APIC_MODE_INVALID != apic_boot_mode ) + return; + + apic_boot_mode = current_local_apic_mode(); + + apic_printk(APIC_DEBUG, "APIC boot state is ''%s'' on core #%d\n", + apic_mode_to_str(apic_boot_mode), smp_processor_id()); +} + +/* Look at the bits in MSR_IA32_APICBASE and work out which APIC mode we are in */ +const enum apic_mode current_local_apic_mode(void) +{ + u64 msr_contents; + + rdmsrl(MSR_IA32_APICBASE, msr_contents); + + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) + return APIC_MODE_X2APIC; + else + { + /* EN bit should always be valid as long as we can read the MSR + */ + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) + return APIC_MODE_XAPIC; + else + return APIC_MODE_DISABLED; + } +} + void check_for_unexpected_msi(unsigned int vector) { unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); BUG_ON(v & (1 << (vector & 0x1f))); } + +const char * apic_mode_to_str(const enum apic_mode mode) +{ + switch(mode) + { + case APIC_MODE_INVALID: + return "invalid"; + case APIC_MODE_DISABLED: + return "disabled"; + case APIC_MODE_XAPIC: + return "xapic"; + case APIC_MODE_X2APIC: + return "x2apic"; + default: + return "unrecognised"; + } +} diff -r 37c77bacb52a -r 8d30fccc0771 xen/arch/x86/genapic/probe.c --- a/xen/arch/x86/genapic/probe.c Mon May 23 17:38:28 2011 +0100 +++ b/xen/arch/x86/genapic/probe.c Wed May 25 15:10:24 2011 +0100 @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) { int i, changed; + record_boot_APIC_mode(); + check_x2apic_preenabled(); cmdline_apic = changed = (genapic != NULL); diff -r 37c77bacb52a -r 8d30fccc0771 xen/include/asm-x86/apic.h --- a/xen/include/asm-x86/apic.h Mon May 23 17:38:28 2011 +0100 +++ b/xen/include/asm-x86/apic.h Wed May 25 15:10:24 2011 +0100 @@ -21,6 +21,19 @@ #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 +/* Possible APIC states */ +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ + APIC_MODE_DISABLED, /* Some bioses disable by default for compatability */ + APIC_MODE_XAPIC, /* xAPIC mode - default upon chipset reset */ + APIC_MODE_X2APIC /* x2APIC mode - common for large smp machines */ +}; + +/* Bootstrap processor local APIC boot mode - so we can taredown to bios state */ +extern enum apic_mode apic_boot_mode; + +/* enum apic_mode -> str function for logging/debug */ +const char * apic_mode_to_str(const enum apic_mode); + extern u8 apic_verbosity; extern bool_t x2apic_enabled; extern bool_t directed_eoi_enabled; @@ -203,6 +216,8 @@ extern void disable_APIC_timer(void); extern void enable_APIC_timer(void); extern int lapic_suspend(void); extern int lapic_resume(void); +extern void record_boot_APIC_mode(void); +extern const enum apic_mode current_local_apic_mode(void); extern int check_nmi_watchdog (void); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 14:32 UTC
[Xen-devel] [PATCH 2 of 6] APIC: remove ''enabled_via_apicbase'' variable
The use of this varable was only sensible when the choice for local APIC mode was between enabled or disabled. Now that x2apic is about, it is wrong, and causes a protection fault in certain cases when trying to tear down x2apic mode. The only place where its use is relevent in the code is in disable_local_APIC which is very out of date. It has now been updated to be in line with the MultiProcessor Spec, as well as being able to successfully shut down from x2apic mode without the possibility of throwing a protection fault (which leads to a general protection fault). Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 8d30fccc0771 -r 17991cc69e88 xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Wed May 25 15:10:24 2011 +0100 +++ b/xen/arch/x86/apic.c Wed May 25 15:10:52 2011 +0100 @@ -165,8 +165,6 @@ void __init apic_intr_init(void) /* Using APIC to generate smp_local_timer_interrupt? */ static bool_t __read_mostly using_apic_timer; -static bool_t __read_mostly enabled_via_apicbase; - int get_physical_broadcast(void) { if (modern_apic()) @@ -330,6 +328,8 @@ void disconnect_bsp_APIC(int virt_wire_s void disable_local_APIC(void) { + u64 msr_contents; + clear_local_APIC(); /* @@ -339,10 +339,39 @@ void disable_local_APIC(void) apic_write_around(APIC_SPIV, apic_read(APIC_SPIV) & ~APIC_SPIV_APIC_ENABLED); - if (enabled_via_apicbase) { - uint64_t msr_content; - rdmsrl(MSR_IA32_APICBASE, msr_content); - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE); + /* Always disable the APIC */ + rdmsrl(MSR_IA32_APICBASE, msr_contents); + msr_contents &= ~ ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); + wrmsrl(MSR_IA32_APICBASE, msr_contents); + + /* If we are the boot processor, stick the local apic back to how we found + * it on boot. */ + if( smp_processor_id() != 0 ) + { + + switch(apic_boot_mode) + { + case APIC_MODE_DISABLED: + break; /* Nothing to do - we did this above */ + case APIC_MODE_XAPIC: + { + msr_contents |= MSR_IA32_APICBASE_ENABLE; + wrmsrl(MSR_IA32_APICBASE, msr_contents); + break; + } + case APIC_MODE_X2APIC: + { + msr_contents |= ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); + wrmsrl(MSR_IA32_APICBASE, msr_contents); + break; + } + default: + { + printk("Hit default case when reverting lapic to boot state on core #%d\n", + smp_processor_id()); + break; + } + } } } @@ -874,7 +903,6 @@ static int __init detect_init_APIC (void wrmsrl(MSR_IA32_APICBASE, msr_content | MSR_IA32_APICBASE_ENABLE | APIC_DEFAULT_PHYS_BASE); - enabled_via_apicbase = 1; } } /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 14:32 UTC
[Xen-devel] [PATCH 3 of 6] APIC: add crash_disable_local_APIC
This new function is needed because on an SMP system, we are not likely to be crashing on the BootStrap Processor. As a result, we need to leave the crashing CPU in the state that the BSP was booted with, so that the kdump kernel can take over as if it were booting on the real BSP. diff -r 17991cc69e88 -r 2f91c312ade5 xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Wed May 25 15:10:52 2011 +0100 +++ b/xen/arch/x86/apic.c Wed May 25 15:11:58 2011 +0100 @@ -375,6 +375,56 @@ void disable_local_APIC(void) } } +void crash_disable_local_APIC(bool_t crashing_cpu) +{ + u64 msr_contents; + + clear_local_APIC(); + + /* + * Disable APIC (implies clearing of registers + * for 82489DX!). + */ + apic_write_around(APIC_SPIV, + apic_read(APIC_SPIV) & ~APIC_SPIV_APIC_ENABLED); + + /* Always disable the APIC */ + rdmsrl(MSR_IA32_APICBASE, msr_contents); + msr_contents &= ~ ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); + wrmsrl(MSR_IA32_APICBASE, msr_contents); + + /* If we are the crashing processor, stick the local apic back to how we found + * it on boot, on the bootstrap processor. The crashdump kernel knows that it + * may not be booting on the real bootstrap processor. */ + if( crashing_cpu ) + { + switch(apic_boot_mode) + { + case APIC_MODE_DISABLED: + break; /* Nothing to do - we did this above */ + case APIC_MODE_XAPIC: + { + msr_contents |= MSR_IA32_APICBASE_ENABLE; + wrmsrl(MSR_IA32_APICBASE, msr_contents); + break; + } + case APIC_MODE_X2APIC: + { + msr_contents |= ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); + wrmsrl(MSR_IA32_APICBASE, msr_contents); + break; + } + default: + { + printk("Hit default case when reverting lapic to boot state on core #%d\n", + smp_processor_id()); + break; + } + } + } +} + + /* * This is to verify that we''re looking at a real local APIC. * Check these against your board if the CPUs aren''t getting diff -r 17991cc69e88 -r 2f91c312ade5 xen/include/asm-x86/apic.h --- a/xen/include/asm-x86/apic.h Wed May 25 15:10:52 2011 +0100 +++ b/xen/include/asm-x86/apic.h Wed May 25 15:11:58 2011 +0100 @@ -195,6 +195,7 @@ extern void clear_local_APIC(void); extern void connect_bsp_APIC (void); extern void disconnect_bsp_APIC (int virt_wire_setup); extern void disable_local_APIC (void); +extern void crash_disable_local_APIC (bool_t crashing_cpu); extern int verify_local_APIC (void); extern void cache_APIC_registers (void); extern void sync_Arb_IDs (void); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 14:32 UTC
[Xen-devel] [PATCH 4 of 6] IOMMU: Sanitise some of our pointer work
This is not related to the rest of my kdump changes, but as pointed out by Konrad in a previous thread, we really should make these checks before blindly calling them. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 2f91c312ade5 -r 80f7a773887d xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Wed May 25 15:11:58 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Wed May 25 15:12:23 2011 +0100 @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un return ops->read_apic_from_ire(apic, reg); } -void iommu_resume() +void iommu_resume(void) { const struct iommu_ops *ops = iommu_get_ops(); - if ( iommu_enabled ) + if ( iommu_enabled && ops && ops->resume ) ops->resume(); } -void iommu_suspend() +void iommu_suspend(void) { const struct iommu_ops *ops = iommu_get_ops(); - if ( iommu_enabled ) + if ( iommu_enabled && ops && ops->resume ) ops->suspend(); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 14:32 UTC
[Xen-devel] [PATCH 5 of 6] IOMMU: add crash_shutdown iommu_op
The kdump kernel has problems booting with interrupt/dma remapping enabled, so we need a new iommu_ops called crash_shutdown which is basically suspend but doesn''t need to bother saving state. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 80f7a773887d -r d3026545e9a0 xen/drivers/passthrough/amd/iommu_init.c --- a/xen/drivers/passthrough/amd/iommu_init.c Wed May 25 15:12:23 2011 +0100 +++ b/xen/drivers/passthrough/amd/iommu_init.c Wed May 25 15:12:37 2011 +0100 @@ -921,6 +921,14 @@ void amd_iommu_suspend(void) disable_iommu(iommu); } +void amd_iommu_crash_shutdown(void) +{ + struct amd_iommu *iommu; + + for_each_amd_iommu ( iommu ) + disable_iommu(iommu); +} + void amd_iommu_resume(void) { struct amd_iommu *iommu; diff -r 80f7a773887d -r d3026545e9a0 xen/drivers/passthrough/amd/pci_amd_iommu.c --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c Wed May 25 15:12:23 2011 +0100 +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c Wed May 25 15:12:37 2011 +0100 @@ -459,4 +459,5 @@ const struct iommu_ops amd_iommu_ops = { .suspend = amd_iommu_suspend, .resume = amd_iommu_resume, .share_p2m = amd_iommu_share_p2m, + .crash_shutdown = amd_iommu_crash_shutdown, }; diff -r 80f7a773887d -r d3026545e9a0 xen/drivers/passthrough/iommu.c --- a/xen/drivers/passthrough/iommu.c Wed May 25 15:12:23 2011 +0100 +++ b/xen/drivers/passthrough/iommu.c Wed May 25 15:12:37 2011 +0100 @@ -421,6 +421,13 @@ void iommu_suspend(void) ops->suspend(); } +void iommu_crash_shutdown(void) +{ + const struct iommu_ops *ops = iommu_get_ops(); + if ( ops && ops->crash_shutdown ) + ops->crash_shutdown(); +} + void iommu_share_p2m_table(struct domain* d) { const struct iommu_ops *ops = iommu_get_ops(); diff -r 80f7a773887d -r d3026545e9a0 xen/drivers/passthrough/vtd/iommu.c --- a/xen/drivers/passthrough/vtd/iommu.c Wed May 25 15:12:23 2011 +0100 +++ b/xen/drivers/passthrough/vtd/iommu.c Wed May 25 15:12:37 2011 +0100 @@ -2238,6 +2238,25 @@ static void vtd_suspend(void) } } +static void vtd_crash_shutdown(void) +{ + struct acpi_drhd_unit *drhd; + struct iommu *iommu; + + if ( !iommu_enabled ) + return; + + iommu_flush_all(); + + for_each_drhd_unit ( drhd ) + { + iommu = drhd->iommu; + iommu_disable_translation(iommu); + } + + iommu_disable_x2apic_IR(); +} + static void vtd_resume(void) { struct acpi_drhd_unit *drhd; @@ -2289,6 +2308,7 @@ const struct iommu_ops intel_iommu_ops .suspend = vtd_suspend, .resume = vtd_resume, .share_p2m = iommu_set_pgd, + .crash_shutdown = vtd_crash_shutdown, }; /* diff -r 80f7a773887d -r d3026545e9a0 xen/include/asm-x86/hvm/svm/amd-iommu-proto.h --- a/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Wed May 25 15:12:23 2011 +0100 +++ b/xen/include/asm-x86/hvm/svm/amd-iommu-proto.h Wed May 25 15:12:37 2011 +0100 @@ -98,6 +98,7 @@ unsigned int amd_iommu_read_ioapic_from_ /* power management support */ void amd_iommu_resume(void); void amd_iommu_suspend(void); +void amd_iommu_crash_shutdown(void); static inline u32 get_field_from_reg_u32(u32 reg_value, u32 mask, u32 shift) { diff -r 80f7a773887d -r d3026545e9a0 xen/include/xen/iommu.h --- a/xen/include/xen/iommu.h Wed May 25 15:12:23 2011 +0100 +++ b/xen/include/xen/iommu.h Wed May 25 15:12:37 2011 +0100 @@ -133,6 +133,7 @@ struct iommu_ops { void (*suspend)(void); void (*resume)(void); void (*share_p2m)(struct domain *d); + void (*crash_shutdown)(void); }; void iommu_update_ire_from_apic(unsigned int apic, unsigned int reg, unsigned int value); @@ -142,6 +143,7 @@ unsigned int iommu_read_apic_from_ire(un void iommu_suspend(void); void iommu_resume(void); +void iommu_crash_shutdown(void); void iommu_set_dom0_mapping(struct domain *d); void iommu_share_p2m_table(struct domain *d); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 14:32 UTC
[Xen-devel] [PATCH 6 of 6] KEXEC: disable iommu jumping into the kdump kernel
kdump kernels are unable to boot with IOMMU enabled. this patch disabled IOMMU mode and removes some of the generic code from the shutdown path which doesnt work after other CPUs have been shot down. Because we need to replace the calls to disable_local_APIC, we remove the calls to __stop_this_cpu and replace them inline, with suitable modification regarding interrupts and local APICs. At the bottom of nmi_shootdown_cpus, remove the call to local_irq_enable as this causes us to jump into purgatory with the interrupt flag enabled. From a quick qrep through the current kexec-tools source, there is a fair amount of time before the interrupt flag is touched, meaning that we could possibly be servicing interrupts in the Xen context even though we have really crashed and left. hpet_disable_legacy_broadcast has been split sideways into a crash version which forgoes the locks (as only 1 cpu is still running by this point), and forgoes the IPI to all other processors. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r d3026545e9a0 -r cb005f9078d3 xen/arch/x86/crash.c --- a/xen/arch/x86/crash.c Wed May 25 15:12:37 2011 +0100 +++ b/xen/arch/x86/crash.c Wed May 25 15:30:43 2011 +0100 @@ -27,6 +27,7 @@ #include <asm/hvm/support.h> #include <asm/apic.h> #include <asm/io_apic.h> +#include <xen/iommu.h> static atomic_t waiting_for_crash_ipi; static unsigned int crashing_cpu; @@ -43,7 +44,10 @@ static int crash_nmi_callback(struct cpu kexec_crash_save_cpu(); - __stop_this_cpu(); + crash_disable_local_APIC(FALSE); + hvm_cpu_down(); + clts(); + asm volatile ( "fninit" ); atomic_dec(&waiting_for_crash_ipi); @@ -77,10 +81,20 @@ static void nmi_shootdown_cpus(void) msecs--; } - __stop_this_cpu(); + crash_disable_local_APIC(TRUE); + hvm_cpu_down(); + clts(); + asm volatile ( "fninit" ); + + /* This is a bit of a hack due to the problems with the x2apic_enabled + * variable, but we can''t do any better without a significant refactoring + * of the APIC code */ + if ( current_local_apic_mode() == APIC_MODE_X2APIC ) + x2apic_enabled = 1; + else + x2apic_enabled = 0; + disable_IO_APIC(); - - local_irq_enable(); } void machine_crash_shutdown(void) @@ -89,6 +103,10 @@ void machine_crash_shutdown(void) nmi_shootdown_cpus(); + /* Crash shutdown any IOMMU functionality as the crashdump kernel is not + * happy when booting if interrupt/dma remapping is still enabled */ + iommu_crash_shutdown(); + info = kexec_crash_save_info(); info->xen_phys_start = xen_phys_start; info->dom0_pfn_to_mfn_frame_list_list diff -r d3026545e9a0 -r cb005f9078d3 xen/arch/x86/hpet.c --- a/xen/arch/x86/hpet.c Wed May 25 15:12:37 2011 +0100 +++ b/xen/arch/x86/hpet.c Wed May 25 15:30:43 2011 +0100 @@ -670,6 +670,32 @@ void hpet_disable_legacy_broadcast(void) smp_send_event_check_mask(&cpu_online_map); } +/* This function is similar to the regular + * hpet_disable_legacy_broadcast function, except it is called + * on the crash path with only the current processor up, so we + * can forget the locks and really cant send an event check IPI + * to the other processors */ +void crash_hpet_disable_legacy_broadcast(void) +{ + u32 cfg; + + if ( !hpet_events || !(hpet_events->flags & HPET_EVT_LEGACY) ) + return; + + hpet_events->flags |= HPET_EVT_DISABLE; + + /* disable HPET T0 */ + cfg = hpet_read32(HPET_Tn_CFG(0)); + cfg &= ~HPET_TN_ENABLE; + hpet_write32(cfg, HPET_Tn_CFG(0)); + + /* Stop HPET legacy interrupts */ + cfg = hpet_read32(HPET_CFG); + cfg &= ~HPET_CFG_LEGACY; + hpet_write32(cfg, HPET_CFG); + +} + void hpet_broadcast_enter(void) { unsigned int cpu = smp_processor_id(); diff -r d3026545e9a0 -r cb005f9078d3 xen/arch/x86/machine_kexec.c --- a/xen/arch/x86/machine_kexec.c Wed May 25 15:12:37 2011 +0100 +++ b/xen/arch/x86/machine_kexec.c Wed May 25 15:30:43 2011 +0100 @@ -89,7 +89,7 @@ void machine_kexec(xen_kexec_image_t *im }; if ( hpet_broadcast_is_available() ) - hpet_disable_legacy_broadcast(); + crash_hpet_disable_legacy_broadcast(); /* * compat_machine_kexec() returns to idle pagetables, which requires us diff -r d3026545e9a0 -r cb005f9078d3 xen/include/asm-x86/hpet.h --- a/xen/include/asm-x86/hpet.h Wed May 25 15:12:37 2011 +0100 +++ b/xen/include/asm-x86/hpet.h Wed May 25 15:30:43 2011 +0100 @@ -73,5 +73,6 @@ void hpet_broadcast_enter(void); void hpet_broadcast_exit(void); int hpet_broadcast_is_available(void); void hpet_disable_legacy_broadcast(void); +void crash_hpet_disable_legacy_broadcast(void); #endif /* __X86_HPET_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wednesday 25 May 2011 16:32:02 Andrew Cooper wrote:> This set of patches is designed to get the kexec path working again on Xen > 4.x > > kdump kernels can''t boot if x2apic mode is enabled and the ACPI tables dont > state this fact. They also cant boot at all with interrupt remapping > enabled. > > These patches cause xen to track the BSP local APIC boot state and return > to it before kexec''ing to a new kernel. It also makes sure to disable IO > virtualisation. > > One area which is problematic is disabling interrupt remapping. > lapic_suspend() calls iommu_disable_x2apic_IR() which in a previous thread > was deemed to be Intel specific and only works by chance on AMD boxes by > effectivly being a NOP. As lapic_suspend() is generic code, does this mean > that we can''t/don''t ever disable interrupt remapping on AMD boxes?Yes, there is no explicit way to disable amd iommu like disable_intremap() for vtd. The reason is that interrupt remapping is enabled per device and there is no global flag to disable it. We need to visit every device entry and then disable per device, but it still sounds doable... Maybe we could make disable_intremap() generic for both Intel and AMD? Thanks, Wei> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-May-25 16:14 UTC
Re: [Xen-devel] [PATCH 0 of 6] Fix kexec in Xen (take 3)
>>> On 25.05.11 at 16:32, Andrew Cooper <andrew.cooper3@citrix.com> wrote: > This set of patches is designed to get the kexec path working again on Xen > 4.x > > kdump kernels can''t boot if x2apic mode is enabled and the ACPI tables dont > state this fact. They also cant boot at all with interrupt remapping > enabled. > > These patches cause xen to track the BSP local APIC boot state and return to > it before kexec''ing to a new kernel. It also makes sure to disable IO > virtualisation.I was about to reply to the individual patches, but they just seem too inconsistent to me (comments not matching code, without it being clear whether code or comment is wrong; functions introduced that have no callers). Can you work on getting them into a state suitable for reviewing? Further I don''t buy your pseudo-quoting of the MP spec saying that secondary CPUs'' local APICs have to be disabled. Keir already pointed out on your previous submission that in order for them to receive the INIT and Startup IPIs they must be enabled. Jan> One area which is problematic is disabling interrupt remapping. > lapic_suspend() calls iommu_disable_x2apic_IR() which in a previous thread > was deemed to be Intel specific and only works by chance on AMD boxes by > effectivly being a NOP. As lapic_suspend() is generic code, does this mean > that we can''t/don''t ever disable interrupt remapping on AMD boxes? > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-25 16:30 UTC
Re: [Xen-devel] [PATCH 0 of 6] Fix kexec in Xen (take 3)
On 25/05/11 17:14, Jan Beulich wrote:>>>> On 25.05.11 at 16:32, Andrew Cooper<andrew.cooper3@citrix.com> wrote: >> This set of patches is designed to get the kexec path working again on Xen >> 4.x >> >> kdump kernels can''t boot if x2apic mode is enabled and the ACPI tables dont >> state this fact. They also cant boot at all with interrupt remapping >> enabled. >> >> These patches cause xen to track the BSP local APIC boot state and return to >> it before kexec''ing to a new kernel. It also makes sure to disable IO >> virtualisation. > I was about to reply to the individual patches, but they just seem > too inconsistent to me (comments not matching code, without it > being clear whether code or comment is wrong; functions introduced > that have no callers). Can you work on getting them into a > state suitable for reviewing?I was splitting the patches up to make them smaller and modular. With the patches as a full series, there are no functions without callers. Which comments don''t match the code?> Further I don''t buy your pseudo-quoting of the MP spec saying > that secondary CPUs'' local APICs have to be disabled. Keir already > pointed out on your previous submission that in order for them to > receive the INIT and Startup IPIs they must be enabled.What Keir said and what the MP spec states are in direct contraction. Please do correct me if I have misread/misinterpreted the spec, but: Section 3.8 states that all local APICs are disabled when the BIOS hands over to the OS. and Section 3.7.3 states that the INIT IPI twiddles the APIC reset lines, which enabled them when they come out of reset, thus receiving and handling the IPI. ~Andrew> Jan > >> One area which is problematic is disabling interrupt remapping. >> lapic_suspend() calls iommu_disable_x2apic_IR() which in a previous thread >> was deemed to be Intel specific and only works by chance on AMD boxes by >> effectivly being a NOP. As lapic_suspend() is generic code, does this mean >> that we can''t/don''t ever disable interrupt remapping on AMD boxes? >> >> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-25 17:28 UTC
Re: [Xen-devel] [PATCH 4 of 6] IOMMU: Sanitise some of our pointer work
On Wed, May 25, 2011 at 03:32:06PM +0100, Andrew Cooper wrote:> This is not related to the rest of my kdump changes, but as pointed > out by Konrad in a previous thread, we really should make these checks > before blindly calling them. >You might want to change the title.> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 2f91c312ade5 -r 80f7a773887d xen/drivers/passthrough/iommu.c > --- a/xen/drivers/passthrough/iommu.c Wed May 25 15:11:58 2011 +0100 > +++ b/xen/drivers/passthrough/iommu.c Wed May 25 15:12:23 2011 +0100 > @@ -407,17 +407,17 @@ unsigned int iommu_read_apic_from_ire(un > return ops->read_apic_from_ire(apic, reg); > } > > -void iommu_resume() > +void iommu_resume(void) > { > const struct iommu_ops *ops = iommu_get_ops(); > - if ( iommu_enabled ) > + if ( iommu_enabled && ops && ops->resume ) > ops->resume(); > } > > -void iommu_suspend() > +void iommu_suspend(void) > { > const struct iommu_ops *ops = iommu_get_ops(); > - if ( iommu_enabled ) > + if ( iommu_enabled && ops && ops->resume )resume? Not suspend?> ops->suspend(); > } > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-May-25 21:35 UTC
Re: [Xen-devel] [PATCH 0 of 6] Fix kexec in Xen (take 3)
On 25/05/2011 17:30, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:>> I was about to reply to the individual patches, but they just seem >> too inconsistent to me (comments not matching code, without it >> being clear whether code or comment is wrong; functions introduced >> that have no callers). Can you work on getting them into a >> state suitable for reviewing? > I was splitting the patches up to make them smaller and modular. With > the patches as a full series, there are no functions without callers. > > Which comments don''t match the code?+ /* If we are the boot processor, stick the local apic back to how we found + * it on boot. */ + if( smp_processor_id() != 0 ) It''s a pretty fundamental one.>> Further I don''t buy your pseudo-quoting of the MP spec saying >> that secondary CPUs'' local APICs have to be disabled. Keir already >> pointed out on your previous submission that in order for them to >> receive the INIT and Startup IPIs they must be enabled. > What Keir said and what the MP spec states are in direct contraction. > Please do correct me if I have misread/misinterpreted the spec, but: > > Section 3.8 states that all local APICs are disabled when the BIOS hands > over to the OS. > > and > > Section 3.7.3 states that the INIT IPI twiddles the APIC reset lines, > which enabled them when they come out of reset, thus receiving and > handling the IPI.You are quoting from a spec that is nearly 15 years old, and particularly addressing 486 and older systems with discrete APICs. Xen has never run on such systems. A better reference for APIC behaviour is Chapter 10 of Volume 3A of the Intel Software Developer Manual. See 10.4.7.1 particularly. The APIC is software disabled on startup -- meaning that the enable bit in the SPIV register is clear. That is quite different from *hardware* disable (via the APICBASE MSR) which your patch attempts to deal with. In this latter case the APIC would be totally shut down and it would not be possible to INIT-SIPI the secondary processor. The software disable (via SPIV) is very much a semi-disabled state (and disable_local_APIC() already returns an APIC to that state). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-26 09:12 UTC
Re: [Xen-devel] [PATCH 0 of 6] Fix kexec in Xen (take 3)
On 25/05/11 22:35, Keir Fraser wrote:> On 25/05/2011 17:30, "Andrew Cooper"<andrew.cooper3@citrix.com> wrote: > >>> I was about to reply to the individual patches, but they just seem >>> too inconsistent to me (comments not matching code, without it >>> being clear whether code or comment is wrong; functions introduced >>> that have no callers). Can you work on getting them into a >>> state suitable for reviewing? >> I was splitting the patches up to make them smaller and modular. With >> the patches as a full series, there are no functions without callers. >> >> Which comments don''t match the code? > + /* If we are the boot processor, stick the local apic back to how we > found > + * it on boot. */ > + if( smp_processor_id() != 0 ) > > It''s a pretty fundamental one.Oops - point taken. I should never attempt to do a trivial cleanup of code.>>> Further I don''t buy your pseudo-quoting of the MP spec saying >>> that secondary CPUs'' local APICs have to be disabled. Keir already >>> pointed out on your previous submission that in order for them to >>> receive the INIT and Startup IPIs they must be enabled. >> What Keir said and what the MP spec states are in direct contraction. >> Please do correct me if I have misread/misinterpreted the spec, but: >> >> Section 3.8 states that all local APICs are disabled when the BIOS hands >> over to the OS. >> >> and >> >> Section 3.7.3 states that the INIT IPI twiddles the APIC reset lines, >> which enabled them when they come out of reset, thus receiving and >> handling the IPI. > You are quoting from a spec that is nearly 15 years old, and particularly > addressing 486 and older systems with discrete APICs. Xen has never run on > such systems. > > A better reference for APIC behaviour is Chapter 10 of Volume 3A of the > Intel Software Developer Manual. See 10.4.7.1 particularly. The APIC is > software disabled on startup -- meaning that the enable bit in the SPIV > register is clear. That is quite different from *hardware* disable (via the > APICBASE MSR) which your patch attempts to deal with. In this latter case > the APIC would be totally shut down and it would not be possible to > INIT-SIPI the secondary processor. The software disable (via SPIV) is very > much a semi-disabled state (and disable_local_APIC() already returns an APIC > to that state). > > -- Keir >Ok - I will read up on this more, and then I guess I have some code to change. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-May-26 09:19 UTC
Re: [Xen-devel] [PATCH 0 of 6] Fix kexec in Xen (take 3)
On 26/05/2011 10:12, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:>> A better reference for APIC behaviour is Chapter 10 of Volume 3A of the >> Intel Software Developer Manual. See 10.4.7.1 particularly. The APIC is >> software disabled on startup -- meaning that the enable bit in the SPIV >> register is clear. That is quite different from *hardware* disable (via the >> APICBASE MSR) which your patch attempts to deal with. In this latter case >> the APIC would be totally shut down and it would not be possible to >> INIT-SIPI the secondary processor. The software disable (via SPIV) is very >> much a semi-disabled state (and disable_local_APIC() already returns an APIC >> to that state). >> >> -- Keir >> > Ok - I will read up on this more, and then I guess I have some code to > change.Yes, please do. Also if you can please try to avoid making crash-specific versions of cleanup/shutdown functions. I would actually rather have a global variable indicating I-am-crashing-on-cpu-x, and have at that from the existing shutdown functions to modify their behaviour. If you can fix that plus curb your impulse to do more to the APIC code than necessary then we have a starting point for further review. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel