Andrew Cooper
2011-May-18 18:08 UTC
[Xen-devel] [PATCH 0 of 3] Fix kexec path in xen (take 2)
This set of patches is the minimal subset required to get the kexec path working again on Xen 4.x kdump kernels can''t boot if x2apic mode is enabled and the ACPI tables dont state this fact. They also cant boot at all with interrupt remapping enabled. These patches cause xen to track the local apic boot state and return to it before kexec''ing to a new kernel. It also makes sure to disable IO virtualisation. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 18:08 UTC
[Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
Xen does not store the boot local apic state which leads to problems when shutting down for a kexec jump. This patch records the boot state so we can return to the boot state when kexec''ing. This is per CPU because all 3 bioses on the boxes I have tested dont enabled all local apics on boot. As a result, we have to return to the bios state so the ACPI tables match up with the hardware state for the booting kernel. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Tue May 17 17:32:19 2011 +0100 +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; static bool_t __initdata opt_x2apic = 1; boolean_param("x2apic", opt_x2apic); +DEFINE_PER_CPU_READ_MOSTLY(enum apic_mode, apic_boot_mode) = APIC_MODE_INVALID; + bool_t __read_mostly x2apic_enabled = 0; bool_t __read_mostly directed_eoi_enabled = 0; @@ -1437,6 +1439,41 @@ int __init APIC_init_uniprocessor (void) return 0; } +/* Needs to be called once per CPU during startup. It records the state the BIOS + * leaves the local APIC so we can tare back down upon shutdown/crash + */ +void __init record_boot_APIC_mode(void) +{ + enum apic_mode this_apic_mode; + u64 msr_contents; + + this_apic_mode = APIC_MODE_INVALID; + + /* Sanity check - we should only ever run once */ + BUG_ON( APIC_MODE_INVALID != this_cpu(apic_boot_mode) ); + + rdmsrl(MSR_IA32_APICBASE, msr_contents); + + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) + this_apic_mode = APIC_MODE_X2APIC; + else + { + /* EN bit should always be valid as long as we can read the MSR + * Can anyone confirm this? + */ + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) + this_apic_mode = APIC_MODE_XAPIC; + else + this_apic_mode = APIC_MODE_DISABLED; + } + + this_cpu(apic_boot_mode) = this_apic_mode; + apic_printk(APIC_DEBUG, "APIC boot state is %d on core #%d\n", + this_apic_mode, smp_processor_id()); +} + void check_for_unexpected_msi(unsigned int vector) { unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/genapic/probe.c --- a/xen/arch/x86/genapic/probe.c Tue May 17 17:32:19 2011 +0100 +++ b/xen/arch/x86/genapic/probe.c Wed May 18 19:00:13 2011 +0100 @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) { int i, changed; + record_boot_APIC_mode(); + check_x2apic_preenabled(); cmdline_apic = changed = (genapic != NULL); diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/smpboot.c --- a/xen/arch/x86/smpboot.c Tue May 17 17:32:19 2011 +0100 +++ b/xen/arch/x86/smpboot.c Wed May 18 19:00:13 2011 +0100 @@ -388,6 +388,9 @@ void start_secondary(void *unused) microcode_resume_cpu(cpu); + /* Record boot apic state */ + record_boot_APIC_mode(); + wmb(); startup_cpu_idle_loop(); } diff -r f531ed84b066 -r 62a8ce6595ad xen/include/asm-x86/apic.h --- a/xen/include/asm-x86/apic.h Tue May 17 17:32:19 2011 +0100 +++ b/xen/include/asm-x86/apic.h Wed May 18 19:00:13 2011 +0100 @@ -21,6 +21,16 @@ #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 +/* Possible APIC states */ +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ + APIC_MODE_DISABLED, /* Some bioses disable by default for compatability */ + APIC_MODE_XAPIC, /* xAPIC mode - default upon chipset reset */ + APIC_MODE_X2APIC /* x2APIC mode - common for large smp machines */ +}; + +/* PerCPU local APIC boot mode - so we can taredown to bios state */ +DECLARE_PER_CPU(enum apic_mode, apic_boot_mode); + extern u8 apic_verbosity; extern bool_t x2apic_enabled; extern bool_t directed_eoi_enabled; @@ -203,6 +213,7 @@ extern void disable_APIC_timer(void); extern void enable_APIC_timer(void); extern int lapic_suspend(void); extern int lapic_resume(void); +extern void record_boot_APIC_mode(void); extern int check_nmi_watchdog (void); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 18:08 UTC
[Xen-devel] [PATCH 2 of 3] apic: remove ''enabled_via_apicbase'' variable
The use of this varable was only sensible when the choice for lapic mode was between enabled or disabled. Now that x2apic is about, it is wrong, and causes a protection fault in certain cases when trying to tare down x2apic mode. The only place where its use is relevent in the code is in disable_local_APIC which has been changed to correctly tare down the local APIC without a protection fault (which leads to a general protection fault). Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r 62a8ce6595ad -r e80b5280fe2f xen/arch/x86/apic.c --- a/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 @@ -165,8 +165,6 @@ void __init apic_intr_init(void) /* Using APIC to generate smp_local_timer_interrupt? */ static bool_t __read_mostly using_apic_timer; -static bool_t __read_mostly enabled_via_apicbase; - int get_physical_broadcast(void) { if (modern_apic()) @@ -330,6 +328,9 @@ void disconnect_bsp_APIC(int virt_wire_s void disable_local_APIC(void) { + u64 msr_contents; + enum apic_mode current_mode; + clear_local_APIC(); /* @@ -339,10 +340,54 @@ void disable_local_APIC(void) apic_write_around(APIC_SPIV, apic_read(APIC_SPIV) & ~APIC_SPIV_APIC_ENABLED); - if (enabled_via_apicbase) { - uint64_t msr_content; - rdmsrl(MSR_IA32_APICBASE, msr_content); - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE); + /* Work out what apic mode we are in */ + rdmsrl(MSR_IA32_APICBASE, msr_contents); + + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) + current_mode = APIC_MODE_X2APIC; + else + { + /* EN bit should always be valid as long as we can read the MSR + * Can anyone confirm this? + */ + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) + current_mode = APIC_MODE_XAPIC; + else + current_mode = APIC_MODE_DISABLED; + } + + /* See what (if anything) we need to do to revert to boot mode */ + if( current_mode != this_cpu(apic_boot_mode) ) + { + rdmsrl(MSR_IA32_APICBASE, msr_contents); + msr_contents &= ~ ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); + wrmsrl(MSR_IA32_APICBASE, msr_contents); + + switch(this_cpu(apic_boot_mode)) + { + case APIC_MODE_DISABLED: + break; /* Nothing to do - we did this above */ + case APIC_MODE_XAPIC: + { + msr_contents |= MSR_IA32_APICBASE_ENABLE; + wrmsrl(MSR_IA32_APICBASE, msr_contents); + break; + } + case APIC_MODE_X2APIC: + { + msr_contents |= ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); + wrmsrl(MSR_IA32_APICBASE, msr_contents); + break; + } + default: + { + printk("Hit default case when reverting lapic to boot state on core #%d\n", + smp_processor_id()); + break; + } + } } } @@ -874,7 +919,6 @@ static int __init detect_init_APIC (void wrmsrl(MSR_IA32_APICBASE, msr_content | MSR_IA32_APICBASE_ENABLE | APIC_DEFAULT_PHYS_BASE); - enabled_via_apicbase = 1; } } /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 18:08 UTC
[Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
kdump kernels are unable to boot with IOMMU enabled, this patch disabled IOMMU mode and removes some of the generic code from the shutdown path which doesnt work after other CPUs have been shot down. Also, leave local interrupts disabled when jumping into pugatory as we have no idea whats in there and really dont want to be servicing interrupts when our entire state is invalid. Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/crash.c --- a/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 +++ b/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 @@ -27,6 +27,8 @@ #include <asm/hvm/support.h> #include <asm/apic.h> #include <asm/io_apic.h> +#include <xen/iommu.h> +#include <asm/hvm/iommu.h> static atomic_t waiting_for_crash_ipi; static unsigned int crashing_cpu; @@ -43,7 +45,10 @@ static int crash_nmi_callback(struct cpu kexec_crash_save_cpu(); - __stop_this_cpu(); + disable_local_APIC(); + hvm_cpu_down(); + clts(); + asm volatile ( "fninit" ); atomic_dec(&waiting_for_crash_ipi); @@ -56,6 +61,7 @@ static int crash_nmi_callback(struct cpu static void nmi_shootdown_cpus(void) { unsigned long msecs; + u64 msr_contents; local_irq_disable(); @@ -77,18 +83,43 @@ static void nmi_shootdown_cpus(void) msecs--; } - __stop_this_cpu(); + disable_local_APIC(); + hvm_cpu_down(); + clts(); + asm volatile ( "fninit" ); + + /* This is a bit of a hack but there is no other way to shutdown correctly + * without a significant refactoring of the APIC code */ + rdmsrl(MSR_IA32_APICBASE, msr_contents); + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) + x2apic_enabled = 1; + else + x2apic_enabled = 0; + disable_IO_APIC(); - - local_irq_enable(); } void machine_crash_shutdown(void) { crash_xen_info_t *info; + const struct iommu_ops * ops; nmi_shootdown_cpus(); + /* Yes i know this is hacky but it is the easiest solution. I should add an iommu_ops + * function called crash() or so which just disables the iommu ''fun'' without saving state + */ + ops = iommu_get_ops(); + if(ops) + ops->suspend(); + + /* Yes i know this is from driver/passthrough/vtd/ but it appears to be architecture + * independant, and also bears little/no relation to x2apic. Needs cleaning up + */ + iommu_disable_x2apic_IR(); + + info = kexec_crash_save_info(); info->xen_phys_start = xen_phys_start; info->dom0_pfn_to_mfn_frame_list_list diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/hpet.c --- a/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100 +++ b/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100 @@ -670,6 +670,33 @@ void hpet_disable_legacy_broadcast(void) smp_send_event_check_mask(&cpu_online_map); } +/* This function is similar to the regular + * hpet_disable_legacy_broadcast function, except it is called + * on the crash path with only the current processor up, so we + * can forget the locks and really cant send an event check IPI + * to the other processors */ +void crash_hpet_disable_legacy_broadcast(void) +{ + u32 cfg; + + if ( !hpet_events || !(hpet_events->flags & HPET_EVT_LEGACY) ) + return; + + hpet_events->flags |= HPET_EVT_DISABLE; + + /* disable HPET T0 */ + cfg = hpet_read32(HPET_Tn_CFG(0)); + cfg &= ~HPET_TN_ENABLE; + hpet_write32(cfg, HPET_Tn_CFG(0)); + + /* Stop HPET legacy interrupts */ + cfg = hpet_read32(HPET_CFG); + cfg &= ~HPET_CFG_LEGACY; + hpet_write32(cfg, HPET_CFG); + +} + + void hpet_broadcast_enter(void) { unsigned int cpu = smp_processor_id(); diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/machine_kexec.c --- a/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100 +++ b/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100 @@ -97,7 +97,7 @@ void machine_kexec(xen_kexec_image_t *im }; if ( hpet_broadcast_is_available() ) - hpet_disable_legacy_broadcast(); + crash_hpet_disable_legacy_broadcast(); /* * compat_machine_kexec() returns to idle pagetables, which requires us diff -r e80b5280fe2f -r aaf44d1a903d xen/include/asm-x86/hpet.h --- a/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100 +++ b/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100 @@ -73,5 +73,6 @@ void hpet_broadcast_enter(void); void hpet_broadcast_exit(void); int hpet_broadcast_is_available(void); void hpet_disable_legacy_broadcast(void); +void crash_hpet_disable_legacy_broadcast(void); #endif /* __X86_HPET_H__ */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 18:39 UTC
Re: [Xen-devel] [PATCH 0 of 3] Fix kexec path in xen (take 2)
On Wed, May 18, 2011 at 07:08:13PM +0100, Andrew Cooper wrote:> This set of patches is the minimal subset required to get the kexec path working again on Xen 4.xYou might want to copy Daniel on these patches: dkiper@net-space.pl as he is working on a project alongside this.> > kdump kernels can''t boot if x2apic mode is enabled and the ACPI tables dont state this fact. They also cant boot at all with interrupt remapping enabled. > > These patches cause xen to track the local apic boot state and return to it before kexec''ing to a new kernel. It also makes sure to disable IO virtualisation. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 18:49 UTC
Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
On Wed, May 18, 2011 at 07:08:16PM +0100, Andrew Cooper wrote:> kdump kernels are unable to boot with IOMMU enabled, > this patch disabled IOMMU mode and removes some of the generic > code from the shutdown path which doesnt work after other > CPUs have been shot down. > > Also, leave local interrupts disabled when jumping into pugatorypurgatory?> as we have no idea whats in there and really dont want to be > servicing interrupts when our entire state is invalid. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/crash.c > --- a/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 > +++ b/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 > @@ -27,6 +27,8 @@ > #include <asm/hvm/support.h> > #include <asm/apic.h> > #include <asm/io_apic.h> > +#include <xen/iommu.h> > +#include <asm/hvm/iommu.h> > > static atomic_t waiting_for_crash_ipi; > static unsigned int crashing_cpu; > @@ -43,7 +45,10 @@ static int crash_nmi_callback(struct cpu > > kexec_crash_save_cpu(); > > - __stop_this_cpu(); > + disable_local_APIC(); > + hvm_cpu_down(); > + clts(); > + asm volatile ( "fninit" );Can you provide a comment why you are using fninit and clt? Is this what the Linux kernel does too when it goes through the kexec path?> > atomic_dec(&waiting_for_crash_ipi); > > @@ -56,6 +61,7 @@ static int crash_nmi_callback(struct cpu > static void nmi_shootdown_cpus(void) > { > unsigned long msecs; > + u64 msr_contents; > > local_irq_disable(); > > @@ -77,18 +83,43 @@ static void nmi_shootdown_cpus(void) > msecs--; > } > > - __stop_this_cpu(); > + disable_local_APIC(); > + hvm_cpu_down(); > + clts(); > + asm volatile ( "fninit" ); > + > + /* This is a bit of a hack but there is no other way to shutdown correctly > + * without a significant refactoring of the APIC code */ > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) > + x2apic_enabled = 1; > + else > + x2apic_enabled = 0; > + > disable_IO_APIC(); > - > - local_irq_enable();Why?> } > > void machine_crash_shutdown(void) > { > crash_xen_info_t *info; > + const struct iommu_ops * ops; > > nmi_shootdown_cpus(); > > + /* Yes i know this is hacky but it is the easiest solution. I should add an iommu_ops > + * function called crash() or so which just disables the iommu ''fun'' without saving state > + */ > + ops = iommu_get_ops(); > + if(ops) > + ops->suspend();Uh, no checking if ops->suspend exists?> + > + /* Yes i know this is from driver/passthrough/vtd/ but it appears to be architecture > + * independant, and also bears little/no relation to x2apic. Needs cleaning upWhat about AMD VI IOMMUs? Does it work when that IOMMU is used?> + */ > + iommu_disable_x2apic_IR();Can''t that function be done in the suspend code of the IOMMU?> + > + > info = kexec_crash_save_info(); > info->xen_phys_start = xen_phys_start; > info->dom0_pfn_to_mfn_frame_list_list > diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/hpet.c > --- a/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100 > +++ b/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100 > @@ -670,6 +670,33 @@ void hpet_disable_legacy_broadcast(void) > smp_send_event_check_mask(&cpu_online_map); > } > > +/* This function is similar to the regular > + * hpet_disable_legacy_broadcast function, except it is called > + * on the crash path with only the current processor up, so we > + * can forget the locks and really cant send an event check IPI > + * to the other processors */ > +void crash_hpet_disable_legacy_broadcast(void) > +{ > + u32 cfg; > + > + if ( !hpet_events || !(hpet_events->flags & HPET_EVT_LEGACY) ) > + return; > + > + hpet_events->flags |= HPET_EVT_DISABLE; > + > + /* disable HPET T0 */ > + cfg = hpet_read32(HPET_Tn_CFG(0)); > + cfg &= ~HPET_TN_ENABLE; > + hpet_write32(cfg, HPET_Tn_CFG(0)); > + > + /* Stop HPET legacy interrupts */ > + cfg = hpet_read32(HPET_CFG); > + cfg &= ~HPET_CFG_LEGACY; > + hpet_write32(cfg, HPET_CFG); > + > +} > + > + > void hpet_broadcast_enter(void) > { > unsigned int cpu = smp_processor_id(); > diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/machine_kexec.c > --- a/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100 > +++ b/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100 > @@ -97,7 +97,7 @@ void machine_kexec(xen_kexec_image_t *im > }; > > if ( hpet_broadcast_is_available() ) > - hpet_disable_legacy_broadcast(); > + crash_hpet_disable_legacy_broadcast(); > > /* > * compat_machine_kexec() returns to idle pagetables, which requires us > diff -r e80b5280fe2f -r aaf44d1a903d xen/include/asm-x86/hpet.h > --- a/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100 > +++ b/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100 > @@ -73,5 +73,6 @@ void hpet_broadcast_enter(void); > void hpet_broadcast_exit(void); > int hpet_broadcast_is_available(void); > void hpet_disable_legacy_broadcast(void); > +void crash_hpet_disable_legacy_broadcast(void); > > #endif /* __X86_HPET_H__ */ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 18:49 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On Wed, May 18, 2011 at 07:08:14PM +0100, Andrew Cooper wrote:> Xen does not store the boot local apic state which leads to problems > when shutting down for a kexec jump. This patch records the boot > state so we can return to the boot state when kexec''ing. > > This is per CPU because all 3 bioses on the boxes I have tested dont > enabled all local apics on boot. As a result, we have to return to > the bios state so the ACPI tables match up with the hardware state > for the booting kernel.Which ACPI table requires this?> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/apic.c > --- a/xen/arch/x86/apic.c Tue May 17 17:32:19 2011 +0100 > +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; > static bool_t __initdata opt_x2apic = 1; > boolean_param("x2apic", opt_x2apic); > > +DEFINE_PER_CPU_READ_MOSTLY(enum apic_mode, apic_boot_mode) = APIC_MODE_INVALID; > + > bool_t __read_mostly x2apic_enabled = 0; > bool_t __read_mostly directed_eoi_enabled = 0; > > @@ -1437,6 +1439,41 @@ int __init APIC_init_uniprocessor (void) > return 0; > } > > +/* Needs to be called once per CPU during startup. It records the state the BIOS > + * leaves the local APIC so we can tare back down upon shutdown/crashtare?> + */ > +void __init record_boot_APIC_mode(void) > +{ > + enum apic_mode this_apic_mode; > + u64 msr_contents; > + > + this_apic_mode = APIC_MODE_INVALID; > + > + /* Sanity check - we should only ever run once */ > + BUG_ON( APIC_MODE_INVALID != this_cpu(apic_boot_mode) ); > + > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > + > + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) > + this_apic_mode = APIC_MODE_X2APIC; > + else > + { > + /* EN bit should always be valid as long as we can read the MSR > + * Can anyone confirm this?Email Vivek Goyal. He is the kexec/kdump maintainer.> + */ > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) > + this_apic_mode = APIC_MODE_XAPIC; > + else > + this_apic_mode = APIC_MODE_DISABLED; > + } > + > + this_cpu(apic_boot_mode) = this_apic_mode; > + apic_printk(APIC_DEBUG, "APIC boot state is %d on core #%d\n", > + this_apic_mode, smp_processor_id());This begs of a function to convert those enums to strings..> +} > + > void check_for_unexpected_msi(unsigned int vector) > { > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); > diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/genapic/probe.c > --- a/xen/arch/x86/genapic/probe.c Tue May 17 17:32:19 2011 +0100 > +++ b/xen/arch/x86/genapic/probe.c Wed May 18 19:00:13 2011 +0100 > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) > { > int i, changed; > > + record_boot_APIC_mode(); > +The spacing looks odd.> check_x2apic_preenabled(); > cmdline_apic = changed = (genapic != NULL); > > diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/smpboot.c > --- a/xen/arch/x86/smpboot.c Tue May 17 17:32:19 2011 +0100 > +++ b/xen/arch/x86/smpboot.c Wed May 18 19:00:13 2011 +0100 > @@ -388,6 +388,9 @@ void start_secondary(void *unused) > > microcode_resume_cpu(cpu); > > + /* Record boot apic state */ > + record_boot_APIC_mode(); > + > wmb(); > startup_cpu_idle_loop(); > } > diff -r f531ed84b066 -r 62a8ce6595ad xen/include/asm-x86/apic.h > --- a/xen/include/asm-x86/apic.h Tue May 17 17:32:19 2011 +0100 > +++ b/xen/include/asm-x86/apic.h Wed May 18 19:00:13 2011 +0100 > @@ -21,6 +21,16 @@ > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 > > +/* Possible APIC states */ > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ > + APIC_MODE_DISABLED, /* Some bioses disable by default for compatability */ > + APIC_MODE_XAPIC, /* xAPIC mode - default upon chipset reset */ > + APIC_MODE_X2APIC /* x2APIC mode - common for large smp machines */ > +}; > + > +/* PerCPU local APIC boot mode - so we can taredown to bios state */taredown?> +DECLARE_PER_CPU(enum apic_mode, apic_boot_mode); > + > extern u8 apic_verbosity; > extern bool_t x2apic_enabled; > extern bool_t directed_eoi_enabled; > @@ -203,6 +213,7 @@ extern void disable_APIC_timer(void); > extern void enable_APIC_timer(void); > extern int lapic_suspend(void); > extern int lapic_resume(void); > +extern void record_boot_APIC_mode(void); > > extern int check_nmi_watchdog (void); > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 18:53 UTC
Re: [Xen-devel] [PATCH 2 of 3] apic: remove ''enabled_via_apicbase'' variable
On Wed, May 18, 2011 at 07:08:15PM +0100, Andrew Cooper wrote:> The use of this varable was only sensible when the choice for lapic modevariable> was between enabled or disabled. Now that x2apic is about, it is wrong, > and causes a protection fault in certain cases when trying to tare downtare?> x2apic mode. > > The only place where its use is relevent in the code is in disable_local_APIC^- is ^^^ take that out.> which has been changed to correctly tare down the local APIC without ateardown?> protection fault (which leads to a general protection fault).So if you don''t have x2apic, then it is wrong to disable the LAPIC mode? What about older hardware?> > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r 62a8ce6595ad -r e80b5280fe2f xen/arch/x86/apic.c > --- a/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 > +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 > @@ -165,8 +165,6 @@ void __init apic_intr_init(void) > /* Using APIC to generate smp_local_timer_interrupt? */ > static bool_t __read_mostly using_apic_timer; > > -static bool_t __read_mostly enabled_via_apicbase; > - > int get_physical_broadcast(void) > { > if (modern_apic()) > @@ -330,6 +328,9 @@ void disconnect_bsp_APIC(int virt_wire_s > > void disable_local_APIC(void) > { > + u64 msr_contents; > + enum apic_mode current_mode; > + > clear_local_APIC(); > > /* > @@ -339,10 +340,54 @@ void disable_local_APIC(void) > apic_write_around(APIC_SPIV, > apic_read(APIC_SPIV) & ~APIC_SPIV_APIC_ENABLED); > > - if (enabled_via_apicbase) { > - uint64_t msr_content; > - rdmsrl(MSR_IA32_APICBASE, msr_content); > - wrmsrl(MSR_IA32_APICBASE, msr_content & ~MSR_IA32_APICBASE_ENABLE); > + /* Work out what apic mode we are in */ > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > + > + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) > + current_mode = APIC_MODE_X2APIC; > + else > + { > + /* EN bit should always be valid as long as we can read the MSR > + * Can anyone confirm this?Might want to email Vivek Goyal..> + */ > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) > + current_mode = APIC_MODE_XAPIC; > + else > + current_mode = APIC_MODE_DISABLED; > + } > + > + /* See what (if anything) we need to do to revert to boot mode */ > + if( current_mode != this_cpu(apic_boot_mode) ) > + { > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > + msr_contents &= ~ ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); > + wrmsrl(MSR_IA32_APICBASE, msr_contents); > + > + switch(this_cpu(apic_boot_mode)) > + { > + case APIC_MODE_DISABLED: > + break; /* Nothing to do - we did this above */ > + case APIC_MODE_XAPIC: > + { > + msr_contents |= MSR_IA32_APICBASE_ENABLE; > + wrmsrl(MSR_IA32_APICBASE, msr_contents); > + break; > + } > + case APIC_MODE_X2APIC: > + { > + msr_contents |= ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); > + wrmsrl(MSR_IA32_APICBASE, msr_contents); > + break; > + } > + default: > + { > + printk("Hit default case when reverting lapic to boot state on core #%d\n", > + smp_processor_id()); > + break; > + } > + } > } > } > > @@ -874,7 +919,6 @@ static int __init detect_init_APIC (void > wrmsrl(MSR_IA32_APICBASE, > msr_content | MSR_IA32_APICBASE_ENABLE > | APIC_DEFAULT_PHYS_BASE); > - enabled_via_apicbase = 1; > } > } > /* > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 20:27 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote:> On Wed, May 18, 2011 at 07:08:14PM +0100, Andrew Cooper wrote: >> Xen does not store the boot local apic state which leads to problems >> when shutting down for a kexec jump. This patch records the boot >> state so we can return to the boot state when kexec''ing. >> >> This is per CPU because all 3 bioses on the boxes I have tested dont >> enabled all local apics on boot. As a result, we have to return to >> the bios state so the ACPI tables match up with the hardware state >> for the booting kernel. > Which ACPI table requires this?Cant remember offhand but linux (2.6.32) was doing finger pointing at the multi-processor tables>> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> >> >> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/apic.c >> --- a/xen/arch/x86/apic.c Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 >> @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; >> static bool_t __initdata opt_x2apic = 1; >> boolean_param("x2apic", opt_x2apic); >> >> +DEFINE_PER_CPU_READ_MOSTLY(enum apic_mode, apic_boot_mode) = APIC_MODE_INVALID; >> + >> bool_t __read_mostly x2apic_enabled = 0; >> bool_t __read_mostly directed_eoi_enabled = 0; >> >> @@ -1437,6 +1439,41 @@ int __init APIC_init_uniprocessor (void) >> return 0; >> } >> >> +/* Needs to be called once per CPU during startup. It records the state the BIOS >> + * leaves the local APIC so we can tare back down upon shutdown/crash > tare?I fail at spelling - I meant "tear down">> + */ >> +void __init record_boot_APIC_mode(void) >> +{ >> + enum apic_mode this_apic_mode; >> + u64 msr_contents; >> + >> + this_apic_mode = APIC_MODE_INVALID; >> + >> + /* Sanity check - we should only ever run once */ >> + BUG_ON( APIC_MODE_INVALID != this_cpu(apic_boot_mode) ); >> + >> + rdmsrl(MSR_IA32_APICBASE, msr_contents); >> + >> + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ >> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) >> +&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) >> + this_apic_mode = APIC_MODE_X2APIC; >> + else >> + { >> + /* EN bit should always be valid as long as we can read the MSR >> + * Can anyone confirm this? > Email Vivek Goyal. He is the kexec/kdump maintainer.He doesn''t appear in the maintainers file, and I don''t have an email address.>> + */ >> + if ( msr_contents& MSR_IA32_APICBASE_ENABLE ) >> + this_apic_mode = APIC_MODE_XAPIC; >> + else >> + this_apic_mode = APIC_MODE_DISABLED; >> + } >> + >> + this_cpu(apic_boot_mode) = this_apic_mode; >> + apic_printk(APIC_DEBUG, "APIC boot state is %d on core #%d\n", >> + this_apic_mode, smp_processor_id()); > This begs of a function to convert those enums to strings..True - this was left over from my debugging - I shall fix that tomorrow.>> +} >> + >> void check_for_unexpected_msi(unsigned int vector) >> { >> unsigned long v = apic_read(APIC_ISR + ((vector& ~0x1f)>> 1)); >> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/genapic/probe.c >> --- a/xen/arch/x86/genapic/probe.c Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/arch/x86/genapic/probe.c Wed May 18 19:00:13 2011 +0100 >> @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) >> { >> int i, changed; >> >> + record_boot_APIC_mode(); >> + > The spacing looks odd.Does it? Looks fine for me - I have been following the 1 tab to 4 spaces convention in half the codebase.>> check_x2apic_preenabled(); >> cmdline_apic = changed = (genapic != NULL); >> >> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/smpboot.c >> --- a/xen/arch/x86/smpboot.c Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/arch/x86/smpboot.c Wed May 18 19:00:13 2011 +0100 >> @@ -388,6 +388,9 @@ void start_secondary(void *unused) >> >> microcode_resume_cpu(cpu); >> >> + /* Record boot apic state */ >> + record_boot_APIC_mode(); >> + >> wmb(); >> startup_cpu_idle_loop(); >> } >> diff -r f531ed84b066 -r 62a8ce6595ad xen/include/asm-x86/apic.h >> --- a/xen/include/asm-x86/apic.h Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/include/asm-x86/apic.h Wed May 18 19:00:13 2011 +0100 >> @@ -21,6 +21,16 @@ >> #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 >> #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 >> >> +/* Possible APIC states */ >> +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ >> + APIC_MODE_DISABLED, /* Some bioses disable by default for compatability */ >> + APIC_MODE_XAPIC, /* xAPIC mode - default upon chipset reset */ >> + APIC_MODE_X2APIC /* x2APIC mode - common for large smp machines */ >> +}; >> + >> +/* PerCPU local APIC boot mode - so we can taredown to bios state */ > taredown?same comment as above re. spelling>> +DECLARE_PER_CPU(enum apic_mode, apic_boot_mode); >> + >> extern u8 apic_verbosity; >> extern bool_t x2apic_enabled; >> extern bool_t directed_eoi_enabled; >> @@ -203,6 +213,7 @@ extern void disable_APIC_timer(void); >> extern void enable_APIC_timer(void); >> extern int lapic_suspend(void); >> extern int lapic_resume(void); >> +extern void record_boot_APIC_mode(void); >> >> extern int check_nmi_watchdog (void); >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 20:35 UTC
Re: [Xen-devel] [PATCH 2 of 3] apic: remove ''enabled_via_apicbase'' variable
On 18/05/11 19:53, Konrad Rzeszutek Wilk wrote:> On Wed, May 18, 2011 at 07:08:15PM +0100, Andrew Cooper wrote: >> The use of this varable was only sensible when the choice for lapic mode > variable >> was between enabled or disabled. Now that x2apic is about, it is wrong, >> and causes a protection fault in certain cases when trying to tare down > tare? > >> x2apic mode. >> >> The only place where its use is relevent in the code is in disable_local_APIC > ^- is ^^^ take that out. > >> which has been changed to correctly tare down the local APIC without a > teardown? >> protection fault (which leads to a general protection fault). > So if you don''t have x2apic, then it is wrong to disable the LAPIC mode? > What about older hardware?I guess I wasn''t very clear in my description. In older hardware without x2apic, it is correct to simply twiddle the ENABLE bit in the APICBASE MSR. However, with x2apic mode enabled, setting the ENABLE bit from 1 to 0 while leaving the EXTD bit set will result in a protection fault which will propagate to a general protection fault the same codepath will be called in the fault handler. As a result, the current code in disable_local_APIC will result in a GPF if the BIOS boots with LAPICs disabled (fine as per the spec for compatibility) and xen decided to take advantage of x2apic mode.>> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> >> >> diff -r 62a8ce6595ad -r e80b5280fe2f xen/arch/x86/apic.c >> --- a/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 >> +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 >> @@ -165,8 +165,6 @@ void __init apic_intr_init(void) >> /* Using APIC to generate smp_local_timer_interrupt? */ >> static bool_t __read_mostly using_apic_timer; >> >> -static bool_t __read_mostly enabled_via_apicbase; >> - >> int get_physical_broadcast(void) >> { >> if (modern_apic()) >> @@ -330,6 +328,9 @@ void disconnect_bsp_APIC(int virt_wire_s >> >> void disable_local_APIC(void) >> { >> + u64 msr_contents; >> + enum apic_mode current_mode; >> + >> clear_local_APIC(); >> >> /* >> @@ -339,10 +340,54 @@ void disable_local_APIC(void) >> apic_write_around(APIC_SPIV, >> apic_read(APIC_SPIV)& ~APIC_SPIV_APIC_ENABLED); >> >> - if (enabled_via_apicbase) { >> - uint64_t msr_content; >> - rdmsrl(MSR_IA32_APICBASE, msr_content); >> - wrmsrl(MSR_IA32_APICBASE, msr_content& ~MSR_IA32_APICBASE_ENABLE); >> + /* Work out what apic mode we are in */ >> + rdmsrl(MSR_IA32_APICBASE, msr_contents); >> + >> + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ >> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) >> +&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) >> + current_mode = APIC_MODE_X2APIC; >> + else >> + { >> + /* EN bit should always be valid as long as we can read the MSR >> + * Can anyone confirm this? > Might want to email Vivek Goyal.. >> + */ >> + if ( msr_contents& MSR_IA32_APICBASE_ENABLE ) >> + current_mode = APIC_MODE_XAPIC; >> + else >> + current_mode = APIC_MODE_DISABLED; >> + } >> + >> + /* See what (if anything) we need to do to revert to boot mode */ >> + if( current_mode != this_cpu(apic_boot_mode) ) >> + { >> + rdmsrl(MSR_IA32_APICBASE, msr_contents); >> + msr_contents&= ~ ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); >> + wrmsrl(MSR_IA32_APICBASE, msr_contents); >> + >> + switch(this_cpu(apic_boot_mode)) >> + { >> + case APIC_MODE_DISABLED: >> + break; /* Nothing to do - we did this above */ >> + case APIC_MODE_XAPIC: >> + { >> + msr_contents |= MSR_IA32_APICBASE_ENABLE; >> + wrmsrl(MSR_IA32_APICBASE, msr_contents); >> + break; >> + } >> + case APIC_MODE_X2APIC: >> + { >> + msr_contents |= ( MSR_IA32_APICBASE_ENABLE | MSR_IA32_APICBASE_EXTD ); >> + wrmsrl(MSR_IA32_APICBASE, msr_contents); >> + break; >> + } >> + default: >> + { >> + printk("Hit default case when reverting lapic to boot state on core #%d\n", >> + smp_processor_id()); >> + break; >> + } >> + } >> } >> } >> >> @@ -874,7 +919,6 @@ static int __init detect_init_APIC (void >> wrmsrl(MSR_IA32_APICBASE, >> msr_content | MSR_IA32_APICBASE_ENABLE >> | APIC_DEFAULT_PHYS_BASE); >> - enabled_via_apicbase = 1; >> } >> } >> /* >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 20:43 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On Wed, May 18, 2011 at 09:27:56PM +0100, Andrew Cooper wrote:> > > On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote: > >On Wed, May 18, 2011 at 07:08:14PM +0100, Andrew Cooper wrote: > >>Xen does not store the boot local apic state which leads to problems > >>when shutting down for a kexec jump. This patch records the boot > >>state so we can return to the boot state when kexec''ing. > >> > >>This is per CPU because all 3 bioses on the boxes I have tested dont > >>enabled all local apics on boot. As a result, we have to return to > >>the bios state so the ACPI tables match up with the hardware state > >>for the booting kernel. > >Which ACPI table requires this? > Cant remember offhand but linux (2.6.32) was doing finger pointing > at the multi-processor tablesMP tables? Those don''t get used when ACPI is used..> >>Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> > >> > >>diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/apic.c > >>--- a/xen/arch/x86/apic.c Tue May 17 17:32:19 2011 +0100 > >>+++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 > >>@@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; > >> static bool_t __initdata opt_x2apic = 1; > >> boolean_param("x2apic", opt_x2apic); > >> > >>+DEFINE_PER_CPU_READ_MOSTLY(enum apic_mode, apic_boot_mode) = APIC_MODE_INVALID; > >>+ > >> bool_t __read_mostly x2apic_enabled = 0; > >> bool_t __read_mostly directed_eoi_enabled = 0; > >> > >>@@ -1437,6 +1439,41 @@ int __init APIC_init_uniprocessor (void) > >> return 0; > >> } > >> > >>+/* Needs to be called once per CPU during startup. It records the state the BIOS > >>+ * leaves the local APIC so we can tare back down upon shutdown/crash > >tare? > I fail at spelling - I meant "tear down" > >>+ */ > >>+void __init record_boot_APIC_mode(void) > >>+{ > >>+ enum apic_mode this_apic_mode; > >>+ u64 msr_contents; > >>+ > >>+ this_apic_mode = APIC_MODE_INVALID; > >>+ > >>+ /* Sanity check - we should only ever run once */ > >>+ BUG_ON( APIC_MODE_INVALID != this_cpu(apic_boot_mode) ); > >>+ > >>+ rdmsrl(MSR_IA32_APICBASE, msr_contents); > >>+ > >>+ /* Reading EXTD bit from the MSR is only valid if CPUID says so, else reserved */ > >>+ if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > >>+&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) > >>+ this_apic_mode = APIC_MODE_X2APIC; > >>+ else > >>+ { > >>+ /* EN bit should always be valid as long as we can read the MSR > >>+ * Can anyone confirm this? > >Email Vivek Goyal. He is the kexec/kdump maintainer. > He doesn''t appear in the maintainers file, and I don''t have an emailHmm..This is what I see in the MAINTAINERS file: KDUMP M: Vivek Goyal <vgoyal@redhat.com> M: Haren Myneni <hbabu@us.ibm.com> L: kexec@lists.infradead.org W: http://lse.sourceforge.net/kdump/ S: Maintained F: Documentation/kdump/> address.http://lmgtfy.com/?q=Vivek+Goyal+kexec> >>+ */ > >>+ if ( msr_contents& MSR_IA32_APICBASE_ENABLE ) > >>+ this_apic_mode = APIC_MODE_XAPIC; > >>+ else > >>+ this_apic_mode = APIC_MODE_DISABLED; > >>+ } > >>+ > >>+ this_cpu(apic_boot_mode) = this_apic_mode; > >>+ apic_printk(APIC_DEBUG, "APIC boot state is %d on core #%d\n", > >>+ this_apic_mode, smp_processor_id()); > >This begs of a function to convert those enums to strings.. > True - this was left over from my debugging - I shall fix that tomorrow.Ok.> >>+} > >>+ > >> void check_for_unexpected_msi(unsigned int vector) > >> { > >> unsigned long v = apic_read(APIC_ISR + ((vector& ~0x1f)>> 1)); > >>diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/genapic/probe.c > >>--- a/xen/arch/x86/genapic/probe.c Tue May 17 17:32:19 2011 +0100 > >>+++ b/xen/arch/x86/genapic/probe.c Wed May 18 19:00:13 2011 +0100 > >>@@ -60,6 +60,8 @@ void __init generic_apic_probe(void) > >> { > >> int i, changed; > >> > >>+ record_boot_APIC_mode(); > >>+ > >The spacing looks odd. > Does it? Looks fine for me - I have been following the 1 tab to 4 > spaces convention in half the codebase.Could be my mailer.. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 20:45 UTC
Re: [Xen-devel] [PATCH 2 of 3] apic: remove ''enabled_via_apicbase'' variable
> >So if you don''t have x2apic, then it is wrong to disable the LAPIC mode? > >What about older hardware? > I guess I wasn''t very clear in my description. In older hardware > without x2apic, it is correct to simply twiddle the ENABLE bit in > the APICBASE MSR. However, with x2apic mode enabled, setting the > ENABLE bit from 1 to 0 while leaving the EXTD bit set will result in > a protection fault which will propagate to a general protection > fault the same codepath will be called in the fault handler. As a > result, the current code in disable_local_APIC will result in a GPF > if the BIOS boots with LAPICs disabled (fine as per the spec for > compatibility) and xen decided to take advantage of x2apic mode.Ok. You might also point to the appropiate section of the x2APIC spec about this. I think it is 2.7.1.2 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 20:48 UTC
Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote:> On Wed, May 18, 2011 at 07:08:16PM +0100, Andrew Cooper wrote: >> kdump kernels are unable to boot with IOMMU enabled, >> this patch disabled IOMMU mode and removes some of the generic >> code from the shutdown path which doesnt work after other >> CPUs have been shot down. >> >> Also, leave local interrupts disabled when jumping into pugatory > purgatory?purgatory is the bit of code which the a crashing kernel jumps into, which pretends to do minimal bootloader things to book the kdump kernel. It is part of the kexec-tools package.>> as we have no idea whats in there and really dont want to be >> servicing interrupts when our entire state is invalid. >> >> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> >> >> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/crash.c >> --- a/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 >> +++ b/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 >> @@ -27,6 +27,8 @@ >> #include<asm/hvm/support.h> >> #include<asm/apic.h> >> #include<asm/io_apic.h> >> +#include<xen/iommu.h> >> +#include<asm/hvm/iommu.h> >> >> static atomic_t waiting_for_crash_ipi; >> static unsigned int crashing_cpu; >> @@ -43,7 +45,10 @@ static int crash_nmi_callback(struct cpu >> >> kexec_crash_save_cpu(); >> >> - __stop_this_cpu(); >> + disable_local_APIC(); >> + hvm_cpu_down(); >> + clts(); >> + asm volatile ( "fninit" ); > Can you provide a comment why you are using fninit and clt? > Is this what the Linux kernel does too when it goes through the kexec path?I was replacing __stop_this_cpu() with the safe subset of its contents - it was a verbatim copy minus the SMP stuff which the regular __stop_this_cpu() does. I suppose I could have split __stop_this_cpu() to __crash_stop_this_cpu() but it didn''t seem worth making such a trivially small function.>> >> atomic_dec(&waiting_for_crash_ipi); >> >> @@ -56,6 +61,7 @@ static int crash_nmi_callback(struct cpu >> static void nmi_shootdown_cpus(void) >> { >> unsigned long msecs; >> + u64 msr_contents; >> >> local_irq_disable(); >> >> @@ -77,18 +83,43 @@ static void nmi_shootdown_cpus(void) >> msecs--; >> } >> >> - __stop_this_cpu(); >> + disable_local_APIC(); >> + hvm_cpu_down(); >> + clts(); >> + asm volatile ( "fninit" ); >> + >> + /* This is a bit of a hack but there is no other way to shutdown correctly >> + * without a significant refactoring of the APIC code */ >> + rdmsrl(MSR_IA32_APICBASE, msr_contents); >> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) >> +&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) >> + x2apic_enabled = 1; >> + else >> + x2apic_enabled = 0; >> + >> disable_IO_APIC(); >> - >> - local_irq_enable(); > Why?Because that local_irq_enable() results in the interrupt flag being set when jumping into purgatory, which (at the moment) doesn''t touch interrupts at all. The result is that interrupts from PCI devices which are unaware of the crash are (potentially) being serviced by the xen handlers even though we have left the Xen context for good.>> } >> >> void machine_crash_shutdown(void) >> { >> crash_xen_info_t *info; >> + const struct iommu_ops * ops; >> >> nmi_shootdown_cpus(); >> >> + /* Yes i know this is hacky but it is the easiest solution. I should add an iommu_ops >> + * function called crash() or so which just disables the iommu ''fun'' without saving state >> + */ >> + ops = iommu_get_ops(); >> + if(ops) >> + ops->suspend(); > Uh, no checking if ops->suspend exists? >True - at the moment both intel and amd iommu_ops are fully implemented but I will add an extra condition to the if statement.>> + >> + /* Yes i know this is from driver/passthrough/vtd/ but it appears to be architecture >> + * independant, and also bears little/no relation to x2apic. Needs cleaning up > What about AMD VI IOMMUs? Does it work when that IOMMU is used? >It worked on the AMD box I tested the code on. Like the comment says - as far as I can tell, it is architecture independent code.>> + */ >> + iommu_disable_x2apic_IR(); > Can''t that function be done in the suspend code of the IOMMU?There is a comment in iommu suspend stating that it cant and isn''t done, but rather is left for the local/ioapic_suspend functions which dont properly work in the kexec path.>> + >> + >> info = kexec_crash_save_info(); >> info->xen_phys_start = xen_phys_start; >> info->dom0_pfn_to_mfn_frame_list_list >> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/hpet.c >> --- a/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100 >> +++ b/xen/arch/x86/hpet.c Wed May 18 19:00:13 2011 +0100 >> @@ -670,6 +670,33 @@ void hpet_disable_legacy_broadcast(void) >> smp_send_event_check_mask(&cpu_online_map); >> } >> >> +/* This function is similar to the regular >> + * hpet_disable_legacy_broadcast function, except it is called >> + * on the crash path with only the current processor up, so we >> + * can forget the locks and really cant send an event check IPI >> + * to the other processors */ >> +void crash_hpet_disable_legacy_broadcast(void) >> +{ >> + u32 cfg; >> + >> + if ( !hpet_events || !(hpet_events->flags& HPET_EVT_LEGACY) ) >> + return; >> + >> + hpet_events->flags |= HPET_EVT_DISABLE; >> + >> + /* disable HPET T0 */ >> + cfg = hpet_read32(HPET_Tn_CFG(0)); >> + cfg&= ~HPET_TN_ENABLE; >> + hpet_write32(cfg, HPET_Tn_CFG(0)); >> + >> + /* Stop HPET legacy interrupts */ >> + cfg = hpet_read32(HPET_CFG); >> + cfg&= ~HPET_CFG_LEGACY; >> + hpet_write32(cfg, HPET_CFG); >> + >> +} >> + >> + >> void hpet_broadcast_enter(void) >> { >> unsigned int cpu = smp_processor_id(); >> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/machine_kexec.c >> --- a/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100 >> +++ b/xen/arch/x86/machine_kexec.c Wed May 18 19:00:13 2011 +0100 >> @@ -97,7 +97,7 @@ void machine_kexec(xen_kexec_image_t *im >> }; >> >> if ( hpet_broadcast_is_available() ) >> - hpet_disable_legacy_broadcast(); >> + crash_hpet_disable_legacy_broadcast(); >> >> /* >> * compat_machine_kexec() returns to idle pagetables, which requires us >> diff -r e80b5280fe2f -r aaf44d1a903d xen/include/asm-x86/hpet.h >> --- a/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100 >> +++ b/xen/include/asm-x86/hpet.h Wed May 18 19:00:13 2011 +0100 >> @@ -73,5 +73,6 @@ void hpet_broadcast_enter(void); >> void hpet_broadcast_exit(void); >> int hpet_broadcast_is_available(void); >> void hpet_disable_legacy_broadcast(void); >> +void crash_hpet_disable_legacy_broadcast(void); >> >> #endif /* __X86_HPET_H__ */ >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-18 20:57 UTC
Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
On Wed, May 18, 2011 at 09:48:36PM +0100, Andrew Cooper wrote:> > > On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote: > >On Wed, May 18, 2011 at 07:08:16PM +0100, Andrew Cooper wrote: > >>kdump kernels are unable to boot with IOMMU enabled, > >>this patch disabled IOMMU mode and removes some of the generic > >>code from the shutdown path which doesnt work after other > >>CPUs have been shot down. > >> > >>Also, leave local interrupts disabled when jumping into pugatory > >purgatory? > purgatory is the bit of code which the a crashing kernel jumps into, > which pretends to do minimal bootloader things to book the kdump > kernel. It is part of the kexec-tools package.Ok. Might want to include that in the description.> > >>as we have no idea whats in there and really dont want to be > >>servicing interrupts when our entire state is invalid. > >> > >>Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> > >> > >>diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/crash.c > >>--- a/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 > >>+++ b/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 > >>@@ -27,6 +27,8 @@ > >> #include<asm/hvm/support.h> > >> #include<asm/apic.h> > >> #include<asm/io_apic.h> > >>+#include<xen/iommu.h> > >>+#include<asm/hvm/iommu.h> > >> > >> static atomic_t waiting_for_crash_ipi; > >> static unsigned int crashing_cpu; > >>@@ -43,7 +45,10 @@ static int crash_nmi_callback(struct cpu > >> > >> kexec_crash_save_cpu(); > >> > >>- __stop_this_cpu(); > >>+ disable_local_APIC(); > >>+ hvm_cpu_down(); > >>+ clts(); > >>+ asm volatile ( "fninit" ); > >Can you provide a comment why you are using fninit and clt? > >Is this what the Linux kernel does too when it goes through the kexec path? > I was replacing __stop_this_cpu() with the safe subset of its > contents - it was a verbatim copy minus the SMP stuff which the > regular __stop_this_cpu() does. I suppose I could have split > __stop_this_cpu() to __crash_stop_this_cpu() but it didn''t seem > worth making such a trivially small function.Why can''t you use the SMP version? I know you are not running in SMP mode, but does it hurt?> >> > >> atomic_dec(&waiting_for_crash_ipi); > >> > >>@@ -56,6 +61,7 @@ static int crash_nmi_callback(struct cpu > >> static void nmi_shootdown_cpus(void) > >> { > >> unsigned long msecs; > >>+ u64 msr_contents; > >> > >> local_irq_disable(); > >> > >>@@ -77,18 +83,43 @@ static void nmi_shootdown_cpus(void) > >> msecs--; > >> } > >> > >>- __stop_this_cpu(); > >>+ disable_local_APIC(); > >>+ hvm_cpu_down(); > >>+ clts(); > >>+ asm volatile ( "fninit" ); > >>+ > >>+ /* This is a bit of a hack but there is no other way to shutdown correctly > >>+ * without a significant refactoring of the APIC code */ > >>+ rdmsrl(MSR_IA32_APICBASE, msr_contents); > >>+ if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > >>+&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) > >>+ x2apic_enabled = 1; > >>+ else > >>+ x2apic_enabled = 0; > >>+ > >> disable_IO_APIC(); > >>- > >>- local_irq_enable(); > >Why? > Because that local_irq_enable() results in the interrupt flag being > set when jumping into purgatory, which (at the moment) doesn''t touch > interrupts at all. The result is that interrupts from PCI devices > which are unaware of the crash are (potentially) being serviced by > the xen handlers even though we have left the Xen context for good.Yikes. Please do explain this in the code right there were you remove the local_irq_enable..> >> } > >> > >> void machine_crash_shutdown(void) > >> { > >> crash_xen_info_t *info; > >>+ const struct iommu_ops * ops; > >> > >> nmi_shootdown_cpus(); > >> > >>+ /* Yes i know this is hacky but it is the easiest solution. I should add an iommu_ops > >>+ * function called crash() or so which just disables the iommu ''fun'' without saving state > >>+ */ > >>+ ops = iommu_get_ops(); > >>+ if(ops) > >>+ ops->suspend(); > >Uh, no checking if ops->suspend exists? > > > True - at the moment both intel and amd iommu_ops are fully > implemented but I will add an extra condition to the if statement. > >>+ > >>+ /* Yes i know this is from driver/passthrough/vtd/ but it appears to be architecture > >>+ * independant, and also bears little/no relation to x2apic. Needs cleaning up > >What about AMD VI IOMMUs? Does it work when that IOMMU is used? > > > It worked on the AMD box I tested the code on. Like the comment > says - as far as I can tell, it is architecture independent code. > >>+ */ > >>+ iommu_disable_x2apic_IR(); > >Can''t that function be done in the suspend code of the IOMMU? > There is a comment in iommu suspend stating that it cant and isn''t > done, but rather is left for the local/ioapic_suspend functions > which dont properly work in the kexec path.OK, how about just moving it out of driver/passthrought/vtd then? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-18 21:24 UTC
Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
On 18/05/11 21:57, Konrad Rzeszutek Wilk wrote:> On Wed, May 18, 2011 at 09:48:36PM +0100, Andrew Cooper wrote: >> >> On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote: >>> On Wed, May 18, 2011 at 07:08:16PM +0100, Andrew Cooper wrote: >>>> kdump kernels are unable to boot with IOMMU enabled, >>>> this patch disabled IOMMU mode and removes some of the generic >>>> code from the shutdown path which doesnt work after other >>>> CPUs have been shot down. >>>> >>>> Also, leave local interrupts disabled when jumping into pugatory >>> purgatory? >> purgatory is the bit of code which the a crashing kernel jumps into, >> which pretends to do minimal bootloader things to book the kdump >> kernel. It is part of the kexec-tools package. > Ok. Might want to include that in the description. > >>>> as we have no idea whats in there and really dont want to be >>>> servicing interrupts when our entire state is invalid. >>>> >>>> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> >>>> >>>> diff -r e80b5280fe2f -r aaf44d1a903d xen/arch/x86/crash.c >>>> --- a/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 >>>> +++ b/xen/arch/x86/crash.c Wed May 18 19:00:13 2011 +0100 >>>> @@ -27,6 +27,8 @@ >>>> #include<asm/hvm/support.h> >>>> #include<asm/apic.h> >>>> #include<asm/io_apic.h> >>>> +#include<xen/iommu.h> >>>> +#include<asm/hvm/iommu.h> >>>> >>>> static atomic_t waiting_for_crash_ipi; >>>> static unsigned int crashing_cpu; >>>> @@ -43,7 +45,10 @@ static int crash_nmi_callback(struct cpu >>>> >>>> kexec_crash_save_cpu(); >>>> >>>> - __stop_this_cpu(); >>>> + disable_local_APIC(); >>>> + hvm_cpu_down(); >>>> + clts(); >>>> + asm volatile ( "fninit" ); >>> Can you provide a comment why you are using fninit and clt? >>> Is this what the Linux kernel does too when it goes through the kexec path? >> I was replacing __stop_this_cpu() with the safe subset of its >> contents - it was a verbatim copy minus the SMP stuff which the >> regular __stop_this_cpu() does. I suppose I could have split >> __stop_this_cpu() to __crash_stop_this_cpu() but it didn''t seem >> worth making such a trivially small function. > Why can''t you use the SMP version? I know you are not running > in SMP mode, but does it hurt? >It sadly does hurt. All the code relating to the x2apic_enabled variable is broken. That variable is used to mean "is the user expecting me to use x2apic mode", "did the bios boot me in x2apic mode" and "what mode is the boot processor currently in". For normal operation, it works (assuming no race conditions creep in when starting non-boot processors). However, x2apic mode really needs to be per_cpu because it depends on the lapic MSR as to whether you should be using MSRs or MMIO to talk to the lapic/ioapic registers, and you kindly get a protection fault if you use MSRs outside of x2apic mode. As a result, 1 global variable doesn''t cut it when you are doing an nmi shootdown of processors. I was in the middle of making a fix for that, but Keir made it clear that such a patch would not be accepted.>>>> atomic_dec(&waiting_for_crash_ipi); >>>> >>>> @@ -56,6 +61,7 @@ static int crash_nmi_callback(struct cpu >>>> static void nmi_shootdown_cpus(void) >>>> { >>>> unsigned long msecs; >>>> + u64 msr_contents; >>>> >>>> local_irq_disable(); >>>> >>>> @@ -77,18 +83,43 @@ static void nmi_shootdown_cpus(void) >>>> msecs--; >>>> } >>>> >>>> - __stop_this_cpu(); >>>> + disable_local_APIC(); >>>> + hvm_cpu_down(); >>>> + clts(); >>>> + asm volatile ( "fninit" ); >>>> + >>>> + /* This is a bit of a hack but there is no other way to shutdown correctly >>>> + * without a significant refactoring of the APIC code */ >>>> + rdmsrl(MSR_IA32_APICBASE, msr_contents); >>>> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) >>>> +&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) >>>> + x2apic_enabled = 1; >>>> + else >>>> + x2apic_enabled = 0; >>>> + >>>> disable_IO_APIC(); >>>> - >>>> - local_irq_enable(); >>> Why? >> Because that local_irq_enable() results in the interrupt flag being >> set when jumping into purgatory, which (at the moment) doesn''t touch >> interrupts at all. The result is that interrupts from PCI devices >> which are unaware of the crash are (potentially) being serviced by >> the xen handlers even though we have left the Xen context for good. > Yikes. Please do explain this in the code right there were you > remove the local_irq_enable.. > >>>> } >>>> >>>> void machine_crash_shutdown(void) >>>> { >>>> crash_xen_info_t *info; >>>> + const struct iommu_ops * ops; >>>> >>>> nmi_shootdown_cpus(); >>>> >>>> + /* Yes i know this is hacky but it is the easiest solution. I should add an iommu_ops >>>> + * function called crash() or so which just disables the iommu ''fun'' without saving state >>>> + */ >>>> + ops = iommu_get_ops(); >>>> + if(ops) >>>> + ops->suspend(); >>> Uh, no checking if ops->suspend exists? >>> >> True - at the moment both intel and amd iommu_ops are fully >> implemented but I will add an extra condition to the if statement. >>>> + >>>> + /* Yes i know this is from driver/passthrough/vtd/ but it appears to be architecture >>>> + * independant, and also bears little/no relation to x2apic. Needs cleaning up >>> What about AMD VI IOMMUs? Does it work when that IOMMU is used? >>> >> It worked on the AMD box I tested the code on. Like the comment >> says - as far as I can tell, it is architecture independent code. >>>> + */ >>>> + iommu_disable_x2apic_IR(); >>> Can''t that function be done in the suspend code of the IOMMU? >> There is a comment in iommu suspend stating that it cant and isn''t >> done, but rather is left for the local/ioapic_suspend functions >> which dont properly work in the kexec path. > OK, how about just moving it out of driver/passthrought/vtd then?Because that code is fragile enough without me poking about in it. I would prefer someone with more knowledge about IOMMU to make that call. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-May-18 23:40 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On 18/05/2011 19:08, "Andrew Cooper" <andrew.cooper3@citrix.com> wrote:> Xen does not store the boot local apic state which leads to problems > when shutting down for a kexec jump. This patch records the boot > state so we can return to the boot state when kexec''ing. > > This is per CPU because all 3 bioses on the boxes I have tested dont > enabled all local apics on boot.You mean some CPUs have APIC totally disabled? Impossible: you need the APIC enabled to be able to INIT-SIPI bootstrap the secondary processors. And Xen itself has no code to enable disabled APICs (except some very legacy code that executes only in the uniprocessor case, which I''m pretty sure you''re not touching). You call your new function so late in secondary CPU bringup that our own APIC setup has already run (via smp_callin->{x2apic_ap_setup,setup_local_APIC}). Which would indicate that the boot state of the secondary CPUs is not of importance to you, since you I''m sure tested your patch as working okay in this current form, and that would imply that you don''t need to record per-cpu boot state. The singleshot BUG_ON in record_boot_APIC_mode is broken, since secondary CPUs can be started up multiple times (we can offline/online them). You need to bail on repeated invocations. Or not be recording boot state for secondary cpus since apparently it isn''t needed... We''re going to be going through these patches very slowly and painfully, because frankly the arguments and assertions seem to be plenty full of holes. -- Keir> As a result, we have to return to > the bios state so the ACPI tables match up with the hardware state > for the booting kernel. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > > diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/apic.c > --- a/xen/arch/x86/apic.c Tue May 17 17:32:19 2011 +0100 > +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 > @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; > static bool_t __initdata opt_x2apic = 1; > boolean_param("x2apic", opt_x2apic); > > +DEFINE_PER_CPU_READ_MOSTLY(enum apic_mode, apic_boot_mode) > APIC_MODE_INVALID; > + > bool_t __read_mostly x2apic_enabled = 0; > bool_t __read_mostly directed_eoi_enabled = 0; > > @@ -1437,6 +1439,41 @@ int __init APIC_init_uniprocessor (void) > return 0; > } > > +/* Needs to be called once per CPU during startup. It records the state the > BIOS > + * leaves the local APIC so we can tare back down upon shutdown/crash > + */ > +void __init record_boot_APIC_mode(void) > +{ > + enum apic_mode this_apic_mode; > + u64 msr_contents; > + > + this_apic_mode = APIC_MODE_INVALID; > + > + /* Sanity check - we should only ever run once */ > + BUG_ON( APIC_MODE_INVALID != this_cpu(apic_boot_mode) ); > + > + rdmsrl(MSR_IA32_APICBASE, msr_contents); > + > + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else > reserved */ > + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) > + && (msr_contents & MSR_IA32_APICBASE_EXTD) ) > + this_apic_mode = APIC_MODE_X2APIC; > + else > + { > + /* EN bit should always be valid as long as we can read the MSR > + * Can anyone confirm this? > + */ > + if ( msr_contents & MSR_IA32_APICBASE_ENABLE ) > + this_apic_mode = APIC_MODE_XAPIC; > + else > + this_apic_mode = APIC_MODE_DISABLED; > + } > + > + this_cpu(apic_boot_mode) = this_apic_mode; > + apic_printk(APIC_DEBUG, "APIC boot state is %d on core #%d\n", > + this_apic_mode, smp_processor_id()); > +} > + > void check_for_unexpected_msi(unsigned int vector) > { > unsigned long v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1)); > diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/genapic/probe.c > --- a/xen/arch/x86/genapic/probe.c Tue May 17 17:32:19 2011 +0100 > +++ b/xen/arch/x86/genapic/probe.c Wed May 18 19:00:13 2011 +0100 > @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) > { > int i, changed; > > + record_boot_APIC_mode(); > + > check_x2apic_preenabled(); > cmdline_apic = changed = (genapic != NULL); > > diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/smpboot.c > --- a/xen/arch/x86/smpboot.c Tue May 17 17:32:19 2011 +0100 > +++ b/xen/arch/x86/smpboot.c Wed May 18 19:00:13 2011 +0100 > @@ -388,6 +388,9 @@ void start_secondary(void *unused) > > microcode_resume_cpu(cpu); > > + /* Record boot apic state */ > + record_boot_APIC_mode(); > + > wmb(); > startup_cpu_idle_loop(); > } > diff -r f531ed84b066 -r 62a8ce6595ad xen/include/asm-x86/apic.h > --- a/xen/include/asm-x86/apic.h Tue May 17 17:32:19 2011 +0100 > +++ b/xen/include/asm-x86/apic.h Wed May 18 19:00:13 2011 +0100 > @@ -21,6 +21,16 @@ > #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 > #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 > > +/* Possible APIC states */ > +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ > + APIC_MODE_DISABLED, /* Some bioses disable by default for > compatability */ > + APIC_MODE_XAPIC, /* xAPIC mode - default upon chipset > reset */ > + APIC_MODE_X2APIC /* x2APIC mode - common for large smp > machines */ > +}; > + > +/* PerCPU local APIC boot mode - so we can taredown to bios state */ > +DECLARE_PER_CPU(enum apic_mode, apic_boot_mode); > + > extern u8 apic_verbosity; > extern bool_t x2apic_enabled; > extern bool_t directed_eoi_enabled; > @@ -203,6 +213,7 @@ extern void disable_APIC_timer(void); > extern void enable_APIC_timer(void); > extern int lapic_suspend(void); > extern int lapic_resume(void); > +extern void record_boot_APIC_mode(void); > > extern int check_nmi_watchdog (void); > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2011-May-19 00:54 UTC
RE: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
> From: Andrew Cooper > Sent: Thursday, May 19, 2011 4:28 AM > > On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote: > > On Wed, May 18, 2011 at 07:08:14PM +0100, Andrew Cooper wrote: > >> Xen does not store the boot local apic state which leads to problems > >> when shutting down for a kexec jump. This patch records the boot > >> state so we can return to the boot state when kexec''ing. > >> > >> This is per CPU because all 3 bioses on the boxes I have tested dont > >> enabled all local apics on boot. As a result, we have to return to > >> the bios state so the ACPI tables match up with the hardware state > >> for the booting kernel. > > Which ACPI table requires this? > Cant remember offhand but linux (2.6.32) was doing finger pointing at the > multi-processor tablesit would be good to understand exactly how CPU states don''t match the ACPI table. As Keir commented, your current patch is bogus which may just work around the issue in another way. Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2011-May-19 00:56 UTC
RE: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
> From: Konrad Rzeszutek Wilk > Sent: Thursday, May 19, 2011 4:43 AM > > On Wed, May 18, 2011 at 09:27:56PM +0100, Andrew Cooper wrote: > > > > > > On 18/05/11 19:49, Konrad Rzeszutek Wilk wrote: > > >On Wed, May 18, 2011 at 07:08:14PM +0100, Andrew Cooper wrote: > > >>Xen does not store the boot local apic state which leads to problems > > >>when shutting down for a kexec jump. This patch records the boot > > >>state so we can return to the boot state when kexec''ing. > > >> > > >>This is per CPU because all 3 bioses on the boxes I have tested dont > > >>enabled all local apics on boot. As a result, we have to return to > > >>the bios state so the ACPI tables match up with the hardware state > > >>for the booting kernel. > > >Which ACPI table requires this? > > Cant remember offhand but linux (2.6.32) was doing finger pointing at > > the multi-processor tables > > MP tables? Those don''t get used when ACPI is used..He may mean ACPI MADT table... Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tian, Kevin
2011-May-19 03:31 UTC
RE: [Xen-devel] [PATCH 2 of 3] apic: remove ''enabled_via_apicbase'' variable
> From: Andrew Cooper > Sent: Thursday, May 19, 2011 4:36 AM > > > > >> which has been changed to correctly tare down the local APIC without > >> a > > teardown? > >> protection fault (which leads to a general protection fault). > > So if you don''t have x2apic, then it is wrong to disable the LAPIC mode? > > What about older hardware? > I guess I wasn''t very clear in my description. In older hardware without > x2apic, it is correct to simply twiddle the ENABLE bit in the APICBASE MSR. > However, with x2apic mode enabled, setting the ENABLE bit from 1 to 0 while > leaving the EXTD bit set will result in a protection fault which will propagate to > a general protection fault the same codepath will be called in the fault handler. > As a result, the current code in disable_local_APIC will result in a GPF if the > BIOS boots with LAPICs disabled (fine as per the spec for compatibility) and xen > decided to take advantage of x2apic mode.Though above is true a bug, I''m curious whether it''s a hard requirement to reset APIC state to the BIOS state, or whether there''s other way around to handle it in kexec path. Could you check how upstream Linux handles this? In a glimpse upstream Linux only tries to recover APIC to BIOS state for 32bit, and also lacks of special treatment on x2apic. Then how would 64bit linux work with kexec? Thanks Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-May-19 08:34 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On Wed, 2011-05-18 at 21:43 +0100, Konrad Rzeszutek Wilk wrote:> > >Email Vivek Goyal. He is the kexec/kdump maintainer. > > He doesn''t appear in the maintainers file, and I don''t have an email > > Hmm..This is what I see in the MAINTAINERS file: > KDUMP > M: Vivek Goyal <vgoyal@redhat.com> > M: Haren Myneni <hbabu@us.ibm.com> > L: kexec@lists.infradead.org > W: http://lse.sourceforge.net/kdump/ > S: Maintained > F: Documentation/kdump/That''s the Linux MAINTAINERS file, which is why Andrew probably didn''t look there since this is a Xen side patch. The question didn''t really seem like a kdump maintainer one in any case, more of a "person who knows about APICs one", perhaps those are sometimes the same person ;-) Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-19 11:14 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On 19/05/11 00:40, Keir Fraser wrote:> On 18/05/2011 19:08, "Andrew Cooper"<andrew.cooper3@citrix.com> wrote: > >> Xen does not store the boot local apic state which leads to problems >> when shutting down for a kexec jump. This patch records the boot >> state so we can return to the boot state when kexec''ing. >> >> This is per CPU because all 3 bioses on the boxes I have tested dont >> enabled all local apics on boot. > You mean some CPUs have APIC totally disabled?That is what I meant but now I have double checked the trace, I realize I was mis-interpreting the Xen APIC debug trace. As a result, I do agree that most of the rest of this patch is based on false assumptions.> Impossible: you need the APIC > enabled to be able to INIT-SIPI bootstrap the secondary processors.Nope. The MP spec section 3.8 states that all AP lapics must be disabled when the bios passes over to the OS. Section 3.7.3 states that the INIT-IPI is special and twiddles the PRST pin for external lapics, or the INIT pin for internal lapics. The result in either case is the lapic resetting, enabling themselves in the process, and reading the reset vector.> And Xen > itself has no code to enable disabled APICs (except some very legacy code > that executes only in the uniprocessor case, which I''m pretty sure you''re > not touching). > > You call your new function so late in secondary CPU bringup that our own > APIC setup has already run (via > smp_callin->{x2apic_ap_setup,setup_local_APIC}). Which would indicate that > the boot state of the secondary CPUs is not of importance to you, since you > I''m sure tested your patch as working okay in this current form, and that > would imply that you don''t need to record per-cpu boot state.Hmm I thought I had traced the codepath but I clearly got that wrong. Either way, given the new clarification from the MP spec, my suggestion would be to fully disable the lapics in the nmi_crash_handler just before looping over halt(). Experimental evidence shows however that linux 2.6.32 kdump (and presumably earlier) can not function if its BSP lapic is in x2apic mode when the bios would have left it disabled or in xapic mode.> The singleshot BUG_ON in record_boot_APIC_mode is broken, since secondary > CPUs can be started up multiple times (we can offline/online them). You need > to bail on repeated invocations. Or not be recording boot state for > secondary cpus since apparently it isn''t needed...Fair enough - I didn''t realize that it was possible to be called several times> We''re going to be going through these patches very slowly and painfully, > because frankly the arguments and assertions seem to be plenty full of > holes. > > -- Keir >> As a result, we have to return to >> the bios state so the ACPI tables match up with the hardware state >> for the booting kernel. >> >> Signed-off-by: Andrew Cooper<andrew.cooper3@citrix.com> >> >> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/apic.c >> --- a/xen/arch/x86/apic.c Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/arch/x86/apic.c Wed May 18 19:00:13 2011 +0100 >> @@ -74,6 +74,8 @@ u8 __read_mostly apic_verbosity; >> static bool_t __initdata opt_x2apic = 1; >> boolean_param("x2apic", opt_x2apic); >> >> +DEFINE_PER_CPU_READ_MOSTLY(enum apic_mode, apic_boot_mode) >> APIC_MODE_INVALID; >> + >> bool_t __read_mostly x2apic_enabled = 0; >> bool_t __read_mostly directed_eoi_enabled = 0; >> >> @@ -1437,6 +1439,41 @@ int __init APIC_init_uniprocessor (void) >> return 0; >> } >> >> +/* Needs to be called once per CPU during startup. It records the state the >> BIOS >> + * leaves the local APIC so we can tare back down upon shutdown/crash >> + */ >> +void __init record_boot_APIC_mode(void) >> +{ >> + enum apic_mode this_apic_mode; >> + u64 msr_contents; >> + >> + this_apic_mode = APIC_MODE_INVALID; >> + >> + /* Sanity check - we should only ever run once */ >> + BUG_ON( APIC_MODE_INVALID != this_cpu(apic_boot_mode) ); >> + >> + rdmsrl(MSR_IA32_APICBASE, msr_contents); >> + >> + /* Reading EXTD bit from the MSR is only valid if CPUID says so, else >> reserved */ >> + if ( cpu_has(¤t_cpu_data, X86_FEATURE_X2APIC) >> +&& (msr_contents& MSR_IA32_APICBASE_EXTD) ) >> + this_apic_mode = APIC_MODE_X2APIC; >> + else >> + { >> + /* EN bit should always be valid as long as we can read the MSR >> + * Can anyone confirm this? >> + */ >> + if ( msr_contents& MSR_IA32_APICBASE_ENABLE ) >> + this_apic_mode = APIC_MODE_XAPIC; >> + else >> + this_apic_mode = APIC_MODE_DISABLED; >> + } >> + >> + this_cpu(apic_boot_mode) = this_apic_mode; >> + apic_printk(APIC_DEBUG, "APIC boot state is %d on core #%d\n", >> + this_apic_mode, smp_processor_id()); >> +} >> + >> void check_for_unexpected_msi(unsigned int vector) >> { >> unsigned long v = apic_read(APIC_ISR + ((vector& ~0x1f)>> 1)); >> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/genapic/probe.c >> --- a/xen/arch/x86/genapic/probe.c Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/arch/x86/genapic/probe.c Wed May 18 19:00:13 2011 +0100 >> @@ -60,6 +60,8 @@ void __init generic_apic_probe(void) >> { >> int i, changed; >> >> + record_boot_APIC_mode(); >> + >> check_x2apic_preenabled(); >> cmdline_apic = changed = (genapic != NULL); >> >> diff -r f531ed84b066 -r 62a8ce6595ad xen/arch/x86/smpboot.c >> --- a/xen/arch/x86/smpboot.c Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/arch/x86/smpboot.c Wed May 18 19:00:13 2011 +0100 >> @@ -388,6 +388,9 @@ void start_secondary(void *unused) >> >> microcode_resume_cpu(cpu); >> >> + /* Record boot apic state */ >> + record_boot_APIC_mode(); >> + >> wmb(); >> startup_cpu_idle_loop(); >> } >> diff -r f531ed84b066 -r 62a8ce6595ad xen/include/asm-x86/apic.h >> --- a/xen/include/asm-x86/apic.h Tue May 17 17:32:19 2011 +0100 >> +++ b/xen/include/asm-x86/apic.h Wed May 18 19:00:13 2011 +0100 >> @@ -21,6 +21,16 @@ >> #define IO_APIC_REDIR_DEST_LOGICAL 0x00800 >> #define IO_APIC_REDIR_DEST_PHYSICAL 0x00000 >> >> +/* Possible APIC states */ >> +enum apic_mode { APIC_MODE_INVALID, /* Not set yet */ >> + APIC_MODE_DISABLED, /* Some bioses disable by default for >> compatability */ >> + APIC_MODE_XAPIC, /* xAPIC mode - default upon chipset >> reset */ >> + APIC_MODE_X2APIC /* x2APIC mode - common for large smp >> machines */ >> +}; >> + >> +/* PerCPU local APIC boot mode - so we can taredown to bios state */ >> +DECLARE_PER_CPU(enum apic_mode, apic_boot_mode); >> + >> extern u8 apic_verbosity; >> extern bool_t x2apic_enabled; >> extern bool_t directed_eoi_enabled; >> @@ -203,6 +213,7 @@ extern void disable_APIC_timer(void); >> extern void enable_APIC_timer(void); >> extern int lapic_suspend(void); >> extern int lapic_resume(void); >> +extern void record_boot_APIC_mode(void); >> >> extern int check_nmi_watchdog (void); >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >-- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-19 14:26 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
> evidence shows however that linux 2.6.32 kdump (and presumablyI would suggest you use the latest linux kernel (2.6.39) to look at the kdump implementation, much fresher. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-19 14:32 UTC
Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
> >>>What about AMD VI IOMMUs? Does it work when that IOMMU is used? > >>> > >>It worked on the AMD box I tested the code on. Like the comment > >>says - as far as I can tell, it is architecture independent code. > >>>>+ */ > >>>>+ iommu_disable_x2apic_IR(); > >>>Can''t that function be done in the suspend code of the IOMMU? > >>There is a comment in iommu suspend stating that it cant and isn''t > >>done, but rather is left for the local/ioapic_suspend functions > >>which dont properly work in the kexec path. > >OK, how about just moving it out of driver/passthrought/vtd then? > Because that code is fragile enough without me poking about in it. > I would prefer someone with more knowledge about IOMMU to make that > call.OK. Lets CC him here then. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Cooper
2011-May-19 14:43 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On 19/05/11 15:26, Konrad Rzeszutek Wilk wrote:>> evidence shows however that linux 2.6.32 kdump (and presumably > I would suggest you use the latest linux kernel (2.6.39) to > look at the kdump implementation, much fresher.True, but at the moment, my problem is to fix the fact that our next release of XenServer (which runs 2.6.32) doesn''t successfully kdump. Therefore, that was the base of my investigation of the problem. I will experiment with later kernels when I get a chance, but from comparing the similarities in the codepaths, I''m not sure the results will be much different. -- Andrew Cooper - Dom0 Kernel Engineer, Citrix XenServer T: +44 (0)1223 225 900, http://www.citrix.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-May-19 16:21 UTC
Re: [Xen-devel] [PATCH 1 of 3] apic: record local apic state on boot
On Thu, May 19, 2011 at 09:34:46AM +0100, Ian Campbell wrote:> On Wed, 2011-05-18 at 21:43 +0100, Konrad Rzeszutek Wilk wrote: > > > >Email Vivek Goyal. He is the kexec/kdump maintainer. > > > He doesn''t appear in the maintainers file, and I don''t have an email > > > > Hmm..This is what I see in the MAINTAINERS file: > > KDUMP > > M: Vivek Goyal <vgoyal@redhat.com> > > M: Haren Myneni <hbabu@us.ibm.com> > > L: kexec@lists.infradead.org > > W: http://lse.sourceforge.net/kdump/ > > S: Maintained > > F: Documentation/kdump/ > > That''s the Linux MAINTAINERS file, which is why Andrew probably didn''t > look there since this is a Xen side patch. The question didn''t really > seem like a kdump maintainer one in any case, more of a "person who > knows about APICs one", perhaps those are sometimes the same person ;-)Oh yeah. I sat next to Vivek when he was cursing out IO-APICs, SCSI controllers, network cards being brain-dead when the Linux kernel barfed. He is your man. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-May-20 00:33 UTC
RE: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
The functions iommu_enable_x2apic_IR()/iommu_disable_x2apic_IR() should really be architectural specific. They should not be called from common code without going through iommu API. The reason it worked on AMD box is because it returns if the list acpi_drhd_units is empty. On AMD box, this list is empty since it is only populated only on Intel VT-d enabled systems. We will clean this up. I don''t know why is disable_intremap() called separately. It seems to me we should be able to call disable_qinval() and disable_intremap() in vtd_suspend(). I will give it a try tomorrow and see if I can find a clue the code is written this way. Allen -----Original Message----- From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] Sent: Thursday, May 19, 2011 7:32 AM To: Andrew Cooper; Kay, Allen M Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel> >>>What about AMD VI IOMMUs? Does it work when that IOMMU is used? > >>> > >>It worked on the AMD box I tested the code on. Like the comment > >>says - as far as I can tell, it is architecture independent code. > >>>>+ */ > >>>>+ iommu_disable_x2apic_IR(); > >>>Can''t that function be done in the suspend code of the IOMMU? > >>There is a comment in iommu suspend stating that it cant and isn''t > >>done, but rather is left for the local/ioapic_suspend functions > >>which dont properly work in the kexec path. > >OK, how about just moving it out of driver/passthrought/vtd then? > Because that code is fragile enough without me poking about in it. > I would prefer someone with more knowledge about IOMMU to make that > call.OK. Lets CC him here then. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kay, Allen M
2011-May-20 21:55 UTC
RE: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel
Andrew, Given that there is some cleanup work needed with x2apic/suspend, you can go ahead and add calls to disable_intremap() and disable_qinval() in vtd_suspend() for now. This seems the right thing to do by looking at the code. We will look at this again when we revisit and clean up x2apic/suspend code. Once you are done with kexec work, can you write a xen wiki page so that we can follow your instructions to test kexec when we touch this part of the code? Allen -----Original Message----- From: Kay, Allen M Sent: Thursday, May 19, 2011 5:33 PM To: ''Konrad Rzeszutek Wilk''; Andrew Cooper Cc: xen-devel@lists.xensource.com Subject: RE: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel The functions iommu_enable_x2apic_IR()/iommu_disable_x2apic_IR() should really be architectural specific. They should not be called from common code without going through iommu API. The reason it worked on AMD box is because it returns if the list acpi_drhd_units is empty. On AMD box, this list is empty since it is only populated only on Intel VT-d enabled systems. We will clean this up. I don''t know why is disable_intremap() called separately. It seems to me we should be able to call disable_qinval() and disable_intremap() in vtd_suspend(). I will give it a try tomorrow and see if I can find a clue the code is written this way. Allen -----Original Message----- From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com] Sent: Thursday, May 19, 2011 7:32 AM To: Andrew Cooper; Kay, Allen M Cc: xen-devel@lists.xensource.com Subject: Re: [Xen-devel] [PATCH 3 of 3] kexec: disable iommu jumping into the kdump kernel> >>>What about AMD VI IOMMUs? Does it work when that IOMMU is used? > >>> > >>It worked on the AMD box I tested the code on. Like the comment > >>says - as far as I can tell, it is architecture independent code. > >>>>+ */ > >>>>+ iommu_disable_x2apic_IR(); > >>>Can''t that function be done in the suspend code of the IOMMU? > >>There is a comment in iommu suspend stating that it cant and isn''t > >>done, but rather is left for the local/ioapic_suspend functions > >>which dont properly work in the kexec path. > >OK, how about just moving it out of driver/passthrought/vtd then? > Because that code is fragile enough without me poking about in it. > I would prefer someone with more knowledge about IOMMU to make that > call.OK. Lets CC him here then. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel