Going to ask Linus to pull these patches for 3.4-rc4. arch/x86/xen/enlighten.c | 4 +++- arch/x86/xen/smp.c | 15 +++++++++++++++ drivers/xen/events.c | 2 +- drivers/xen/xen-acpi-processor.c | 5 ++++- 4 files changed, 23 insertions(+), 3 deletions(-) Konrad Rzeszutek Wilk (3): xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won''t be loaded. xen/smp: Fix crash when booting with ACPI hotplug CPUs. xen/acpi: Workaround broken BIOSes exporting non-existing C-states. Stefano Stabellini (1): xen: use the pirq number to check the pirq_eoi_map
Konrad Rzeszutek Wilk
2012-Apr-26 18:29 UTC
[PATCH 1/4] xen/enlighten: Disable MWAIT_LEAF so that acpi-pad won''t be loaded.
There are exactly four users of __monitor and __mwait: - cstate.c (which allows acpi_processor_ffh_cstate_enter to be called when the cpuidle API drivers are used. However patch "cpuidle: replace xen access to x86 pm_idle and default_idle" provides a mechanism to disable the cpuidle and use safe_halt. - smpboot (which allows mwait_play_dead to be called). However safe_halt is always used so we skip that. - intel_idle (same deal as above). - acpi_pad.c. This the one that we do not want to run as we will hit the below crash. Why do we want to expose MWAIT_LEAF in the first place? We want it for the xen-acpi-processor driver - which uploads C-states to the hypervisor. If MWAIT_LEAF is set, the cstate.c sets the proper address in the C-states so that the hypervisor can benefit from using the MWAIT functionality. And that is the sole reason for using it. Without this patch, if a module performs mwait or monitor we get this: invalid opcode: 0000 [#1] SMP CPU 2 .. snip.. Pid: 5036, comm: insmod Tainted: G O 3.4.0-rc2upstream-dirty #2 Intel Corporation S2600CP/S2600CP RIP: e030:[<ffffffffa000a017>] [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check] RSP: e02b:ffff8801c298bf18 EFLAGS: 00010282 RAX: ffff8801c298a010 RBX: ffffffffa03b2000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff8801c29800d8 RDI: ffff8801ff097200 RBP: ffff8801c298bf18 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 R13: ffffffffa000a000 R14: 0000005148db7294 R15: 0000000000000003 FS: 00007fbb364f2700(0000) GS:ffff8801ff08c000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 000000000179f038 CR3: 00000001c9469000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process insmod (pid: 5036, threadinfo ffff8801c298a000, task ffff8801c29cd7e0) Stack: ffff8801c298bf48 ffffffff81002124 ffffffffa03b2000 00000000000081fd 000000000178f010 000000000178f030 ffff8801c298bf78 ffffffff810c41e6 00007fff3fb30db9 00007fff3fb30db9 00000000000081fd 0000000000010000 Call Trace: [<ffffffff81002124>] do_one_initcall+0x124/0x170 [<ffffffff810c41e6>] sys_init_module+0xc6/0x220 [<ffffffff815b15b9>] system_call_fastpath+0x16/0x1b Code: <0f> 01 c8 31 c0 0f 01 c9 c9 c3 00 00 00 00 00 00 00 00 00 00 00 00 RIP [<ffffffffa000a017>] mwait_check_init+0x17/0x1000 [mwait_check] RSP <ffff8801c298bf18> ---[ end trace 16582fc8a3d1e29a ]--- Kernel panic - not syncing: Fatal exception With this module (which is what acpi_pad.c would hit): MODULE_AUTHOR("Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>"); MODULE_DESCRIPTION("mwait_check_and_back"); MODULE_LICENSE("GPL"); MODULE_VERSION(); static int __init mwait_check_init(void) { __monitor((void *)¤t_thread_info()->flags, 0, 0); __mwait(0, 0); return 0; } static void __exit mwait_check_exit(void) { } module_init(mwait_check_init); module_exit(mwait_check_exit); Reported-by: Liu, Jinsong <jinsong.liu@intel.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/enlighten.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 4f51beb..407008d 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -349,9 +349,11 @@ static void __init xen_init_cpuid_mask(void) /* Xen will set CR4.OSXSAVE if supported and not disabled by force */ if ((cx & xsave_mask) != xsave_mask) cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE */ - +#if !defined(CONFIG_ACPI_PROCESSOR_AGGREGATOR) && \ + !defined(CONFIG_ACPI_PROCESSOR_AGGREGATOR_MODULE) if (xen_check_mwait()) cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32)); +#endif } static void xen_set_debugreg(int reg, unsigned long val) -- 1.7.7.5
Konrad Rzeszutek Wilk
2012-Apr-26 18:29 UTC
[PATCH 2/4] xen: use the pirq number to check the pirq_eoi_map
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com> In pirq_check_eoi_map use the pirq number rather than the Linux irq number to check whether an eoi is needed in the pirq_eoi_map. The reason is that the irq number is not always identical to the pirq number so if we wrongly use the irq number to check the pirq_eoi_map we are going to check for the wrong pirq to EOI. As a consequence some interrupts might not be EOI''ed by the guest correctly. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Tested-by: Tobias Geiger <tobias.geiger@vido.info> [v1: Added some extra wording to git commit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/xen/events.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index 4b33acd..0a8a17c 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -274,7 +274,7 @@ static unsigned int cpu_from_evtchn(unsigned int evtchn) static bool pirq_check_eoi_map(unsigned irq) { - return test_bit(irq, pirq_eoi_map); + return test_bit(pirq_from_irq(irq), pirq_eoi_map); } static bool pirq_needs_eoi_flag(unsigned irq) -- 1.7.7.5
Konrad Rzeszutek Wilk
2012-Apr-26 18:29 UTC
[PATCH 3/4] xen/smp: Fix crash when booting with ACPI hotplug CPUs.
When we boot on a machine that can hotplug CPUs and we are using ''dom0_max_vcpus=X'' on the Xen hypervisor line to clip the amount of CPUs available to the initial domain, we get this: (XEN) Command line: com1=115200,8n1 dom0_mem=8G noreboot dom0_max_vcpus=8 sync_console mce_verbosity=verbose console=com1,vga loglvl=all guest_loglvl=all .. snip.. DMI: Intel Corporation S2600CP/S2600CP, BIOS SE5C600.86B.99.99.x032.072520111118 07/25/2011 .. snip. SMP: Allowing 64 CPUs, 32 hotplug CPUs installing Xen timer for CPU 7 cpu 7 spinlock event irq 361 NMI watchdog: disabled (cpu7): hardware events not enabled Brought up 8 CPUs .. snip.. [acpi processor finds the CPUs are not initialized and starts calling arch_register_cpu, which creates /sys/devices/system/cpu/cpu8/online] CPU 8 got hotplugged CPU 9 got hotplugged CPU 10 got hotplugged .. snip.. initcall 1_acpi_battery_init_async+0x0/0x1b returned 0 after 406 usecs calling erst_init+0x0/0x2bb @ 1 [and the scheduler sticks newly started tasks on the new CPUs, but said CPUs cannot be initialized b/c the hypervisor has limited the amount of vCPUS to 8 - as per the dom0_max_vcpus=8 flag. The spinlock tries to kick the other CPU, but the structure for that is not initialized and we crash.] BUG: unable to handle kernel paging request at fffffffffffffed8 IP: [<ffffffff81035289>] xen_spin_lock+0x29/0x60 PGD 180d067 PUD 180e067 PMD 0 Oops: 0002 [#1] SMP CPU 7 Modules linked in: Pid: 1, comm: swapper/0 Not tainted 3.4.0-rc2upstream-00001-gf5154e8 #1 Intel Corporation S2600CP/S2600CP RIP: e030:[<ffffffff81035289>] [<ffffffff81035289>] xen_spin_lock+0x29/0x60 RSP: e02b:ffff8801fb9b3a70 EFLAGS: 00010282 With this patch, we cap the amount of vCPUS that the initial domain can run, to exactly what dom0_max_vcpus=X has specified. In the future, if there is a hypercall that will allow a running domain to expand past its initial set of vCPUS, this patch should be re-evaluated. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 15 +++++++++++++++ 1 files changed, 15 insertions(+), 0 deletions(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 5fac691..05da787 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -178,6 +178,7 @@ static void __init xen_fill_possible_map(void) static void __init xen_filter_cpu_maps(void) { int i, rc; + unsigned int subtract; if (!xen_initial_domain()) return; @@ -192,8 +193,22 @@ static void __init xen_filter_cpu_maps(void) } else { set_cpu_possible(i, false); set_cpu_present(i, false); + subtract++; } } +#ifdef CONFIG_HOTPLUG_CPU + /* This is akin to using ''nr_cpus'' on the Linux command line. + * Which is OK as when we use ''dom0_max_vcpus=X'' we can only + * have up to X, while nr_cpu_ids is greater than X. This + * normally is not a problem, except when CPU hotplugging + * is involved and then there might be more than X CPUs + * in the guest - which will not work as there is no + * hypercall to expand the max number of VCPUs an already + * running guest has. So cap it up to X. */ + if (subtract) + nr_cpu_ids = nr_cpu_ids - subtract; +#endif + } static void __init xen_smp_prepare_boot_cpu(void) -- 1.7.7.5
Konrad Rzeszutek Wilk
2012-Apr-26 18:29 UTC
[PATCH 4/4] xen/acpi: Workaround broken BIOSes exporting non-existing C-states.
We did a similar check for the P-states but did not do it for the C-states. What we want to do is ignore cases where the DSDT has definition for sixteen CPUs, but the machine only has eight CPUs and we get: xen-acpi-processor: (CX): Hypervisor error (-22) for ACPI CPU14 Reported-by: Tobias Geiger <tobias.geiger@vido.info> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/xen/xen-acpi-processor.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/drivers/xen/xen-acpi-processor.c b/drivers/xen/xen-acpi-processor.c index 174b565..0b48579 100644 --- a/drivers/xen/xen-acpi-processor.c +++ b/drivers/xen/xen-acpi-processor.c @@ -128,7 +128,10 @@ static int push_cxx_to_hypervisor(struct acpi_processor *_pr) pr_debug(" C%d: %s %d uS\n", cx->type, cx->desc, (u32)cx->latency); } - } else + } else if (ret != -EINVAL) + /* EINVAL means the ACPI ID is incorrect - meaning the ACPI + * table is referencing a non-existing CPU - which can happen + * with broken ACPI tables. */ pr_err(DRV_NAME "(CX): Hypervisor error (%d) for ACPI CPU%u\n", ret, _pr->acpi_id); -- 1.7.7.5