Konrad Rzeszutek Wilk
2013-Apr-16 20:08 UTC
[PATCH] fixes for v3.10 in the CPU hotplug patch (v1).
The first three patches fix outstanding issues with v3.9 (and earlier) kernels where the simple sequence of: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online would embarrassingly not work. As such they also have the stable@vger.kernel.org on them. [PATCH 1/9] xen/smp: Fix leakage of timer interrupt line for every [PATCH 2/9] xen/smp/spinlock: Fix leakage of the spinlock interrupt [PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ The rest are just cleanups and some coalescing of the PV and PVHVM paths. arch/x86/xen/enlighten.c | 5 ++++- arch/x86/xen/smp.c | 21 ++++++++++++++------- arch/x86/xen/spinlock.c | 25 +++++++++++++++++++++++++ arch/x86/xen/time.c | 13 ++++++++++--- drivers/xen/events.c | 20 +++++++++++++++++++- 5 files changed, 72 insertions(+), 12 deletions(-) Konrad Rzeszutek Wilk (9): xen/smp: Fix leakage of timer interrupt line for every CPU online/offline. xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline xen/time: Fix kasprintf splat when allocating timer%d IRQ line. xen/events: Check that IRQ value passed in is valid. xen/time: Add default value of -1 for IRQ and check for that. xen/spinlock: Check against default value of -1 for IRQ line. xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one. xen/smp: Unifiy some of the PVs and PVHVM offline CPU path
Konrad Rzeszutek Wilk
2013-Apr-16 20:08 UTC
[PATCH 1/9] xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.
In the PVHVM path when we do CPU online/offline path we would leak the timer%d IRQ line everytime we do a offline event. The online path (xen_hvm_setup_cpu_clockevents via x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt line for the timer%d. But we would still use the old interrupt line leading to: kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261! invalid opcode: 0000 [#1] SMP RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270 .. snip.. <IRQ> [<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0 [<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0 [<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240 [<ffffffff811175b9>] handle_percpu_irq+0x49/0x70 [<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0 [<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40 [<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80 <EOI> [<ffffffff81666d01>] ? start_secondary+0x193/0x1a8 [<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8 There is also the oddity (timer1) in the /proc/interrupts after offlining CPU1: 64: 1121 0 xen-percpu-virq timer0 78: 0 0 xen-percpu-virq timer1 84: 0 2483 xen-percpu-virq timer2 This patch fixes it. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> CC: stable@vger.kernel.org --- arch/x86/xen/smp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 09ea61d..f80e69c 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu) unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); + xen_teardown_timer(cpu); native_cpu_die(cpu); } -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 2/9] xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline
While we don''t use the spinlock interrupt line (see for details commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 - xen: disable PV spinlocks on HVM) - we should still do the proper init / deinit sequence. We did not do that correctly and for the CPU init for PVHVM guest we would allocate an interrupt line - but failed to deallocate the old interrupt line. This resulted in leakage of an irq_desc but more importantly this splat as we online an offlined CPU: genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1) Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1 Call Trace: [<ffffffff811156de>] __setup_irq+0x23e/0x4a0 [<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250 [<ffffffff811161bb>] request_threaded_irq+0xfb/0x160 [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 [<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160 [<ffffffff81303758>] ? kasprintf+0x38/0x40 [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 [<ffffffff810cad35>] ? update_max_interval+0x15/0x40 [<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78 [<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33 [<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70 [<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10 [<ffffffff8109402b>] __cpu_notify+0x1b/0x30 [<ffffffff8166834a>] _cpu_up+0xa0/0x14b [<ffffffff816684ce>] cpu_up+0xd9/0xec [<ffffffff8165f754>] store_online+0x94/0xd0 [<ffffffff8141d15b>] dev_attr_store+0x1b/0x20 [<ffffffff81218f44>] sysfs_write_file+0xf4/0x170 [<ffffffff811a2864>] vfs_write+0xb4/0x130 [<ffffffff811a302a>] sys_write+0x5a/0xa0 [<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b cpu 1 spinlock event irq -16 smpboot: Booting Node 0 Processor 1 APIC 0x2 And if one looks at the /proc/interrupts right after offlining (CPU1): 70: 0 0 xen-percpu-ipi spinlock0 71: 0 0 xen-percpu-ipi spinlock1 77: 0 0 xen-percpu-ipi spinlock2 There is the oddity of the ''spinlock1'' still being present. CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index f80e69c..22c800a 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu) unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); + xen_uninit_lock_cpu(cpu); xen_teardown_timer(cpu); native_cpu_die(cpu); } -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ line.
When we online the CPU, we get this splat: smpboot: Booting Node 0 Processor 1 APIC 0x2 installing Xen timer for CPU 1 BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179 in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1 Call Trace: [<ffffffff810c1fea>] __might_sleep+0xda/0x100 [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0 [<ffffffff81303758>] ? kasprintf+0x38/0x40 [<ffffffff813036eb>] kvasprintf+0x5b/0x90 [<ffffffff81303758>] kasprintf+0x38/0x40 [<ffffffff81044510>] xen_setup_timer+0x30/0xb0 [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30 [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8 The solution to that is use kasprintf in the CPU hotplug path that ''online''s the CPU. That is, do it in in xen_hvm_cpu_notify, and remove the call to in xen_hvm_setup_cpu_clockevents. Unfortunatly the later is not a good idea as the bootup path does not use xen_hvm_cpu_notify so we would end up never allocating timer%d interrupt lines when booting. As such add the check for atomic() to continue. CC: stable@vger.kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/enlighten.c | 5 ++++- arch/x86/xen/time.c | 6 +++++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c index 47d3243..ddbd54a 100644 --- a/arch/x86/xen/enlighten.c +++ b/arch/x86/xen/enlighten.c @@ -1641,8 +1641,11 @@ static int __cpuinit xen_hvm_cpu_notify(struct notifier_block *self, switch (action) { case CPU_UP_PREPARE: xen_vcpu_setup(cpu); - if (xen_have_vector_callback) + if (xen_have_vector_callback) { xen_init_lock_cpu(cpu); + if (xen_feature(XENFEAT_hvm_safe_pvclock)) + xen_setup_timer(cpu); + } break; default: break; diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 0296a95..054cc01 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -497,7 +497,11 @@ static void xen_hvm_setup_cpu_clockevents(void) { int cpu = smp_processor_id(); xen_setup_runstate_info(cpu); - xen_setup_timer(cpu); + /* + * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence + * doing it xen_hvm_cpu_notify (which gets called by smp_init during + * early bootup and also during CPU hotplug events). + */ xen_setup_cpu_clockevents(); } -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 4/9] xen/events: Check that IRQ value passed in is valid.
We naively assume that the IRQ value passed in is correct. If it is not, then any dereference operation for the ''info'' structure will result in crash - so might as well guard ourselves and sprinkle copious amounts of WARN_ON. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/xen/events.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/xen/events.c b/drivers/xen/events.c index bb65f75..94daed1 100644 --- a/drivers/xen/events.c +++ b/drivers/xen/events.c @@ -515,6 +515,9 @@ static void xen_free_irq(unsigned irq) { struct irq_info *info = irq_get_handler_data(irq); + if (WARN_ON(!info)) + return; + list_del(&info->list); irq_set_handler_data(irq, NULL); @@ -1003,6 +1006,9 @@ static void unbind_from_irq(unsigned int irq) int evtchn = evtchn_from_irq(irq); struct irq_info *info = irq_get_handler_data(irq); + if (WARN_ON(!info)) + return; + mutex_lock(&irq_mapping_update_lock); if (info->refcnt > 0) { @@ -1130,6 +1136,10 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi, void unbind_from_irqhandler(unsigned int irq, void *dev_id) { + struct irq_info *info = irq_get_handler_data(irq); + + if (WARN_ON(!info)) + return; free_irq(irq, dev_id); unbind_from_irq(irq); } @@ -1441,6 +1451,9 @@ void rebind_evtchn_irq(int evtchn, int irq) { struct irq_info *info = info_for_irq(irq); + if (WARN_ON(!info)) + return; + /* Make sure the irq is masked, since the new event channel will also be masked. */ disable_irq(irq); @@ -1714,7 +1727,12 @@ void xen_poll_irq(int irq) int xen_test_irq_shared(int irq) { struct irq_info *info = info_for_irq(irq); - struct physdev_irq_status_query irq_status = { .irq = info->u.pirq.pirq }; + struct physdev_irq_status_query irq_status; + + if (WARN_ON(!info)) + return -ENOENT; + + irq_status.irq = info->u.pirq.pirq; if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) return 0; -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 5/9] xen/time: Add default value of -1 for IRQ and check for that.
If the timer interrupt has been de-init or is just now being initialized, the default value of -1 should be preset as interrupt line. Check for that and if something is odd WARN us. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/time.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 054cc01..3d88bfd 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -377,7 +377,7 @@ static const struct clock_event_device xen_vcpuop_clockevent = { static const struct clock_event_device *xen_clockevent &xen_timerop_clockevent; -static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events); +static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events) = { .irq = -1 }; static irqreturn_t xen_timer_interrupt(int irq, void *dev_id) { @@ -401,6 +401,9 @@ void xen_setup_timer(int cpu) struct clock_event_device *evt; int irq; + evt = &per_cpu(xen_clock_events, cpu); + WARN(evt->irq >= 0, "IRQ%d for CPU%d is already allocated\n", evt->irq, cpu); + printk(KERN_INFO "installing Xen timer for CPU %d\n", cpu); name = kasprintf(GFP_KERNEL, "timer%d", cpu); @@ -413,7 +416,6 @@ void xen_setup_timer(int cpu) IRQF_FORCE_RESUME, name, NULL); - evt = &per_cpu(xen_clock_events, cpu); memcpy(evt, xen_clockevent, sizeof(*evt)); evt->cpumask = cpumask_of(cpu); @@ -426,6 +428,7 @@ void xen_teardown_timer(int cpu) BUG_ON(cpu == 0); evt = &per_cpu(xen_clock_events, cpu); unbind_from_irqhandler(evt->irq, NULL); + evt->irq = -1; } void xen_setup_cpu_clockevents(void) -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 6/9] xen/spinlock: Check against default value of -1 for IRQ line.
The default (uninitialized) value of the IRQ line is -1. Check if we already have allocated an spinlock interrupt line and if somebody is trying to do it again. Also set it to -1 when we offline the CPU. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/spinlock.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index f7a080e..47ae032 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -364,6 +364,9 @@ void __cpuinit xen_init_lock_cpu(int cpu) int irq; const char *name; + WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n", + cpu, per_cpu(lock_kicker_irq, cpu)); + name = kasprintf(GFP_KERNEL, "spinlock%d", cpu); irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR, cpu, @@ -383,6 +386,7 @@ void __cpuinit xen_init_lock_cpu(int cpu) void xen_uninit_lock_cpu(int cpu) { unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); + per_cpu(lock_kicker_irq, cpu) = -1; } void __init xen_init_spinlocks(void) -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 7/9] xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM
See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 (xen: disable PV spinlocks on HVM) for details. But we did not disable it everywhere - which means that when we boot as PVHVM we end up allocating per-CPU irq line for spinlock. This fixes that. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/spinlock.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index 47ae032..8b54603 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -367,6 +367,13 @@ void __cpuinit xen_init_lock_cpu(int cpu) WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n", cpu, per_cpu(lock_kicker_irq, cpu)); + /* + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 + * (xen: disable PV spinlocks on HVM) + */ + if (xen_hvm_domain()) + return; + name = kasprintf(GFP_KERNEL, "spinlock%d", cpu); irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR, cpu, @@ -385,12 +392,26 @@ void __cpuinit xen_init_lock_cpu(int cpu) void xen_uninit_lock_cpu(int cpu) { + /* + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 + * (xen: disable PV spinlocks on HVM) + */ + if (xen_hvm_domain()) + return; + unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); per_cpu(lock_kicker_irq, cpu) = -1; } void __init xen_init_spinlocks(void) { + /* + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 + * (xen: disable PV spinlocks on HVM) + */ + if (xen_hvm_domain()) + return; + BUILD_BUG_ON(sizeof(struct xen_spinlock) > sizeof(arch_spinlock_t)); pv_lock_ops.spin_is_locked = xen_spin_is_locked; -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 8/9] xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one.
There is no need to use the PV version of the IRQ_WORKER mechanism as under PVHVM we are using the native version. The native version is using the SMP API. They just sit around unused: 69: 0 0 xen-percpu-ipi irqwork0 83: 0 0 xen-percpu-ipi irqwork1 Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 22c800a..415694c 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu) goto fail; per_cpu(xen_callfuncsingle_irq, cpu) = rc; + /* + * The IRQ worker on PVHVM goes through the native path and uses the + * IPI mechanism. + */ + if (xen_hvm_domain()) + return 0; + callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu); rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR, cpu, @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu) if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); + if (xen_hvm_domain()) + return rc; + if (per_cpu(xen_irq_work, cpu) >= 0) unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu) unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); + if (!xen_hvm_domain()) + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); xen_uninit_lock_cpu(cpu); xen_teardown_timer(cpu); native_cpu_die(cpu); -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Apr-16 20:09 UTC
[PATCH 9/9] xen/smp: Unifiy some of the PVs and PVHVM offline CPU path
The "xen_cpu_die" and "xen_hvm_cpu_die" are very similar. Lets coalesce them. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 14 ++++---------- 1 file changed, 4 insertions(+), 10 deletions(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 415694c..0d466d7 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -428,7 +428,7 @@ static int xen_cpu_disable(void) static void xen_cpu_die(unsigned int cpu) { - while (HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL)) { + while (xen_pv_domain() && HYPERVISOR_vcpu_op(VCPUOP_is_up, cpu, NULL)) { current->state = TASK_UNINTERRUPTIBLE; schedule_timeout(HZ/10); } @@ -436,7 +436,8 @@ static void xen_cpu_die(unsigned int cpu) unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); + if (!xen_hvm_domain()) + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); xen_uninit_lock_cpu(cpu); xen_teardown_timer(cpu); } @@ -667,14 +668,7 @@ static int __cpuinit xen_hvm_cpu_up(unsigned int cpu, struct task_struct *tidle) static void xen_hvm_cpu_die(unsigned int cpu) { - unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); - if (!xen_hvm_domain()) - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); - xen_uninit_lock_cpu(cpu); - xen_teardown_timer(cpu); + xen_cpu_die(cpu); native_cpu_die(cpu); } -- 1.8.1.4
Stefano Stabellini
2013-Apr-26 16:06 UTC
Re: [PATCH 1/9] xen/smp: Fix leakage of timer interrupt line for every CPU online/offline.
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> In the PVHVM path when we do CPU online/offline path we would > leak the timer%d IRQ line everytime we do a offline event. The > online path (xen_hvm_setup_cpu_clockevents via > x86_cpuinit.setup_percpu_clockev) would allocate a new interrupt > line for the timer%d. > > But we would still use the old interrupt line leading to: > > kernel BUG at /home/konrad/ssd/konrad/linux/kernel/hrtimer.c:1261! > invalid opcode: 0000 [#1] SMP > RIP: 0010:[<ffffffff810b9e21>] [<ffffffff810b9e21>] hrtimer_interrupt+0x261/0x270 > .. snip.. > <IRQ> > [<ffffffff810445ef>] xen_timer_interrupt+0x2f/0x1b0 > [<ffffffff81104825>] ? stop_machine_cpu_stop+0xb5/0xf0 > [<ffffffff8111434c>] handle_irq_event_percpu+0x7c/0x240 > [<ffffffff811175b9>] handle_percpu_irq+0x49/0x70 > [<ffffffff813a74a3>] __xen_evtchn_do_upcall+0x1c3/0x2f0 > [<ffffffff813a760a>] xen_evtchn_do_upcall+0x2a/0x40 > [<ffffffff8167c26d>] xen_hvm_callback_vector+0x6d/0x80 > <EOI> > [<ffffffff81666d01>] ? start_secondary+0x193/0x1a8 > [<ffffffff81666cfd>] ? start_secondary+0x18f/0x1a8 > > There is also the oddity (timer1) in the /proc/interrupts after > offlining CPU1: > > 64: 1121 0 xen-percpu-virq timer0 > 78: 0 0 xen-percpu-virq timer1 > 84: 0 2483 xen-percpu-virq timer2 > > This patch fixes it. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > CC: stable@vger.kernel.orgAcked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>> arch/x86/xen/smp.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 09ea61d..f80e69c 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu) > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > + xen_teardown_timer(cpu); > native_cpu_die(cpu); > } > > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:06 UTC
Re: [PATCH 2/9] xen/smp/spinlock: Fix leakage of the spinlock interrupt line for every CPU online/offline
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> While we don''t use the spinlock interrupt line (see for details > commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 - > xen: disable PV spinlocks on HVM) - we should still do the proper > init / deinit sequence. We did not do that correctly and for the > CPU init for PVHVM guest we would allocate an interrupt line - but > failed to deallocate the old interrupt line. > > This resulted in leakage of an irq_desc but more importantly this splat > as we online an offlined CPU: > > genirq: Flags mismatch irq 71. 0002cc20 (spinlock1) vs. 0002cc20 (spinlock1) > Pid: 2542, comm: init.late Not tainted 3.9.0-rc6upstream #1 > Call Trace: > [<ffffffff811156de>] __setup_irq+0x23e/0x4a0 > [<ffffffff81194191>] ? kmem_cache_alloc_trace+0x221/0x250 > [<ffffffff811161bb>] request_threaded_irq+0xfb/0x160 > [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 > [<ffffffff813a8423>] bind_ipi_to_irqhandler+0xa3/0x160 > [<ffffffff81303758>] ? kasprintf+0x38/0x40 > [<ffffffff8104c6f0>] ? xen_spin_trylock+0x20/0x20 > [<ffffffff810cad35>] ? update_max_interval+0x15/0x40 > [<ffffffff816605db>] xen_init_lock_cpu+0x3c/0x78 > [<ffffffff81660029>] xen_hvm_cpu_notify+0x29/0x33 > [<ffffffff81676bdd>] notifier_call_chain+0x4d/0x70 > [<ffffffff810bb2a9>] __raw_notifier_call_chain+0x9/0x10 > [<ffffffff8109402b>] __cpu_notify+0x1b/0x30 > [<ffffffff8166834a>] _cpu_up+0xa0/0x14b > [<ffffffff816684ce>] cpu_up+0xd9/0xec > [<ffffffff8165f754>] store_online+0x94/0xd0 > [<ffffffff8141d15b>] dev_attr_store+0x1b/0x20 > [<ffffffff81218f44>] sysfs_write_file+0xf4/0x170 > [<ffffffff811a2864>] vfs_write+0xb4/0x130 > [<ffffffff811a302a>] sys_write+0x5a/0xa0 > [<ffffffff8167ada9>] system_call_fastpath+0x16/0x1b > cpu 1 spinlock event irq -16 > smpboot: Booting Node 0 Processor 1 APIC 0x2 > > And if one looks at the /proc/interrupts right after > offlining (CPU1): > > 70: 0 0 xen-percpu-ipi spinlock0 > 71: 0 0 xen-percpu-ipi spinlock1 > 77: 0 0 xen-percpu-ipi spinlock2 > > There is the oddity of the ''spinlock1'' still being present. > > CC: stable@vger.kernel.org > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>> arch/x86/xen/smp.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index f80e69c..22c800a 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -662,6 +662,7 @@ static void xen_hvm_cpu_die(unsigned int cpu) > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > + xen_uninit_lock_cpu(cpu); > xen_teardown_timer(cpu); > native_cpu_die(cpu); > } > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:11 UTC
Re: [PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ line.
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> When we online the CPU, we get this splat: > > smpboot: Booting Node 0 Processor 1 APIC 0x2 > installing Xen timer for CPU 1 > BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179 > in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 > Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1 > Call Trace: > [<ffffffff810c1fea>] __might_sleep+0xda/0x100 > [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0 > [<ffffffff81303758>] ? kasprintf+0x38/0x40 > [<ffffffff813036eb>] kvasprintf+0x5b/0x90 > [<ffffffff81303758>] kasprintf+0x38/0x40 > [<ffffffff81044510>] xen_setup_timer+0x30/0xb0 > [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30 > [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8 > > The solution to that is use kasprintf in the CPU hotplug path > that ''online''s the CPU. That is, do it in in xen_hvm_cpu_notify, > and remove the call to in xen_hvm_setup_cpu_clockevents. > > Unfortunatly the later is not a good idea as the bootup path > does not use xen_hvm_cpu_notify so we would end up never allocating > timer%d interrupt lines when booting. As such add the check for > atomic() to continue.This last is not reflected in the code. Also, is it actually OK to move xen_setup_timer out of xen_hvm_setup_cpu_clockevents? xen_setup_cpu_clockevents registers xen_clock_events as clocksource and xen_clock_events is setup by xen_setup_timer so we need to make sure that the call order remains the same.> CC: stable@vger.kernel.org > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/enlighten.c | 5 ++++- > arch/x86/xen/time.c | 6 +++++- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > index 47d3243..ddbd54a 100644 > --- a/arch/x86/xen/enlighten.c > +++ b/arch/x86/xen/enlighten.c > @@ -1641,8 +1641,11 @@ static int __cpuinit xen_hvm_cpu_notify(struct notifier_block *self, > switch (action) { > case CPU_UP_PREPARE: > xen_vcpu_setup(cpu); > - if (xen_have_vector_callback) > + if (xen_have_vector_callback) { > xen_init_lock_cpu(cpu); > + if (xen_feature(XENFEAT_hvm_safe_pvclock)) > + xen_setup_timer(cpu); > + } > break; > default: > break; > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c > index 0296a95..054cc01 100644 > --- a/arch/x86/xen/time.c > +++ b/arch/x86/xen/time.c > @@ -497,7 +497,11 @@ static void xen_hvm_setup_cpu_clockevents(void) > { > int cpu = smp_processor_id(); > xen_setup_runstate_info(cpu); > - xen_setup_timer(cpu); > + /* > + * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence > + * doing it xen_hvm_cpu_notify (which gets called by smp_init during > + * early bootup and also during CPU hotplug events). > + */ > xen_setup_cpu_clockevents(); > } > > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:12 UTC
Re: [PATCH 4/9] xen/events: Check that IRQ value passed in is valid.
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> We naively assume that the IRQ value passed in is correct. > If it is not, then any dereference operation for the ''info'' > structure will result in crash - so might as well guard ourselves > and sprinkle copious amounts of WARN_ON. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>> drivers/xen/events.c | 20 +++++++++++++++++++- > 1 file changed, 19 insertions(+), 1 deletion(-) > > diff --git a/drivers/xen/events.c b/drivers/xen/events.c > index bb65f75..94daed1 100644 > --- a/drivers/xen/events.c > +++ b/drivers/xen/events.c > @@ -515,6 +515,9 @@ static void xen_free_irq(unsigned irq) > { > struct irq_info *info = irq_get_handler_data(irq); > > + if (WARN_ON(!info)) > + return; > + > list_del(&info->list); > > irq_set_handler_data(irq, NULL); > @@ -1003,6 +1006,9 @@ static void unbind_from_irq(unsigned int irq) > int evtchn = evtchn_from_irq(irq); > struct irq_info *info = irq_get_handler_data(irq); > > + if (WARN_ON(!info)) > + return; > + > mutex_lock(&irq_mapping_update_lock); > > if (info->refcnt > 0) { > @@ -1130,6 +1136,10 @@ int bind_ipi_to_irqhandler(enum ipi_vector ipi, > > void unbind_from_irqhandler(unsigned int irq, void *dev_id) > { > + struct irq_info *info = irq_get_handler_data(irq); > + > + if (WARN_ON(!info)) > + return; > free_irq(irq, dev_id); > unbind_from_irq(irq); > } > @@ -1441,6 +1451,9 @@ void rebind_evtchn_irq(int evtchn, int irq) > { > struct irq_info *info = info_for_irq(irq); > > + if (WARN_ON(!info)) > + return; > + > /* Make sure the irq is masked, since the new event channel > will also be masked. */ > disable_irq(irq); > @@ -1714,7 +1727,12 @@ void xen_poll_irq(int irq) > int xen_test_irq_shared(int irq) > { > struct irq_info *info = info_for_irq(irq); > - struct physdev_irq_status_query irq_status = { .irq = info->u.pirq.pirq }; > + struct physdev_irq_status_query irq_status; > + > + if (WARN_ON(!info)) > + return -ENOENT; > + > + irq_status.irq = info->u.pirq.pirq; > > if (HYPERVISOR_physdev_op(PHYSDEVOP_irq_status_query, &irq_status)) > return 0; > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:15 UTC
Re: [PATCH 5/9] xen/time: Add default value of -1 for IRQ and check for that.
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> If the timer interrupt has been de-init or is just now being > initialized, the default value of -1 should be preset as > interrupt line. Check for that and if something is odd > WARN us. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>> arch/x86/xen/time.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c > index 054cc01..3d88bfd 100644 > --- a/arch/x86/xen/time.c > +++ b/arch/x86/xen/time.c > @@ -377,7 +377,7 @@ static const struct clock_event_device xen_vcpuop_clockevent = { > > static const struct clock_event_device *xen_clockevent > &xen_timerop_clockevent; > -static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events); > +static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events) = { .irq = -1 }; > > static irqreturn_t xen_timer_interrupt(int irq, void *dev_id) > { > @@ -401,6 +401,9 @@ void xen_setup_timer(int cpu) > struct clock_event_device *evt; > int irq; > > + evt = &per_cpu(xen_clock_events, cpu); > + WARN(evt->irq >= 0, "IRQ%d for CPU%d is already allocated\n", evt->irq, cpu); > + > printk(KERN_INFO "installing Xen timer for CPU %d\n", cpu); > > name = kasprintf(GFP_KERNEL, "timer%d", cpu); > @@ -413,7 +416,6 @@ void xen_setup_timer(int cpu) > IRQF_FORCE_RESUME, > name, NULL); > > - evt = &per_cpu(xen_clock_events, cpu); > memcpy(evt, xen_clockevent, sizeof(*evt)); > > evt->cpumask = cpumask_of(cpu); > @@ -426,6 +428,7 @@ void xen_teardown_timer(int cpu) > BUG_ON(cpu == 0); > evt = &per_cpu(xen_clock_events, cpu); > unbind_from_irqhandler(evt->irq, NULL); > + evt->irq = -1; > } > > void xen_setup_cpu_clockevents(void) > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:18 UTC
Re: [PATCH 6/9] xen/spinlock: Check against default value of -1 for IRQ line.
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> The default (uninitialized) value of the IRQ line is -1. > Check if we already have allocated an spinlock interrupt line > and if somebody is trying to do it again. Also set it to -1 > when we offline the CPU. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/spinlock.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c > index f7a080e..47ae032 100644 > --- a/arch/x86/xen/spinlock.c > +++ b/arch/x86/xen/spinlock.c > @@ -364,6 +364,9 @@ void __cpuinit xen_init_lock_cpu(int cpu) > int irq; > const char *name; > > + WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n",shouldn''t this be >= ^> + cpu, per_cpu(lock_kicker_irq, cpu)); > > name = kasprintf(GFP_KERNEL, "spinlock%d", cpu); > irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR, > cpu, > @@ -383,6 +386,7 @@ void __cpuinit xen_init_lock_cpu(int cpu) > void xen_uninit_lock_cpu(int cpu) > { > unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); > + per_cpu(lock_kicker_irq, cpu) = -1; > } > > void __init xen_init_spinlocks(void) > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:20 UTC
Re: [PATCH 7/9] xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > (xen: disable PV spinlocks on HVM) for details. > > But we did not disable it everywhere - which means that when > we boot as PVHVM we end up allocating per-CPU irq line for > spinlock. This fixes that. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Is there any point in calling xen_init_lock_cpu in PVHVM guests? At that point we might as well remove the call from xen_hvm_cpu_notify.> arch/x86/xen/spinlock.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c > index 47ae032..8b54603 100644 > --- a/arch/x86/xen/spinlock.c > +++ b/arch/x86/xen/spinlock.c > @@ -367,6 +367,13 @@ void __cpuinit xen_init_lock_cpu(int cpu) > WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n", > cpu, per_cpu(lock_kicker_irq, cpu)); > > + /* > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > + * (xen: disable PV spinlocks on HVM) > + */ > + if (xen_hvm_domain()) > + return; > + > name = kasprintf(GFP_KERNEL, "spinlock%d", cpu); > irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR, > cpu, > @@ -385,12 +392,26 @@ void __cpuinit xen_init_lock_cpu(int cpu) > > void xen_uninit_lock_cpu(int cpu) > { > + /* > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > + * (xen: disable PV spinlocks on HVM) > + */ > + if (xen_hvm_domain()) > + return; > + > unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); > per_cpu(lock_kicker_irq, cpu) = -1; > } > > void __init xen_init_spinlocks(void) > { > + /* > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > + * (xen: disable PV spinlocks on HVM) > + */ > + if (xen_hvm_domain()) > + return; > + > BUILD_BUG_ON(sizeof(struct xen_spinlock) > sizeof(arch_spinlock_t)); > > pv_lock_ops.spin_is_locked = xen_spin_is_locked; > -- > 1.8.1.4 >
Stefano Stabellini
2013-Apr-26 16:27 UTC
Re: [PATCH 8/9] xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one.
On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote:> There is no need to use the PV version of the IRQ_WORKER mechanism > as under PVHVM we are using the native version. The native > version is using the SMP API. > > They just sit around unused: > > 69: 0 0 xen-percpu-ipi irqwork0 > 83: 0 0 xen-percpu-ipi irqwork1 > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>Might be worth trying to make it work instead? Is it just because we don''t set the apic->send_IPI_* functions to the xen specific version on PVHVM?> arch/x86/xen/smp.c | 13 ++++++++++++- > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 22c800a..415694c 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu) > goto fail; > per_cpu(xen_callfuncsingle_irq, cpu) = rc; > > + /* > + * The IRQ worker on PVHVM goes through the native path and uses the > + * IPI mechanism. > + */ > + if (xen_hvm_domain()) > + return 0; > + > callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu); > rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR, > cpu, > @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu) > if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), > NULL); > + if (xen_hvm_domain()) > + return rc; > + > if (per_cpu(xen_irq_work, cpu) >= 0) > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu) > unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); > - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > + if (!xen_hvm_domain()) > + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > xen_uninit_lock_cpu(cpu); > xen_teardown_timer(cpu); > native_cpu_die(cpu); > -- > 1.8.1.4 >
Konrad Rzeszutek Wilk
2013-Apr-29 18:34 UTC
Re: [PATCH 8/9] xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one.
On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote:> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > There is no need to use the PV version of the IRQ_WORKER mechanism > > as under PVHVM we are using the native version. The native > > version is using the SMP API. > > > > They just sit around unused: > > > > 69: 0 0 xen-percpu-ipi irqwork0 > > 83: 0 0 xen-percpu-ipi irqwork1 > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Might be worth trying to make it work instead? > Is it just because we don''t set the apic->send_IPI_* functions to the > xen specific version on PVHVM? >Right. We use the baremetal mechanism to do it. And it works fine.> > > arch/x86/xen/smp.c | 13 ++++++++++++- > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > > index 22c800a..415694c 100644 > > --- a/arch/x86/xen/smp.c > > +++ b/arch/x86/xen/smp.c > > @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu) > > goto fail; > > per_cpu(xen_callfuncsingle_irq, cpu) = rc; > > > > + /* > > + * The IRQ worker on PVHVM goes through the native path and uses the > > + * IPI mechanism. > > + */ > > + if (xen_hvm_domain()) > > + return 0; > > + > > callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu); > > rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR, > > cpu, > > @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu) > > if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) > > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), > > NULL); > > + if (xen_hvm_domain()) > > + return rc; > > + > > if (per_cpu(xen_irq_work, cpu) >= 0) > > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > > > @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu) > > unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); > > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); > > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); > > - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > + if (!xen_hvm_domain()) > > + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > xen_uninit_lock_cpu(cpu); > > xen_teardown_timer(cpu); > > native_cpu_die(cpu); > > -- > > 1.8.1.4 > >
Konrad Rzeszutek Wilk
2013-Apr-29 18:34 UTC
Re: [PATCH 7/9] xen/spinlock: Disable IRQ spinlock (PV) allocation on PVHVM
On Fri, Apr 26, 2013 at 05:20:23PM +0100, Stefano Stabellini wrote:> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > > (xen: disable PV spinlocks on HVM) for details. > > > > But we did not disable it everywhere - which means that when > > we boot as PVHVM we end up allocating per-CPU irq line for > > spinlock. This fixes that. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Is there any point in calling xen_init_lock_cpu in PVHVM guests?I was thinking about it.. but I still have hope that I will be able to take Jeremy''s patches for paravirt locking and redo them a bit.> At that point we might as well remove the call from xen_hvm_cpu_notify. > > > > arch/x86/xen/spinlock.c | 21 +++++++++++++++++++++ > > 1 file changed, 21 insertions(+) > > > > diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c > > index 47ae032..8b54603 100644 > > --- a/arch/x86/xen/spinlock.c > > +++ b/arch/x86/xen/spinlock.c > > @@ -367,6 +367,13 @@ void __cpuinit xen_init_lock_cpu(int cpu) > > WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n", > > cpu, per_cpu(lock_kicker_irq, cpu)); > > > > + /* > > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > > + * (xen: disable PV spinlocks on HVM) > > + */ > > + if (xen_hvm_domain()) > > + return; > > + > > name = kasprintf(GFP_KERNEL, "spinlock%d", cpu); > > irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR, > > cpu, > > @@ -385,12 +392,26 @@ void __cpuinit xen_init_lock_cpu(int cpu) > > > > void xen_uninit_lock_cpu(int cpu) > > { > > + /* > > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > > + * (xen: disable PV spinlocks on HVM) > > + */ > > + if (xen_hvm_domain()) > > + return; > > + > > unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); > > per_cpu(lock_kicker_irq, cpu) = -1; > > } > > > > void __init xen_init_spinlocks(void) > > { > > + /* > > + * See git commit f10cd522c5fbfec9ae3cc01967868c9c2401ed23 > > + * (xen: disable PV spinlocks on HVM) > > + */ > > + if (xen_hvm_domain()) > > + return; > > + > > BUILD_BUG_ON(sizeof(struct xen_spinlock) > sizeof(arch_spinlock_t)); > > > > pv_lock_ops.spin_is_locked = xen_spin_is_locked; > > -- > > 1.8.1.4 > >
Konrad Rzeszutek Wilk
2013-Apr-29 18:35 UTC
Re: [PATCH 6/9] xen/spinlock: Check against default value of -1 for IRQ line.
On Fri, Apr 26, 2013 at 05:18:01PM +0100, Stefano Stabellini wrote:> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > The default (uninitialized) value of the IRQ line is -1. > > Check if we already have allocated an spinlock interrupt line > > and if somebody is trying to do it again. Also set it to -1 > > when we offline the CPU. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/spinlock.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c > > index f7a080e..47ae032 100644 > > --- a/arch/x86/xen/spinlock.c > > +++ b/arch/x86/xen/spinlock.c > > @@ -364,6 +364,9 @@ void __cpuinit xen_init_lock_cpu(int cpu) > > int irq; > > const char *name; > > > > + WARN(per_cpu(lock_kicker_irq, cpu) > 0, "spinlock on CPU%d exists on IRQ%d!\n", > shouldn''t this be >= ^ >Yes. Thanks for catching.> > > + cpu, per_cpu(lock_kicker_irq, cpu)); > > > > name = kasprintf(GFP_KERNEL, "spinlock%d", cpu); > > irq = bind_ipi_to_irqhandler(XEN_SPIN_UNLOCK_VECTOR, > > cpu, > > @@ -383,6 +386,7 @@ void __cpuinit xen_init_lock_cpu(int cpu) > > void xen_uninit_lock_cpu(int cpu) > > { > > unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); > > + per_cpu(lock_kicker_irq, cpu) = -1; > > } > > > > void __init xen_init_spinlocks(void) > > -- > > 1.8.1.4 > >
Konrad Rzeszutek Wilk
2013-Apr-29 18:36 UTC
Re: [PATCH 3/9] xen/time: Fix kasprintf splat when allocating timer%d IRQ line.
On Fri, Apr 26, 2013 at 05:11:35PM +0100, Stefano Stabellini wrote:> On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > When we online the CPU, we get this splat: > > > > smpboot: Booting Node 0 Processor 1 APIC 0x2 > > installing Xen timer for CPU 1 > > BUG: sleeping function called from invalid context at /home/konrad/ssd/konrad/linux/mm/slab.c:3179 > > in_atomic(): 1, irqs_disabled(): 0, pid: 0, name: swapper/1 > > Pid: 0, comm: swapper/1 Not tainted 3.9.0-rc6upstream-00001-g3884fad #1 > > Call Trace: > > [<ffffffff810c1fea>] __might_sleep+0xda/0x100 > > [<ffffffff81194617>] __kmalloc_track_caller+0x1e7/0x2c0 > > [<ffffffff81303758>] ? kasprintf+0x38/0x40 > > [<ffffffff813036eb>] kvasprintf+0x5b/0x90 > > [<ffffffff81303758>] kasprintf+0x38/0x40 > > [<ffffffff81044510>] xen_setup_timer+0x30/0xb0 > > [<ffffffff810445af>] xen_hvm_setup_cpu_clockevents+0x1f/0x30 > > [<ffffffff81666d0a>] start_secondary+0x19c/0x1a8 > > > > The solution to that is use kasprintf in the CPU hotplug path > > that ''online''s the CPU. That is, do it in in xen_hvm_cpu_notify, > > and remove the call to in xen_hvm_setup_cpu_clockevents. > > > > Unfortunatly the later is not a good idea as the bootup path > > does not use xen_hvm_cpu_notify so we would end up never allocating > > timer%d interrupt lines when booting. As such add the check for > > atomic() to continue. > > This last is not reflected in the code.I found out that it was not needed.> > Also, is it actually OK to move xen_setup_timer out of > xen_hvm_setup_cpu_clockevents?Yes. It ends up being called earlier - in the notifier.> > xen_setup_cpu_clockevents registers xen_clock_events as clocksource and > xen_clock_events is setup by xen_setup_timer so we need to make sure > that the call order remains the same.The order is still the same.> > > > CC: stable@vger.kernel.org > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/enlighten.c | 5 ++++- > > arch/x86/xen/time.c | 6 +++++- > > 2 files changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c > > index 47d3243..ddbd54a 100644 > > --- a/arch/x86/xen/enlighten.c > > +++ b/arch/x86/xen/enlighten.c > > @@ -1641,8 +1641,11 @@ static int __cpuinit xen_hvm_cpu_notify(struct notifier_block *self, > > switch (action) { > > case CPU_UP_PREPARE: > > xen_vcpu_setup(cpu); > > - if (xen_have_vector_callback) > > + if (xen_have_vector_callback) { > > xen_init_lock_cpu(cpu); > > + if (xen_feature(XENFEAT_hvm_safe_pvclock)) > > + xen_setup_timer(cpu); > > + } > > break; > > default: > > break; > > diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c > > index 0296a95..054cc01 100644 > > --- a/arch/x86/xen/time.c > > +++ b/arch/x86/xen/time.c > > @@ -497,7 +497,11 @@ static void xen_hvm_setup_cpu_clockevents(void) > > { > > int cpu = smp_processor_id(); > > xen_setup_runstate_info(cpu); > > - xen_setup_timer(cpu); > > + /* > > + * xen_setup_timer(cpu) - snprintf is bad in atomic context. Hence > > + * doing it xen_hvm_cpu_notify (which gets called by smp_init during > > + * early bootup and also during CPU hotplug events). > > + */ > > xen_setup_cpu_clockevents(); > > } > > > > -- > > 1.8.1.4 > >
Stefano Stabellini
2013-May-01 13:25 UTC
Re: [PATCH 8/9] xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one.
On Mon, 29 Apr 2013, Konrad Rzeszutek Wilk wrote:> On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote: > > On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > > There is no need to use the PV version of the IRQ_WORKER mechanism > > > as under PVHVM we are using the native version. The native > > > version is using the SMP API. > > > > > > They just sit around unused: > > > > > > 69: 0 0 xen-percpu-ipi irqwork0 > > > 83: 0 0 xen-percpu-ipi irqwork1 > > > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > Might be worth trying to make it work instead? > > Is it just because we don''t set the apic->send_IPI_* functions to the > > xen specific version on PVHVM? > > > > Right. We use the baremetal mechanism to do it. And it works fine.OK, it works fine, but won''t it generate many mores trap and emulate cycles?> > > arch/x86/xen/smp.c | 13 ++++++++++++- > > > 1 file changed, 12 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > > > index 22c800a..415694c 100644 > > > --- a/arch/x86/xen/smp.c > > > +++ b/arch/x86/xen/smp.c > > > @@ -144,6 +144,13 @@ static int xen_smp_intr_init(unsigned int cpu) > > > goto fail; > > > per_cpu(xen_callfuncsingle_irq, cpu) = rc; > > > > > > + /* > > > + * The IRQ worker on PVHVM goes through the native path and uses the > > > + * IPI mechanism. > > > + */ > > > + if (xen_hvm_domain()) > > > + return 0; > > > + > > > callfunc_name = kasprintf(GFP_KERNEL, "irqwork%d", cpu); > > > rc = bind_ipi_to_irqhandler(XEN_IRQ_WORK_VECTOR, > > > cpu, > > > @@ -167,6 +174,9 @@ static int xen_smp_intr_init(unsigned int cpu) > > > if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) > > > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), > > > NULL); > > > + if (xen_hvm_domain()) > > > + return rc; > > > + > > > if (per_cpu(xen_irq_work, cpu) >= 0) > > > unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > > > > > @@ -661,7 +671,8 @@ static void xen_hvm_cpu_die(unsigned int cpu) > > > unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); > > > unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); > > > unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); > > > - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > > + if (!xen_hvm_domain()) > > > + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); > > > xen_uninit_lock_cpu(cpu); > > > xen_teardown_timer(cpu); > > > native_cpu_die(cpu); > > > -- > > > 1.8.1.4 > > > >
Konrad Rzeszutek Wilk
2013-May-01 14:57 UTC
Re: [PATCH 8/9] xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one.
On Wed, May 01, 2013 at 02:25:16PM +0100, Stefano Stabellini wrote:> On Mon, 29 Apr 2013, Konrad Rzeszutek Wilk wrote: > > On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote: > > > On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > > > There is no need to use the PV version of the IRQ_WORKER mechanism > > > > as under PVHVM we are using the native version. The native > > > > version is using the SMP API. > > > > > > > > They just sit around unused: > > > > > > > > 69: 0 0 xen-percpu-ipi irqwork0 > > > > 83: 0 0 xen-percpu-ipi irqwork1 > > > > > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > > > Might be worth trying to make it work instead? > > > Is it just because we don''t set the apic->send_IPI_* functions to the > > > xen specific version on PVHVM? > > > > > > > Right. We use the baremetal mechanism to do it. And it works fine. > > OK, it works fine, but won''t it generate many mores trap and emulate > cycles?No idea. We can certainly make use of the PV IPI mechanism for IRQ_WORKER type mechaism but I would have to play with xentrace to get a good handle of what is involved (And how the v Posted interrupt thing affects this). Right now that is something I can''t do (buried in bugs).
Stefano Stabellini
2013-May-01 15:07 UTC
Re: [PATCH 8/9] xen/smp/pvhvm: Don''t initialize IRQ_WORKER as we are using the native one.
On Wed, 1 May 2013, Konrad Rzeszutek Wilk wrote:> On Wed, May 01, 2013 at 02:25:16PM +0100, Stefano Stabellini wrote: > > On Mon, 29 Apr 2013, Konrad Rzeszutek Wilk wrote: > > > On Fri, Apr 26, 2013 at 05:27:20PM +0100, Stefano Stabellini wrote: > > > > On Tue, 16 Apr 2013, Konrad Rzeszutek Wilk wrote: > > > > > There is no need to use the PV version of the IRQ_WORKER mechanism > > > > > as under PVHVM we are using the native version. The native > > > > > version is using the SMP API. > > > > > > > > > > They just sit around unused: > > > > > > > > > > 69: 0 0 xen-percpu-ipi irqwork0 > > > > > 83: 0 0 xen-percpu-ipi irqwork1 > > > > > > > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > > > > > Might be worth trying to make it work instead? > > > > Is it just because we don''t set the apic->send_IPI_* functions to the > > > > xen specific version on PVHVM? > > > > > > > > > > Right. We use the baremetal mechanism to do it. And it works fine. > > > > OK, it works fine, but won''t it generate many mores trap and emulate > > cycles? > > No idea. We can certainly make use of the PV IPI mechanism for IRQ_WORKER > type mechaism but I would have to play with xentrace to get a good handle > of what is involved (And how the v Posted interrupt thing affects this). > > Right now that is something I can''t do (buried in bugs).OK