Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH] fix kmemleak complaining about memory leaks when offlining/onlining vCPUS (v1).
When I fixed the online/offline vCPU mechanism to work again under PVHVM guests, I saw this: [ 665.026885] kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) but due to not having enough time did not fix it. This patchset fixes it by stashing the ''char *'' of said interrupts line during their creation. When they are torn down when a VCPU is offline we use said per-cpu pointer to free it. As part of it I also added some error checking and some default values. Please look. arch/x86/xen/smp.c | 91 ++++++++++++++++++++++++++++++++----------------- arch/x86/xen/spinlock.c | 5 +++ arch/x86/xen/time.c | 41 +++++++++++++++------- 3 files changed, 92 insertions(+), 45 deletions(-) Konrad Rzeszutek Wilk (9): xen/smp: Coalesce the free_irq calls in one function. xen/smp: Introduce a common structure to contain the IRQ name and interrupt line. xen/smp: Set the per-cpu IRQ number to a valid default. xen/smp: Don''t leak interrupt name when offlining. xen/spinlock: Don''t leak interrupt name when offlining. xen/time: Encapsulate the struct clock_event_device in another structure. xen/time: Don''t leak interrupt name when offlining. xen/time: Check that the per_cpu data structure has data before freeing. xen/time: Free onlined per-cpu data structure if we want to online it again.
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 1/9] xen/smp: Coalesce the free_irq calls in one function.
There are two functions that do a bunch of ''free_irq'' on the per_cpu IRQ. Instead of having duplicate code just move it to one function. This is just code movement. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 40 +++++++++++++++++++--------------------- 1 file changed, 19 insertions(+), 21 deletions(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index fb44426..19fc9f3 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -98,6 +98,23 @@ static void __cpuinit cpu_bringup_and_idle(void) cpu_startup_entry(CPUHP_ONLINE); } +static void xen_smp_intr_free(unsigned int cpu) +{ + if (per_cpu(xen_resched_irq, cpu) >= 0) + unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu), NULL); + if (per_cpu(xen_callfunc_irq, cpu) >= 0) + unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); + if (per_cpu(xen_debug_irq, cpu) >= 0) + unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); + if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) + unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), + NULL); + if (xen_hvm_domain()) + return; + + if (per_cpu(xen_irq_work, cpu) >= 0) + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); +}; static int xen_smp_intr_init(unsigned int cpu) { int rc; @@ -165,21 +182,7 @@ static int xen_smp_intr_init(unsigned int cpu) return 0; fail: - if (per_cpu(xen_resched_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu), NULL); - if (per_cpu(xen_callfunc_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); - if (per_cpu(xen_debug_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); - if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), - NULL); - if (xen_hvm_domain()) - return rc; - - if (per_cpu(xen_irq_work, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); - + xen_smp_intr_free(cpu); return rc; } @@ -432,12 +435,7 @@ static void xen_cpu_die(unsigned int cpu) current->state = TASK_UNINTERRUPTIBLE; schedule_timeout(HZ/10); } - unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); - unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), NULL); - if (!xen_hvm_domain()) - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); + xen_smp_intr_free(cpu); xen_uninit_lock_cpu(cpu); xen_teardown_timer(cpu); } -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 2/9] xen/smp: Introduce a common structure to contain the IRQ name and interrupt line.
This patch adds a new structure to contain the common two things that each of the per-cpu interrupts need: - an interrupt number, - and the name of the interrupt (to be added in ''xen/smp: Don''t leak interrupt name when offlining''). This allows us to carry the tuple of the per-cpu interrupt data structure and expand it as we need in the future. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 44 ++++++++++++++++++++++++-------------------- 1 file changed, 24 insertions(+), 20 deletions(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 19fc9f3..f5b29ec 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -39,11 +39,15 @@ cpumask_var_t xen_cpu_initialized_map; -static DEFINE_PER_CPU(int, xen_resched_irq); -static DEFINE_PER_CPU(int, xen_callfunc_irq); -static DEFINE_PER_CPU(int, xen_callfuncsingle_irq); -static DEFINE_PER_CPU(int, xen_irq_work); -static DEFINE_PER_CPU(int, xen_debug_irq) = -1; +struct xen_common_irq { + int irq; + char *name; +}; +static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq); +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq); +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq); +static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work); +static DEFINE_PER_CPU(struct xen_common_irq, xen_debug_irq) = { .irq = -1 }; static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id); static irqreturn_t xen_call_function_single_interrupt(int irq, void *dev_id); @@ -100,20 +104,20 @@ static void __cpuinit cpu_bringup_and_idle(void) static void xen_smp_intr_free(unsigned int cpu) { - if (per_cpu(xen_resched_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu), NULL); - if (per_cpu(xen_callfunc_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu), NULL); - if (per_cpu(xen_debug_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu), NULL); - if (per_cpu(xen_callfuncsingle_irq, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu), + if (per_cpu(xen_resched_irq, cpu).irq >= 0) + unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu).irq, NULL); + if (per_cpu(xen_callfunc_irq, cpu).irq >= 0) + unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu).irq, NULL); + if (per_cpu(xen_debug_irq, cpu).irq >= 0) + unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu).irq, NULL); + if (per_cpu(xen_callfuncsingle_irq, cpu).irq >= 0) + unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu).irq, NULL); if (xen_hvm_domain()) return; - if (per_cpu(xen_irq_work, cpu) >= 0) - unbind_from_irqhandler(per_cpu(xen_irq_work, cpu), NULL); + if (per_cpu(xen_irq_work, cpu).irq >= 0) + unbind_from_irqhandler(per_cpu(xen_irq_work, cpu).irq, NULL); }; static int xen_smp_intr_init(unsigned int cpu) { @@ -129,7 +133,7 @@ static int xen_smp_intr_init(unsigned int cpu) NULL); if (rc < 0) goto fail; - per_cpu(xen_resched_irq, cpu) = rc; + per_cpu(xen_resched_irq, cpu).irq = rc; callfunc_name = kasprintf(GFP_KERNEL, "callfunc%d", cpu); rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_VECTOR, @@ -140,7 +144,7 @@ static int xen_smp_intr_init(unsigned int cpu) NULL); if (rc < 0) goto fail; - per_cpu(xen_callfunc_irq, cpu) = rc; + per_cpu(xen_callfunc_irq, cpu).irq = rc; debug_name = kasprintf(GFP_KERNEL, "debug%d", cpu); rc = bind_virq_to_irqhandler(VIRQ_DEBUG, cpu, xen_debug_interrupt, @@ -148,7 +152,7 @@ static int xen_smp_intr_init(unsigned int cpu) debug_name, NULL); if (rc < 0) goto fail; - per_cpu(xen_debug_irq, cpu) = rc; + per_cpu(xen_debug_irq, cpu).irq = rc; callfunc_name = kasprintf(GFP_KERNEL, "callfuncsingle%d", cpu); rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_SINGLE_VECTOR, @@ -159,7 +163,7 @@ static int xen_smp_intr_init(unsigned int cpu) NULL); if (rc < 0) goto fail; - per_cpu(xen_callfuncsingle_irq, cpu) = rc; + per_cpu(xen_callfuncsingle_irq, cpu).irq = rc; /* * The IRQ worker on PVHVM goes through the native path and uses the @@ -177,7 +181,7 @@ static int xen_smp_intr_init(unsigned int cpu) NULL); if (rc < 0) goto fail; - per_cpu(xen_irq_work, cpu) = rc; + per_cpu(xen_irq_work, cpu).irq = rc; return 0; -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 3/9] xen/smp: Set the per-cpu IRQ number to a valid default.
When we free it we want to make sure to set it to a default value of -1 so that we don''t double-free it (in case somebody calls us twice). Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 28 +++++++++++++++++++--------- 1 file changed, 19 insertions(+), 9 deletions(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index f5b29ec..6a483cd 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -43,10 +43,10 @@ struct xen_common_irq { int irq; char *name; }; -static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq); -static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq); -static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq); -static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work); +static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq) = { .irq = -1 }; +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq) = { .irq = -1 }; +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq) = { .irq = -1 }; +static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work) = { .irq = -1 }; static DEFINE_PER_CPU(struct xen_common_irq, xen_debug_irq) = { .irq = -1 }; static irqreturn_t xen_call_function_interrupt(int irq, void *dev_id); @@ -104,20 +104,30 @@ static void __cpuinit cpu_bringup_and_idle(void) static void xen_smp_intr_free(unsigned int cpu) { - if (per_cpu(xen_resched_irq, cpu).irq >= 0) + if (per_cpu(xen_resched_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu).irq, NULL); - if (per_cpu(xen_callfunc_irq, cpu).irq >= 0) + per_cpu(xen_resched_irq, cpu).irq = -1; + } + if (per_cpu(xen_callfunc_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu).irq, NULL); - if (per_cpu(xen_debug_irq, cpu).irq >= 0) + per_cpu(xen_callfunc_irq, cpu).irq = -1; + } + if (per_cpu(xen_debug_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu).irq, NULL); - if (per_cpu(xen_callfuncsingle_irq, cpu).irq >= 0) + per_cpu(xen_debug_irq, cpu).irq = -1; + } + if (per_cpu(xen_callfuncsingle_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu).irq, NULL); + per_cpu(xen_callfuncsingle_irq, cpu).irq = -1; + } if (xen_hvm_domain()) return; - if (per_cpu(xen_irq_work, cpu).irq >= 0) + if (per_cpu(xen_irq_work, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_irq_work, cpu).irq, NULL); + per_cpu(xen_irq_work, cpu).irq = -1; + } }; static int xen_smp_intr_init(unsigned int cpu) { -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 4/9] xen/smp: Don''t leak interrupt name when offlining.
When the user does: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online kmemleak reports: kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff88003fa51240 (size 32): comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s) hex dump (first 32 bytes): 72 65 73 63 68 65 64 31 00 00 00 00 00 00 00 00 resched1........ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81660721>] kmemleak_alloc+0x21/0x50 [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0 [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90 [<ffffffff812fe228>] kasprintf+0x38/0x40 [<ffffffff81047ed1>] xen_smp_intr_init+0x41/0x2c0 [<ffffffff816636d3>] xen_cpu_up+0x393/0x3e8 [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b [<ffffffff8166bd48>] cpu_up+0xd9/0xec [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3 [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6 [<ffffffff8165ce39>] kernel_init+0x9/0xf0 [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff This patch fixes some of it by using the ''struct xen_common_irq->name'' field to stash away the char so that it can be freed when the interrupt line is destroyed. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/smp.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 6a483cd..37fbe71 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -107,19 +107,27 @@ static void xen_smp_intr_free(unsigned int cpu) if (per_cpu(xen_resched_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_resched_irq, cpu).irq, NULL); per_cpu(xen_resched_irq, cpu).irq = -1; + kfree(per_cpu(xen_resched_irq, cpu).name); + per_cpu(xen_resched_irq, cpu).name = NULL; } if (per_cpu(xen_callfunc_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_callfunc_irq, cpu).irq, NULL); per_cpu(xen_callfunc_irq, cpu).irq = -1; + kfree(per_cpu(xen_callfunc_irq, cpu).name); + per_cpu(xen_callfunc_irq, cpu).name = NULL; } if (per_cpu(xen_debug_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_debug_irq, cpu).irq, NULL); per_cpu(xen_debug_irq, cpu).irq = -1; + kfree(per_cpu(xen_debug_irq, cpu).name); + per_cpu(xen_debug_irq, cpu).name = NULL; } if (per_cpu(xen_callfuncsingle_irq, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_callfuncsingle_irq, cpu).irq, NULL); per_cpu(xen_callfuncsingle_irq, cpu).irq = -1; + kfree(per_cpu(xen_callfuncsingle_irq, cpu).name); + per_cpu(xen_callfuncsingle_irq, cpu).name = NULL; } if (xen_hvm_domain()) return; @@ -127,12 +135,14 @@ static void xen_smp_intr_free(unsigned int cpu) if (per_cpu(xen_irq_work, cpu).irq >= 0) { unbind_from_irqhandler(per_cpu(xen_irq_work, cpu).irq, NULL); per_cpu(xen_irq_work, cpu).irq = -1; + kfree(per_cpu(xen_irq_work, cpu).name); + per_cpu(xen_irq_work, cpu).name = NULL; } }; static int xen_smp_intr_init(unsigned int cpu) { int rc; - const char *resched_name, *callfunc_name, *debug_name; + char *resched_name, *callfunc_name, *debug_name; resched_name = kasprintf(GFP_KERNEL, "resched%d", cpu); rc = bind_ipi_to_irqhandler(XEN_RESCHEDULE_VECTOR, @@ -144,6 +154,7 @@ static int xen_smp_intr_init(unsigned int cpu) if (rc < 0) goto fail; per_cpu(xen_resched_irq, cpu).irq = rc; + per_cpu(xen_resched_irq, cpu).name = resched_name; callfunc_name = kasprintf(GFP_KERNEL, "callfunc%d", cpu); rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_VECTOR, @@ -155,6 +166,7 @@ static int xen_smp_intr_init(unsigned int cpu) if (rc < 0) goto fail; per_cpu(xen_callfunc_irq, cpu).irq = rc; + per_cpu(xen_callfunc_irq, cpu).name = callfunc_name; debug_name = kasprintf(GFP_KERNEL, "debug%d", cpu); rc = bind_virq_to_irqhandler(VIRQ_DEBUG, cpu, xen_debug_interrupt, @@ -163,6 +175,7 @@ static int xen_smp_intr_init(unsigned int cpu) if (rc < 0) goto fail; per_cpu(xen_debug_irq, cpu).irq = rc; + per_cpu(xen_debug_irq, cpu).name = debug_name; callfunc_name = kasprintf(GFP_KERNEL, "callfuncsingle%d", cpu); rc = bind_ipi_to_irqhandler(XEN_CALL_FUNCTION_SINGLE_VECTOR, @@ -174,6 +187,7 @@ static int xen_smp_intr_init(unsigned int cpu) if (rc < 0) goto fail; per_cpu(xen_callfuncsingle_irq, cpu).irq = rc; + per_cpu(xen_callfuncsingle_irq, cpu).name = callfunc_name; /* * The IRQ worker on PVHVM goes through the native path and uses the @@ -192,6 +206,7 @@ static int xen_smp_intr_init(unsigned int cpu) if (rc < 0) goto fail; per_cpu(xen_irq_work, cpu).irq = rc; + per_cpu(xen_irq_work, cpu).name = callfunc_name; return 0; -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 5/9] xen/spinlock: Don''t leak interrupt name when offlining.
When the user does: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online kmemleak reports: kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) unreferenced object 0xffff88003fa51260 (size 32): comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s) hex dump (first 32 bytes): 73 70 69 6e 6c 6f 63 6b 31 00 00 00 00 00 00 00 spinlock1....... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81660721>] kmemleak_alloc+0x21/0x50 [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0 [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90 [<ffffffff812fe228>] kasprintf+0x38/0x40 [<ffffffff81663789>] xen_init_lock_cpu+0x61/0xbe [<ffffffff816633a6>] xen_cpu_up+0x66/0x3e8 [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b [<ffffffff8166bd48>] cpu_up+0xd9/0xec [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3 [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6 [<ffffffff8165ce39>] kernel_init+0x9/0xf0 [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff Instead of doing it like the "xen/smp: Don''t leak interrupt name when offlining" patch did (which has a per-cpu structure which contains both the IRQ number and char*) we use a per-cpu pointers to a *char. The reason is that the "__this_cpu_read(lock_kicker_irq);" macro blows up with "__bad_size_call_parameter()" as the size of the returned structure is not within the parameters of what it expects and optimizes for. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/spinlock.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index 3002ec1..d6c1857 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -7,6 +7,7 @@ #include <linux/debugfs.h> #include <linux/log2.h> #include <linux/gfp.h> +#include <linux/slab.h> #include <asm/paravirt.h> @@ -165,6 +166,7 @@ static int xen_spin_trylock(struct arch_spinlock *lock) return old == 0; } +static DEFINE_PER_CPU(char *, irq_name); static DEFINE_PER_CPU(int, lock_kicker_irq) = -1; static DEFINE_PER_CPU(struct xen_spinlock *, lock_spinners); @@ -385,6 +387,7 @@ void __cpuinit xen_init_lock_cpu(int cpu) if (irq >= 0) { disable_irq(irq); /* make sure it''s never delivered */ per_cpu(lock_kicker_irq, cpu) = irq; + per_cpu(irq_name, cpu) = name; } printk("cpu %d spinlock event irq %d\n", cpu, irq); @@ -401,6 +404,8 @@ void xen_uninit_lock_cpu(int cpu) unbind_from_irqhandler(per_cpu(lock_kicker_irq, cpu), NULL); per_cpu(lock_kicker_irq, cpu) = -1; + kfree(per_cpu(irq_name, cpu)); + per_cpu(irq_name, cpu) = NULL; } void __init xen_init_spinlocks(void) -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 6/9] xen/time: Encapsulate the struct clock_event_device in another structure.
We don''t do any code movement. We just encapsulate the struct clock_event_device in a new structure which contains said structure and a pointer to a char *name. The ''name'' will be used in ''xen/time: Don''t leak interrupt name when offlining''. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/time.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 3d88bfd..5190687 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -377,11 +377,16 @@ static const struct clock_event_device xen_vcpuop_clockevent = { static const struct clock_event_device *xen_clockevent &xen_timerop_clockevent; -static DEFINE_PER_CPU(struct clock_event_device, xen_clock_events) = { .irq = -1 }; + +struct xen_clock_event_device { + struct clock_event_device evt; + char *name; +}; +static DEFINE_PER_CPU(struct xen_clock_event_device, xen_clock_events) = { .evt.irq = -1 }; static irqreturn_t xen_timer_interrupt(int irq, void *dev_id) { - struct clock_event_device *evt = &__get_cpu_var(xen_clock_events); + struct clock_event_device *evt = &__get_cpu_var(xen_clock_events).evt; irqreturn_t ret; ret = IRQ_NONE; @@ -401,7 +406,7 @@ void xen_setup_timer(int cpu) struct clock_event_device *evt; int irq; - evt = &per_cpu(xen_clock_events, cpu); + evt = &per_cpu(xen_clock_events, cpu).evt; WARN(evt->irq >= 0, "IRQ%d for CPU%d is already allocated\n", evt->irq, cpu); printk(KERN_INFO "installing Xen timer for CPU %d\n", cpu); @@ -426,7 +431,7 @@ void xen_teardown_timer(int cpu) { struct clock_event_device *evt; BUG_ON(cpu == 0); - evt = &per_cpu(xen_clock_events, cpu); + evt = &per_cpu(xen_clock_events, cpu).evt; unbind_from_irqhandler(evt->irq, NULL); evt->irq = -1; } @@ -435,7 +440,7 @@ void xen_setup_cpu_clockevents(void) { BUG_ON(preemptible()); - clockevents_register_device(&__get_cpu_var(xen_clock_events)); + clockevents_register_device(&__get_cpu_var(xen_clock_events).evt); } void xen_timer_resume(void) -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 7/9] xen/time: Don''t leak interrupt name when offlining.
When the user does: echo 0 > /sys/devices/system/cpu/cpu1/online echo 1 > /sys/devices/system/cpu/cpu1/online kmemleak reports: kmemleak: 7 new suspected memory leaks (see /sys/kernel/debug/kmemleak) One of the leaks is from xen/time: unreferenced object 0xffff88003fa51280 (size 32): comm "swapper/0", pid 1, jiffies 4294667339 (age 1027.789s) hex dump (first 32 bytes): 74 69 6d 65 72 31 00 00 00 00 00 00 00 00 00 00 timer1.......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff81660721>] kmemleak_alloc+0x21/0x50 [<ffffffff81190aac>] __kmalloc_track_caller+0xec/0x2a0 [<ffffffff812fe1bb>] kvasprintf+0x5b/0x90 [<ffffffff812fe228>] kasprintf+0x38/0x40 [<ffffffff81041ec1>] xen_setup_timer+0x51/0xf0 [<ffffffff8166339f>] xen_cpu_up+0x5f/0x3e8 [<ffffffff8166bbf5>] _cpu_up+0xd1/0x14b [<ffffffff8166bd48>] cpu_up+0xd9/0xec [<ffffffff81ae6e4a>] smp_init+0x4b/0xa3 [<ffffffff81ac4981>] kernel_init_freeable+0xdb/0x1e6 [<ffffffff8165ce39>] kernel_init+0x9/0xf0 [<ffffffff8167edfc>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff This patch fixes it by stashing away the ''name'' in the per-cpu data structure and freeing it when offlining the CPU. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/time.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 5190687..011f1bf 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -14,6 +14,7 @@ #include <linux/kernel_stat.h> #include <linux/math64.h> #include <linux/gfp.h> +#include <linux/slab.h> #include <asm/pvclock.h> #include <asm/xen/hypervisor.h> @@ -402,7 +403,7 @@ static irqreturn_t xen_timer_interrupt(int irq, void *dev_id) void xen_setup_timer(int cpu) { - const char *name; + char *name; struct clock_event_device *evt; int irq; @@ -425,6 +426,7 @@ void xen_setup_timer(int cpu) evt->cpumask = cpumask_of(cpu); evt->irq = irq; + per_cpu(xen_clock_events, cpu).name = name; } void xen_teardown_timer(int cpu) @@ -434,6 +436,8 @@ void xen_teardown_timer(int cpu) evt = &per_cpu(xen_clock_events, cpu).evt; unbind_from_irqhandler(evt->irq, NULL); evt->irq = -1; + kfree(per_cpu(xen_clock_events, cpu).name); + per_cpu(xen_clock_events, cpu).name = NULL; } void xen_setup_cpu_clockevents(void) -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 8/9] xen/time: Check that the per_cpu data structure has data before freeing.
We don''t check whether the per_cpu data structure has actually been freed in the past. This checks it and if it has been freed in the past then just continues on without double-freeing. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/time.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 011f1bf..6a56ae0 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -434,10 +434,13 @@ void xen_teardown_timer(int cpu) struct clock_event_device *evt; BUG_ON(cpu == 0); evt = &per_cpu(xen_clock_events, cpu).evt; - unbind_from_irqhandler(evt->irq, NULL); - evt->irq = -1; - kfree(per_cpu(xen_clock_events, cpu).name); - per_cpu(xen_clock_events, cpu).name = NULL; + + if (evt->irq >= 0) { + unbind_from_irqhandler(evt->irq, NULL); + evt->irq = -1; + kfree(per_cpu(xen_clock_events, cpu).name); + per_cpu(xen_clock_events, cpu).name = NULL; + } } void xen_setup_cpu_clockevents(void) -- 1.8.1.4
Konrad Rzeszutek Wilk
2013-Jun-05 15:54 UTC
[PATCH 9/9] xen/time: Free onlined per-cpu data structure if we want to online it again.
If the per-cpu time data structure has been onlined already and we are trying to online it again, then free the previous copy before blindly over-writting it. A developer naturally should not call this function multiple times but just in case. Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/xen/time.c | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-) diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c index 6a56ae0..aec0b14 100644 --- a/arch/x86/xen/time.c +++ b/arch/x86/xen/time.c @@ -401,6 +401,20 @@ static irqreturn_t xen_timer_interrupt(int irq, void *dev_id) return ret; } +void xen_teardown_timer(int cpu) +{ + struct clock_event_device *evt; + BUG_ON(cpu == 0); + evt = &per_cpu(xen_clock_events, cpu).evt; + + if (evt->irq >= 0) { + unbind_from_irqhandler(evt->irq, NULL); + evt->irq = -1; + kfree(per_cpu(xen_clock_events, cpu).name); + per_cpu(xen_clock_events, cpu).name = NULL; + } +} + void xen_setup_timer(int cpu) { char *name; @@ -409,6 +423,8 @@ void xen_setup_timer(int cpu) evt = &per_cpu(xen_clock_events, cpu).evt; WARN(evt->irq >= 0, "IRQ%d for CPU%d is already allocated\n", evt->irq, cpu); + if (evt->irq >= 0) + xen_teardown_timer(cpu); printk(KERN_INFO "installing Xen timer for CPU %d\n", cpu); @@ -429,19 +445,6 @@ void xen_setup_timer(int cpu) per_cpu(xen_clock_events, cpu).name = name; } -void xen_teardown_timer(int cpu) -{ - struct clock_event_device *evt; - BUG_ON(cpu == 0); - evt = &per_cpu(xen_clock_events, cpu).evt; - - if (evt->irq >= 0) { - unbind_from_irqhandler(evt->irq, NULL); - evt->irq = -1; - kfree(per_cpu(xen_clock_events, cpu).name); - per_cpu(xen_clock_events, cpu).name = NULL; - } -} void xen_setup_cpu_clockevents(void) { -- 1.8.1.4
Jan Beulich
2013-Jun-06 07:16 UTC
Re: [PATCH 2/9] xen/smp: Introduce a common structure to contain the IRQ name and interrupt line.
>>> On 05.06.13 at 17:54, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > This patch adds a new structure to contain the common two things > that each of the per-cpu interrupts need: > - an interrupt number, > - and the name of the interrupt (to be added in ''xen/smp: Don''t leak > interrupt name when offlining''). > > This allows us to carry the tuple of the per-cpu interrupt data structure > and expand it as we need in the future. > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/xen/smp.c | 44 ++++++++++++++++++++++++-------------------- > 1 file changed, 24 insertions(+), 20 deletions(-) > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > index 19fc9f3..f5b29ec 100644 > --- a/arch/x86/xen/smp.c > +++ b/arch/x86/xen/smp.c > @@ -39,11 +39,15 @@ > > cpumask_var_t xen_cpu_initialized_map; > > -static DEFINE_PER_CPU(int, xen_resched_irq); > -static DEFINE_PER_CPU(int, xen_callfunc_irq); > -static DEFINE_PER_CPU(int, xen_callfuncsingle_irq); > -static DEFINE_PER_CPU(int, xen_irq_work); > -static DEFINE_PER_CPU(int, xen_debug_irq) = -1; > +struct xen_common_irq { > + int irq; > + char *name; > +}; > +static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq); > +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq); > +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq); > +static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work); > +static DEFINE_PER_CPU(struct xen_common_irq, xen_debug_irq) = { .irq = -1 };I have to admit that I''m quite puzzled by this still being massaged the way it is, rather than getting converted to proper per-CPU IRQs (i.e. with one global IRQ and only per-CPU event channels). Not only conserves this on resources (irrespective of IRQs not being a scarce resources anymore with SPARSE_IRQS), it also makes /proc/interrupts output a lot more manageable on domains with many vCPU-s. I did this years ago for our kernels, but due to time constraints can''t offer to do the same for the upstream kernel until we get ready to switch to using it in favor of the forward port. Jan
Konrad Rzeszutek Wilk
2013-Jun-06 18:58 UTC
Re: [PATCH 2/9] xen/smp: Introduce a common structure to contain the IRQ name and interrupt line.
On Thu, Jun 06, 2013 at 08:16:39AM +0100, Jan Beulich wrote:> >>> On 05.06.13 at 17:54, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > This patch adds a new structure to contain the common two things > > that each of the per-cpu interrupts need: > > - an interrupt number, > > - and the name of the interrupt (to be added in ''xen/smp: Don''t leak > > interrupt name when offlining''). > > > > This allows us to carry the tuple of the per-cpu interrupt data structure > > and expand it as we need in the future. > > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > arch/x86/xen/smp.c | 44 ++++++++++++++++++++++++-------------------- > > 1 file changed, 24 insertions(+), 20 deletions(-) > > > > diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c > > index 19fc9f3..f5b29ec 100644 > > --- a/arch/x86/xen/smp.c > > +++ b/arch/x86/xen/smp.c > > @@ -39,11 +39,15 @@ > > > > cpumask_var_t xen_cpu_initialized_map; > > > > -static DEFINE_PER_CPU(int, xen_resched_irq); > > -static DEFINE_PER_CPU(int, xen_callfunc_irq); > > -static DEFINE_PER_CPU(int, xen_callfuncsingle_irq); > > -static DEFINE_PER_CPU(int, xen_irq_work); > > -static DEFINE_PER_CPU(int, xen_debug_irq) = -1; > > +struct xen_common_irq { > > + int irq; > > + char *name; > > +}; > > +static DEFINE_PER_CPU(struct xen_common_irq, xen_resched_irq); > > +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfunc_irq); > > +static DEFINE_PER_CPU(struct xen_common_irq, xen_callfuncsingle_irq); > > +static DEFINE_PER_CPU(struct xen_common_irq, xen_irq_work); > > +static DEFINE_PER_CPU(struct xen_common_irq, xen_debug_irq) = { .irq = -1 }; > > I have to admit that I''m quite puzzled by this still being massaged > the way it is, rather than getting converted to proper per-CPU > IRQs (i.e. with one global IRQ and only per-CPU event channels). > Not only conserves this on resources (irrespective of IRQs not > being a scarce resources anymore with SPARSE_IRQS), it also > makes /proc/interrupts output a lot more manageable on domains > with many vCPU-s. > > I did this years ago for our kernels, but due to time constraints > can''t offer to do the same for the upstream kernel until we get > ready to switch to using it in favor of the forward port.I believe it was just the matter of other higher priority items preempting it. Let me put it on the "v3.11 and the future list" so that I won''t forget. Thanks!