Stefano Stabellini
2013-Jul-23 17:26 UTC
[PATCH v3 0/2] make ballooned out pages have a valid mapping at all times
Hi all, this patch series limits problems caused by tcp retransmits on NFS when the original block pages were mapped from a foreign domain and now the mapping is gone. It accomplishes the goal by: 1) mapping all ballooned out pages to a per-cpu "balloon_scratch_page"; 2) making sure that once a grant is unmapped, the original mapping to the per-cpu balloon_scratch_page is restored atomically. The first patch accomplishes (1), the second patch uses GNTTABOP_unmap_and_replace to atomically unmap a grant and restore the original mapping. Stefano Stabellini (2): xen/balloon: set a mapping for ballooned out pages xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping arch/x86/xen/p2m.c | 22 ++++++++++++------ drivers/xen/balloon.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-- drivers/xen/gntdev.c | 11 +------- include/xen/balloon.h | 3 ++ 4 files changed, 75 insertions(+), 19 deletions(-) Cheers, Stefano
Stefano Stabellini
2013-Jul-23 17:27 UTC
[PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY in the p2m. These ballooned out pages are used to map foreign grants by gntdev and blkback (see alloc_xenballooned_pages). Allocate a page per cpu and map all the ballooned out pages to the corresponding mfn. Set the p2m accordingly. This way reading from a ballooned out page won''t cause a kernel crash (see http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: alex@alex.org.uk CC: dcrisan@flexiant.com --- drivers/xen/balloon.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++-- include/xen/balloon.h | 3 ++ 2 files changed, 58 insertions(+), 3 deletions(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 930fb68..b9260dd 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -36,6 +36,7 @@ * IN THE SOFTWARE. */ +#include <linux/cpu.h> #include <linux/kernel.h> #include <linux/sched.h> #include <linux/errno.h> @@ -50,6 +51,7 @@ #include <linux/notifier.h> #include <linux/memory.h> #include <linux/memory_hotplug.h> +#include <linux/percpu-defs.h> #include <asm/page.h> #include <asm/pgalloc.h> @@ -88,6 +90,8 @@ EXPORT_SYMBOL_GPL(balloon_stats); /* We increase/decrease in batches which fit in a page */ static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; +static DEFINE_PER_CPU(struct page *, balloon_scratch_page); + #ifdef CONFIG_HIGHMEM #define inc_totalhigh_pages() (totalhigh_pages++) @@ -423,7 +427,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) if (xen_pv_domain() && !PageHighMem(page)) { ret = HYPERVISOR_update_va_mapping( (unsigned long)__va(pfn << PAGE_SHIFT), - __pte_ma(0), 0); + pfn_pte(page_to_pfn(__get_cpu_var(balloon_scratch_page)), + PAGE_KERNEL_RO), 0); BUG_ON(ret); } #endif @@ -436,7 +441,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) /* No more mappings: invalidate P2M and add to balloon. */ for (i = 0; i < nr_pages; i++) { pfn = mfn_to_pfn(frame_list[i]); - __set_phys_to_machine(pfn, INVALID_P2M_ENTRY); + __set_phys_to_machine(pfn, + pfn_to_mfn(page_to_pfn(__get_cpu_var(balloon_scratch_page)))); balloon_append(pfn_to_page(pfn)); } @@ -491,6 +497,18 @@ static void balloon_process(struct work_struct *work) mutex_unlock(&balloon_mutex); } +struct page* get_balloon_scratch_page(void) +{ + struct page *ret = get_cpu_var(balloon_scratch_page); + BUG_ON(ret == NULL); + return ret; +} + +void put_balloon_scratch_page(void) +{ + put_cpu_var(balloon_scratch_page); +} + /* Resets the Xen limit, sets new target, and kicks off processing. */ void balloon_set_new_target(unsigned long target) { @@ -584,13 +602,47 @@ static void __init balloon_add_region(unsigned long start_pfn, } } +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + int cpu = (long)hcpu; + switch (action) { + case CPU_UP_PREPARE: + if (per_cpu(balloon_scratch_page, cpu) != NULL) + break; + per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL); + if (per_cpu(balloon_scratch_page, cpu) == NULL) { + pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu); + return NOTIFY_BAD; + } + break; + default: + break; + } + return NOTIFY_OK; +} + +static struct notifier_block balloon_cpu_notifier __cpuinitdata = { + .notifier_call = balloon_cpu_notify, +}; + static int __init balloon_init(void) { - int i; + int i, cpu; if (!xen_domain()) return -ENODEV; + for_each_online_cpu(cpu) + { + per_cpu(balloon_scratch_page, cpu) = alloc_page(GFP_KERNEL); + if (per_cpu(balloon_scratch_page, cpu) == NULL) { + pr_warn("Failed to allocate balloon_scratch_page for cpu %d\n", cpu); + return -ENOMEM; + } + } + register_cpu_notifier(&balloon_cpu_notifier); + pr_info("xen/balloon: Initialising balloon driver.\n"); balloon_stats.current_pages = xen_pv_domain() diff --git a/include/xen/balloon.h b/include/xen/balloon.h index cc2e1a7..7a819b7 100644 --- a/include/xen/balloon.h +++ b/include/xen/balloon.h @@ -29,6 +29,9 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem); void free_xenballooned_pages(int nr_pages, struct page **pages); +struct page* get_balloon_scratch_page(void); +void put_balloon_scratch_page(void); + struct device; #ifdef CONFIG_XEN_SELFBALLOONING extern int register_xen_selfballooning(struct device *dev); -- 1.7.2.5
Stefano Stabellini
2013-Jul-23 17:27 UTC
[PATCH v3 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 mapping instead of reinstating the original mapping. Doing so separately would be racy. To unmap a grant and reinstate the original mapping atomically we use GNTTABOP_unmap_and_replace. GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping passed in new_addr so we have to reinstate it, however that is a per-cpu mapping only used for balloon scratch pages, so we can be sure that it''s not going to be accessed while the mapping is not valid. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Reviewed-by: David Vrabel <david.vrabel@citrix.com> CC: alex@alex.org.uk CC: dcrisan@flexiant.com --- arch/x86/xen/p2m.c | 22 +++++++++++++++------- drivers/xen/gntdev.c | 11 ++--------- 2 files changed, 17 insertions(+), 16 deletions(-) diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index 95fb2aa..0d4ec35 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -161,6 +161,7 @@ #include <asm/xen/page.h> #include <asm/xen/hypercall.h> #include <asm/xen/hypervisor.h> +#include <xen/balloon.h> #include <xen/grant_table.h> #include "multicalls.h" @@ -967,7 +968,10 @@ int m2p_remove_override(struct page *page, if (kmap_op != NULL) { if (!PageHighMem(page)) { struct multicall_space mcs; - struct gnttab_unmap_grant_ref *unmap_op; + struct gnttab_unmap_and_replace *unmap_op; + struct page *scratch_page = get_balloon_scratch_page(); + unsigned long scratch_page_address = (unsigned long) + __va(page_to_pfn(scratch_page) << PAGE_SHIFT); /* * It might be that we queued all the m2p grant table @@ -990,21 +994,25 @@ int m2p_remove_override(struct page *page, } mcs = xen_mc_entry( - sizeof(struct gnttab_unmap_grant_ref)); + sizeof(struct gnttab_unmap_and_replace)); unmap_op = mcs.args; unmap_op->host_addr = kmap_op->host_addr; + unmap_op->new_addr = scratch_page_address; unmap_op->handle = kmap_op->handle; - unmap_op->dev_bus_addr = 0; MULTI_grant_table_op(mcs.mc, - GNTTABOP_unmap_grant_ref, unmap_op, 1); + GNTTABOP_unmap_and_replace, unmap_op, 1); xen_mc_issue(PARAVIRT_LAZY_MMU); - set_pte_at(&init_mm, address, ptep, - pfn_pte(pfn, PAGE_KERNEL)); - __flush_tlb_single(address); + mcs = __xen_mc_entry(0); + MULTI_update_va_mapping(mcs.mc, scratch_page_address, + pfn_pte(page_to_pfn(get_balloon_scratch_page()), + PAGE_KERNEL_RO), 0); + xen_mc_issue(PARAVIRT_LAZY_MMU); + kmap_op->host_addr = 0; + put_balloon_scratch_page(); } } diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 3c8803f..51f4c95 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -270,19 +270,12 @@ static int map_grant_pages(struct grant_map *map) * with find_grant_ptes. */ for (i = 0; i < map->count; i++) { - unsigned level; unsigned long address = (unsigned long) pfn_to_kaddr(page_to_pfn(map->pages[i])); - pte_t *ptep; - u64 pte_maddr = 0; BUG_ON(PageHighMem(map->pages[i])); - ptep = lookup_address(address, &level); - pte_maddr = arbitrary_virt_to_machine(ptep).maddr; - gnttab_set_map_op(&map->kmap_ops[i], pte_maddr, - map->flags | - GNTMAP_host_map | - GNTMAP_contains_pte, + gnttab_set_map_op(&map->kmap_ops[i], address, + map->flags | GNTMAP_host_map, map->grants[i].ref, map->grants[i].domid); } -- 1.7.2.5
Ian Campbell
2013-Jul-23 18:00 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote:> +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > + unsigned long action, void *hcpu) > +{ > + int cpu = (long)hcpu; > + switch (action) { > + case CPU_UP_PREPARE: > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > + break;Thinking about this a bit more -- do we know what happens to the per-cpu area for a CPU which is unplugged and then reintroduced? Is it preserved or is it reset? If it is reset then this gets more complicated :-( We might be able to use the core mm page reference count, so that when the last reference is removed the page is automatically reclaimed. We can obviously take a reference whenever we add a mapping of the trade page, but I''m not sure we are always on the path which removes such mappings... Even then you could waste pages for some potentially large amount of time each time you replug a VCPU. Urg, I really hope the per-cpu area is preserved! Ian.
Konrad Rzeszutek Wilk
2013-Jul-23 19:05 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, Jul 23, 2013 at 07:00:09PM +0100, Ian Campbell wrote:> On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote: > > +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > > + unsigned long action, void *hcpu) > > +{ > > + int cpu = (long)hcpu; > > + switch (action) { > > + case CPU_UP_PREPARE: > > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > > + break; > > Thinking about this a bit more -- do we know what happens to the per-cpu > area for a CPU which is unplugged and then reintroduced? Is it preserved > or is it reset? > > If it is reset then this gets more complicated :-( We might be able to > use the core mm page reference count, so that when the last reference is > removed the page is automatically reclaimed. We can obviously take a > reference whenever we add a mapping of the trade page, but I''m not sure > we are always on the path which removes such mappings... Even then you > could waste pages for some potentially large amount of time each time > you replug a VCPU. > > Urg, I really hope the per-cpu area is preserved!It is. During bootup time you see this: [ 0.000000] smpboot: Allowing 128 CPUs, 96 hotplug CPU [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1 which means that all of the per_CPU are shrunk down to 128 (from CONFIG_NR_CPUS=512 was built with) and stays for the lifetime of the kernel. You might have to clear it when the vCPU comes back up though - otherwise you will have garbage. Or you can use the zalloc_cpumask_var_node which will allocate a dynamic version of this. (based on the possible_cpus - so in this case 128).> > Ian. >
Stefano Stabellini
2013-Jul-24 11:05 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, 23 Jul 2013, Konrad Rzeszutek Wilk wrote:> On Tue, Jul 23, 2013 at 07:00:09PM +0100, Ian Campbell wrote: > > On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote: > > > +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > > > + unsigned long action, void *hcpu) > > > +{ > > > + int cpu = (long)hcpu; > > > + switch (action) { > > > + case CPU_UP_PREPARE: > > > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > > > + break; > > > > Thinking about this a bit more -- do we know what happens to the per-cpu > > area for a CPU which is unplugged and then reintroduced? Is it preserved > > or is it reset? > > > > If it is reset then this gets more complicated :-( We might be able to > > use the core mm page reference count, so that when the last reference is > > removed the page is automatically reclaimed. We can obviously take a > > reference whenever we add a mapping of the trade page, but I''m not sure > > we are always on the path which removes such mappings... Even then you > > could waste pages for some potentially large amount of time each time > > you replug a VCPU. > > > > Urg, I really hope the per-cpu area is preserved! > > It is. During bootup time you see this: > > [ 0.000000] smpboot: Allowing 128 CPUs, 96 hotplug CPU > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1 > > which means that all of the per_CPU are shrunk down to 128 (from > CONFIG_NR_CPUS=512 was built with) and stays for the lifetime of the kernel. > > You might have to clear it when the vCPU comes back up though - otherwise you > will have garbage.I don''t see anything in the hotplug code that would modify the value of the per_cpu area of offline cpus.
Konrad Rzeszutek Wilk
2013-Jul-24 14:58 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Wed, Jul 24, 2013 at 12:05:05PM +0100, Stefano Stabellini wrote:> On Tue, 23 Jul 2013, Konrad Rzeszutek Wilk wrote: > > On Tue, Jul 23, 2013 at 07:00:09PM +0100, Ian Campbell wrote: > > > On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote: > > > > +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > > > > + unsigned long action, void *hcpu) > > > > +{ > > > > + int cpu = (long)hcpu; > > > > + switch (action) { > > > > + case CPU_UP_PREPARE: > > > > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > > > > + break; > > > > > > Thinking about this a bit more -- do we know what happens to the per-cpu > > > area for a CPU which is unplugged and then reintroduced? Is it preserved > > > or is it reset? > > > > > > If it is reset then this gets more complicated :-( We might be able to > > > use the core mm page reference count, so that when the last reference is > > > removed the page is automatically reclaimed. We can obviously take a > > > reference whenever we add a mapping of the trade page, but I''m not sure > > > we are always on the path which removes such mappings... Even then you > > > could waste pages for some potentially large amount of time each time > > > you replug a VCPU. > > > > > > Urg, I really hope the per-cpu area is preserved! > > > > It is. During bootup time you see this: > > > > [ 0.000000] smpboot: Allowing 128 CPUs, 96 hotplug CPU > > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1 > > > > which means that all of the per_CPU are shrunk down to 128 (from > > CONFIG_NR_CPUS=512 was built with) and stays for the lifetime of the kernel. > > > > You might have to clear it when the vCPU comes back up though - otherwise you > > will have garbage. > > I don''t see anything in the hotplug code that would modify the value of > the per_cpu area of offline cpus.You might have never onlined the CPUs and the kernel is built with DEBUG options which poison the page. Anyhow, doing a memset seems like a prudent thing to do? Perhaps when built with CONFG_DEBUG_XENFS you add poison values to it?
David Vrabel
2013-Jul-24 17:37 UTC
Re: [Xen-devel] [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On 23/07/13 18:27, Stefano Stabellini wrote:> Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY > in the p2m. These ballooned out pages are used to map foreign grants > by gntdev and blkback (see alloc_xenballooned_pages). > > Allocate a page per cpu and map all the ballooned out pages to the > corresponding mfn. Set the p2m accordingly. This way reading from a > ballooned out page won''t cause a kernel crash (see > http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html).Reviewed-by: David Vrabel <david.vrabel@citrix.com> A number of users of DEFINE_PER_CPU() initialize it with for_each_possible_cpu() without registering a cpu notifier, so I think there is no risk that offlining a CPU clears its per-cpu data and the code as-is is fine. David
Ian Campbell
2013-Jul-25 03:31 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Wed, 2013-07-24 at 10:58 -0400, Konrad Rzeszutek Wilk wrote:> On Wed, Jul 24, 2013 at 12:05:05PM +0100, Stefano Stabellini wrote: > > On Tue, 23 Jul 2013, Konrad Rzeszutek Wilk wrote: > > > On Tue, Jul 23, 2013 at 07:00:09PM +0100, Ian Campbell wrote: > > > > On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote: > > > > > +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > > > > > + unsigned long action, void *hcpu) > > > > > +{ > > > > > + int cpu = (long)hcpu; > > > > > + switch (action) { > > > > > + case CPU_UP_PREPARE: > > > > > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > > > > > + break; > > > > > > > > Thinking about this a bit more -- do we know what happens to the per-cpu > > > > area for a CPU which is unplugged and then reintroduced? Is it preserved > > > > or is it reset? > > > > > > > > If it is reset then this gets more complicated :-( We might be able to > > > > use the core mm page reference count, so that when the last reference is > > > > removed the page is automatically reclaimed. We can obviously take a > > > > reference whenever we add a mapping of the trade page, but I''m not sure > > > > we are always on the path which removes such mappings... Even then you > > > > could waste pages for some potentially large amount of time each time > > > > you replug a VCPU. > > > > > > > > Urg, I really hope the per-cpu area is preserved! > > > > > > It is. During bootup time you see this: > > > > > > [ 0.000000] smpboot: Allowing 128 CPUs, 96 hotplug CPU > > > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1 > > > > > > which means that all of the per_CPU are shrunk down to 128 (from > > > CONFIG_NR_CPUS=512 was built with) and stays for the lifetime of the kernel. > > > > > > You might have to clear it when the vCPU comes back up though - otherwise you > > > will have garbage. > > > > I don''t see anything in the hotplug code that would modify the value of > > the per_cpu area of offline cpus. > > You might have never onlined the CPUs and the kernel is built with DEBUG options > which poison the page. > > Anyhow, doing a memset seems like a prudent thing to do? Perhaps when > built with CONFG_DEBUG_XENFS you add poison values to it?The point is that the patches need for the per-cpu areas to *not* be reinitialised over a vcpu unplug+plug, otherwise we will leak the original page when we allocate the new one on plug. We can''t just free the page on vcpu unplug because it might still be in use. Ian.
Konrad Rzeszutek Wilk
2013-Jul-29 14:10 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Thu, Jul 25, 2013 at 04:31:07AM +0100, Ian Campbell wrote:> On Wed, 2013-07-24 at 10:58 -0400, Konrad Rzeszutek Wilk wrote: > > On Wed, Jul 24, 2013 at 12:05:05PM +0100, Stefano Stabellini wrote: > > > On Tue, 23 Jul 2013, Konrad Rzeszutek Wilk wrote: > > > > On Tue, Jul 23, 2013 at 07:00:09PM +0100, Ian Campbell wrote: > > > > > On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote: > > > > > > +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > > > > > > + unsigned long action, void *hcpu) > > > > > > +{ > > > > > > + int cpu = (long)hcpu; > > > > > > + switch (action) { > > > > > > + case CPU_UP_PREPARE: > > > > > > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > > > > > > + break; > > > > > > > > > > Thinking about this a bit more -- do we know what happens to the per-cpu > > > > > area for a CPU which is unplugged and then reintroduced? Is it preserved > > > > > or is it reset? > > > > > > > > > > If it is reset then this gets more complicated :-( We might be able to > > > > > use the core mm page reference count, so that when the last reference is > > > > > removed the page is automatically reclaimed. We can obviously take a > > > > > reference whenever we add a mapping of the trade page, but I''m not sure > > > > > we are always on the path which removes such mappings... Even then you > > > > > could waste pages for some potentially large amount of time each time > > > > > you replug a VCPU. > > > > > > > > > > Urg, I really hope the per-cpu area is preserved! > > > > > > > > It is. During bootup time you see this: > > > > > > > > [ 0.000000] smpboot: Allowing 128 CPUs, 96 hotplug CPU > > > > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1 > > > > > > > > which means that all of the per_CPU are shrunk down to 128 (from > > > > CONFIG_NR_CPUS=512 was built with) and stays for the lifetime of the kernel. > > > > > > > > You might have to clear it when the vCPU comes back up though - otherwise you > > > > will have garbage. > > > > > > I don''t see anything in the hotplug code that would modify the value of > > > the per_cpu area of offline cpus. > > > > You might have never onlined the CPUs and the kernel is built with DEBUG options > > which poison the page. > > > > Anyhow, doing a memset seems like a prudent thing to do? Perhaps when > > built with CONFG_DEBUG_XENFS you add poison values to it? > > The point is that the patches need for the per-cpu areas to *not* be > reinitialised over a vcpu unplug+plug, otherwise we will leak the > original page when we allocate the new one on plug.OK.> > We can''t just free the page on vcpu unplug because it might still be in > use.I am still worried about before-the-cpu-is-up-the-per-cpu-has-garbage case. We could add code in the boot-before-smp (so when there is only one CPU) to do: for_each_possible(cpu) memset(__per_cpu(some_memory),0,sizeof...); and then I think it satisfies your concerns and mine?> > Ian. >
Stefano Stabellini
2013-Aug-04 14:30 UTC
Re: [PATCH v3 1/2] xen/balloon: set a mapping for ballooned out pages
On Mon, 29 Jul 2013, Konrad Rzeszutek Wilk wrote:> On Thu, Jul 25, 2013 at 04:31:07AM +0100, Ian Campbell wrote: > > On Wed, 2013-07-24 at 10:58 -0400, Konrad Rzeszutek Wilk wrote: > > > On Wed, Jul 24, 2013 at 12:05:05PM +0100, Stefano Stabellini wrote: > > > > On Tue, 23 Jul 2013, Konrad Rzeszutek Wilk wrote: > > > > > On Tue, Jul 23, 2013 at 07:00:09PM +0100, Ian Campbell wrote: > > > > > > On Tue, 2013-07-23 at 18:27 +0100, Stefano Stabellini wrote: > > > > > > > +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, > > > > > > > + unsigned long action, void *hcpu) > > > > > > > +{ > > > > > > > + int cpu = (long)hcpu; > > > > > > > + switch (action) { > > > > > > > + case CPU_UP_PREPARE: > > > > > > > + if (per_cpu(balloon_scratch_page, cpu) != NULL) > > > > > > > + break; > > > > > > > > > > > > Thinking about this a bit more -- do we know what happens to the per-cpu > > > > > > area for a CPU which is unplugged and then reintroduced? Is it preserved > > > > > > or is it reset? > > > > > > > > > > > > If it is reset then this gets more complicated :-( We might be able to > > > > > > use the core mm page reference count, so that when the last reference is > > > > > > removed the page is automatically reclaimed. We can obviously take a > > > > > > reference whenever we add a mapping of the trade page, but I''m not sure > > > > > > we are always on the path which removes such mappings... Even then you > > > > > > could waste pages for some potentially large amount of time each time > > > > > > you replug a VCPU. > > > > > > > > > > > > Urg, I really hope the per-cpu area is preserved! > > > > > > > > > > It is. During bootup time you see this: > > > > > > > > > > [ 0.000000] smpboot: Allowing 128 CPUs, 96 hotplug CPU > > > > > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1 > > > > > > > > > > which means that all of the per_CPU are shrunk down to 128 (from > > > > > CONFIG_NR_CPUS=512 was built with) and stays for the lifetime of the kernel. > > > > > > > > > > You might have to clear it when the vCPU comes back up though - otherwise you > > > > > will have garbage. > > > > > > > > I don''t see anything in the hotplug code that would modify the value of > > > > the per_cpu area of offline cpus. > > > > > > You might have never onlined the CPUs and the kernel is built with DEBUG options > > > which poison the page. > > > > > > Anyhow, doing a memset seems like a prudent thing to do? Perhaps when > > > built with CONFG_DEBUG_XENFS you add poison values to it? > > > > The point is that the patches need for the per-cpu areas to *not* be > > reinitialised over a vcpu unplug+plug, otherwise we will leak the > > original page when we allocate the new one on plug. > > OK. > > > > We can''t just free the page on vcpu unplug because it might still be in > > use. > > I am still worried about before-the-cpu-is-up-the-per-cpu-has-garbage case. > We could add code in the boot-before-smp (so when there is only one CPU) to > do: > > for_each_possible(cpu) > memset(__per_cpu(some_memory),0,sizeof...); > > and then I think it satisfies your concerns and mine?OK, I''ll add an early_initcall.