Stefano Stabellini
2013-Jul-22 16:21 UTC
[PATCH v2 0/2] make ballooned out pages have a valid mapping at all times
Hi all, this patch series limits problems caused by tcp retransmits on NFS when the original block pages were mapped from a foreign domain and now the mapping is gone. It accomplishes the goal by: 1) mapping all ballooned out pages to a per-cpu "balloon_trade_page"; 2) making sure that once a grant is unmapped, the original mapping to the per-cpu balloon_trade_page is restored atomically. The first patch accomplishes (1), the second patch uses GNTTABOP_unmap_and_replace to atomically unmap a grant and restore the original mapping. Stefano Stabellini (2): xen/balloon: set a mapping for ballooned out pages xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping arch/x86/xen/p2m.c | 20 +++++++++++------ drivers/xen/balloon.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-- drivers/xen/gntdev.c | 11 +-------- include/xen/balloon.h | 2 + 4 files changed, 68 insertions(+), 19 deletions(-) Cheers, Stefano
Stefano Stabellini
2013-Jul-22 16:28 UTC
[PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY in the p2m. These ballooned out pages are used to map foreign grants by gntdev and blkback (see alloc_xenballooned_pages). Allocate a page per cpu and map all the ballooned out pages to the corresponding mfn. Set the p2m accordingly. This way reading from a ballooned out page won''t cause a kernel crash (see http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html). Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: alex@alex.org.uk CC: dcrisan@flexiant.com --- drivers/xen/balloon.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++-- include/xen/balloon.h | 2 + 2 files changed, 53 insertions(+), 3 deletions(-) diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index 930fb68..82bdd0c 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -36,6 +36,7 @@ * IN THE SOFTWARE. */ +#include <linux/cpu.h> #include <linux/kernel.h> #include <linux/sched.h> #include <linux/errno.h> @@ -88,6 +89,8 @@ EXPORT_SYMBOL_GPL(balloon_stats); /* We increase/decrease in batches which fit in a page */ static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; +static struct page** balloon_trade_pages; + #ifdef CONFIG_HIGHMEM #define inc_totalhigh_pages() (totalhigh_pages++) @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) if (xen_pv_domain() && !PageHighMem(page)) { ret = HYPERVISOR_update_va_mapping( (unsigned long)__va(pfn << PAGE_SHIFT), - __pte_ma(0), 0); + pfn_pte(page_to_pfn(get_balloon_trade_page()), + PAGE_KERNEL_RO), 0); BUG_ON(ret); } #endif @@ -436,7 +440,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) /* No more mappings: invalidate P2M and add to balloon. */ for (i = 0; i < nr_pages; i++) { pfn = mfn_to_pfn(frame_list[i]); - __set_phys_to_machine(pfn, INVALID_P2M_ENTRY); + __set_phys_to_machine(pfn, + pfn_to_mfn(page_to_pfn(get_balloon_trade_page()))); balloon_append(pfn_to_page(pfn)); } @@ -491,6 +496,12 @@ static void balloon_process(struct work_struct *work) mutex_unlock(&balloon_mutex); } +struct page* get_balloon_trade_page(void) +{ + BUG_ON(balloon_trade_pages[smp_processor_id()] == NULL); + return balloon_trade_pages[smp_processor_id()]; +} + /* Resets the Xen limit, sets new target, and kicks off processing. */ void balloon_set_new_target(unsigned long target) { @@ -584,13 +595,50 @@ static void __init balloon_add_region(unsigned long start_pfn, } } +static int __cpuinit balloon_cpu_notify(struct notifier_block *self, + unsigned long action, void *hcpu) +{ + int cpu = (long)hcpu; + switch (action) { + case CPU_UP_PREPARE: + balloon_trade_pages[cpu] = alloc_page(GFP_KERNEL); + if (balloon_trade_pages[cpu] == NULL) { + pr_warn("Failed to allocate balloon_trade_page for cpu %d\n", cpu); + return NOTIFY_BAD; + } + break; + default: + break; + } + return NOTIFY_OK; +} + +static struct notifier_block balloon_cpu_notifier __cpuinitdata = { + .notifier_call = balloon_cpu_notify, +}; + static int __init balloon_init(void) { - int i; + int i, cpu; if (!xen_domain()) return -ENODEV; + balloon_trade_pages = kzalloc(num_possible_cpus() * sizeof(struct page*), + GFP_KERNEL); + if (balloon_trade_pages == NULL) + return -ENOMEM; + + for_each_online_cpu(cpu) + { + balloon_trade_pages[cpu] = alloc_page(GFP_KERNEL); + if (balloon_trade_pages[cpu] == NULL) { + pr_warn("Failed to allocate balloon_trade_page for cpu %d\n", cpu); + return -ENOMEM; + } + } + register_cpu_notifier(&balloon_cpu_notifier); + pr_info("xen/balloon: Initialising balloon driver.\n"); balloon_stats.current_pages = xen_pv_domain() diff --git a/include/xen/balloon.h b/include/xen/balloon.h index cc2e1a7..e94b4aa 100644 --- a/include/xen/balloon.h +++ b/include/xen/balloon.h @@ -29,6 +29,8 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages, bool highmem); void free_xenballooned_pages(int nr_pages, struct page **pages); +struct page* get_balloon_trade_page(void); + struct device; #ifdef CONFIG_XEN_SELFBALLOONING extern int register_xen_selfballooning(struct device *dev); -- 1.7.2.5
Stefano Stabellini
2013-Jul-22 16:28 UTC
[PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 mapping instead of reinstating the original mapping. Doing so separately would be racy. To unmap a grant and reinstate the original mapping atomically we use GNTTABOP_unmap_and_replace. GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping passed in new_addr so we have to reinstate it, however that is a per-cpu mapping only used for balloon trade pages, so we can be sure that it''s not going to be accessed while the mapping is not valid. Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> CC: alex@alex.org.uk CC: dcrisan@flexiant.com --- arch/x86/xen/p2m.c | 20 +++++++++++++------- drivers/xen/gntdev.c | 11 ++--------- 2 files changed, 15 insertions(+), 16 deletions(-) diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c index 95fb2aa..c8c273d 100644 --- a/arch/x86/xen/p2m.c +++ b/arch/x86/xen/p2m.c @@ -161,6 +161,7 @@ #include <asm/xen/page.h> #include <asm/xen/hypercall.h> #include <asm/xen/hypervisor.h> +#include <xen/balloon.h> #include <xen/grant_table.h> #include "multicalls.h" @@ -967,7 +968,9 @@ int m2p_remove_override(struct page *page, if (kmap_op != NULL) { if (!PageHighMem(page)) { struct multicall_space mcs; - struct gnttab_unmap_grant_ref *unmap_op; + struct gnttab_unmap_and_replace *unmap_op; + unsigned long trade_page_address = (unsigned long) + __va(page_to_pfn(get_balloon_trade_page()) << PAGE_SHIFT); /* * It might be that we queued all the m2p grant table @@ -990,20 +993,23 @@ int m2p_remove_override(struct page *page, } mcs = xen_mc_entry( - sizeof(struct gnttab_unmap_grant_ref)); + sizeof(struct gnttab_unmap_and_replace)); unmap_op = mcs.args; unmap_op->host_addr = kmap_op->host_addr; + unmap_op->new_addr = trade_page_address; unmap_op->handle = kmap_op->handle; - unmap_op->dev_bus_addr = 0; MULTI_grant_table_op(mcs.mc, - GNTTABOP_unmap_grant_ref, unmap_op, 1); + GNTTABOP_unmap_and_replace, unmap_op, 1); xen_mc_issue(PARAVIRT_LAZY_MMU); - set_pte_at(&init_mm, address, ptep, - pfn_pte(pfn, PAGE_KERNEL)); - __flush_tlb_single(address); + mcs = __xen_mc_entry(0); + MULTI_update_va_mapping(mcs.mc, trade_page_address, + pfn_pte(page_to_pfn(get_balloon_trade_page()), + PAGE_KERNEL_RO), 0); + xen_mc_issue(PARAVIRT_LAZY_MMU); + kmap_op->host_addr = 0; } } diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c index 3c8803f..51f4c95 100644 --- a/drivers/xen/gntdev.c +++ b/drivers/xen/gntdev.c @@ -270,19 +270,12 @@ static int map_grant_pages(struct grant_map *map) * with find_grant_ptes. */ for (i = 0; i < map->count; i++) { - unsigned level; unsigned long address = (unsigned long) pfn_to_kaddr(page_to_pfn(map->pages[i])); - pte_t *ptep; - u64 pte_maddr = 0; BUG_ON(PageHighMem(map->pages[i])); - ptep = lookup_address(address, &level); - pte_maddr = arbitrary_virt_to_machine(ptep).maddr; - gnttab_set_map_op(&map->kmap_ops[i], pte_maddr, - map->flags | - GNTMAP_host_map | - GNTMAP_contains_pte, + gnttab_set_map_op(&map->kmap_ops[i], address, + map->flags | GNTMAP_host_map, map->grants[i].ref, map->grants[i].domid); } -- 1.7.2.5
David Vrabel
2013-Jul-22 16:51 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On 22/07/13 17:28, Stefano Stabellini wrote:> Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY > in the p2m. These ballooned out pages are used to map foreign grants > by gntdev and blkback (see alloc_xenballooned_pages). > > Allocate a page per cpu and map all the ballooned out pages to the > corresponding mfn. Set the p2m accordingly. This way reading from a > ballooned out page won''t cause a kernel crash (see > http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html).Generally in favour of this approach. I can''t help thinking the "trade page" doesn''t really mean anything but I can''t think of a better, more descriptive name.> --- a/drivers/xen/balloon.c > +++ b/drivers/xen/balloon.c > @@ -36,6 +36,7 @@ > * IN THE SOFTWARE. > */ > > +#include <linux/cpu.h> > #include <linux/kernel.h> > #include <linux/sched.h> > #include <linux/errno.h> > @@ -88,6 +89,8 @@ EXPORT_SYMBOL_GPL(balloon_stats); > > /* We increase/decrease in batches which fit in a page */ > static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; > +static struct page** balloon_trade_pages;static DECLARE_PER_CPU(struct page *, balloon_trade_pages);> #ifdef CONFIG_HIGHMEM > #define inc_totalhigh_pages() (totalhigh_pages++) > @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > if (xen_pv_domain() && !PageHighMem(page)) { > ret = HYPERVISOR_update_va_mapping( > (unsigned long)__va(pfn << PAGE_SHIFT), > - __pte_ma(0), 0); > + pfn_pte(page_to_pfn(get_balloon_trade_page()), > + PAGE_KERNEL_RO), 0);Preemption needs to be disabled while using the trade page, see suggestion below.> +struct page* get_balloon_trade_page(void) > +{ > + BUG_ON(balloon_trade_pages[smp_processor_id()] == NULL); > + return balloon_trade_pages[smp_processor_id()]; > +}I think you need a get_balloon_trade_page() and put_balloon_trade_page() pair that call get_cpu_var() and put_cpu_var() to ensure that preemption is disabled while using it. David
David Vrabel
2013-Jul-22 17:02 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On 22/07/13 17:28, Stefano Stabellini wrote:> GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 > mapping instead of reinstating the original mapping. > Doing so separately would be racy. > > To unmap a grant and reinstate the original mapping atomically we use > GNTTABOP_unmap_and_replace. > GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so > don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping > passed in new_addr so we have to reinstate it, however that is a > per-cpu mapping only used for balloon trade pages, so we can be sure that > it''s not going to be accessed while the mapping is not valid.This solves the problem of userspace accessing a disk image on an NFS mount but what would blkback talking to an iSCSI LUN? Will that need similar fixes to blkback? This series does not need to fix this now though.> --- a/arch/x86/xen/p2m.c > +++ b/arch/x86/xen/p2m.c > @@ -161,6 +161,7 @@ > #include <asm/xen/page.h> > #include <asm/xen/hypercall.h> > #include <asm/xen/hypervisor.h> > +#include <xen/balloon.h> > #include <xen/grant_table.h> > > #include "multicalls.h" > @@ -967,7 +968,9 @@ int m2p_remove_override(struct page *page, > if (kmap_op != NULL) { > if (!PageHighMem(page)) { > struct multicall_space mcs; > - struct gnttab_unmap_grant_ref *unmap_op; > + struct gnttab_unmap_and_replace *unmap_op; > + unsigned long trade_page_address = (unsigned long) > + __va(page_to_pfn(get_balloon_trade_page()) << PAGE_SHIFT);struct page *trade_page = get_balloon_trade_page(); unsigned long trade_page_address = page_address(trade_page); (And the corresponding put_balloon_trade_page() once you''ve added it.) Otherwise, Reviewed-by: David Vrabel <david.vrabel@citrix.com> David
Ian Campbell
2013-Jul-22 17:22 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote:> On 22/07/13 17:28, Stefano Stabellini wrote: > > Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY > > in the p2m. These ballooned out pages are used to map foreign grants > > by gntdev and blkback (see alloc_xenballooned_pages). > > > > Allocate a page per cpu and map all the ballooned out pages to the > > corresponding mfn. Set the p2m accordingly. This way reading from a > > ballooned out page won''t cause a kernel crash (see > > http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html). > > Generally in favour of this approach. > > I can''t help thinking the "trade page" doesn''t really mean anything but > I can''t think of a better, more descriptive name. > > > --- a/drivers/xen/balloon.c > > +++ b/drivers/xen/balloon.c > > @@ -36,6 +36,7 @@ > > * IN THE SOFTWARE. > > */ > > > > +#include <linux/cpu.h> > > #include <linux/kernel.h> > > #include <linux/sched.h> > > #include <linux/errno.h> > > @@ -88,6 +89,8 @@ EXPORT_SYMBOL_GPL(balloon_stats); > > > > /* We increase/decrease in batches which fit in a page */ > > static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; > > +static struct page** balloon_trade_pages; > > static DECLARE_PER_CPU(struct page *, balloon_trade_pages);And use this_cpu(balloon_trade_page) etc.> > #ifdef CONFIG_HIGHMEM > > #define inc_totalhigh_pages() (totalhigh_pages++) > > @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > > if (xen_pv_domain() && !PageHighMem(page)) { > > ret = HYPERVISOR_update_va_mapping( > > (unsigned long)__va(pfn << PAGE_SHIFT), > > - __pte_ma(0), 0); > > + pfn_pte(page_to_pfn(get_balloon_trade_page()), > > + PAGE_KERNEL_RO), 0); > > Preemption needs to be disabled while using the trade page, see > suggestion below.Hopefully you mean just when setting up/manipulating it? Otherwise that would basically mean all the time since just ballooning out a single page will cause it to be used.> > > +struct page* get_balloon_trade_page(void) > > +{ > > + BUG_ON(balloon_trade_pages[smp_processor_id()] == NULL); > > + return balloon_trade_pages[smp_processor_id()]; > > +} > > I think you need a get_balloon_trade_page() and put_balloon_trade_page() > pair that call get_cpu_var() and put_cpu_var() to ensure that preemption > is disabled while using it. > > David
David Vrabel
2013-Jul-22 17:26 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On 22/07/13 18:22, Ian Campbell wrote:> On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: >> On 22/07/13 17:28, Stefano Stabellini wrote: >>> >>> #ifdef CONFIG_HIGHMEM >>> #define inc_totalhigh_pages() (totalhigh_pages++) >>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) >>> if (xen_pv_domain() && !PageHighMem(page)) { >>> ret = HYPERVISOR_update_va_mapping( >>> (unsigned long)__va(pfn << PAGE_SHIFT), >>> - __pte_ma(0), 0); >>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), >>> + PAGE_KERNEL_RO), 0); >> >> Preemption needs to be disabled while using the trade page, see >> suggestion below. > > Hopefully you mean just when setting up/manipulating it?Yes, sorry. get_...() update_va_mapping() put_...() David
Ian Campbell
2013-Jul-22 17:32 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote:> On 22/07/13 18:22, Ian Campbell wrote: > > On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: > >> On 22/07/13 17:28, Stefano Stabellini wrote: > >>> > >>> #ifdef CONFIG_HIGHMEM > >>> #define inc_totalhigh_pages() (totalhigh_pages++) > >>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > >>> if (xen_pv_domain() && !PageHighMem(page)) { > >>> ret = HYPERVISOR_update_va_mapping( > >>> (unsigned long)__va(pfn << PAGE_SHIFT), > >>> - __pte_ma(0), 0); > >>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), > >>> + PAGE_KERNEL_RO), 0); > >> > >> Preemption needs to be disabled while using the trade page, see > >> suggestion below. > > > > Hopefully you mean just when setting up/manipulating it? > > Yes, sorry. > > get_...() > update_va_mapping() > put_...()I can see why it would matter in the unmap_and_replace+fixup case (since you need them to happen "atomically") but why in this case? We don''t actually care which of the trade pages gets used for this purpose, so even if we happen to get preempted and change CPU it doesn''t really matter. Hrm, unless the CPU goes offline I suppose, ah but we don''t free the page when the cpu goes down (this is good). Oh, that made me notice: + case CPU_UP_PREPARE: + balloon_trade_pages[cpu] = alloc_page(GFP_KERNEL); will leak the previously allocated page if you take the CPU down then up again. Ian.
David Vrabel
2013-Jul-23 09:58 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On 22/07/13 18:32, Ian Campbell wrote:> On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: >> On 22/07/13 18:22, Ian Campbell wrote: >>> On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: >>>> On 22/07/13 17:28, Stefano Stabellini wrote: >>>>> >>>>> #ifdef CONFIG_HIGHMEM >>>>> #define inc_totalhigh_pages() (totalhigh_pages++) >>>>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) >>>>> if (xen_pv_domain() && !PageHighMem(page)) { >>>>> ret = HYPERVISOR_update_va_mapping( >>>>> (unsigned long)__va(pfn << PAGE_SHIFT), >>>>> - __pte_ma(0), 0); >>>>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), >>>>> + PAGE_KERNEL_RO), 0); >>>> >>>> Preemption needs to be disabled while using the trade page, see >>>> suggestion below. >>> >>> Hopefully you mean just when setting up/manipulating it? >> >> Yes, sorry. >> >> get_...() >> update_va_mapping() >> put_...() > > I can see why it would matter in the unmap_and_replace+fixup case (since > you need them to happen "atomically") but why in this case? We don''t > actually care which of the trade pages gets used for this purpose, so > even if we happen to get preempted and change CPU it doesn''t really > matter.If a trade page from another CPU is used, it may concurrently have it''s MFN cleared by a unmap_and_replace call on the other CPU. David
Roger Pau Monné
2013-Jul-23 10:35 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On 22/07/13 19:02, David Vrabel wrote:> On 22/07/13 17:28, Stefano Stabellini wrote: >> GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 >> mapping instead of reinstating the original mapping. >> Doing so separately would be racy. >> >> To unmap a grant and reinstate the original mapping atomically we use >> GNTTABOP_unmap_and_replace. >> GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so >> don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping >> passed in new_addr so we have to reinstate it, however that is a >> per-cpu mapping only used for balloon trade pages, so we can be sure that >> it''s not going to be accessed while the mapping is not valid. > > This solves the problem of userspace accessing a disk image on an NFS > mount but what would blkback talking to an iSCSI LUN? Will that need > similar fixes to blkback? This series does not need to fix this now though.I have not played much with iSCSI and blkback, but I guess the same issue exists there, the only difference is that we don''t perform the grant mappings with GNTMAP_contains_pte, so less modifications are required. Maybe the unmap and reinstate of the trade page address could be placed somewhere more generic, so that I don''t have to also code the multicall in blkback? Adding a proper unmap and replace op that reinstates the trade page address to grant-table.c does seem like the more logical option IMHO. (I can do that after this patch is committed).
Stefano Stabellini
2013-Jul-23 12:37 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On Tue, 23 Jul 2013, Roger Pau Monné wrote:> On 22/07/13 19:02, David Vrabel wrote: > > On 22/07/13 17:28, Stefano Stabellini wrote: > >> GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 > >> mapping instead of reinstating the original mapping. > >> Doing so separately would be racy. > >> > >> To unmap a grant and reinstate the original mapping atomically we use > >> GNTTABOP_unmap_and_replace. > >> GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so > >> don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping > >> passed in new_addr so we have to reinstate it, however that is a > >> per-cpu mapping only used for balloon trade pages, so we can be sure that > >> it''s not going to be accessed while the mapping is not valid. > > > > This solves the problem of userspace accessing a disk image on an NFS > > mount but what would blkback talking to an iSCSI LUN? Will that need > > similar fixes to blkback? This series does not need to fix this now though. > > I have not played much with iSCSI and blkback, but I guess the same > issue exists there, the only difference is that we don''t perform the > grant mappings with GNTMAP_contains_pte, so less modifications are required. > > Maybe the unmap and reinstate of the trade page address could be placed > somewhere more generic, so that I don''t have to also code the multicall > in blkback?I don''t think that can be done with the current limitations of the GNTTABOP_unmap_and_replace hypercall.> Adding a proper unmap and replace op that reinstates the > trade page address to grant-table.c does seem like the more logical > option IMHO. (I can do that after this patch is committed).That might be the only way to solve the problem for blkback. http://marc.info/?l=xen-devel&m=137442818429129 should be a good start. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Stefano Stabellini
2013-Jul-23 12:47 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On Mon, 22 Jul 2013, David Vrabel wrote:> On 22/07/13 17:28, Stefano Stabellini wrote: > > GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 > > mapping instead of reinstating the original mapping. > > Doing so separately would be racy. > > > > To unmap a grant and reinstate the original mapping atomically we use > > GNTTABOP_unmap_and_replace. > > GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so > > don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping > > passed in new_addr so we have to reinstate it, however that is a > > per-cpu mapping only used for balloon trade pages, so we can be sure that > > it''s not going to be accessed while the mapping is not valid. > > This solves the problem of userspace accessing a disk image on an NFS > mount but what would blkback talking to an iSCSI LUN? Will that need > similar fixes to blkback? This series does not need to fix this now though.No, it wouldn''t. We actually need a better hypercall than GNTTABOP_unmap_and_replace for that.> > --- a/arch/x86/xen/p2m.c > > +++ b/arch/x86/xen/p2m.c > > @@ -161,6 +161,7 @@ > > #include <asm/xen/page.h> > > #include <asm/xen/hypercall.h> > > #include <asm/xen/hypervisor.h> > > +#include <xen/balloon.h> > > #include <xen/grant_table.h> > > > > #include "multicalls.h" > > @@ -967,7 +968,9 @@ int m2p_remove_override(struct page *page, > > if (kmap_op != NULL) { > > if (!PageHighMem(page)) { > > struct multicall_space mcs; > > - struct gnttab_unmap_grant_ref *unmap_op; > > + struct gnttab_unmap_and_replace *unmap_op; > > + unsigned long trade_page_address = (unsigned long) > > + __va(page_to_pfn(get_balloon_trade_page()) << PAGE_SHIFT); > > struct page *trade_page = get_balloon_trade_page(); > unsigned long trade_page_address = page_address(trade_page); > > (And the corresponding put_balloon_trade_page() once you''ve added it.) > > Otherwise, > > Reviewed-by: David Vrabel <david.vrabel@citrix.com>>thanks
Stefano Stabellini
2013-Jul-23 12:49 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Mon, 22 Jul 2013, Ian Campbell wrote:> On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: > > On 22/07/13 18:22, Ian Campbell wrote: > > > On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: > > >> On 22/07/13 17:28, Stefano Stabellini wrote: > > >>> > > >>> #ifdef CONFIG_HIGHMEM > > >>> #define inc_totalhigh_pages() (totalhigh_pages++) > > >>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > > >>> if (xen_pv_domain() && !PageHighMem(page)) { > > >>> ret = HYPERVISOR_update_va_mapping( > > >>> (unsigned long)__va(pfn << PAGE_SHIFT), > > >>> - __pte_ma(0), 0); > > >>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), > > >>> + PAGE_KERNEL_RO), 0); > > >> > > >> Preemption needs to be disabled while using the trade page, see > > >> suggestion below. > > > > > > Hopefully you mean just when setting up/manipulating it? > > > > Yes, sorry. > > > > get_...() > > update_va_mapping() > > put_...() > > I can see why it would matter in the unmap_and_replace+fixup case (since > you need them to happen "atomically") but why in this case? We don''t > actually care which of the trade pages gets used for this purpose, so > even if we happen to get preempted and change CPU it doesn''t really > matter. Hrm, unless the CPU goes offline I suppose, ah but we don''t free > the page when the cpu goes down (this is good).I agree with Ian, I think it only matters in the m2p code.> Oh, that made me notice: > > + case CPU_UP_PREPARE: > + balloon_trade_pages[cpu] = alloc_page(GFP_KERNEL); > > will leak the previously allocated page if you take the CPU down then up > again.That means I''ll have to free the page on CPU_DOWN_PREPARE
Konrad Rzeszutek Wilk
2013-Jul-23 13:57 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On Mon, Jul 22, 2013 at 06:02:45PM +0100, David Vrabel wrote:> On 22/07/13 17:28, Stefano Stabellini wrote: > > GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 > > mapping instead of reinstating the original mapping. > > Doing so separately would be racy. > > > > To unmap a grant and reinstate the original mapping atomically we use > > GNTTABOP_unmap_and_replace. > > GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so > > don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping > > passed in new_addr so we have to reinstate it, however that is a > > per-cpu mapping only used for balloon trade pages, so we can be sure that > > it''s not going to be accessed while the mapping is not valid. > > This solves the problem of userspace accessing a disk image on an NFS > mount but what would blkback talking to an iSCSI LUN? Will that need > similar fixes to blkback? This series does not need to fix this now though.I am not sure I follow. The blkback using iSCSI LUNs works just fine (I am using that - I have LVs of guests on an iSCSI disk). I think the problem you are alluding to is when blkfront and netback are both involved. Meaning you have an iSCSI target in an PV DomU guest exporting its LUNs to other domU guests. The iSCSI target resides on top of blkfront. That means that within the guest you have - netback and blkfront using the same M2P entry. That is a seperate issue I think that this patch. But I could also be misremembering the problem statement. Roger had encountered it and dig some diagnosis of it.> > > --- a/arch/x86/xen/p2m.c > > +++ b/arch/x86/xen/p2m.c > > @@ -161,6 +161,7 @@ > > #include <asm/xen/page.h> > > #include <asm/xen/hypercall.h> > > #include <asm/xen/hypervisor.h> > > +#include <xen/balloon.h> > > #include <xen/grant_table.h> > > > > #include "multicalls.h" > > @@ -967,7 +968,9 @@ int m2p_remove_override(struct page *page, > > if (kmap_op != NULL) { > > if (!PageHighMem(page)) { > > struct multicall_space mcs; > > - struct gnttab_unmap_grant_ref *unmap_op; > > + struct gnttab_unmap_and_replace *unmap_op; > > + unsigned long trade_page_address = (unsigned long) > > + __va(page_to_pfn(get_balloon_trade_page()) << PAGE_SHIFT); > > struct page *trade_page = get_balloon_trade_page(); > unsigned long trade_page_address = page_address(trade_page); > > (And the corresponding put_balloon_trade_page() once you''ve added it.) > > Otherwise, > > Reviewed-by: David Vrabel <david.vrabel@citrix.com> > > David
Konrad Rzeszutek Wilk
2013-Jul-23 13:58 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Mon, Jul 22, 2013 at 05:51:35PM +0100, David Vrabel wrote:> On 22/07/13 17:28, Stefano Stabellini wrote: > > Currently ballooned out pages are mapped to 0 and have INVALID_P2M_ENTRY > > in the p2m. These ballooned out pages are used to map foreign grants > > by gntdev and blkback (see alloc_xenballooned_pages). > > > > Allocate a page per cpu and map all the ballooned out pages to the > > corresponding mfn. Set the p2m accordingly. This way reading from a > > ballooned out page won''t cause a kernel crash (see > > http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html). > > Generally in favour of this approach. > > I can''t help thinking the "trade page" doesn''t really mean anything but > I can''t think of a better, more descriptive name.scratch page?> > > --- a/drivers/xen/balloon.c > > +++ b/drivers/xen/balloon.c > > @@ -36,6 +36,7 @@ > > * IN THE SOFTWARE. > > */ > > > > +#include <linux/cpu.h> > > #include <linux/kernel.h> > > #include <linux/sched.h> > > #include <linux/errno.h> > > @@ -88,6 +89,8 @@ EXPORT_SYMBOL_GPL(balloon_stats); > > > > /* We increase/decrease in batches which fit in a page */ > > static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; > > +static struct page** balloon_trade_pages; > > static DECLARE_PER_CPU(struct page *, balloon_trade_pages); > > > #ifdef CONFIG_HIGHMEM > > #define inc_totalhigh_pages() (totalhigh_pages++) > > @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > > if (xen_pv_domain() && !PageHighMem(page)) { > > ret = HYPERVISOR_update_va_mapping( > > (unsigned long)__va(pfn << PAGE_SHIFT), > > - __pte_ma(0), 0); > > + pfn_pte(page_to_pfn(get_balloon_trade_page()), > > + PAGE_KERNEL_RO), 0); > > Preemption needs to be disabled while using the trade page, see > suggestion below. > > > +struct page* get_balloon_trade_page(void) > > +{ > > + BUG_ON(balloon_trade_pages[smp_processor_id()] == NULL); > > + return balloon_trade_pages[smp_processor_id()]; > > +} > > I think you need a get_balloon_trade_page() and put_balloon_trade_page() > pair that call get_cpu_var() and put_cpu_var() to ensure that preemption > is disabled while using it. > > David
Ian Campbell
2013-Jul-23 14:23 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On Tue, 2013-07-23 at 09:57 -0400, Konrad Rzeszutek Wilk wrote:> On Mon, Jul 22, 2013 at 06:02:45PM +0100, David Vrabel wrote: > > On 22/07/13 17:28, Stefano Stabellini wrote: > > > GNTTABOP_unmap_grant_ref unmaps a grant and replaces it with a 0 > > > mapping instead of reinstating the original mapping. > > > Doing so separately would be racy. > > > > > > To unmap a grant and reinstate the original mapping atomically we use > > > GNTTABOP_unmap_and_replace. > > > GNTTABOP_unmap_and_replace doesn''t work with GNTMAP_contains_pte, so > > > don''t use it for kmaps. GNTTABOP_unmap_and_replace zeroes the mapping > > > passed in new_addr so we have to reinstate it, however that is a > > > per-cpu mapping only used for balloon trade pages, so we can be sure that > > > it''s not going to be accessed while the mapping is not valid. > > > > This solves the problem of userspace accessing a disk image on an NFS > > mount but what would blkback talking to an iSCSI LUN? Will that need > > similar fixes to blkback? This series does not need to fix this now though. > > I am not sure I follow. The blkback using iSCSI LUNs works just fine > (I am using that - I have LVs of guests on an iSCSI disk).It is very rare but if you get a TCP retransmit on the iSCSI stream then it is possible to get the same race with an ACK turning up and completing the I/O while the transmit is still pending. I''m pretty sure there have been reports of this in the past,but I can also easily believe you aren''t seeing it, you''d need a heavily stressed and/or flakey network and a big dollop of bad luck... Ian.
Ian Campbell
2013-Jul-23 14:24 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, 2013-07-23 at 13:49 +0100, Stefano Stabellini wrote:> On Mon, 22 Jul 2013, Ian Campbell wrote: > > On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: > > > On 22/07/13 18:22, Ian Campbell wrote: > > > > On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: > > > >> On 22/07/13 17:28, Stefano Stabellini wrote: > > > >>> > > > >>> #ifdef CONFIG_HIGHMEM > > > >>> #define inc_totalhigh_pages() (totalhigh_pages++) > > > >>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > > > >>> if (xen_pv_domain() && !PageHighMem(page)) { > > > >>> ret = HYPERVISOR_update_va_mapping( > > > >>> (unsigned long)__va(pfn << PAGE_SHIFT), > > > >>> - __pte_ma(0), 0); > > > >>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), > > > >>> + PAGE_KERNEL_RO), 0); > > > >> > > > >> Preemption needs to be disabled while using the trade page, see > > > >> suggestion below. > > > > > > > > Hopefully you mean just when setting up/manipulating it? > > > > > > Yes, sorry. > > > > > > get_...() > > > update_va_mapping() > > > put_...() > > > > I can see why it would matter in the unmap_and_replace+fixup case (since > > you need them to happen "atomically") but why in this case? We don''t > > actually care which of the trade pages gets used for this purpose, so > > even if we happen to get preempted and change CPU it doesn''t really > > matter. Hrm, unless the CPU goes offline I suppose, ah but we don''t free > > the page when the cpu goes down (this is good). > > I agree with Ian, I think it only matters in the m2p code. > > > > Oh, that made me notice: > > > > + case CPU_UP_PREPARE: > > + balloon_trade_pages[cpu] = alloc_page(GFP_KERNEL); > > > > will leak the previously allocated page if you take the CPU down then up > > again. > > That means I''ll have to free the page on CPU_DOWN_PREPARENo, don''t do that. The page may still be in use in the page tables, taking the CPU down doesn''t make all uses of its trade page go away. You''ll need to add a check in CPU_UP_PREPARE to only allocate a page if one doesn''t already exist, I think. Ian.
Ian Campbell
2013-Jul-23 14:27 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, 2013-07-23 at 10:58 +0100, David Vrabel wrote:> On 22/07/13 18:32, Ian Campbell wrote: > > On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: > >> On 22/07/13 18:22, Ian Campbell wrote: > >>> On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: > >>>> On 22/07/13 17:28, Stefano Stabellini wrote: > >>>>> > >>>>> #ifdef CONFIG_HIGHMEM > >>>>> #define inc_totalhigh_pages() (totalhigh_pages++) > >>>>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > >>>>> if (xen_pv_domain() && !PageHighMem(page)) { > >>>>> ret = HYPERVISOR_update_va_mapping( > >>>>> (unsigned long)__va(pfn << PAGE_SHIFT), > >>>>> - __pte_ma(0), 0); > >>>>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), > >>>>> + PAGE_KERNEL_RO), 0); > >>>> > >>>> Preemption needs to be disabled while using the trade page, see > >>>> suggestion below. > >>> > >>> Hopefully you mean just when setting up/manipulating it? > >> > >> Yes, sorry. > >> > >> get_...() > >> update_va_mapping() > >> put_...() > > > > I can see why it would matter in the unmap_and_replace+fixup case (since > > you need them to happen "atomically") but why in this case? We don''t > > actually care which of the trade pages gets used for this purpose, so > > even if we happen to get preempted and change CPU it doesn''t really > > matter. > > If a trade page from another CPU is used, it may concurrently have it''s > MFN cleared by a unmap_and_replace call on the other CPU.The call on the other CPU clears the virtual address, not the MFN, so long as we have the MFN in our hand we are OK, I think? Nothing updates the m2p or p2m does it? Or is there a race between get_trade_page and page_to_pfn (or maybe pfn_pte)?> > David
Alex Bligh
2013-Jul-23 14:48 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
--On 23 July 2013 15:23:16 +0100 Ian Campbell <ian.campbell@citrix.com> wrote:> but I can also easily > believe you aren''t seeing it, you''d need a heavily stressed and/or > flakey network and a big dollop of bad luck...We appear to have a healthy supply of one or the other ... -- Alex Bligh
Ian Campbell
2013-Jul-23 14:51 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
On Tue, 2013-07-23 at 15:48 +0100, Alex Bligh wrote:> > --On 23 July 2013 15:23:16 +0100 Ian Campbell <ian.campbell@citrix.com> > wrote: > > > but I can also easily > > believe you aren''t seeing it, you''d need a heavily stressed and/or > > flakey network and a big dollop of bad luck... > > We appear to have a healthy supply of one or the other ...The NFS variant is a little easier to trigger because you also have retransmits at the RPC layer. The pure TCP retransmit issue is harder to hit. Ian.
David Vrabel
2013-Jul-23 15:03 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On 23/07/13 15:27, Ian Campbell wrote:> On Tue, 2013-07-23 at 10:58 +0100, David Vrabel wrote: >> On 22/07/13 18:32, Ian Campbell wrote: >>> On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: >>>> On 22/07/13 18:22, Ian Campbell wrote: >>>>> On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: >>>>>> On 22/07/13 17:28, Stefano Stabellini wrote: >>>>>>> >>>>>>> #ifdef CONFIG_HIGHMEM >>>>>>> #define inc_totalhigh_pages() (totalhigh_pages++) >>>>>>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) >>>>>>> if (xen_pv_domain() && !PageHighMem(page)) { >>>>>>> ret = HYPERVISOR_update_va_mapping( >>>>>>> (unsigned long)__va(pfn << PAGE_SHIFT), >>>>>>> - __pte_ma(0), 0); >>>>>>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), >>>>>>> + PAGE_KERNEL_RO), 0); >>>>>> >>>>>> Preemption needs to be disabled while using the trade page, see >>>>>> suggestion below. >>>>> >>>>> Hopefully you mean just when setting up/manipulating it? >>>> >>>> Yes, sorry. >>>> >>>> get_...() >>>> update_va_mapping() >>>> put_...() >>> >>> I can see why it would matter in the unmap_and_replace+fixup case (since >>> you need them to happen "atomically") but why in this case? We don''t >>> actually care which of the trade pages gets used for this purpose, so >>> even if we happen to get preempted and change CPU it doesn''t really >>> matter. >> >> If a trade page from another CPU is used, it may concurrently have it''s >> MFN cleared by a unmap_and_replace call on the other CPU. > > The call on the other CPU clears the virtual address, not the MFN, so > long as we have the MFN in our hand we are OK, I think? Nothing updates > the m2p or p2m does it?Yes, I think you''re correct here. But, does this need to be mfn_pte(page_to_mfn(get_balloon_trade_page())? The pte parameter to update_va_mapping() needs to contains the machine address, right? David
Alex Bligh
2013-Jul-23 15:06 UTC
Re: [PATCH v2 2/2] xen/m2p: use GNTTABOP_unmap_and_replace to reinstate the original mapping
Ian, --On 23 July 2013 15:51:41 +0100 Ian Campbell <ian.campbell@citrix.com> wrote:>> We appear to have a healthy supply of one or the other ... > > The NFS variant is a little easier to trigger because you also have > retransmits at the RPC layer. The pure TCP retransmit issue is harder to > hit.This figures, as we see it consistently with NFS on a low spec development filer which takes all sorts of abuse, but not on iSCSI (same filer I think). -- Alex Bligh
Ian Campbell
2013-Jul-23 17:04 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, 2013-07-23 at 16:03 +0100, David Vrabel wrote:> On 23/07/13 15:27, Ian Campbell wrote: > > On Tue, 2013-07-23 at 10:58 +0100, David Vrabel wrote: > >> On 22/07/13 18:32, Ian Campbell wrote: > >>> On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: > >>>> On 22/07/13 18:22, Ian Campbell wrote: > >>>>> On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: > >>>>>> On 22/07/13 17:28, Stefano Stabellini wrote: > >>>>>>> > >>>>>>> #ifdef CONFIG_HIGHMEM > >>>>>>> #define inc_totalhigh_pages() (totalhigh_pages++) > >>>>>>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > >>>>>>> if (xen_pv_domain() && !PageHighMem(page)) { > >>>>>>> ret = HYPERVISOR_update_va_mapping( > >>>>>>> (unsigned long)__va(pfn << PAGE_SHIFT), > >>>>>>> - __pte_ma(0), 0); > >>>>>>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), > >>>>>>> + PAGE_KERNEL_RO), 0); > >>>>>> > >>>>>> Preemption needs to be disabled while using the trade page, see > >>>>>> suggestion below. > >>>>> > >>>>> Hopefully you mean just when setting up/manipulating it? > >>>> > >>>> Yes, sorry. > >>>> > >>>> get_...() > >>>> update_va_mapping() > >>>> put_...() > >>> > >>> I can see why it would matter in the unmap_and_replace+fixup case (since > >>> you need them to happen "atomically") but why in this case? We don''t > >>> actually care which of the trade pages gets used for this purpose, so > >>> even if we happen to get preempted and change CPU it doesn''t really > >>> matter. > >> > >> If a trade page from another CPU is used, it may concurrently have it''s > >> MFN cleared by a unmap_and_replace call on the other CPU. > > > > The call on the other CPU clears the virtual address, not the MFN, so > > long as we have the MFN in our hand we are OK, I think? Nothing updates > > the m2p or p2m does it? > > Yes, I think you''re correct here.Oh good! That said if other callers do need the locking it would be better to use it consistently everywhere I suppose, plus it means we have to think less hard about correctness ;-)> But, does this need to be mfn_pte(page_to_mfn(get_balloon_trade_page())? > The pte parameter to update_va_mapping() needs to contains the machine > address, right?I think pfn_pte(page_to_pfn(...)) and mfn_pte(page_to_mfn(...)) are equivalent, just the p2m translation happens in pfn_pte vs page_to_mfn. The pfn version is the more normal "generic" thing whereas the mfn versions are Xen special cases used for special things (like mfns with no equivalent pfn, 1:1 mappings and the like). Ian.
Stefano Stabellini
2013-Jul-23 17:20 UTC
Re: [PATCH v2 1/2] xen/balloon: set a mapping for ballooned out pages
On Tue, 23 Jul 2013, David Vrabel wrote:> On 23/07/13 15:27, Ian Campbell wrote: > > On Tue, 2013-07-23 at 10:58 +0100, David Vrabel wrote: > >> On 22/07/13 18:32, Ian Campbell wrote: > >>> On Mon, 2013-07-22 at 18:26 +0100, David Vrabel wrote: > >>>> On 22/07/13 18:22, Ian Campbell wrote: > >>>>> On Mon, 2013-07-22 at 17:51 +0100, David Vrabel wrote: > >>>>>> On 22/07/13 17:28, Stefano Stabellini wrote: > >>>>>>> > >>>>>>> #ifdef CONFIG_HIGHMEM > >>>>>>> #define inc_totalhigh_pages() (totalhigh_pages++) > >>>>>>> @@ -423,7 +426,8 @@ static enum bp_state decrease_reservation(unsigned long nr_pages, gfp_t gfp) > >>>>>>> if (xen_pv_domain() && !PageHighMem(page)) { > >>>>>>> ret = HYPERVISOR_update_va_mapping( > >>>>>>> (unsigned long)__va(pfn << PAGE_SHIFT), > >>>>>>> - __pte_ma(0), 0); > >>>>>>> + pfn_pte(page_to_pfn(get_balloon_trade_page()), > >>>>>>> + PAGE_KERNEL_RO), 0); > >>>>>> > >>>>>> Preemption needs to be disabled while using the trade page, see > >>>>>> suggestion below. > >>>>> > >>>>> Hopefully you mean just when setting up/manipulating it? > >>>> > >>>> Yes, sorry. > >>>> > >>>> get_...() > >>>> update_va_mapping() > >>>> put_...() > >>> > >>> I can see why it would matter in the unmap_and_replace+fixup case (since > >>> you need them to happen "atomically") but why in this case? We don''t > >>> actually care which of the trade pages gets used for this purpose, so > >>> even if we happen to get preempted and change CPU it doesn''t really > >>> matter. > >> > >> If a trade page from another CPU is used, it may concurrently have it''s > >> MFN cleared by a unmap_and_replace call on the other CPU. > > > > The call on the other CPU clears the virtual address, not the MFN, so > > long as we have the MFN in our hand we are OK, I think? Nothing updates > > the m2p or p2m does it? > > Yes, I think you''re correct here.Yep, only the pte corresponding to trade_page_address is cleared. Nothing in Linux access a trade page from trade_page_address. However Xen does, when Linux calls GNTTABOP_unmap_and_replace. So we only need to make sure that the trade_page of this cpu is used in GNTTABOP_unmap_and_replace and the following update_va_mapping.> But, does this need to be mfn_pte(page_to_mfn(get_balloon_trade_page())? > The pte parameter to update_va_mapping() needs to contains the machine > address, right?Yep, pfn_pte does that already.