Jeremy Fitzhardinge
2011-Jan-24 22:55 UTC
[Xen-devel] [PATCH 0/9] Add apply_to_page_range_batch() and use it
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> I''m proposing this series for 2.6.39. We''ve had apply_to_page_range() for a while, which is a general way to apply a function to ptes across a range of addresses - including allocating any missing parts of the pagetable as needed. This logic is replicated in a number of places throughout the kernel, but it hasn''t been widely replaced by this function, partly because of concerns about the overhead of calling the function once per pte. This series adds apply_to_page_range_batch() (and reimplements apply_to_page_range() in terms of it), which calls the pte operation function once per pte page, moving the inner loop into the callback function. apply_to_page_range(_batch) also calls its callback with lazy mmu updates enabled, which allows batching of the operations in environments where this is beneficial (ie, virtualization). The only caveat this introduces is callbacks can''t expect to immediately see the effects of the pte updates in memory. Since this is effectively identical to the code in lib/ioremap.c and mm/vmalloc.c (twice!), I replace their open-coded variants. I''m sure there are others places in the kernel which could do with this (I only stumbled over ioremap by accident). I also add a minor optimisation to vunmap_page_range() to use a plain pte_clear() rather than the more expensive and unnecessary ptep_get_and_clear(). Jeremy Fitzhardinge (9): mm: remove unused "token" argument from apply_to_page_range callback. mm: add apply_to_page_range_batch() ioremap: use apply_to_page_range_batch() for ioremap_page_range() vmalloc: use plain pte_clear() for unmaps vmalloc: use apply_to_page_range_batch() for vunmap_page_range() vmalloc: use apply_to_page_range_batch() for vmap_page_range_noflush() vmalloc: use apply_to_page_range_batch() in alloc_vm_area() xen/mmu: use apply_to_page_range_batch() in xen_remap_domain_mfn_range() xen/grant-table: use apply_to_page_range_batch() arch/x86/xen/grant-table.c | 30 +++++---- arch/x86/xen/mmu.c | 18 +++-- include/linux/mm.h | 9 ++- lib/ioremap.c | 85 +++++++------------------ mm/memory.c | 57 ++++++++++++----- mm/vmalloc.c | 150 ++++++++++++-------------------------------- 6 files changed, 140 insertions(+), 209 deletions(-) -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:55 UTC
[Xen-devel] [PATCH 1/9] mm: remove unused "token" argument from apply_to_page_range callback.
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> The argument is basically the struct page of the pte_t * passed into the callback. But there''s no need to pass that, since it can be fairly easily derived from the pte_t * itself if needed (and no current users need to do that anyway). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/xen/grant-table.c | 6 ++---- arch/x86/xen/mmu.c | 3 +-- include/linux/mm.h | 3 +-- mm/memory.c | 2 +- mm/vmalloc.c | 2 +- 5 files changed, 6 insertions(+), 10 deletions(-) diff --git a/arch/x86/xen/grant-table.c b/arch/x86/xen/grant-table.c index 49ba9b5..5bf892a 100644 --- a/arch/x86/xen/grant-table.c +++ b/arch/x86/xen/grant-table.c @@ -44,8 +44,7 @@ #include <asm/pgtable.h> -static int map_pte_fn(pte_t *pte, struct page *pmd_page, - unsigned long addr, void *data) +static int map_pte_fn(pte_t *pte, unsigned long addr, void *data) { unsigned long **frames = (unsigned long **)data; @@ -54,8 +53,7 @@ static int map_pte_fn(pte_t *pte, struct page *pmd_page, return 0; } -static int unmap_pte_fn(pte_t *pte, struct page *pmd_page, - unsigned long addr, void *data) +static int unmap_pte_fn(pte_t *pte, unsigned long addr, void *data) { set_pte_at(&init_mm, addr, pte, __pte(0)); diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 5e92b61..38ba804 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -2292,8 +2292,7 @@ struct remap_data { struct mmu_update *mmu_update; }; -static int remap_area_mfn_pte_fn(pte_t *ptep, pgtable_t token, - unsigned long addr, void *data) +static int remap_area_mfn_pte_fn(pte_t *ptep, unsigned long addr, void *data) { struct remap_data *rmd = data; pte_t pte = pte_mkspecial(pfn_pte(rmd->mfn++, rmd->prot)); diff --git a/include/linux/mm.h b/include/linux/mm.h index 956a355..bb898ec 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1529,8 +1529,7 @@ struct page *follow_page(struct vm_area_struct *, unsigned long address, #define FOLL_MLOCK 0x40 /* mark page as mlocked */ #define FOLL_SPLIT 0x80 /* don''t return transhuge pages, split them */ -typedef int (*pte_fn_t)(pte_t *pte, pgtable_t token, unsigned long addr, - void *data); +typedef int (*pte_fn_t)(pte_t *pte, unsigned long addr, void *data); extern int apply_to_page_range(struct mm_struct *mm, unsigned long address, unsigned long size, pte_fn_t fn, void *data); diff --git a/mm/memory.c b/mm/memory.c index 31250fa..740470c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2032,7 +2032,7 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, token = pmd_pgtable(*pmd); do { - err = fn(pte++, token, addr, data); + err = fn(pte++, addr, data); if (err) break; } while (addr += PAGE_SIZE, addr != end); diff --git a/mm/vmalloc.c b/mm/vmalloc.c index f9b1667..5ddbdfe 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2061,7 +2061,7 @@ void __attribute__((weak)) vmalloc_sync_all(void) } -static int f(pte_t *pte, pgtable_t table, unsigned long addr, void *data) +static int f(pte_t *pte, unsigned long addr, void *data) { /* apply_to_page_range() does all the hard work. */ return 0; -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 2/9] mm: add apply_to_page_range_batch()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> apply_to_page_range() calls its callback function once for each pte, which is pretty inefficient since it will almost always be operating on a batch of adjacent ptes. apply_to_page_range_batch() calls its callback with both a pte_t * and a count, so it can operate on multiple ptes at once. The callback is expected to handle all its ptes, or return an error. For both apply_to_page_range and apply_to_page_range_batch, it is up to the caller to work out how much progress was made if either fails with an error. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- include/linux/mm.h | 6 +++++ mm/memory.c | 57 +++++++++++++++++++++++++++++++++++++-------------- 2 files changed, 47 insertions(+), 16 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index bb898ec..5a32a8a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1533,6 +1533,12 @@ typedef int (*pte_fn_t)(pte_t *pte, unsigned long addr, void *data); extern int apply_to_page_range(struct mm_struct *mm, unsigned long address, unsigned long size, pte_fn_t fn, void *data); +typedef int (*pte_batch_fn_t)(pte_t *pte, unsigned count, + unsigned long addr, void *data); +extern int apply_to_page_range_batch(struct mm_struct *mm, + unsigned long address, unsigned long size, + pte_batch_fn_t fn, void *data); + #ifdef CONFIG_PROC_FS void vm_stat_account(struct mm_struct *, unsigned long, struct file *, long); #else diff --git a/mm/memory.c b/mm/memory.c index 740470c..496e4e6 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -2012,11 +2012,10 @@ EXPORT_SYMBOL(remap_pfn_range); static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data) + pte_batch_fn_t fn, void *data) { pte_t *pte; int err; - pgtable_t token; spinlock_t *uninitialized_var(ptl); pte = (mm == &init_mm) ? @@ -2028,25 +2027,17 @@ static int apply_to_pte_range(struct mm_struct *mm, pmd_t *pmd, BUG_ON(pmd_huge(*pmd)); arch_enter_lazy_mmu_mode(); - - token = pmd_pgtable(*pmd); - - do { - err = fn(pte++, addr, data); - if (err) - break; - } while (addr += PAGE_SIZE, addr != end); - + err = fn(pte, (end - addr) / PAGE_SIZE, addr, data); arch_leave_lazy_mmu_mode(); if (mm != &init_mm) - pte_unmap_unlock(pte-1, ptl); + pte_unmap_unlock(pte, ptl); return err; } static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data) + pte_batch_fn_t fn, void *data) { pmd_t *pmd; unsigned long next; @@ -2068,7 +2059,7 @@ static int apply_to_pmd_range(struct mm_struct *mm, pud_t *pud, static int apply_to_pud_range(struct mm_struct *mm, pgd_t *pgd, unsigned long addr, unsigned long end, - pte_fn_t fn, void *data) + pte_batch_fn_t fn, void *data) { pud_t *pud; unsigned long next; @@ -2090,8 +2081,9 @@ static int apply_to_pud_range(struct mm_struct *mm, pgd_t *pgd, * Scan a region of virtual memory, filling in page tables as necessary * and calling a provided function on each leaf page table. */ -int apply_to_page_range(struct mm_struct *mm, unsigned long addr, - unsigned long size, pte_fn_t fn, void *data) +int apply_to_page_range_batch(struct mm_struct *mm, + unsigned long addr, unsigned long size, + pte_batch_fn_t fn, void *data) { pgd_t *pgd; unsigned long next; @@ -2109,6 +2101,39 @@ int apply_to_page_range(struct mm_struct *mm, unsigned long addr, return err; } +EXPORT_SYMBOL_GPL(apply_to_page_range_batch); + +struct pte_single_fn +{ + pte_fn_t fn; + void *data; +}; + +static int apply_pte_batch(pte_t *pte, unsigned count, + unsigned long addr, void *data) +{ + struct pte_single_fn *single = data; + int err = 0; + + while (count--) { + err = single->fn(pte, addr, single->data); + if (err) + break; + + addr += PAGE_SIZE; + pte++; + } + + return err; +} + +int apply_to_page_range(struct mm_struct *mm, unsigned long addr, + unsigned long size, pte_fn_t fn, void *data) +{ + struct pte_single_fn single = { .fn = fn, .data = data }; + return apply_to_page_range_batch(mm, addr, size, + apply_pte_batch, &single); +} EXPORT_SYMBOL_GPL(apply_to_page_range); /* -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 3/9] ioremap: use apply_to_page_range_batch() for ioremap_page_range()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- lib/ioremap.c | 85 +++++++++++++++------------------------------------------ 1 files changed, 22 insertions(+), 63 deletions(-) diff --git a/lib/ioremap.c b/lib/ioremap.c index da4e2ad..e75d0d1 100644 --- a/lib/ioremap.c +++ b/lib/ioremap.c @@ -13,81 +13,40 @@ #include <asm/cacheflush.h> #include <asm/pgtable.h> -static int ioremap_pte_range(pmd_t *pmd, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) +struct ioremap_data { - pte_t *pte; + phys_addr_t phys_addr; + pgprot_t prot; +}; + +static int ioremap_pte_range(pte_t *pte, unsigned count, + unsigned long addr, void *v) +{ + struct ioremap_data *data = v; u64 pfn; - pfn = phys_addr >> PAGE_SHIFT; - pte = pte_alloc_kernel(pmd, addr); - if (!pte) - return -ENOMEM; - do { - BUG_ON(!pte_none(*pte)); - set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot)); - pfn++; - } while (pte++, addr += PAGE_SIZE, addr != end); - return 0; -} + pfn = data->phys_addr >> PAGE_SHIFT; + data->phys_addr += count * PAGE_SIZE; -static inline int ioremap_pmd_range(pud_t *pud, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) -{ - pmd_t *pmd; - unsigned long next; + while (count--) { + BUG_ON(!pte_none(*pte)); - phys_addr -= addr; - pmd = pmd_alloc(&init_mm, pud, addr); - if (!pmd) - return -ENOMEM; - do { - next = pmd_addr_end(addr, end); - if (ioremap_pte_range(pmd, addr, next, phys_addr + addr, prot)) - return -ENOMEM; - } while (pmd++, addr = next, addr != end); - return 0; -} + set_pte_at(&init_mm, addr, pte++, pfn_pte(pfn++, data->prot)); -static inline int ioremap_pud_range(pgd_t *pgd, unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) -{ - pud_t *pud; - unsigned long next; + addr += PAGE_SIZE; + } - phys_addr -= addr; - pud = pud_alloc(&init_mm, pgd, addr); - if (!pud) - return -ENOMEM; - do { - next = pud_addr_end(addr, end); - if (ioremap_pmd_range(pud, addr, next, phys_addr + addr, prot)) - return -ENOMEM; - } while (pud++, addr = next, addr != end); return 0; } -int ioremap_page_range(unsigned long addr, - unsigned long end, phys_addr_t phys_addr, pgprot_t prot) +int ioremap_page_range(unsigned long addr, unsigned long end, + phys_addr_t phys_addr, pgprot_t prot) { - pgd_t *pgd; - unsigned long start; - unsigned long next; - int err; - - BUG_ON(addr >= end); - - start = addr; - phys_addr -= addr; - pgd = pgd_offset_k(addr); - do { - next = pgd_addr_end(addr, end); - err = ioremap_pud_range(pgd, addr, next, phys_addr+addr, prot); - if (err) - break; - } while (pgd++, addr = next, addr != end); + struct ioremap_data data = { .phys_addr = phys_addr, .prot = prot }; + int err = apply_to_page_range_batch(&init_mm, addr, end - addr, + ioremap_pte_range, &data); - flush_cache_vmap(start, end); + flush_cache_vmap(addr, end); return err; } -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 4/9] vmalloc: use plain pte_clear() for unmaps
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> ptep_get_and_clear() is potentially moderately expensive (at least an atomic operation, or potentially a trap-and-fault when virtualized) so use a plain pte_clear(). Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- mm/vmalloc.c | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 5ddbdfe..c06dc1e 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -39,8 +39,9 @@ static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end) pte = pte_offset_kernel(pmd, addr); do { - pte_t ptent = ptep_get_and_clear(&init_mm, addr, pte); + pte_t ptent = *pte; WARN_ON(!pte_none(ptent) && !pte_present(ptent)); + pte_clear(&init_mm, addr, pte); } while (pte++, addr += PAGE_SIZE, addr != end); } -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 5/9] vmalloc: use apply_to_page_range_batch() for vunmap_page_range()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> There''s no need to open-code it when there''s helpful utility function to do the job. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@kernel.dk> --- mm/vmalloc.c | 53 +++++++++-------------------------------------------- 1 files changed, 9 insertions(+), 44 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c06dc1e..e99aa3b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -33,59 +33,24 @@ /*** Page table manipulation functions ***/ -static void vunmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end) +static int vunmap_pte(pte_t *pte, unsigned count, + unsigned long addr, void *data) { - pte_t *pte; - - pte = pte_offset_kernel(pmd, addr); - do { + while (count--) { pte_t ptent = *pte; - WARN_ON(!pte_none(ptent) && !pte_present(ptent)); - pte_clear(&init_mm, addr, pte); - } while (pte++, addr += PAGE_SIZE, addr != end); -} - -static void vunmap_pmd_range(pud_t *pud, unsigned long addr, unsigned long end) -{ - pmd_t *pmd; - unsigned long next; - pmd = pmd_offset(pud, addr); - do { - next = pmd_addr_end(addr, end); - if (pmd_none_or_clear_bad(pmd)) - continue; - vunmap_pte_range(pmd, addr, next); - } while (pmd++, addr = next, addr != end); -} + WARN_ON(!pte_none(ptent) && !pte_present(ptent)); -static void vunmap_pud_range(pgd_t *pgd, unsigned long addr, unsigned long end) -{ - pud_t *pud; - unsigned long next; + pte_clear(&init_mm, addr, pte++); + addr += PAGE_SIZE; + } - pud = pud_offset(pgd, addr); - do { - next = pud_addr_end(addr, end); - if (pud_none_or_clear_bad(pud)) - continue; - vunmap_pmd_range(pud, addr, next); - } while (pud++, addr = next, addr != end); + return 0; } static void vunmap_page_range(unsigned long addr, unsigned long end) { - pgd_t *pgd; - unsigned long next; - - BUG_ON(addr >= end); - pgd = pgd_offset_k(addr); - do { - next = pgd_addr_end(addr, end); - if (pgd_none_or_clear_bad(pgd)) - continue; - vunmap_pud_range(pgd, addr, next); - } while (pgd++, addr = next, addr != end); + apply_to_page_range_batch(&init_mm, addr, end - addr, vunmap_pte, NULL); } static int vmap_pte_range(pmd_t *pmd, unsigned long addr, -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 6/9] vmalloc: use apply_to_page_range_batch() for vmap_page_range_noflush()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> There''s no need to open-code it when there''s a helpful utility function. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Nick Piggin <npiggin@kernel.dk> --- mm/vmalloc.c | 92 ++++++++++++++++++--------------------------------------- 1 files changed, 29 insertions(+), 63 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index e99aa3b..cf4e705 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -53,63 +53,34 @@ static void vunmap_page_range(unsigned long addr, unsigned long end) apply_to_page_range_batch(&init_mm, addr, end - addr, vunmap_pte, NULL); } -static int vmap_pte_range(pmd_t *pmd, unsigned long addr, - unsigned long end, pgprot_t prot, struct page **pages, int *nr) +struct vmap_data { - pte_t *pte; + struct page **pages; + unsigned index; + pgprot_t prot; +}; - /* - * nr is a running index into the array which helps higher level - * callers keep track of where we''re up to. - */ +static int vmap_pte(pte_t *pte, unsigned count, + unsigned long addr, void *data) +{ + struct vmap_data *vmap = data; - pte = pte_alloc_kernel(pmd, addr); - if (!pte) - return -ENOMEM; - do { - struct page *page = pages[*nr]; + while (count--) { + struct page *page = vmap->pages[vmap->index]; if (WARN_ON(!pte_none(*pte))) return -EBUSY; + if (WARN_ON(!page)) return -ENOMEM; - set_pte_at(&init_mm, addr, pte, mk_pte(page, prot)); - (*nr)++; - } while (pte++, addr += PAGE_SIZE, addr != end); - return 0; -} -static int vmap_pmd_range(pud_t *pud, unsigned long addr, - unsigned long end, pgprot_t prot, struct page **pages, int *nr) -{ - pmd_t *pmd; - unsigned long next; - - pmd = pmd_alloc(&init_mm, pud, addr); - if (!pmd) - return -ENOMEM; - do { - next = pmd_addr_end(addr, end); - if (vmap_pte_range(pmd, addr, next, prot, pages, nr)) - return -ENOMEM; - } while (pmd++, addr = next, addr != end); - return 0; -} + set_pte_at(&init_mm, addr, pte, mk_pte(page, vmap->prot)); -static int vmap_pud_range(pgd_t *pgd, unsigned long addr, - unsigned long end, pgprot_t prot, struct page **pages, int *nr) -{ - pud_t *pud; - unsigned long next; + pte++; + addr += PAGE_SIZE; + vmap->index++; + } - pud = pud_alloc(&init_mm, pgd, addr); - if (!pud) - return -ENOMEM; - do { - next = pud_addr_end(addr, end); - if (vmap_pmd_range(pud, addr, next, prot, pages, nr)) - return -ENOMEM; - } while (pud++, addr = next, addr != end); return 0; } @@ -122,22 +93,17 @@ static int vmap_pud_range(pgd_t *pgd, unsigned long addr, static int vmap_page_range_noflush(unsigned long start, unsigned long end, pgprot_t prot, struct page **pages) { - pgd_t *pgd; - unsigned long next; - unsigned long addr = start; - int err = 0; - int nr = 0; - - BUG_ON(addr >= end); - pgd = pgd_offset_k(addr); - do { - next = pgd_addr_end(addr, end); - err = vmap_pud_range(pgd, addr, next, prot, pages, &nr); - if (err) - return err; - } while (pgd++, addr = next, addr != end); - - return nr; + int err; + struct vmap_data vmap = { + .pages = pages, + .index = 0, + .prot = prot + }; + + err = apply_to_page_range_batch(&init_mm, start, end - start, + vmap_pte, &vmap); + + return err ? err : vmap.index; } static int vmap_page_range(unsigned long start, unsigned long end, -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 7/9] vmalloc: use apply_to_page_range_batch() in alloc_vm_area()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- mm/vmalloc.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index cf4e705..64d395f 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1993,9 +1993,9 @@ void __attribute__((weak)) vmalloc_sync_all(void) } -static int f(pte_t *pte, unsigned long addr, void *data) +static int f(pte_t *pte, unsigned count, unsigned long addr, void *data) { - /* apply_to_page_range() does all the hard work. */ + /* apply_to_page_range_batch() does all the hard work. */ return 0; } @@ -2024,8 +2024,8 @@ struct vm_struct *alloc_vm_area(size_t size) * This ensures that page tables are constructed for this region * of kernel virtual address space and mapped into init_mm. */ - if (apply_to_page_range(&init_mm, (unsigned long)area->addr, - area->size, f, NULL)) { + if (apply_to_page_range_batch(&init_mm, (unsigned long)area->addr, + area->size, f, NULL)) { free_vm_area(area); return NULL; } -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 8/9] xen/mmu: use apply_to_page_range_batch() in xen_remap_domain_mfn_range()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/xen/mmu.c | 19 ++++++++++++------- 1 files changed, 12 insertions(+), 7 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 38ba804..25da278 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -2292,14 +2292,19 @@ struct remap_data { struct mmu_update *mmu_update; }; -static int remap_area_mfn_pte_fn(pte_t *ptep, unsigned long addr, void *data) +static int remap_area_mfn_pte_fn(pte_t *ptep, unsigned count, + unsigned long addr, void *data) { struct remap_data *rmd = data; - pte_t pte = pte_mkspecial(pfn_pte(rmd->mfn++, rmd->prot)); - rmd->mmu_update->ptr = arbitrary_virt_to_machine(ptep).maddr; - rmd->mmu_update->val = pte_val_ma(pte); - rmd->mmu_update++; + while (count--) { + pte_t pte = pte_mkspecial(pfn_pte(rmd->mfn++, rmd->prot)); + + rmd->mmu_update->ptr = arbitrary_virt_to_machine(ptep).maddr; + rmd->mmu_update->val = pte_val_ma(pte); + rmd->mmu_update++; + ptep++; + } return 0; } @@ -2328,8 +2333,8 @@ int xen_remap_domain_mfn_range(struct vm_area_struct *vma, range = (unsigned long)batch << PAGE_SHIFT; rmd.mmu_update = mmu_update; - err = apply_to_page_range(vma->vm_mm, addr, range, - remap_area_mfn_pte_fn, &rmd); + err = apply_to_page_range_batch(vma->vm_mm, addr, range, + remap_area_mfn_pte_fn, &rmd); if (err) goto out; -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Jan-24 22:56 UTC
[Xen-devel] [PATCH 9/9] xen/grant-table: use apply_to_page_range_batch()
From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> No need to call the callback per-pte. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> --- arch/x86/xen/grant-table.c | 28 ++++++++++++++++++---------- 1 files changed, 18 insertions(+), 10 deletions(-) diff --git a/arch/x86/xen/grant-table.c b/arch/x86/xen/grant-table.c index 5bf892a..11a8a45 100644 --- a/arch/x86/xen/grant-table.c +++ b/arch/x86/xen/grant-table.c @@ -44,19 +44,27 @@ #include <asm/pgtable.h> -static int map_pte_fn(pte_t *pte, unsigned long addr, void *data) +static int map_pte_fn(pte_t *pte, unsigned count, unsigned long addr, void *data) { unsigned long **frames = (unsigned long **)data; - set_pte_at(&init_mm, addr, pte, mfn_pte((*frames)[0], PAGE_KERNEL)); - (*frames)++; + while (count--) { + set_pte_at(&init_mm, addr, pte, mfn_pte((*frames)[0], PAGE_KERNEL)); + (*frames)++; + pte++; + addr += PAGE_SIZE; + } return 0; } -static int unmap_pte_fn(pte_t *pte, unsigned long addr, void *data) +static int unmap_pte_fn(pte_t *pte, unsigned count, unsigned long addr, void *data) { + while (count--) { + pte_clear(&init_mm, addr, pte); + addr += PAGE_SIZE; + pte++; + } - set_pte_at(&init_mm, addr, pte, __pte(0)); return 0; } @@ -75,15 +83,15 @@ int arch_gnttab_map_shared(unsigned long *frames, unsigned long nr_gframes, *__shared = shared; } - rc = apply_to_page_range(&init_mm, (unsigned long)shared, - PAGE_SIZE * nr_gframes, - map_pte_fn, &frames); + rc = apply_to_page_range_batch(&init_mm, (unsigned long)shared, + PAGE_SIZE * nr_gframes, + map_pte_fn, &frames); return rc; } void arch_gnttab_unmap_shared(struct grant_entry *shared, unsigned long nr_gframes) { - apply_to_page_range(&init_mm, (unsigned long)shared, - PAGE_SIZE * nr_gframes, unmap_pte_fn, NULL); + apply_to_page_range_batch(&init_mm, (unsigned long)shared, + PAGE_SIZE * nr_gframes, unmap_pte_fn, NULL); } -- 1.7.3.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Andrew Morton
2011-Jan-28 00:18 UTC
[Xen-devel] Re: [PATCH 0/9] Add apply_to_page_range_batch() and use it
On Mon, 24 Jan 2011 14:55:58 -0800 Jeremy Fitzhardinge <jeremy@goop.org> wrote:> From: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > > I''m proposing this series for 2.6.39. > > We''ve had apply_to_page_range() for a while, which is a general way to > apply a function to ptes across a range of addresses - including > allocating any missing parts of the pagetable as needed. This logic > is replicated in a number of places throughout the kernel, but it > hasn''t been widely replaced by this function, partly because of > concerns about the overhead of calling the function once per pte. > > This series adds apply_to_page_range_batch() (and reimplements > apply_to_page_range() in terms of it), which calls the pte operation > function once per pte page, moving the inner loop into the callback > function. > > apply_to_page_range(_batch) also calls its callback with lazy mmu > updates enabled, which allows batching of the operations in > environments where this is beneficial (ie, virtualization). The only > caveat this introduces is callbacks can''t expect to immediately see > the effects of the pte updates in memory. > > Since this is effectively identical to the code in lib/ioremap.c and > mm/vmalloc.c (twice!), I replace their open-coded variants. I''m sure > there are others places in the kernel which could do with this (I only > stumbled over ioremap by accident). > > I also add a minor optimisation to vunmap_page_range() to use a > plain pte_clear() rather than the more expensive and unnecessary > ptep_get_and_clear(). > > Jeremy Fitzhardinge (9): > mm: remove unused "token" argument from apply_to_page_range callback. > mm: add apply_to_page_range_batch() > ioremap: use apply_to_page_range_batch() for ioremap_page_range() > vmalloc: use plain pte_clear() for unmaps > vmalloc: use apply_to_page_range_batch() for vunmap_page_range() > vmalloc: use apply_to_page_range_batch() for > vmap_page_range_noflush() > vmalloc: use apply_to_page_range_batch() in alloc_vm_area() > xen/mmu: use apply_to_page_range_batch() in > xen_remap_domain_mfn_range() > xen/grant-table: use apply_to_page_range_batch() > > arch/x86/xen/grant-table.c | 30 +++++---- > arch/x86/xen/mmu.c | 18 +++-- > include/linux/mm.h | 9 ++- > lib/ioremap.c | 85 +++++++------------------ > mm/memory.c | 57 ++++++++++++----- > mm/vmalloc.c | 150 ++++++++++++-------------------------------- > 6 files changed, 140 insertions(+), 209 deletions(-)That all looks good to me. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel