Jiang Liu
2013-Mar-05 14:54 UTC
[RFC PATCH v1 00/33] accurately calculate pages managed by buddy system
The original goal of this patchset is to fix the bug reported by https://bugzilla.kernel.org/show_bug.cgi?id=53501 Now it has also been expanded to reduce common code used by memory initializion. In total it has reduced about 550 lines of code. Patch 1: Extract common help functions from free_init_mem() and free_initrd_mem() on different architectures. Patch 2-27: Use help functions to simplify free_init_mem() and free_initrd_mem() on different architectures. This has reduced about 500 lines of code. Patch 28: Introduce common help function to free highmem pages when initializing memory subsystem. Patch 29-32: Adjust totalhigh_pages, totalram_pages and zone->managed_pages altogether when reserving/unreserving pages. Patch 33: Change /sys/.../node/nodex/meminfo to report available pages within the node as "MemTotal". We have only tested these patchset on x86 platforms, and have done basic compliation tests using cross-compilers from ftp.kernel.org. That means some code may not pass compilation on some architectures. So any help to test this patchset are welcomed! Jiang Liu (33): mm: introduce common help functions to deal with reserved/managed pages mm/alpha: use common help functions to free reserved pages mm/ARM: use common help functions to free reserved pages mm/avr32: use common help functions to free reserved pages mm/blackfin: use common help functions to free reserved pages mm/c6x: use common help functions to free reserved pages mm/cris: use common help functions to free reserved pages mm/FRV: use common help functions to free reserved pages mm/h8300: use common help functions to free reserved pages mm/IA64: use common help functions to free reserved pages mm/m32r: use common help functions to free reserved pages mm/m68k: use common help functions to free reserved pages mm/microblaze: use common help functions to free reserved pages mm/MIPS: use common help functions to free reserved pages mm/mn10300: use common help functions to free reserved pages mm/openrisc: use common help functions to free reserved pages mm/parisc: use common help functions to free reserved pages mm/ppc: use common help functions to free reserved pages mm/s390: use common help functions to free reserved pages mm/score: use common help functions to free reserved pages mm/SH: use common help functions to free reserved pages mm/SPARC: use common help functions to free reserved pages mm/um: use common help functions to free reserved pages mm/unicore32: use common help functions to free reserved pages mm/x86: use common help functions to free reserved pages mm/xtensa: use common help functions to free reserved pages mm,kexec: use common help functions to free reserved pages mm: introduce free_highmem_page() helper to free highmem pages inti buddy system mm: accurately calculate zone->managed_pages for highmem zones mm: use a dedicated lock to protect totalram_pages and zone->managed_pages mm: avoid using __free_pages_bootmem() at runtime mm: correctly update zone->mamaged_pages mm: report available pages as "MemTotal" for each NUMA node arch/alpha/kernel/sys_nautilus.c | 5 +- arch/alpha/mm/init.c | 24 ++------- arch/alpha/mm/numa.c | 3 +- arch/arm/mm/init.c | 46 ++++++----------- arch/arm64/mm/init.c | 26 +--------- arch/avr32/mm/init.c | 24 +-------- arch/blackfin/mm/init.c | 20 +------- arch/c6x/mm/init.c | 30 +---------- arch/cris/mm/init.c | 16 +----- arch/frv/mm/init.c | 32 ++---------- arch/h8300/mm/init.c | 28 +---------- arch/ia64/mm/init.c | 23 ++------- arch/m32r/mm/init.c | 26 ++-------- arch/m68k/mm/init.c | 24 +-------- arch/microblaze/include/asm/setup.h | 1 - arch/microblaze/mm/init.c | 33 ++---------- arch/mips/mm/init.c | 36 ++++---------- arch/mips/sgi-ip27/ip27-memory.c | 4 +- arch/mn10300/mm/init.c | 23 +-------- arch/openrisc/mm/init.c | 27 ++-------- arch/parisc/mm/init.c | 24 ++------- arch/powerpc/kernel/crash_dump.c | 5 +- arch/powerpc/kernel/fadump.c | 5 +- arch/powerpc/kernel/kvm.c | 7 +-- arch/powerpc/mm/mem.c | 34 ++----------- arch/powerpc/platforms/512x/mpc512x_shared.c | 5 +- arch/s390/mm/init.c | 35 +++---------- arch/score/mm/init.c | 33 ++---------- arch/sh/mm/init.c | 26 ++-------- arch/sparc/kernel/leon_smp.c | 15 ++---- arch/sparc/mm/init_32.c | 50 +++---------------- arch/sparc/mm/init_64.c | 25 ++-------- arch/tile/mm/init.c | 4 +- arch/um/kernel/mem.c | 25 ++-------- arch/unicore32/mm/init.c | 26 +--------- arch/x86/mm/init.c | 5 +- arch/x86/mm/init_32.c | 10 +--- arch/x86/mm/init_64.c | 18 +------ arch/xtensa/mm/init.c | 21 ++------ drivers/virtio/virtio_balloon.c | 8 +-- drivers/xen/balloon.c | 19 ++----- include/linux/mm.h | 36 ++++++++++++++ include/linux/mmzone.h | 14 ++++-- kernel/kexec.c | 8 +-- mm/bootmem.c | 16 ++---- mm/hugetlb.c | 2 +- mm/memory_hotplug.c | 31 ++---------- mm/nobootmem.c | 14 ++---- mm/page_alloc.c | 69 ++++++++++++++++++++++---- 49 files changed, 248 insertions(+), 793 deletions(-) -- 1.7.9.5
Jiang Liu
2013-Mar-05 14:54 UTC
[RFC PATCH v1 01/33] mm: introduce common help functions to deal with reserved/managed pages
Code to deal with reserved/managed pages are duplicated by many architectures, so introduce common help functions to reduce duplicated code. These common help functions will also be used to concentrate code to modify totalram_pages and zone->managed_pages, which makes the code much more clear. Signed-off-by: Jiang Liu <jiang.liu at huawei.com> --- include/linux/mm.h | 37 +++++++++++++++++++++++++++++++++++++ mm/page_alloc.c | 20 ++++++++++++++++++++ 2 files changed, 57 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index 7acc9dc..881461c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1295,6 +1295,43 @@ extern void free_area_init_node(int nid, unsigned long * zones_size, unsigned long zone_start_pfn, unsigned long *zholes_size); extern void free_initmem(void); +/* Help functions to deal with reserved/managed pages. */ +extern unsigned long free_reserved_area(unsigned long start, unsigned long end, + int poison, char *s); + +static inline void adjust_managed_page_count(struct page *page, long count) +{ + totalram_pages += count; +} + +static inline void __free_reserved_page(struct page *page) +{ + ClearPageReserved(page); + init_page_count(page); + __free_page(page); +} + +static inline void free_reserved_page(struct page *page) +{ + __free_reserved_page(page); + adjust_managed_page_count(page, 1); +} + +static inline void mark_page_reserved(struct page *page) +{ + SetPageReserved(page); + adjust_managed_page_count(page, -1); +} + +static inline void free_initmem_default(int poison) +{ + extern char __init_begin[], __init_end[]; + + free_reserved_area(PAGE_ALIGN((unsigned long)&__init_begin) , + ((unsigned long)&__init_end) & PAGE_MASK, + poison, "unused kernel"); +} + #ifdef CONFIG_HAVE_MEMBLOCK_NODE_MAP /* * With CONFIG_HAVE_MEMBLOCK_NODE_MAP set, an architecture may initialise its diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 8fcced7..0fadb09 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5113,6 +5113,26 @@ early_param("movablecore", cmdline_parse_movablecore); #endif /* CONFIG_HAVE_MEMBLOCK_NODE_MAP */ +unsigned long free_reserved_area(unsigned long start, unsigned long end, + int poison, char *s) +{ + unsigned long pages, pos; + + pos = start = PAGE_ALIGN(start); + end &= PAGE_MASK; + for (pages = 0; pos < end; pos += PAGE_SIZE, pages++) { + if (poison) + memset((void *)pos, poison, PAGE_SIZE); + free_reserved_page(virt_to_page(pos)); + } + + if (pages && s) + pr_info("Freeing %s memory: %ldK (%lx - %lx)\n", + s, pages << (PAGE_SHIFT - 10), start, end); + + return pages; +} + /** * set_dma_reserve - set the specified number of pages reserved in the first zone * @new_dma_reserve: The number of pages to mark reserved -- 1.7.9.5
Jiang Liu
2013-Mar-05 14:55 UTC
[RFC PATCH v1 32/33] mm: correctly update zone->mamaged_pages
Enhance adjust_managed_page_count() to adjust totalhigh_pages for highmem pages. And change code which directly adjusts totalram_pages to use adjust_managed_page_count() because it adjusts totalram_pages, totalhigh_pages and zone->managed_pages altogether in a safe way. Remove inc_totalhigh_pages() and dec_totalhigh_pages() from xen/balloon driver bacause adjust_managed_page_count() has already adjusted totalhigh_pages. This patch also enhances virtio_balloon driver to adjust totalhigh_pages when reserve/unreserve pages. Signed-off-by: Jiang Liu <jiang.liu at huawei.com> Cc: Chris Metcalf <cmetcalf at tilera.com> Cc: Rusty Russell <rusty at rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst at redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com> Cc: Jeremy Fitzhardinge <jeremy at goop.org> Cc: Wen Congyang <wency at cn.fujitsu.com> Cc: Andrew Morton <akpm at linux-foundation.org> Cc: Tang Chen <tangchen at cn.fujitsu.com> Cc: Yasuaki Ishimatsu <isimatu.yasuaki at jp.fujitsu.com> Cc: Mel Gorman <mgorman at suse.de> Cc: Minchan Kim <minchan at kernel.org> Cc: linux-kernel at vger.kernel.org Cc: virtualization at lists.linux-foundation.org Cc: xen-devel at lists.xensource.com Cc: linux-mm at kvack.org --- arch/tile/mm/init.c | 4 ++-- drivers/virtio/virtio_balloon.c | 8 +++++--- drivers/xen/balloon.c | 19 ++++--------------- mm/hugetlb.c | 2 +- mm/memory_hotplug.c | 15 +++------------ mm/page_alloc.c | 6 ++++++ 6 files changed, 21 insertions(+), 33 deletions(-) diff --git a/arch/tile/mm/init.c b/arch/tile/mm/init.c index 2749515..5886aef 100644 --- a/arch/tile/mm/init.c +++ b/arch/tile/mm/init.c @@ -720,7 +720,7 @@ static void __init init_free_pfn_range(unsigned long start, unsigned long end) } init_page_count(page); __free_pages(page, order); - totalram_pages += count; + adjust_managed_page_count(page, count); page += count; pfn += count; @@ -1033,7 +1033,7 @@ static void free_init_pages(char *what, unsigned long begin, unsigned long end) pfn_pte(pfn, PAGE_KERNEL)); memset((void *)addr, POISON_FREE_INITMEM, PAGE_SIZE); free_page(addr); - totalram_pages++; + adjust_managed_page_count(page, 1); } pr_info("Freeing %s: %ldk freed\n", what, (end - begin) >> 10); } diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 8dab163..4c6ec53 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -148,7 +148,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num) } set_page_pfns(vb->pfns + vb->num_pfns, page); vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; - totalram_pages--; + adjust_managed_page_count(page, -1); } /* Did we get any? */ @@ -160,11 +160,13 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num) static void release_pages_by_pfn(const u32 pfns[], unsigned int num) { unsigned int i; + struct page *page; /* Find pfns pointing at start of each page, get pages and free them. */ for (i = 0; i < num; i += VIRTIO_BALLOON_PAGES_PER_PAGE) { - balloon_page_free(balloon_pfn_to_page(pfns[i])); - totalram_pages++; + page = balloon_pfn_to_page(pfns[i]); + balloon_page_free(page); + adjust_managed_page_count(page, 1); } } diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c index a56776d..a5fdbcc 100644 --- a/drivers/xen/balloon.c +++ b/drivers/xen/balloon.c @@ -89,14 +89,6 @@ EXPORT_SYMBOL_GPL(balloon_stats); /* We increase/decrease in batches which fit in a page */ static xen_pfn_t frame_list[PAGE_SIZE / sizeof(unsigned long)]; -#ifdef CONFIG_HIGHMEM -#define inc_totalhigh_pages() (totalhigh_pages++) -#define dec_totalhigh_pages() (totalhigh_pages--) -#else -#define inc_totalhigh_pages() do {} while (0) -#define dec_totalhigh_pages() do {} while (0) -#endif - /* List of ballooned pages, threaded through the mem_map array. */ static LIST_HEAD(ballooned_pages); @@ -132,9 +124,7 @@ static void __balloon_append(struct page *page) static void balloon_append(struct page *page) { __balloon_append(page); - if (PageHighMem(page)) - dec_totalhigh_pages(); - totalram_pages--; + adjust_managed_page_count(page, -1); } /* balloon_retrieve: rescue a page from the balloon, if it is not empty. */ @@ -151,13 +141,12 @@ static struct page *balloon_retrieve(bool prefer_highmem) page = list_entry(ballooned_pages.next, struct page, lru); list_del(&page->lru); - if (PageHighMem(page)) { + if (PageHighMem(page)) balloon_stats.balloon_high--; - inc_totalhigh_pages(); - } else + else balloon_stats.balloon_low--; - totalram_pages++; + adjust_managed_page_count(page, 1); return page; } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0a0be33..a381818 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1246,7 +1246,7 @@ static void __init gather_bootmem_prealloc(void) * side-effects, like CommitLimit going negative. */ if (h->order > (MAX_ORDER - 1)) - totalram_pages += 1 << h->order; + adjust_managed_page_count(page, 1 << h->order); } } diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index af9e87f..f9ce564 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -760,20 +760,13 @@ EXPORT_SYMBOL_GPL(__online_page_set_limits); void __online_page_increment_counters(struct page *page) { - totalram_pages++; - -#ifdef CONFIG_HIGHMEM - if (PageHighMem(page)) - totalhigh_pages++; -#endif + adjust_managed_page_count(page, 1); } EXPORT_SYMBOL_GPL(__online_page_increment_counters); void __online_page_free(struct page *page) { - ClearPageReserved(page); - init_page_count(page); - __free_page(page); + __free_reserved_page(page); } EXPORT_SYMBOL_GPL(__online_page_free); @@ -970,7 +963,6 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ return ret; } - zone->managed_pages += onlined_pages; zone->present_pages += onlined_pages; zone->zone_pgdat->node_present_pages += onlined_pages; if (onlined_pages) { @@ -1554,10 +1546,9 @@ repeat: /* reset pagetype flags and makes migrate type to be MOVABLE */ undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE); /* removal success */ - zone->managed_pages -= offlined_pages; + adjust_managed_page_count(pfn_to_page(start_pfn), -offlined_pages); zone->present_pages -= offlined_pages; zone->zone_pgdat->node_present_pages -= offlined_pages; - totalram_pages -= offlined_pages; init_per_zone_wmark_min(); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d252443..041eb92 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -780,7 +780,9 @@ void __init init_cma_reserved_pageblock(struct page *page) #ifdef CONFIG_HIGHMEM if (PageHighMem(page)) totalhigh_pages += pageblock_nr_pages; + else #endif + page_zone(page)->managed_pages += pageblock_nr_pages; } #endif @@ -5119,6 +5121,10 @@ void adjust_managed_page_count(struct page *page, long count) page_zone(page)->managed_pages += count; totalram_pages += count; +#ifdef CONFIG_HIGHMEM + if (PageHighMem(page)) + totalhigh_pages += count; +#endif if (lock) spin_unlock(&managed_page_count_lock); -- 1.7.9.5
Sam Ravnborg
2013-Mar-05 19:47 UTC
[RFC PATCH v1 01/33] mm: introduce common help functions to deal with reserved/managed pages
On Tue, Mar 05, 2013 at 10:54:44PM +0800, Jiang Liu wrote:> Code to deal with reserved/managed pages are duplicated by many > architectures, so introduce common help functions to reduce duplicated > code. These common help functions will also be used to concentrate code > to modify totalram_pages and zone->managed_pages, which makes the code > much more clear. > > Signed-off-by: Jiang Liu <jiang.liu at huawei.com> > --- > include/linux/mm.h | 37 +++++++++++++++++++++++++++++++++++++ > mm/page_alloc.c | 20 ++++++++++++++++++++ > 2 files changed, 57 insertions(+) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 7acc9dc..881461c 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -1295,6 +1295,43 @@ extern void free_area_init_node(int nid, unsigned long * zones_size, > unsigned long zone_start_pfn, unsigned long *zholes_size); > extern void free_initmem(void); > > +/* Help functions to deal with reserved/managed pages. */ > +extern unsigned long free_reserved_area(unsigned long start, unsigned long end, > + int poison, char *s); > + > +static inline void adjust_managed_page_count(struct page *page, long count) > +{ > + totalram_pages += count; > +}What is the purpose of the unused page argument?> + > +static inline void __free_reserved_page(struct page *page) > +{ > + ClearPageReserved(page); > + init_page_count(page); > + __free_page(page); > +}This method is useful for architectures which implment HIGHMEM, like 32 bit x86 and 32 bit sparc. This calls for a name without underscores.> + > +static inline void free_reserved_page(struct page *page) > +{ > + __free_reserved_page(page); > + adjust_managed_page_count(page, 1); > +} > + > +static inline void mark_page_reserved(struct page *page) > +{ > + SetPageReserved(page); > + adjust_managed_page_count(page, -1); > +} > + > +static inline void free_initmem_default(int poison) > +{Why request user to supply the poison argumet. If this is the default implmentation then use the default poison value too (POISON_FREE_INITMEM)> + extern char __init_begin[], __init_end[]; > + > + free_reserved_area(PAGE_ALIGN((unsigned long)&__init_begin) , > + ((unsigned long)&__init_end) & PAGE_MASK, > + poison, "unused kernel"); > +}Maybe it is just me how is not used to this area of the kernel. But a few comments that describe what the purpose is of each function would have helped me. Sam