Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 0/7] make balloon pages movable by compaction
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. This patch-set follows the main idea discussed at 2012 LSFMMS session: "Ballooning for transparent huge pages" -- http://lwn.net/Articles/490114/ to introduce the required changes to the virtio_balloon driver, as well as the changes to the core compaction & migration bits, in order to make those subsystems aware of ballooned pages and allow memory balloon pages become movable within a guest, thus avoiding the aforementioned fragmentation issue Following are numbers that prove this patch benefits on allowing compaction to be more effective at memory ballooned guests. Results for STRESS-HIGHALLOC benchmark, from Mel Gorman's mmtests suite, running on a 4gB RAM KVM guest which was ballooning 512mB RAM in 64mB chunks, at every minute (inflating/deflating), while test was running: ===BEGIN stress-highalloc STRESS-HIGHALLOC highalloc-3.7 highalloc-3.7 rc4-clean rc4-patch Pass 1 55.00 ( 0.00%) 62.00 ( 7.00%) Pass 2 54.00 ( 0.00%) 62.00 ( 8.00%) while Rested 75.00 ( 0.00%) 80.00 ( 5.00%) MMTests Statistics: duration 3.7 3.7 rc4-clean rc4-patch User 1207.59 1207.46 System 1300.55 1299.61 Elapsed 2273.72 2157.06 MMTests Statistics: vmstat 3.7 3.7 rc4-clean rc4-patch Page Ins 3581516 2374368 Page Outs 11148692 10410332 Swap Ins 80 47 Swap Outs 3641 476 Direct pages scanned 37978 33826 Kswapd pages scanned 1828245 1342869 Kswapd pages reclaimed 1710236 1304099 Direct pages reclaimed 32207 31005 Kswapd efficiency 93% 97% Kswapd velocity 804.077 622.546 Direct efficiency 84% 91% Direct velocity 16.703 15.682 Percentage direct scans 2% 2% Page writes by reclaim 79252 9704 Page writes file 75611 9228 Page writes anon 3641 476 Page reclaim immediate 16764 11014 Page rescued immediate 0 0 Slabs scanned 2171904 2152448 Direct inode steals 385 2261 Kswapd inode steals 659137 609670 Kswapd skipped wait 1 69 THP fault alloc 546 631 THP collapse alloc 361 339 THP splits 259 263 THP fault fallback 98 50 THP collapse fail 20 17 Compaction stalls 747 499 Compaction success 244 145 Compaction failures 503 354 Compaction pages moved 370888 474837 Compaction move failure 77378 65259 ===END stress-highalloc Rafael Aquini (7): mm: adjust address_space_operations.migratepage() return code mm: redefine address_space.assoc_mapping mm: introduce a common interface for balloon pages mobility mm: introduce compaction and migration for ballooned pages virtio_balloon: introduce migration primitives to balloon pages mm: introduce putback_movable_pages() mm: add vm event counters for balloon pages compaction drivers/virtio/virtio_balloon.c | 136 +++++++++++++++++-- fs/buffer.c | 12 +- fs/gfs2/glock.c | 2 +- fs/hugetlbfs/inode.c | 4 +- fs/inode.c | 2 +- fs/nilfs2/page.c | 2 +- include/linux/balloon_compaction.h | 220 ++++++++++++++++++++++++++++++ include/linux/fs.h | 2 +- include/linux/migrate.h | 19 +++ include/linux/pagemap.h | 16 +++ include/linux/vm_event_item.h | 8 +- mm/Kconfig | 15 ++ mm/Makefile | 1 + mm/balloon_compaction.c | 271 +++++++++++++++++++++++++++++++++++++ mm/compaction.c | 27 +++- mm/migrate.c | 77 +++++++++-- mm/page_alloc.c | 2 +- mm/vmstat.c | 10 +- 18 files changed, 782 insertions(+), 44 deletions(-) create mode 100644 include/linux/balloon_compaction.h create mode 100644 mm/balloon_compaction.c Change log: v11: * Address AKPM's last review suggestions; * Extend the balloon compaction common API and simplify its usage at driver; * Minor nitpicks on code commentary; v10: * Adjust leak_balloon() wait_event logic to make a clear locking scheme (MST); * Drop the RCU protection approach for dereferencing balloon's page->mapping; * Minor nitpitcks on code commentaries (MST); v9: * Adjust rcu_dereference usage to leverage page lock protection (Paul, Peter); * Enhance doc on compaction interface introduced to balloon driver (Michael); * Fix issue with isolated pages breaking leak_balloon() logics (Michael); v8: * introduce a common MM interface for balloon driver page compaction (Michael); * remove the global state preventing multiple balloon device support (Michael); * introduce RCU protection/syncrhonization to balloon page->mapping (Michael); v7: * fix a potential page leak case at 'putback_balloon_page' (Mel); * adjust vm-events-counter patch and remove its drop-on-merge message (Rik); * add 'putback_movable_pages' to avoid hacks on 'putback_lru_pages' (Minchan); v6: * rename 'is_balloon_page()' to 'movable_balloon_page()' (Rik); v5: * address Andrew Morton's review comments on the patch series; * address a couple extra nitpick suggestions on PATCH 01 (Minchan); v4: * address Rusty Russel's review comments on PATCH 02; * re-base virtio_balloon patch on 9c378abc5c0c6fc8e3acf5968924d274503819b3; V3: * address reviewers nitpick suggestions on PATCH 01 (Mel, Minchan); V2: * address Mel Gorman's review comments on PATCH 01; -- 1.7.11.7
Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 1/7] mm: adjust address_space_operations.migratepage() return code
This patch introduces MIGRATEPAGE_SUCCESS as the default return code for address_space_operations.migratepage() method and documents the expected return code for the same method in failure cases. Signed-off-by: Rafael Aquini <aquini at redhat.com> --- fs/hugetlbfs/inode.c | 4 ++-- include/linux/migrate.h | 7 +++++++ mm/migrate.c | 22 +++++++++++----------- 3 files changed, 20 insertions(+), 13 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index c5bc355..bdeda2c 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -608,11 +608,11 @@ static int hugetlbfs_migrate_page(struct address_space *mapping, int rc; rc = migrate_huge_page_move_mapping(mapping, newpage, page); - if (rc) + if (rc != MIGRATEPAGE_SUCCESS) return rc; migrate_page_copy(newpage, page); - return 0; + return MIGRATEPAGE_SUCCESS; } static int hugetlbfs_statfs(struct dentry *dentry, struct kstatfs *buf) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index ce7e667..a4e886d 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -7,6 +7,13 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **); +/* + * Return values from addresss_space_operations.migratepage(): + * - negative errno on page migration failure; + * - zero on page migration success; + */ +#define MIGRATEPAGE_SUCCESS 0 + #ifdef CONFIG_MIGRATION extern void putback_lru_pages(struct list_head *l); diff --git a/mm/migrate.c b/mm/migrate.c index 77ed2d7..98c7a89 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -286,7 +286,7 @@ static int migrate_page_move_mapping(struct address_space *mapping, /* Anonymous page without mapping */ if (page_count(page) != 1) return -EAGAIN; - return 0; + return MIGRATEPAGE_SUCCESS; } spin_lock_irq(&mapping->tree_lock); @@ -356,7 +356,7 @@ static int migrate_page_move_mapping(struct address_space *mapping, } spin_unlock_irq(&mapping->tree_lock); - return 0; + return MIGRATEPAGE_SUCCESS; } /* @@ -372,7 +372,7 @@ int migrate_huge_page_move_mapping(struct address_space *mapping, if (!mapping) { if (page_count(page) != 1) return -EAGAIN; - return 0; + return MIGRATEPAGE_SUCCESS; } spin_lock_irq(&mapping->tree_lock); @@ -399,7 +399,7 @@ int migrate_huge_page_move_mapping(struct address_space *mapping, page_unfreeze_refs(page, expected_count - 1); spin_unlock_irq(&mapping->tree_lock); - return 0; + return MIGRATEPAGE_SUCCESS; } /* @@ -486,11 +486,11 @@ int migrate_page(struct address_space *mapping, rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode); - if (rc) + if (rc != MIGRATEPAGE_SUCCESS) return rc; migrate_page_copy(newpage, page); - return 0; + return MIGRATEPAGE_SUCCESS; } EXPORT_SYMBOL(migrate_page); @@ -513,7 +513,7 @@ int buffer_migrate_page(struct address_space *mapping, rc = migrate_page_move_mapping(mapping, newpage, page, head, mode); - if (rc) + if (rc != MIGRATEPAGE_SUCCESS) return rc; /* @@ -549,7 +549,7 @@ int buffer_migrate_page(struct address_space *mapping, } while (bh != head); - return 0; + return MIGRATEPAGE_SUCCESS; } EXPORT_SYMBOL(buffer_migrate_page); #endif @@ -814,7 +814,7 @@ skip_unmap: put_anon_vma(anon_vma); uncharge: - mem_cgroup_end_migration(mem, page, newpage, rc == 0); + mem_cgroup_end_migration(mem, page, newpage, rc == MIGRATEPAGE_SUCCESS); unlock: unlock_page(page); out: @@ -987,7 +987,7 @@ int migrate_pages(struct list_head *from, case -EAGAIN: retry++; break; - case 0: + case MIGRATEPAGE_SUCCESS: break; default: /* Permanent failure */ @@ -1024,7 +1024,7 @@ int migrate_huge_page(struct page *hpage, new_page_t get_new_page, /* try again */ cond_resched(); break; - case 0: + case MIGRATEPAGE_SUCCESS: goto out; default: rc = -EIO; -- 1.7.11.7
Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 2/7] mm: redefine address_space.assoc_mapping
This patch overhauls struct address_space.assoc_mapping renaming it to address_space.private_data and its type is redefined to void*. By this approach we consistently name the .private_* elements from struct address_space as well as allow extended usage for address_space association with other data structures through ->private_data. Also, all users of old ->assoc_mapping element are converted to reflect its new name and type change. Signed-off-by: Rafael Aquini <aquini at redhat.com> --- fs/buffer.c | 12 ++++++------ fs/gfs2/glock.c | 2 +- fs/inode.c | 2 +- fs/nilfs2/page.c | 2 +- include/linux/fs.h | 2 +- 5 files changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index b5f0442..e0bad95 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -555,7 +555,7 @@ void emergency_thaw_all(void) */ int sync_mapping_buffers(struct address_space *mapping) { - struct address_space *buffer_mapping = mapping->assoc_mapping; + struct address_space *buffer_mapping = mapping->private_data; if (buffer_mapping == NULL || list_empty(&mapping->private_list)) return 0; @@ -588,10 +588,10 @@ void mark_buffer_dirty_inode(struct buffer_head *bh, struct inode *inode) struct address_space *buffer_mapping = bh->b_page->mapping; mark_buffer_dirty(bh); - if (!mapping->assoc_mapping) { - mapping->assoc_mapping = buffer_mapping; + if (!mapping->private_data) { + mapping->private_data = buffer_mapping; } else { - BUG_ON(mapping->assoc_mapping != buffer_mapping); + BUG_ON(mapping->private_data != buffer_mapping); } if (!bh->b_assoc_map) { spin_lock(&buffer_mapping->private_lock); @@ -788,7 +788,7 @@ void invalidate_inode_buffers(struct inode *inode) if (inode_has_buffers(inode)) { struct address_space *mapping = &inode->i_data; struct list_head *list = &mapping->private_list; - struct address_space *buffer_mapping = mapping->assoc_mapping; + struct address_space *buffer_mapping = mapping->private_data; spin_lock(&buffer_mapping->private_lock); while (!list_empty(list)) @@ -811,7 +811,7 @@ int remove_inode_buffers(struct inode *inode) if (inode_has_buffers(inode)) { struct address_space *mapping = &inode->i_data; struct list_head *list = &mapping->private_list; - struct address_space *buffer_mapping = mapping->assoc_mapping; + struct address_space *buffer_mapping = mapping->private_data; spin_lock(&buffer_mapping->private_lock); while (!list_empty(list)) { diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c index e6c2fd5..0f22d09 100644 --- a/fs/gfs2/glock.c +++ b/fs/gfs2/glock.c @@ -768,7 +768,7 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number, mapping->host = s->s_bdev->bd_inode; mapping->flags = 0; mapping_set_gfp_mask(mapping, GFP_NOFS); - mapping->assoc_mapping = NULL; + mapping->private_data = NULL; mapping->backing_dev_info = s->s_bdi; mapping->writeback_index = 0; } diff --git a/fs/inode.c b/fs/inode.c index b03c719..4cac8e1 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -165,7 +165,7 @@ int inode_init_always(struct super_block *sb, struct inode *inode) mapping->host = inode; mapping->flags = 0; mapping_set_gfp_mask(mapping, GFP_HIGHUSER_MOVABLE); - mapping->assoc_mapping = NULL; + mapping->private_data = NULL; mapping->backing_dev_info = &default_backing_dev_info; mapping->writeback_index = 0; diff --git a/fs/nilfs2/page.c b/fs/nilfs2/page.c index 3e7b2a0..07f76db 100644 --- a/fs/nilfs2/page.c +++ b/fs/nilfs2/page.c @@ -431,7 +431,7 @@ void nilfs_mapping_init(struct address_space *mapping, struct inode *inode, mapping->host = inode; mapping->flags = 0; mapping_set_gfp_mask(mapping, GFP_NOFS); - mapping->assoc_mapping = NULL; + mapping->private_data = NULL; mapping->backing_dev_info = bdi; mapping->a_ops = &empty_aops; } diff --git a/include/linux/fs.h b/include/linux/fs.h index b33cfc9..0982565 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -418,7 +418,7 @@ struct address_space { struct backing_dev_info *backing_dev_info; /* device readahead, etc */ spinlock_t private_lock; /* for use by the address_space */ struct list_head private_list; /* ditto */ - struct address_space *assoc_mapping; /* ditto */ + void *private_data; /* ditto */ } __attribute__((aligned(sizeof(long)))); /* * On most architectures that alignment is already the case; but -- 1.7.11.7
Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 3/7] mm: introduce a common interface for balloon pages mobility
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. This patch introduces a common interface to help a balloon driver on making its page set movable to compaction, and thus allowing the system to better leverage the compation efforts on memory defragmentation. Signed-off-by: Rafael Aquini <aquini at redhat.com> --- include/linux/balloon_compaction.h | 220 ++++++++++++++++++++++++++++++ include/linux/migrate.h | 10 ++ include/linux/pagemap.h | 16 +++ mm/Kconfig | 15 +++ mm/Makefile | 1 + mm/balloon_compaction.c | 269 +++++++++++++++++++++++++++++++++++++ 6 files changed, 531 insertions(+) create mode 100644 include/linux/balloon_compaction.h create mode 100644 mm/balloon_compaction.c diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h new file mode 100644 index 0000000..1865bd5 --- /dev/null +++ b/include/linux/balloon_compaction.h @@ -0,0 +1,220 @@ +/* + * include/linux/balloon_compaction.h + * + * Common interface definitions for making balloon pages movable to compaction. + * + * Copyright (C) 2012, Red Hat, Inc. Rafael Aquini <aquini at redhat.com> + */ +#ifndef _LINUX_BALLOON_COMPACTION_H +#define _LINUX_BALLOON_COMPACTION_H +#include <linux/pagemap.h> +#include <linux/migrate.h> +#include <linux/gfp.h> +#include <linux/err.h> + +/* + * Balloon device information descriptor. + * This struct is used to allow the common balloon compaction interface + * procedures to find the proper balloon device holding memory pages they'll + * have to cope for page compaction / migration, as well as it serves the + * balloon driver as a page book-keeper for its registered balloon devices. + */ +struct balloon_dev_info { + void *balloon_device; /* balloon device descriptor */ + struct address_space *mapping; /* balloon special page->mapping */ + unsigned long isolated_pages; /* # of isolated pages for migration */ + spinlock_t pages_lock; /* Protection to pages list */ + struct list_head pages; /* Pages enqueued & handled to Host */ +}; + +extern struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info); +extern struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info); +extern struct balloon_dev_info *balloon_devinfo_alloc( + void *balloon_dev_descriptor); + +static inline void balloon_devinfo_free(struct balloon_dev_info *b_dev_info) +{ + kfree(b_dev_info); +} + +#ifdef CONFIG_BALLOON_COMPACTION +extern bool balloon_page_isolate(struct page *page); +extern void balloon_page_putback(struct page *page); +extern int balloon_page_migrate(struct page *newpage, + struct page *page, enum migrate_mode mode); +extern struct address_space +*balloon_mapping_alloc(struct balloon_dev_info *b_dev_info, + const struct address_space_operations *a_ops); + +static inline void balloon_mapping_free(struct address_space *balloon_mapping) +{ + kfree(balloon_mapping); +} + +/* + * balloon_page_insert - insert a page into the balloon's page list and make + * the page->mapping assignment accordingly. + * @page : page to be assigned as a 'balloon page' + * @mapping : allocated special 'balloon_mapping' + * @head : balloon's device page list head + */ +static inline void balloon_page_insert(struct page *page, + struct address_space *mapping, + struct list_head *head) +{ + list_add(&page->lru, head); + /* + * Make sure the page is already inserted on balloon's page list + * before assigning its ->mapping. + */ + smp_wmb(); + page->mapping = mapping; +} + +/* + * balloon_page_delete - clear the page->mapping and delete the page from + * balloon's page list accordingly. + * @page : page to be released from balloon's page list + */ +static inline void balloon_page_delete(struct page *page) +{ + page->mapping = NULL; + /* + * Make sure page->mapping is cleared before we proceed with + * balloon's page list deletion. + */ + smp_wmb(); + list_del(&page->lru); +} + +/* + * __is_movable_balloon_page - helper to perform @page mapping->flags tests + */ +static inline bool __is_movable_balloon_page(struct page *page) +{ + /* + * we might attempt to read ->mapping concurrently to other + * threads trying to write to it. + */ + struct address_space *mapping = ACCESS_ONCE(page->mapping); + smp_read_barrier_depends(); + return mapping_balloon(mapping); +} + +/* + * balloon_page_movable - test page->mapping->flags to identify balloon pages + * that can be moved by compaction/migration. + * + * This function is used at core compaction's page isolation scheme, therefore + * most pages exposed to it are not enlisted as balloon pages and so, to avoid + * undesired side effects like racing against __free_pages(), we cannot afford + * holding the page locked while testing page->mapping->flags here. + * + * As we might return false positives in the case of a balloon page being just + * released under us, the page->mapping->flags need to be re-tested later, + * under the proper page lock, at the functions that will be coping with the + * balloon page case. + */ +static inline bool balloon_page_movable(struct page *page) +{ + /* + * Before dereferencing and testing mapping->flags, lets make sure + * this is not a page that uses ->mapping in a different way + */ + if (!PageSlab(page) && !PageSwapCache(page) && !PageAnon(page) && + !page_mapped(page)) + return __is_movable_balloon_page(page); + + return false; +} + +/* + * balloon_page_device - get the b_dev_info descriptor for the balloon device + * that enqueues the given page. + */ +static inline struct balloon_dev_info *balloon_page_device(struct page *page) +{ + struct address_space *mapping = page->mapping; + if (likely(mapping)) + return mapping->private_data; + + return NULL; +} + +static inline gfp_t balloon_mapping_gfp_mask(void) +{ + return GFP_HIGHUSER_MOVABLE; +} + +static inline void balloon_event_count(enum vm_event_item item) +{ + count_vm_event(item); +} + +static inline bool balloon_compaction_check(void) +{ + return true; +} + +#else /* !CONFIG_BALLOON_COMPACTION */ + +static inline void *balloon_mapping_alloc(void *balloon_device, + const struct address_space_operations *a_ops) +{ + return ERR_PTR(-EOPNOTSUPP); +} + +static inline void balloon_mapping_free(struct address_space *balloon_mapping) +{ + return; +} + +static inline void balloon_page_insert(struct page *page, + struct address_space *mapping, + struct list_head *head) +{ + list_add(&page->lru, head); +} + +static inline void balloon_page_delete(struct page *page) +{ + list_del(&page->lru); +} + +static inline bool balloon_page_movable(struct page *page) +{ + return false; +} + +static inline bool balloon_page_isolate(struct page *page) +{ + return false; +} + +static inline void balloon_page_putback(struct page *page) +{ + return; +} + +static inline int balloon_page_migrate(struct page *newpage, + struct page *page, enum migrate_mode mode) +{ + return 0; +} + +static inline gfp_t balloon_mapping_gfp_mask(void) +{ + return GFP_HIGHUSER; +} + +static inline void balloon_event_count(enum vm_event_item item) +{ + return; +} + +static inline bool balloon_compaction_check(void) +{ + return false; +} +#endif /* CONFIG_BALLOON_COMPACTION */ +#endif /* _LINUX_BALLOON_COMPACTION_H */ diff --git a/include/linux/migrate.h b/include/linux/migrate.h index a4e886d..e570c3c 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -11,8 +11,18 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **); * Return values from addresss_space_operations.migratepage(): * - negative errno on page migration failure; * - zero on page migration success; + * + * The balloon page migration introduces this special case where a 'distinct' + * return code is used to flag a successful page migration to unmap_and_move(). + * This approach is necessary because page migration can race against balloon + * deflation procedure, and for such case we could introduce a nasty page leak + * if a successfully migrated balloon page gets released concurrently with + * migration's unmap_and_move() wrap-up steps. */ #define MIGRATEPAGE_SUCCESS 0 +#define MIGRATEPAGE_BALLOON_SUCCESS 1 /* special ret code for balloon page + * sucessfull migration case. + */ #ifdef CONFIG_MIGRATION diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index e42c762..6da609d 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -24,6 +24,7 @@ enum mapping_flags { AS_ENOSPC = __GFP_BITS_SHIFT + 1, /* ENOSPC on async write */ AS_MM_ALL_LOCKS = __GFP_BITS_SHIFT + 2, /* under mm_take_all_locks() */ AS_UNEVICTABLE = __GFP_BITS_SHIFT + 3, /* e.g., ramdisk, SHM_LOCK */ + AS_BALLOON_MAP = __GFP_BITS_SHIFT + 4, /* balloon page special map */ }; static inline void mapping_set_error(struct address_space *mapping, int error) @@ -53,6 +54,21 @@ static inline int mapping_unevictable(struct address_space *mapping) return !!mapping; } +static inline void mapping_set_balloon(struct address_space *mapping) +{ + set_bit(AS_BALLOON_MAP, &mapping->flags); +} + +static inline void mapping_clear_balloon(struct address_space *mapping) +{ + clear_bit(AS_BALLOON_MAP, &mapping->flags); +} + +static inline int mapping_balloon(struct address_space *mapping) +{ + return mapping && test_bit(AS_BALLOON_MAP, &mapping->flags); +} + static inline gfp_t mapping_gfp_mask(struct address_space * mapping) { return (__force gfp_t)mapping->flags & __GFP_BITS_MASK; diff --git a/mm/Kconfig b/mm/Kconfig index a3f8ddd..b119172 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -188,6 +188,21 @@ config SPLIT_PTLOCK_CPUS default "4" # +# support for memory balloon compaction +config BALLOON_COMPACTION + bool "Allow for balloon memory compaction/migration" + select COMPACTION + depends on VIRTIO_BALLOON + help + Memory fragmentation introduced by ballooning might reduce + significantly the number of 2MB contiguous memory blocks that can be + used within a guest, thus imposing performance penalties associated + with the reduced number of transparent huge pages that could be used + by the guest workload. Allowing the compaction & migration for memory + pages enlisted as being part of memory balloon devices avoids the + scenario aforementioned and helps improving memory defragmentation. + +# # support for memory compaction config COMPACTION bool "Allow for memory compaction" diff --git a/mm/Makefile b/mm/Makefile index 6b025f8..21e10ee 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -57,3 +57,4 @@ obj-$(CONFIG_DEBUG_KMEMLEAK) += kmemleak.o obj-$(CONFIG_DEBUG_KMEMLEAK_TEST) += kmemleak-test.o obj-$(CONFIG_CLEANCACHE) += cleancache.o obj-$(CONFIG_MEMORY_ISOLATION) += page_isolation.o +obj-$(CONFIG_BALLOON_COMPACTION) += balloon_compaction.o diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c new file mode 100644 index 0000000..90935aa --- /dev/null +++ b/mm/balloon_compaction.c @@ -0,0 +1,269 @@ +/* + * mm/balloon_compaction.c + * + * Common interface for making balloon pages movable to compaction. + * + * Copyright (C) 2012, Red Hat, Inc. Rafael Aquini <aquini at redhat.com> + */ +#include <linux/mm.h> +#include <linux/slab.h> +#include <linux/export.h> +#include <linux/balloon_compaction.h> + +/* + * balloon_devinfo_alloc - allocates a balloon device information descriptor. + * @balloon_dev_descriptor: pointer to reference the balloon device which + * this struct balloon_dev_info will be servicing. + * + * Driver must call it to properly allocate and initialize an instance of + * struct balloon_dev_info which will be used to reference a balloon device + * as well as to keep track of the balloon device page list. + */ +struct balloon_dev_info *balloon_devinfo_alloc(void *balloon_dev_descriptor) +{ + struct balloon_dev_info *b_dev_info; + b_dev_info = kmalloc(sizeof(*b_dev_info), GFP_KERNEL); + if (!b_dev_info) + return ERR_PTR(-ENOMEM); + + b_dev_info->balloon_device = balloon_dev_descriptor; + b_dev_info->mapping = NULL; + b_dev_info->isolated_pages = 0; + spin_lock_init(&b_dev_info->pages_lock); + INIT_LIST_HEAD(&b_dev_info->pages); + + return b_dev_info; +} +EXPORT_SYMBOL_GPL(balloon_devinfo_alloc); + +/* + * balloon_page_enqueue - allocates a new page and inserts it into the balloon + * page list. + * @b_dev_info: balloon device decriptor where we will insert a new page to + * + * Driver must call it to properly allocate a new enlisted balloon page + * before definetively removing it from the guest system. + * This function returns the page address for the recently enqueued page or + * NULL in the case we fail to allocate a new page this turn. + */ +struct page *balloon_page_enqueue(struct balloon_dev_info *b_dev_info) +{ + unsigned long flags; + struct page *page = alloc_page(balloon_mapping_gfp_mask() | + __GFP_NOMEMALLOC | __GFP_NORETRY); + if (!page) + return NULL; + + BUG_ON(!trylock_page(page)); + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + balloon_page_insert(page, b_dev_info->mapping, &b_dev_info->pages); + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); + unlock_page(page); + return page; +} +EXPORT_SYMBOL_GPL(balloon_page_enqueue); + +/* + * balloon_page_dequeue - removes a page from balloon's page list and returns + * the its address to allow the driver release the page. + * @b_dev_info: balloon device decriptor where we will grab a page from. + * + * Driver must call it to properly de-allocate a previous enlisted balloon page + * before definetively releasing it back to the guest system. + * This function returns the page address for the recently dequeued page or + * NULL in the case we find balloon's page list temporarely empty due to + * compaction isolated pages. + */ +struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) +{ + struct page *page, *tmp; + unsigned long flags; + bool dequeued_page; + + dequeued_page = false; + list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { + if (trylock_page(page)) { + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + balloon_page_delete(page); + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); + unlock_page(page); + dequeued_page = true; + break; + } + } + + if (!dequeued_page) { + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + if (unlikely(list_empty(&b_dev_info->pages) && + !b_dev_info->isolated_pages)) + BUG(); + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); + page = NULL; + } + return page; +} +EXPORT_SYMBOL_GPL(balloon_page_dequeue); + +#ifdef CONFIG_BALLOON_COMPACTION +/* + * balloon_mapping_alloc - allocates a special ->mapping for ballooned pages. + * @b_dev_info: holds the balloon device information descriptor. + * @a_ops: balloon_mapping address_space_operations descriptor. + * + * Driver must call it to properly allocate and initialize an instance of + * struct address_space which will be used as the special page->mapping for + * balloon device enlisted page instances. + */ +struct address_space *balloon_mapping_alloc(struct balloon_dev_info *b_dev_info, + const struct address_space_operations *a_ops) +{ + struct address_space *mapping; + + mapping = kmalloc(sizeof(*mapping), GFP_KERNEL); + if (!mapping) + return ERR_PTR(-ENOMEM); + + /* + * Give a clean 'zeroed' status to all elements of this special + * balloon page->mapping struct address_space instance. + */ + address_space_init_once(mapping); + + /* + * Set mapping->flags appropriately, to allow balloon pages + * ->mapping identification. + */ + mapping_set_balloon(mapping); + mapping_set_gfp_mask(mapping, balloon_mapping_gfp_mask()); + + /* balloon's page->mapping->a_ops callback descriptor */ + mapping->a_ops = a_ops; + + /* + * Establish a pointer reference back to the balloon device descriptor + * this particular page->mapping will be servicing. + * This is used by compaction / migration procedures to identify and + * access the balloon device pageset while isolating / migrating pages. + * + * As some balloon drivers can register multiple balloon devices + * for a single guest, this also helps compaction / migration to + * properly deal with multiple balloon pagesets, when required. + */ + mapping->private_data = b_dev_info; + b_dev_info->mapping = mapping; + + return mapping; +} +EXPORT_SYMBOL_GPL(balloon_mapping_alloc); + +static inline void __isolate_balloon_page(struct page *page) +{ + struct balloon_dev_info *b_dev_info = page->mapping->private_data; + unsigned long flags; + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + list_del(&page->lru); + b_dev_info->isolated_pages++; + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); +} + +static inline void __putback_balloon_page(struct page *page) +{ + struct balloon_dev_info *b_dev_info = page->mapping->private_data; + unsigned long flags; + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + list_add(&page->lru, &b_dev_info->pages); + b_dev_info->isolated_pages--; + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); +} + +static inline int __migrate_balloon_page(struct address_space *mapping, + struct page *newpage, struct page *page, enum migrate_mode mode) +{ + return page->mapping->a_ops->migratepage(mapping, newpage, page, mode); +} + +/* __isolate_lru_page() counterpart for a ballooned page */ +bool balloon_page_isolate(struct page *page) +{ + /* + * Avoid burning cycles with pages that are yet under __free_pages(), + * or just got freed under us. + * + * In case we 'win' a race for a balloon page being freed under us and + * raise its refcount preventing __free_pages() from doing its job + * the put_page() at the end of this block will take care of + * release this page, thus avoiding a nasty leakage. + */ + if (likely(get_page_unless_zero(page))) { + /* + * As balloon pages are not isolated from LRU lists, concurrent + * compaction threads can race against page migration functions + * as well as race against the balloon driver releasing a page. + * + * In order to avoid having an already isolated balloon page + * being (wrongly) re-isolated while it is under migration, + * or to avoid attempting to isolate pages being released by + * the balloon driver, lets be sure we have the page lock + * before proceeding with the balloon page isolation steps. + */ + if (likely(trylock_page(page))) { + /* + * A ballooned page, by default, has just one refcount. + * Prevent concurrent compaction threads from isolating + * an already isolated balloon page by refcount check. + */ + if (__is_movable_balloon_page(page) && + page_count(page) == 2) { + __isolate_balloon_page(page); + unlock_page(page); + return true; + } + unlock_page(page); + } + put_page(page); + } + return false; +} + +/* putback_lru_page() counterpart for a ballooned page */ +void balloon_page_putback(struct page *page) +{ + /* + * 'lock_page()' stabilizes the page and prevents races against + * concurrent isolation threads attempting to re-isolate it. + */ + lock_page(page); + + if (__is_movable_balloon_page(page)) { + __putback_balloon_page(page); + put_page(page); + } else { + __WARN(); + dump_page(page); + } + unlock_page(page); +} + +/* move_to_new_page() counterpart for a ballooned page */ +int balloon_page_migrate(struct page *newpage, + struct page *page, enum migrate_mode mode) +{ + struct address_space *mapping; + int rc = -EAGAIN; + + BUG_ON(!trylock_page(newpage)); + + if (WARN_ON(!__is_movable_balloon_page(page))) { + dump_page(page); + unlock_page(newpage); + return rc; + } + + mapping = page->mapping; + if (mapping) + rc = __migrate_balloon_page(mapping, newpage, page, mode); + + unlock_page(newpage); + return rc; +} +#endif /* CONFIG_BALLOON_COMPACTION */ -- 1.7.11.7
Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 4/7] mm: introduce compaction and migration for ballooned pages
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. This patch introduces the helper functions as well as the necessary changes to teach compaction and migration bits how to cope with pages which are part of a guest memory balloon, in order to make them movable by memory compaction procedures. Signed-off-by: Rafael Aquini <aquini at redhat.com> --- mm/compaction.c | 21 +++++++++++++++++++-- mm/migrate.c | 36 ++++++++++++++++++++++++++++++++++-- 2 files changed, 53 insertions(+), 4 deletions(-) diff --git a/mm/compaction.c b/mm/compaction.c index 9eef558..76abd84 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -14,6 +14,7 @@ #include <linux/backing-dev.h> #include <linux/sysctl.h> #include <linux/sysfs.h> +#include <linux/balloon_compaction.h> #include "internal.h" #if defined CONFIG_COMPACTION || defined CONFIG_CMA @@ -565,9 +566,24 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, goto next_pageblock; } - /* Check may be lockless but that's ok as we recheck later */ - if (!PageLRU(page)) + /* + * Check may be lockless but that's ok as we recheck later. + * It's possible to migrate LRU pages and balloon pages + * Skip any other type of page + */ + if (!PageLRU(page)) { + if (unlikely(balloon_page_movable(page))) { + if (locked && balloon_page_isolate(page)) { + /* Successfully isolated */ + cc->finished_update_migrate = true; + list_add(&page->lru, migratelist); + cc->nr_migratepages++; + nr_isolated++; + goto check_compact_cluster; + } + } continue; + } /* * PageLRU is set. lru_lock normally excludes isolation @@ -621,6 +637,7 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc, cc->nr_migratepages++; nr_isolated++; +check_compact_cluster: /* Avoid isolating too much */ if (cc->nr_migratepages == COMPACT_CLUSTER_MAX) { ++low_pfn; diff --git a/mm/migrate.c b/mm/migrate.c index 98c7a89..87ffe54 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -35,6 +35,7 @@ #include <linux/hugetlb.h> #include <linux/hugetlb_cgroup.h> #include <linux/gfp.h> +#include <linux/balloon_compaction.h> #include <asm/tlbflush.h> @@ -79,7 +80,10 @@ void putback_lru_pages(struct list_head *l) list_del(&page->lru); dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); - putback_lru_page(page); + if (unlikely(balloon_page_movable(page))) + balloon_page_putback(page); + else + putback_lru_page(page); } } @@ -778,6 +782,18 @@ static int __unmap_and_move(struct page *page, struct page *newpage, } } + if (unlikely(balloon_page_movable(page))) { + /* + * A ballooned page does not need any special attention from + * physical to virtual reverse mapping procedures. + * Skip any attempt to unmap PTEs or to remap swap cache, + * in order to avoid burning cycles at rmap level, and perform + * the page migration right away (proteced by page lock). + */ + rc = balloon_page_migrate(newpage, page, mode); + goto uncharge; + } + /* * Corner case handling: * 1. When a new swap-cache page is read into, it is added to the LRU @@ -814,7 +830,9 @@ skip_unmap: put_anon_vma(anon_vma); uncharge: - mem_cgroup_end_migration(mem, page, newpage, rc == MIGRATEPAGE_SUCCESS); + mem_cgroup_end_migration(mem, page, newpage, + (rc == MIGRATEPAGE_SUCCESS || + rc == MIGRATEPAGE_BALLOON_SUCCESS)); unlock: unlock_page(page); out: @@ -846,6 +864,20 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, goto out; rc = __unmap_and_move(page, newpage, force, offlining, mode); + + if (unlikely(rc == MIGRATEPAGE_BALLOON_SUCCESS)) { + /* + * A ballooned page has been migrated already. + * Now, it's the time to remove the old page from the isolated + * pageset list and handle it back to Buddy, wrap-up counters + * and return. + */ + dec_zone_page_state(page, NR_ISOLATED_ANON + + page_is_file_cache(page)); + put_page(page); + __free_page(page); + return 0; + } out: if (rc != -EAGAIN) { /* -- 1.7.11.7
Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 5/7] virtio_balloon: introduce migration primitives to balloon pages
Memory fragmentation introduced by ballooning might reduce significantly the number of 2MB contiguous memory blocks that can be used within a guest, thus imposing performance penalties associated with the reduced number of transparent huge pages that could be used by the guest workload. Besides making balloon pages movable at allocation time and introducing the necessary primitives to perform balloon page migration/compaction, this patch also introduces the following locking scheme, in order to enhance the syncronization methods for accessing elements of struct virtio_balloon, thus providing protection against concurrent access introduced by parallel memory migration threads. - balloon_lock (mutex) : synchronizes the access demand to elements of struct virtio_balloon and its queue operations; Signed-off-by: Rafael Aquini <aquini at redhat.com> --- drivers/virtio/virtio_balloon.c | 135 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 123 insertions(+), 12 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 0908e60..69eede7 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -27,6 +27,7 @@ #include <linux/delay.h> #include <linux/slab.h> #include <linux/module.h> +#include <linux/balloon_compaction.h> /* * Balloon device works in 4K page units. So each page is pointed to by @@ -34,6 +35,7 @@ * page units. */ #define VIRTIO_BALLOON_PAGES_PER_PAGE (PAGE_SIZE >> VIRTIO_BALLOON_PFN_SHIFT) +#define VIRTIO_BALLOON_ARRAY_PFNS_MAX 256 struct virtio_balloon { @@ -52,15 +54,19 @@ struct virtio_balloon /* Number of balloon pages we've told the Host we're not using. */ unsigned int num_pages; /* - * The pages we've told the Host we're not using. + * The pages we've told the Host we're not using are enqueued + * at vb_dev_info->pages list. * Each page on this list adds VIRTIO_BALLOON_PAGES_PER_PAGE * to num_pages above. */ - struct list_head pages; + struct balloon_dev_info *vb_dev_info; + + /* Synchronize access/update to this struct virtio_balloon elements */ + struct mutex balloon_lock; /* The array of pfns we tell the Host about. */ unsigned int num_pfns; - u32 pfns[256]; + u32 pfns[VIRTIO_BALLOON_ARRAY_PFNS_MAX]; /* Memory statistics */ int need_stats_update; @@ -122,18 +128,25 @@ static void set_page_pfns(u32 pfns[], struct page *page) static void fill_balloon(struct virtio_balloon *vb, size_t num) { + struct balloon_dev_info *vb_dev_info = vb->vb_dev_info; + + static DEFINE_RATELIMIT_STATE(fill_balloon_rs, + DEFAULT_RATELIMIT_INTERVAL, + DEFAULT_RATELIMIT_BURST); + /* We can only do one array worth at a time. */ num = min(num, ARRAY_SIZE(vb->pfns)); + mutex_lock(&vb->balloon_lock); for (vb->num_pfns = 0; vb->num_pfns < num; vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { - struct page *page = alloc_page(GFP_HIGHUSER | __GFP_NORETRY | - __GFP_NOMEMALLOC | __GFP_NOWARN); + struct page *page = balloon_page_enqueue(vb_dev_info); + if (!page) { - if (printk_ratelimit()) + if (__ratelimit(&fill_balloon_rs)) dev_printk(KERN_INFO, &vb->vdev->dev, "Out of puff! Can't get %zu pages\n", - num); + VIRTIO_BALLOON_PAGES_PER_PAGE); /* Sleep for at least 1/5 of a second before retry. */ msleep(200); break; @@ -141,7 +154,6 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num) set_page_pfns(vb->pfns + vb->num_pfns, page); vb->num_pages += VIRTIO_BALLOON_PAGES_PER_PAGE; totalram_pages--; - list_add(&page->lru, &vb->pages); } /* Didn't get any? Oh well. */ @@ -149,6 +161,7 @@ static void fill_balloon(struct virtio_balloon *vb, size_t num) return; tell_host(vb, vb->inflate_vq); + mutex_unlock(&vb->balloon_lock); } static void release_pages_by_pfn(const u32 pfns[], unsigned int num) @@ -165,14 +178,17 @@ static void release_pages_by_pfn(const u32 pfns[], unsigned int num) static void leak_balloon(struct virtio_balloon *vb, size_t num) { struct page *page; + struct balloon_dev_info *vb_dev_info = vb->vb_dev_info; /* We can only do one array worth at a time. */ num = min(num, ARRAY_SIZE(vb->pfns)); + mutex_lock(&vb->balloon_lock); for (vb->num_pfns = 0; vb->num_pfns < num; vb->num_pfns += VIRTIO_BALLOON_PAGES_PER_PAGE) { - page = list_first_entry(&vb->pages, struct page, lru); - list_del(&page->lru); + page = balloon_page_dequeue(vb_dev_info); + if (!page) + break; set_page_pfns(vb->pfns + vb->num_pfns, page); vb->num_pages -= VIRTIO_BALLOON_PAGES_PER_PAGE; } @@ -183,6 +199,7 @@ static void leak_balloon(struct virtio_balloon *vb, size_t num) * is true, we *have* to do it in this order */ tell_host(vb, vb->deflate_vq); + mutex_unlock(&vb->balloon_lock); release_pages_by_pfn(vb->pfns, vb->num_pfns); } @@ -339,9 +356,76 @@ static int init_vqs(struct virtio_balloon *vb) return 0; } +static const struct address_space_operations virtio_balloon_aops; +#ifdef CONFIG_BALLOON_COMPACTION +/* + * virtballoon_migratepage - perform the balloon page migration on behalf of + * a compation thread. (called under page lock) + * @mapping: the page->mapping which will be assigned to the new migrated page. + * @newpage: page that will replace the isolated page after migration finishes. + * @page : the isolated (old) page that is about to be migrated to newpage. + * @mode : compaction mode -- not used for balloon page migration. + * + * After a ballooned page gets isolated by compaction procedures, this is the + * function that performs the page migration on behalf of a compaction thread + * The page migration for virtio balloon is done in a simple swap fashion which + * follows these two macro steps: + * 1) insert newpage into vb->pages list and update the host about it; + * 2) update the host about the old page removed from vb->pages list; + * + * This function preforms the balloon page migration task. + * Called through balloon_mapping->a_ops->migratepage + */ +int virtballoon_migratepage(struct address_space *mapping, + struct page *newpage, struct page *page, enum migrate_mode mode) +{ + struct balloon_dev_info *vb_dev_info = balloon_page_device(page); + struct virtio_balloon *vb; + unsigned long flags; + + BUG_ON(!vb_dev_info); + + vb = vb_dev_info->balloon_device; + + if (!mutex_trylock(&vb->balloon_lock)) + return -EAGAIN; + + /* balloon's page migration 1st step -- inflate "newpage" */ + spin_lock_irqsave(&vb_dev_info->pages_lock, flags); + balloon_page_insert(newpage, mapping, &vb_dev_info->pages); + vb_dev_info->isolated_pages--; + spin_unlock_irqrestore(&vb_dev_info->pages_lock, flags); + vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE; + set_page_pfns(vb->pfns, newpage); + tell_host(vb, vb->inflate_vq); + + /* + * balloon's page migration 2nd step -- deflate "page" + * + * It's safe to delete page->lru here because this page is at + * an isolated migration list, and this step is expected to happen here + */ + balloon_page_delete(page); + vb->num_pfns = VIRTIO_BALLOON_PAGES_PER_PAGE; + set_page_pfns(vb->pfns, page); + tell_host(vb, vb->deflate_vq); + + mutex_unlock(&vb->balloon_lock); + + return MIGRATEPAGE_BALLOON_SUCCESS; +} + +/* define the balloon_mapping->a_ops callback to allow balloon page migration */ +static const struct address_space_operations virtio_balloon_aops = { + .migratepage = virtballoon_migratepage, +}; +#endif /* CONFIG_BALLOON_COMPACTION */ + static int virtballoon_probe(struct virtio_device *vdev) { struct virtio_balloon *vb; + struct address_space *vb_mapping; + struct balloon_dev_info *vb_devinfo; int err; vdev->priv = vb = kmalloc(sizeof(*vb), GFP_KERNEL); @@ -350,16 +434,37 @@ static int virtballoon_probe(struct virtio_device *vdev) goto out; } - INIT_LIST_HEAD(&vb->pages); vb->num_pages = 0; + mutex_init(&vb->balloon_lock); init_waitqueue_head(&vb->config_change); init_waitqueue_head(&vb->acked); vb->vdev = vdev; vb->need_stats_update = 0; + vb_devinfo = balloon_devinfo_alloc(vb); + if (IS_ERR(vb_devinfo)) { + err = PTR_ERR(vb_devinfo); + goto out_free_vb; + } + + vb_mapping = balloon_mapping_alloc(vb_devinfo, + (balloon_compaction_check()) ? + &virtio_balloon_aops : NULL); + if (IS_ERR(vb_mapping)) { + /* + * IS_ERR(vb_mapping) && PTR_ERR(vb_mapping) == -EOPNOTSUPP + * This means !CONFIG_BALLOON_COMPACTION, otherwise we get off. + */ + err = PTR_ERR(vb_mapping); + if (err != -EOPNOTSUPP) + goto out_free_vb_devinfo; + } + + vb->vb_dev_info = vb_devinfo; + err = init_vqs(vb); if (err) - goto out_free_vb; + goto out_free_vb_mapping; vb->thread = kthread_run(balloon, vb, "vballoon"); if (IS_ERR(vb->thread)) { @@ -371,6 +476,10 @@ static int virtballoon_probe(struct virtio_device *vdev) out_del_vqs: vdev->config->del_vqs(vdev); +out_free_vb_mapping: + balloon_mapping_free(vb_mapping); +out_free_vb_devinfo: + balloon_devinfo_free(vb_devinfo); out_free_vb: kfree(vb); out: @@ -396,6 +505,8 @@ static void __devexit virtballoon_remove(struct virtio_device *vdev) kthread_stop(vb->thread); remove_common(vb); + balloon_mapping_free(vb->vb_dev_info->mapping); + balloon_devinfo_free(vb->vb_dev_info); kfree(vb); } -- 1.7.11.7
The PATCH "mm: introduce compaction and migration for virtio ballooned pages" hacks around putback_lru_pages() in order to allow ballooned pages to be re-inserted on balloon page list as if a ballooned page was like a LRU page. As ballooned pages are not legitimate LRU pages, this patch introduces putback_movable_pages() to properly cope with cases where the isolated pageset contains ballooned pages and LRU pages, thus fixing the mentioned inelegant hack around putback_lru_pages(). Signed-off-by: Rafael Aquini <aquini at redhat.com> --- include/linux/migrate.h | 2 ++ mm/compaction.c | 6 +++--- mm/migrate.c | 20 ++++++++++++++++++++ mm/page_alloc.c | 2 +- 4 files changed, 26 insertions(+), 4 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index e570c3c..ff074a4 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -27,6 +27,7 @@ typedef struct page *new_page_t(struct page *, unsigned long private, int **); #ifdef CONFIG_MIGRATION extern void putback_lru_pages(struct list_head *l); +extern void putback_movable_pages(struct list_head *l); extern int migrate_page(struct address_space *, struct page *, struct page *, enum migrate_mode); extern int migrate_pages(struct list_head *l, new_page_t x, @@ -50,6 +51,7 @@ extern int migrate_huge_page_move_mapping(struct address_space *mapping, #else static inline void putback_lru_pages(struct list_head *l) {} +static inline void putback_movable_pages(struct list_head *l) {} static inline int migrate_pages(struct list_head *l, new_page_t x, unsigned long private, bool offlining, enum migrate_mode mode) { return -ENOSYS; } diff --git a/mm/compaction.c b/mm/compaction.c index 76abd84..f268bd8 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -995,7 +995,7 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) switch (isolate_migratepages(zone, cc)) { case ISOLATE_ABORT: ret = COMPACT_PARTIAL; - putback_lru_pages(&cc->migratepages); + putback_movable_pages(&cc->migratepages); cc->nr_migratepages = 0; goto out; case ISOLATE_NONE: @@ -1018,9 +1018,9 @@ static int compact_zone(struct zone *zone, struct compact_control *cc) trace_mm_compaction_migratepages(nr_migrate - nr_remaining, nr_remaining); - /* Release LRU pages not migrated */ + /* Release isolated pages not migrated */ if (err) { - putback_lru_pages(&cc->migratepages); + putback_movable_pages(&cc->migratepages); cc->nr_migratepages = 0; if (err == -ENOMEM) { ret = COMPACT_PARTIAL; diff --git a/mm/migrate.c b/mm/migrate.c index 87ffe54..adb3d44 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -80,6 +80,26 @@ void putback_lru_pages(struct list_head *l) list_del(&page->lru); dec_zone_page_state(page, NR_ISOLATED_ANON + page_is_file_cache(page)); + putback_lru_page(page); + } +} + +/* + * Put previously isolated pages back onto the appropriate lists + * from where they were once taken off for compaction/migration. + * + * This function shall be used instead of putback_lru_pages(), + * whenever the isolated pageset has been built by isolate_migratepages_range() + */ +void putback_movable_pages(struct list_head *l) +{ + struct page *page; + struct page *page2; + + list_for_each_entry_safe(page, page2, l, lru) { + list_del(&page->lru); + dec_zone_page_state(page, NR_ISOLATED_ANON + + page_is_file_cache(page)); if (unlikely(balloon_page_movable(page))) balloon_page_putback(page); else diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 5b74de6..1cb0f93 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5710,7 +5710,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc, 0, false, MIGRATE_SYNC); } - putback_lru_pages(&cc->migratepages); + putback_movable_pages(&cc->migratepages); return ret > 0 ? 0 : ret; } -- 1.7.11.7
Rafael Aquini
2012-Nov-07 03:05 UTC
[PATCH v11 7/7] mm: add vm event counters for balloon pages compaction
This patch introduces a new set of vm event counters to keep track of ballooned pages compaction activity. Signed-off-by: Rafael Aquini <aquini at redhat.com> --- drivers/virtio/virtio_balloon.c | 1 + include/linux/vm_event_item.h | 8 +++++++- mm/balloon_compaction.c | 2 ++ mm/migrate.c | 1 + mm/vmstat.c | 10 +++++++++- 5 files changed, 20 insertions(+), 2 deletions(-) diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c index 69eede7..3756fc1 100644 --- a/drivers/virtio/virtio_balloon.c +++ b/drivers/virtio/virtio_balloon.c @@ -411,6 +411,7 @@ int virtballoon_migratepage(struct address_space *mapping, tell_host(vb, vb->deflate_vq); mutex_unlock(&vb->balloon_lock); + balloon_event_count(COMPACTBALLOONMIGRATED); return MIGRATEPAGE_BALLOON_SUCCESS; } diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index 3d31145..cbd72fc 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -41,7 +41,13 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_COMPACTION COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED, COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, -#endif +#ifdef CONFIG_BALLOON_COMPACTION + COMPACTBALLOONISOLATED, /* isolated from balloon pagelist */ + COMPACTBALLOONMIGRATED, /* balloon page sucessfully migrated */ + COMPACTBALLOONRELEASED, /* old-page released after migration */ + COMPACTBALLOONRETURNED, /* putback to pagelist, not-migrated */ +#endif /* CONFIG_BALLOON_COMPACTION */ +#endif /* CONFIG_COMPACTION */ #ifdef CONFIG_HUGETLB_PAGE HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL, #endif diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index 90935aa..32927eb 100644 --- a/mm/balloon_compaction.c +++ b/mm/balloon_compaction.c @@ -215,6 +215,7 @@ bool balloon_page_isolate(struct page *page) if (__is_movable_balloon_page(page) && page_count(page) == 2) { __isolate_balloon_page(page); + balloon_event_count(COMPACTBALLOONISOLATED); unlock_page(page); return true; } @@ -237,6 +238,7 @@ void balloon_page_putback(struct page *page) if (__is_movable_balloon_page(page)) { __putback_balloon_page(page); put_page(page); + balloon_event_count(COMPACTBALLOONRETURNED); } else { __WARN(); dump_page(page); diff --git a/mm/migrate.c b/mm/migrate.c index adb3d44..ee3037d 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -896,6 +896,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, page_is_file_cache(page)); put_page(page); __free_page(page); + balloon_event_count(COMPACTBALLOONRELEASED); return 0; } out: diff --git a/mm/vmstat.c b/mm/vmstat.c index c737057..1363edc 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -781,7 +781,15 @@ const char * const vmstat_text[] = { "compact_stall", "compact_fail", "compact_success", -#endif + +#ifdef CONFIG_BALLOON_COMPACTION + "compact_balloon_isolated", + "compact_balloon_migrated", + "compact_balloon_released", + "compact_balloon_returned", +#endif /* CONFIG_BALLOON_COMPACTION */ + +#endif /* CONFIG_COMPACTION */ #ifdef CONFIG_HUGETLB_PAGE "htlb_buddy_alloc_success", -- 1.7.11.7
Michael S. Tsirkin
2012-Nov-10 15:55 UTC
[PATCH v11 7/7] mm: add vm event counters for balloon pages compaction
On Wed, Nov 07, 2012 at 01:05:54AM -0200, Rafael Aquini wrote:> This patch introduces a new set of vm event counters to keep track of > ballooned pages compaction activity. > > Signed-off-by: Rafael Aquini <aquini at redhat.com> > --- > drivers/virtio/virtio_balloon.c | 1 + > include/linux/vm_event_item.h | 8 +++++++- > mm/balloon_compaction.c | 2 ++ > mm/migrate.c | 1 + > mm/vmstat.c | 10 +++++++++- > 5 files changed, 20 insertions(+), 2 deletions(-) > > diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c > index 69eede7..3756fc1 100644 > --- a/drivers/virtio/virtio_balloon.c > +++ b/drivers/virtio/virtio_balloon.c > @@ -411,6 +411,7 @@ int virtballoon_migratepage(struct address_space *mapping, > tell_host(vb, vb->deflate_vq); > > mutex_unlock(&vb->balloon_lock); > + balloon_event_count(COMPACTBALLOONMIGRATED); > > return MIGRATEPAGE_BALLOON_SUCCESS; > }Looks like any ballon would need to do this. Can this chunk go into caller instead?> diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h > index 3d31145..cbd72fc 100644 > --- a/include/linux/vm_event_item.h > +++ b/include/linux/vm_event_item.h > @@ -41,7 +41,13 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, > #ifdef CONFIG_COMPACTION > COMPACTBLOCKS, COMPACTPAGES, COMPACTPAGEFAILED, > COMPACTSTALL, COMPACTFAIL, COMPACTSUCCESS, > -#endif > +#ifdef CONFIG_BALLOON_COMPACTION > + COMPACTBALLOONISOLATED, /* isolated from balloon pagelist */ > + COMPACTBALLOONMIGRATED, /* balloon page sucessfully migrated */ > + COMPACTBALLOONRELEASED, /* old-page released after migration */ > + COMPACTBALLOONRETURNED, /* putback to pagelist, not-migrated */ > +#endif /* CONFIG_BALLOON_COMPACTION */ > +#endif /* CONFIG_COMPACTION */ > #ifdef CONFIG_HUGETLB_PAGE > HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL, > #endif > diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c > index 90935aa..32927eb 100644 > --- a/mm/balloon_compaction.c > +++ b/mm/balloon_compaction.c > @@ -215,6 +215,7 @@ bool balloon_page_isolate(struct page *page) > if (__is_movable_balloon_page(page) && > page_count(page) == 2) { > __isolate_balloon_page(page); > + balloon_event_count(COMPACTBALLOONISOLATED); > unlock_page(page); > return true; > } > @@ -237,6 +238,7 @@ void balloon_page_putback(struct page *page) > if (__is_movable_balloon_page(page)) { > __putback_balloon_page(page); > put_page(page); > + balloon_event_count(COMPACTBALLOONRETURNED); > } else { > __WARN(); > dump_page(page); > diff --git a/mm/migrate.c b/mm/migrate.c > index adb3d44..ee3037d 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -896,6 +896,7 @@ static int unmap_and_move(new_page_t get_new_page, unsigned long private, > page_is_file_cache(page)); > put_page(page); > __free_page(page); > + balloon_event_count(COMPACTBALLOONRELEASED); > return 0; > } > out: > diff --git a/mm/vmstat.c b/mm/vmstat.c > index c737057..1363edc 100644 > --- a/mm/vmstat.c > +++ b/mm/vmstat.c > @@ -781,7 +781,15 @@ const char * const vmstat_text[] = { > "compact_stall", > "compact_fail", > "compact_success", > -#endif > + > +#ifdef CONFIG_BALLOON_COMPACTION > + "compact_balloon_isolated", > + "compact_balloon_migrated", > + "compact_balloon_released", > + "compact_balloon_returned", > +#endif /* CONFIG_BALLOON_COMPACTION */ > + > +#endif /* CONFIG_COMPACTION */ > > #ifdef CONFIG_HUGETLB_PAGE > "htlb_buddy_alloc_success", > -- > 1.7.11.7
Possibly Parallel Threads
- [PATCH v11 0/7] make balloon pages movable by compaction
- [PATCH v12 0/7] make balloon pages movable by compaction
- [PATCH v12 0/7] make balloon pages movable by compaction
- [PATCH v8 0/5] make balloon pages movable by compaction
- [PATCH v8 0/5] make balloon pages movable by compaction