Michal Hocko
2019-Oct-24 08:42 UTC
[PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
On Wed 23-10-19 12:03:51, David Hildenbrand wrote:> >Do you see any downsides? > > The only downside I see is that we get more false negatives on > has_unmovable_pages(), eventually resulting in the offlining stage after > isolation to loop forever (as some PageOffline() pages are not movable > (especially, XEN balloon, HyperV balloon), there won't be progress). > > I somewhat don't like forcing everybody that uses PageOffline() (especially > all users of balloon compaction) to implement memory notifiers just to avoid > that. Maybe, we even want to use PageOffline() in the future in the core > (e.g., for memory holes instead of PG_reserved or similar).There is only a handful of those and we need to deal with them anyway. If you do not want to enforce them to create their own notifiers then we can accomodate the hotplug code. __test_page_isolated_in_pageblock resp. the call chain up can distinguish temporary and permanent failures (EAGAIN vs. EBUSY). The current state when we always return EBUSY and keep retrying for ever is not optimal at all, right? A referenced PageOffline could be an example of EBUSY all other failures where we are effectively waiting for pages to get freed finaly would be EAGAIN. It is a bit late in the process because a large portion of the work has been done already but this doesn't sound like something to lose sleep over. -- Michal Hocko SUSE Labs
David Hildenbrand
2019-Oct-24 08:51 UTC
[PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
On 24.10.19 10:42, Michal Hocko wrote:> On Wed 23-10-19 12:03:51, David Hildenbrand wrote: >>> Do you see any downsides? >> >> The only downside I see is that we get more false negatives on >> has_unmovable_pages(), eventually resulting in the offlining stage after >> isolation to loop forever (as some PageOffline() pages are not movable >> (especially, XEN balloon, HyperV balloon), there won't be progress). >> >> I somewhat don't like forcing everybody that uses PageOffline() (especially >> all users of balloon compaction) to implement memory notifiers just to avoid >> that. Maybe, we even want to use PageOffline() in the future in the core >> (e.g., for memory holes instead of PG_reserved or similar). > > There is only a handful of those and we need to deal with them anyway. > If you do not want to enforce them to create their own notifiers then we > can accomodate the hotplug code. __test_page_isolated_in_pageblock resp.Yeah, I would prefer offlining code to be able to deal with that without notifier changes for all users.> the call chain up can distinguish temporary and permanent failures > (EAGAIN vs. EBUSY). The current state when we always return EBUSY and > keep retrying for ever is not optimal at all, right? A referenced PageOfflineVery right!> could be an example of EBUSY all other failures where we are effectively > waiting for pages to get freed finaly would be EAGAIN.We have to watch out for PageOffline() pages that are actually movable (balloon compaction). But that doesn't sound too hard.> > It is a bit late in the process because a large portion of the work has > been done already but this doesn't sound like something to lose sleep > over. >Right. I'll look into that to find out if this would work. And see if I can reproduce what I described at all (theoretical thoughts) :) Again, thanks for looking into this Michal! -- Thanks, David / dhildenb
David Hildenbrand
2019-Oct-25 11:28 UTC
[PATCH RFC] mm: Allow to offline unmovable PageOffline() pages if the driver agrees
virtio-mem wants to allow to offline memory blocks of which some parts were unplugged, especially, to later offline and remove completely unplugged memory blocks. The important part is that PageOffline() has to remain set until the section is offline, so these pages will never get accessed (e.g., when dumping). The pages should not be handed back to the buddy (which would require clearing PageOffline() and result in issues if offlining fails and the pages are suddenly in the buddy). Let's allow to do that by allowing to isolate any PageOffline() page when offlining. This way, we can reach the memory hotplug notifier MEM_GOING_OFFLINE, where the driver can signal that he is fine with offlining this page by dropping its reference count. PageOffline() pages with a reference count of 0 can then be skipped when offlining the pages (like if they were free, however they are not in the buddy). Anybody who uses PageOffline() pages and does not agree to offline them (e.g., Hyper-V balloon, XEN balloon, VMWare balloon for 2MB pages) will not decrement the reference count and make offlining fail when trying to migrate such an unmovable page. So there should be no observerable change. Same applies to balloon compaction users (movable PageOffline() pages), the pages will simply be migrated. Note 1: If offlining fails, a driver has to increment the reference count again in MEM_CANCEL_OFFLINE. Note 2: A driver that makes use of this has to be aware that re-onlining the memory block has to be handled by hooking into onlining code (online_page_callback_t), resetting the page PageOffline() and not giving them to the buddy. Cc: Andrew Morton <akpm at linux-foundation.org> Cc: Juergen Gross <jgross at suse.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com> Cc: Pavel Tatashin <pavel.tatashin at microsoft.com> Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> Cc: Vlastimil Babka <vbabka at suse.cz> Cc: Johannes Weiner <hannes at cmpxchg.org> Cc: Anthony Yznaga <anthony.yznaga at oracle.com> Cc: Michal Hocko <mhocko at suse.com> Cc: Oscar Salvador <osalvador at suse.de> Cc: Mel Gorman <mgorman at techsingularity.net> Cc: Mike Rapoport <rppt at linux.ibm.com> Cc: Dan Williams <dan.j.williams at intel.com> Cc: Anshuman Khandual <anshuman.khandual at arm.com> Cc: Qian Cai <cai at lca.pw> Cc: Pingfan Liu <kernelfans at gmail.com> Signed-off-by: David Hildenbrand <david at redhat.com> --- Michal, this is the approach where we allow has_unmovable_pages() to succeed to reach MEM_GOING_OFFLINE and fail later in case the driver did not agree. Thoughts? --- include/linux/page-flags.h | 10 ++++++++++ mm/memory_hotplug.c | 41 ++++++++++++++++++++++++++++---------- mm/page_alloc.c | 24 ++++++++++++++++++++++ mm/page_isolation.c | 9 +++++++++ 4 files changed, 74 insertions(+), 10 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 3b8e5c5f7e1f..4897cc585af6 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -757,6 +757,16 @@ PAGE_TYPE_OPS(Buddy, buddy) * not onlined when onlining the section). * The content of these pages is effectively stale. Such pages should not * be touched (read/write/dump/save) except by their owner. + * + * If a driver wants to allow to offline unmovable PageOffline() pages without + * putting them back to the buddy, it can do so via the memory notifier by + * decrementing the reference count in MEM_GOING_OFFLINE and incrementing the + * reference count in MEM_CANCEL_OFFLINE. When offlining, the PageOffline() + * pages (now with a reference count of zero) are treated like free pages, + * allowing the containing memory block to get offlined. A driver that + * relies on this feature is aware that re-onlining the memory block will + * require to re-set the pages PageOffline() and not giving them to the + * buddy via online_page_callback_t. */ PAGE_TYPE_OPS(Offline, offline) diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index 561371ead39a..7a18b0bba045 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1223,11 +1223,15 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn, /* * Scan pfn range [start,end) to find movable/migratable pages (LRU pages, - * non-lru movable pages and hugepages). We scan pfn because it's much - * easier than scanning over linked list. This function returns the pfn - * of the first found movable page if it's found, otherwise 0. + * non-lru movable pages and hugepages). + * + * Returns: + * 0 in case a movable page is found and movable_pfn was updated. + * -ENOENT in case no movable page was found. + * -EBUSY in case a definetly unmovable page was found. */ -static unsigned long scan_movable_pages(unsigned long start, unsigned long end) +static int scan_movable_pages(unsigned long start, unsigned long end, + unsigned long *movable_pfn) { unsigned long pfn; @@ -1239,18 +1243,29 @@ static unsigned long scan_movable_pages(unsigned long start, unsigned long end) continue; page = pfn_to_page(pfn); if (PageLRU(page)) - return pfn; + goto found; if (__PageMovable(page)) - return pfn; + goto found; + + /* + * Unmovable PageOffline() pages where somebody still holds + * a reference count (after MEM_GOING_OFFLINE) can definetly + * not be offlined. + */ + if (PageOffline(page) && page_count(page)) + return -EBUSY; if (!PageHuge(page)) continue; head = compound_head(page); if (page_huge_active(head)) - return pfn; + goto found; skip = compound_nr(head) - (page - head); pfn += skip - 1; } + return -ENOENT; +found: + *movable_pfn = pfn; return 0; } @@ -1496,7 +1511,8 @@ static int __ref __offline_pages(unsigned long start_pfn, } do { - for (pfn = start_pfn; pfn;) { + pfn = start_pfn; + do { if (signal_pending(current)) { ret = -EINTR; reason = "signal backoff"; @@ -1506,14 +1522,19 @@ static int __ref __offline_pages(unsigned long start_pfn, cond_resched(); lru_add_drain_all(); - pfn = scan_movable_pages(pfn, end_pfn); - if (pfn) { + ret = scan_movable_pages(pfn, end_pfn, &pfn); + if (!ret) { /* * TODO: fatal migration failures should bail * out */ do_migrate_range(pfn, end_pfn); } + } while (!ret); + + if (ret != -ENOENT) { + reason = "unmovable page"; + goto failed_removal_isolated; } /* diff --git a/mm/page_alloc.c b/mm/page_alloc.c index e2b0bdfdd586..1594f480532a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -8273,6 +8273,19 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count, if ((flags & MEMORY_OFFLINE) && PageHWPoison(page)) continue; + /* + * We treat all PageOffline() pages as movable when offlining + * to give drivers a chance to decrement their reference count + * in MEM_GOING_OFFLINE in order to signalize that these pages + * can be offlined as there are no direct references anymore. + * For actually unmovable PageOffline() where the driver does + * not support this, we will fail later when trying to actually + * move these pages that still have a reference count > 0. + * (false negatives in this function only) + */ + if ((flags & MEMORY_OFFLINE) && PageOffline(page)) + continue; + if (__PageMovable(page)) continue; @@ -8702,6 +8715,17 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) offlined_pages++; continue; } + /* + * At this point all remaining PageOffline() pages have a + * reference count of 0 and can simply be skipped. + */ + if (PageOffline(page)) { + BUG_ON(page_count(page)); + BUG_ON(PageBuddy(page)); + pfn++; + offlined_pages++; + continue; + } BUG_ON(page_count(page)); BUG_ON(!PageBuddy(page)); diff --git a/mm/page_isolation.c b/mm/page_isolation.c index 04ee1663cdbe..43b4dabfedc8 100644 --- a/mm/page_isolation.c +++ b/mm/page_isolation.c @@ -170,6 +170,7 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages) * a bit mask) * MEMORY_OFFLINE - isolate to offline (!allocate) memory * e.g., skip over PageHWPoison() pages + * and PageOffline() pages. * REPORT_FAILURE - report details about the failure to * isolate the range * @@ -278,6 +279,14 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, else if ((flags & MEMORY_OFFLINE) && PageHWPoison(page)) /* A HWPoisoned page cannot be also PageBuddy */ pfn++; + else if ((flags & MEMORY_OFFLINE) && PageOffline(page) && + !page_count(page)) + /* + * The responsible driver agreed to offline + * PageOffline() pages by dropping its reference in + * MEM_GOING_OFFLINE. + */ + pfn++; else break; } -- 2.21.0
Seemingly Similar Threads
- [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
- [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
- [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
- [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0
- [PATCH RFC v3 6/9] mm: Allow to offline PageOffline() pages with a reference count of 0