Alexander Duyck
2021-Jan-06 16:59 UTC
[PATCH 1/6] mm: Add batch size for free page reporting
On Tue, Jan 5, 2021 at 7:47 PM Liang Li <liliang324 at gmail.com> wrote:> > Use the page order as the only threshold for page reporting > is not flexible and has some flaws. Because scan a long free > list is not cheap, it's better to wake up the page reporting > worker when there are more pages, wake it up for a sigle page > may not worth. > This patch add a batch size as another threshold to control the > waking up of reporting worker. > > Cc: Alexander Duyck <alexander.h.duyck at linux.intel.com> > Cc: Mel Gorman <mgorman at techsingularity.net> > Cc: Andrea Arcangeli <aarcange at redhat.com> > Cc: Dan Williams <dan.j.williams at intel.com> > Cc: Dave Hansen <dave.hansen at intel.com> > Cc: David Hildenbrand <david at redhat.com> > Cc: Michal Hocko <mhocko at kernel.org> > Cc: Andrew Morton <akpm at linux-foundation.org> > Cc: Alex Williamson <alex.williamson at redhat.com> > Cc: Michael S. Tsirkin <mst at redhat.com> > Cc: Liang Li <liliang324 at gmail.com> > Signed-off-by: Liang Li <liliangleo at didiglobal.com>So you are going to need a lot more explanation for this. Page reporting already had the concept of batching as you could only scan once every 2 seconds as I recall. Thus the "PAGE_REPORTING_DELAY". The change you are making doesn't make any sense without additional context.> --- > mm/page_reporting.c | 1 + > mm/page_reporting.h | 12 ++++++++++-- > 2 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/mm/page_reporting.c b/mm/page_reporting.c > index cd8e13d41df4..694df981ddd2 100644 > --- a/mm/page_reporting.c > +++ b/mm/page_reporting.c > @@ -12,6 +12,7 @@ > > #define PAGE_REPORTING_DELAY (2 * HZ) > static struct page_reporting_dev_info __rcu *pr_dev_info __read_mostly; > +unsigned long page_report_batch_size __read_mostly = 16 * 1024 * 1024UL; > > enum { > PAGE_REPORTING_IDLE = 0, > diff --git a/mm/page_reporting.h b/mm/page_reporting.h > index 2c385dd4ddbd..b8fb3bbb345f 100644 > --- a/mm/page_reporting.h > +++ b/mm/page_reporting.h > @@ -12,6 +12,8 @@ > > #define PAGE_REPORTING_MIN_ORDER pageblock_order > > +extern unsigned long page_report_batch_size; > + > #ifdef CONFIG_PAGE_REPORTING > DECLARE_STATIC_KEY_FALSE(page_reporting_enabled); > void __page_reporting_notify(void); > @@ -33,6 +35,8 @@ static inline bool page_reported(struct page *page) > */ > static inline void page_reporting_notify_free(unsigned int order) > { > + static long batch_size; > +I'm not sure this makes a tone of sense to place the value in an inline function. It might make more sense to put this new code in __page_reporting_notify so that all callers would be referring to the same batch_size value and you don't have to bother with the export of the page_report_batch_size value.> /* Called from hot path in __free_one_page() */ > if (!static_branch_unlikely(&page_reporting_enabled)) > return; > @@ -41,8 +45,12 @@ static inline void page_reporting_notify_free(unsigned int order) > if (order < PAGE_REPORTING_MIN_ORDER) > return; > > - /* This will add a few cycles, but should be called infrequently */ > - __page_reporting_notify(); > + batch_size += (1 << order) << PAGE_SHIFT; > + if (batch_size >= page_report_batch_size) { > + batch_size = 0;I would probably run this in the opposite direction. Rather than running batch_size to zero I would look at adding a "batch_remaining" and then when it is < 0 you could then reset it back to page_report_batch_size. Doing that you only have to read one variable most of the time instead of doing a comparison against two.> + /* This add a few cycles, but should be called infrequently */ > + __page_reporting_notify(); > + } > } > #else /* CONFIG_PAGE_REPORTING */ > #define page_reported(_page) false > -- > 2.18.2 > >