thr3ads.net - Linux Virtualization - [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag

If this information is useful, please help other people find it:
Share via:

Debabrata Banerjee

2014-Jan-03 00:42 UTC

[PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

Currently because of how mm behaves (3.10.y) the code even before the
patch is a problem. I believe what may fix it is if instead of just
removing the conditional on __GFP_WAIT, the initial order > 0
allocation should be made GFP_ATOMIC, then fallback to the original
gfp mask for the order-0 allocations.

On systems that have highly fragmented main memory with pressure,
skb_page_frag_refill() causes problems. mm enters significant
compaction cycles on all cpu's which in itself is bad (add
considerable spinlock contention in isolate_migratepages_range() for
several seconds in kernel at 100% cpu), however even without this
happening basically we have large memory reclaimation when only
allocations from order-3 were necessary. For example, I might see half
the existing page cache on a system (2GB out of 8GB) reclaimed in a
burst, which effectively means the application has to wait even longer
after this compact/reclaim cycle for those pages to be read back from
disk. This is a significant reduction in useful memory from before
skb_page_frag_refill() existed, as one of our systems could run in
steady state will little free memory and 100% fragmentation. Now I see
10-30x more memory free (read: not utilized). Order > 0 allocations
were happening rarely before, now it happens consistently from this
function.

My suggestion above would avoid mm going through
__alloc_pages_direct_compact() and triggering the bad events above. It
will take me several days to try this experiment.

-Debabrata

On Tue, Dec 24, 2013 at 5:46 PM, David Miller <davem at davemloft.net>
wrote:>
> There is still feedback and/or minor adjustments being asked for wrt.
> this series.   These changes have been sitting for more than a week
> which is a bit rediculous.
>
> Please resubmit these changes once everything is resolved to
> everyone's satisfaction, thanks.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev"
in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Eric Dumazet

2014-Jan-03 00:56 UTC

head link

[PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

On Thu, 2014-01-02 at 19:42 -0500, Debabrata Banerjee
wrote:> Currently because of how mm behaves (3.10.y) the code even before the
> patch is a problem. I believe what may fix it is if instead of just
> removing the conditional on __GFP_WAIT, the initial order > 0
> allocation should be made GFP_ATOMIC, then fallback to the original
> gfp mask for the order-0 allocations.
> 
> On systems that have highly fragmented main memory with pressure,
> skb_page_frag_refill() causes problems. mm enters significant
> compaction cycles on all cpu's which in itself is bad (add
> considerable spinlock contention in isolate_migratepages_range() for
> several seconds in kernel at 100% cpu), however even without this
> happening basically we have large memory reclaimation when only
> allocations from order-3 were necessary. For example, I might see half
> the existing page cache on a system (2GB out of 8GB) reclaimed in a
> burst, which effectively means the application has to wait even longer
> after this compact/reclaim cycle for those pages to be read back from
> disk. This is a significant reduction in useful memory from before
> skb_page_frag_refill() existed, as one of our systems could run in
> steady state will little free memory and 100% fragmentation. Now I see
> 10-30x more memory free (read: not utilized). Order > 0 allocations
> were happening rarely before, now it happens consistently from this
> function.
> 
> My suggestion above would avoid mm going through
> __alloc_pages_direct_compact() and triggering the bad events above. It
> will take me several days to try this experiment.
My suggestion is to use a recent kernel, and/or eventually backport the
mm fixes if any.

order-3 allocations should not reclaim 2GB out of 8GB.

There is a reason PAGE_ALLOC_COSTLY_ORDER exists and is 3

Eric Dumazet

2014-Jan-03 01:26 UTC

head link

[PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

On Thu, 2014-01-02 at 16:56 -0800, Eric Dumazet wrote:
> 
> My suggestion is to use a recent kernel, and/or eventually backport the
> mm fixes if any.
> 
> order-3 allocations should not reclaim 2GB out of 8GB.
> 
> There is a reason PAGE_ALLOC_COSTLY_ORDER exists and is 3
Hmm... it looks like I missed __GFP_NORETRY



diff --git a/net/core/sock.c b/net/core/sock.c
index 5393b4b719d7..5f42a4d70cb2 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1872,7 +1872,7 @@ bool skb_page_frag_refill(unsigned int sz, struct
page_frag *pfrag, gfp_t prio)
 		gfp_t gfp = prio;
 
 		if (order)
-			gfp |= __GFP_COMP | __GFP_NOWARN;
+			gfp |= __GFP_COMP | __GFP_NOWARN | __GFP_NORETRY;
 		pfrag->page = alloc_pages(gfp, order);
 		if (likely(pfrag->page)) {
 			pfrag->offset = 0;

Maybe Matching Threads

Search for more possibly parallel threads

Linux Virtualization - Jan 2014 - [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

[PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

[PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

[PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill

Maybe Matching Threads