Linus Torvalds
2018-Jul-11 16:23 UTC
[PATCH v35 1/5] mm: support to get hints of free page blocks
On Wed, Jul 11, 2018 at 2:21 AM Michal Hocko <mhocko at kernel.org> wrote:> > We already have an interface for that. alloc_pages(GFP_NOWAIT, MAX_ORDER -1). > So why do we need any array based interface?That was actually my original argument in the original thread - that the only new interface people might want is one that just tells how many of those MAX_ORDER-1 pages there are. See the thread in v33 with the subject "[PATCH v33 1/4] mm: add a function to get free page blocks" and look for me suggesting just using #define GFP_MINFLAGS (__GFP_NORETRY | __GFP_NOWARN | __GFP_THISNODE | __GFP_NOMEMALLOC) struct page *page = alloc_pages(GFP_MINFLAGS, MAX_ORDER-1); for this all. But I could also see an argument for "allocate N pages of size MAX_ORDER-1", with some small N, simply because I can see the advantage of not taking and releasing the locking and looking up the zone individually N times. If you want to get gigabytes of memory (or terabytes), doing it in bigger chunks than one single maximum-sized page sounds fairly reasonable. I just don't think that "thousands of pages" is reasonable. But "tens of max-sized pages" sounds fair enough to me, and it would certainly not be a pain for the VM. So I'm open to new interfaces. I just want those new interfaces to make sense, and be low latency and simple for the VM to do. I'm objecting to the incredibly baroque and heavy-weight one that can return near-infinite amounts of memory. The real advantage of jjuist the existing "alloc_pages()" model is that I think the ballooning people can use that to *test* things out. If it turns out that taking and releasing the VM locks is a big cost, we can see if a batch interface that allows you to get tens of pages at the same time is worth it. So yes, I'd suggest starting with just the existing alloc_pages. Maybe it's not enough, but it should be good enough for testing. Linus
Wei Wang
2018-Jul-12 02:21 UTC
[PATCH v35 1/5] mm: support to get hints of free page blocks
On 07/12/2018 12:23 AM, Linus Torvalds wrote:> On Wed, Jul 11, 2018 at 2:21 AM Michal Hocko <mhocko at kernel.org> wrote: >> We already have an interface for that. alloc_pages(GFP_NOWAIT, MAX_ORDER -1). >> So why do we need any array based interface? > That was actually my original argument in the original thread - that > the only new interface people might want is one that just tells how > many of those MAX_ORDER-1 pages there are. > > See the thread in v33 with the subject > > "[PATCH v33 1/4] mm: add a function to get free page blocks" > > and look for me suggesting just using > > #define GFP_MINFLAGS (__GFP_NORETRY | __GFP_NOWARN | > __GFP_THISNODE | __GFP_NOMEMALLOC)Would it be better to remove __GFP_THISNODE? We actually want to get all the guest free pages (from all the nodes). Best, Wei
Linus Torvalds
2018-Jul-12 02:30 UTC
[PATCH v35 1/5] mm: support to get hints of free page blocks
On Wed, Jul 11, 2018 at 7:17 PM Wei Wang <wei.w.wang at intel.com> wrote:> > Would it be better to remove __GFP_THISNODE? We actually want to get all > the guest free pages (from all the nodes).Maybe. Or maybe it would be better to have the memory balloon logic be per-node? Maybe you don't want to remove too much memory from one node? I think it's one of those "play with it" things. I don't think that's the big issue, actually. I think the real issue is how to react quickly and gracefully to "oops, I'm trying to give memory away, but now the guest wants it back" while you're in the middle of trying to create that 2TB list of pages. IOW, I think the real work is in whatever tuning for the righ tbehavior. But I'm just guessing. Linus
Michal Hocko
2018-Jul-12 13:12 UTC
[PATCH v35 1/5] mm: support to get hints of free page blocks
[Hmm this one somehow got stuck in my outgoing emails] On Wed 11-07-18 09:23:54, Linus Torvalds wrote: [...]> So I'm open to new interfaces. I just want those new interfaces to > make sense, and be low latency and simple for the VM to do. I'm > objecting to the incredibly baroque and heavy-weight one that can > return near-infinite amounts of memory.Mel was suggesting a bulk page allocator a year ago [1]. I can see only slab bulk api so I am not sure what happened with that work. Anyway I think that starting with what we have right now is much more appropriate than over design this thing from the early beginning. [1] http://lkml.kernel.org/r/20170109163518.6001-5-mgorman at techsingularity.net -- Michal Hocko SUSE Labs
Reasonably Related Threads
- [PATCH v35 1/5] mm: support to get hints of free page blocks
- [PATCH v35 1/5] mm: support to get hints of free page blocks
- [PATCH v35 1/5] mm: support to get hints of free page blocks
- [PATCH v35 1/5] mm: support to get hints of free page blocks
- [PATCH v35 1/5] mm: support to get hints of free page blocks