Dan Magenheimer
2008-Oct-11 21:44 UTC
[Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
Xen hypervisor experts -- I''m looking at replacing the engine of Xen''s xmalloc/xfree (dynamic memory allocation) with a low fragmentation, fixed overhead dynamic allocator called TLSF ("two-level sequential fit"). TLSF info can be found here: http://rtportal.upv.es/rtmalloc/ TLSF has been modified to work with Linux and for an unlimited collection of page-sized regions by Nitin Gupta here: http://code.google.com/p/compcache/wiki/TLSFAllocator The code I started with (thanks Nitin!) is here: http://code.google.com/p/compcache/source/browse/trunk/sub-projects/allocators/tlsf-kmod (Note that a lot of this code can be stripped out for Xen.) Xen''s current xmalloc engine (essentially a first-fit algorithm) is adequate for current needs, but I''m working on a project (see "Paravirtualized Paging" at WIOV in December) that emphasizes its weaknesses, the largest being that maximum calltime is essentially unbounded, especially when allocating a large number of near-page-sized items. (I''m measuring millions of cycles to do some xfree''s.) Another project that may soon make it into Xen (see "Difference Engine" at OSDI in December) has similarly intense near-page-sized needs. Anyway, I''ve hacked a version of TLSF into Xen, enough to meet my needs for now, but one of the xmalloc weaknesses (from my pov anyway) is built into the API: xmalloc''ing a page actually allocates two contiguous pages, xmalloc''ing two pages actually allocates four and so on. Worse, xmalloc''ing slightly LESS than a page, actually allocates two contiguous pages, etc., because the current mechanism requires space for a block header. As a result, I''d like to propose a change to the xmalloc interface to make this issue more explicit: I''d like to change xmalloc/xfree to FAIL on allocation sizes greater than PAGE_SIZE - DELTA, where DELTA is a defined constant. Callers that require an allocation larger than that MUST use the page_alloc (and corresponding page_free) interfaces. In other words, for any dynamic allocation code that needs a dynamically computed size that might exceed a page, the test must be done on the caller-side... and the caller is responsible for remembering whether the subpage allocator or the page-plus allocator was used, and free''ing with the matching subpage-free or page-plus-free routine. While I''d never propose this unforgiving API for user-land code, I think it isn''t unreasonable in a hypervisor. One compromise might be to retain the existing interface for xmalloc_array() as calls to that routine are much more likely to exceed a page. Another might be to retain the existing xmalloc interface... and maybe even the existing engine... but supplement it with a new xmalloc_subpage() interface and change caller-side code over time to use the new interface(/engine). Comments? (on both the API proposal and the engine) Thanks, Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan de Konink
2008-Oct-11 22:12 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 Dan Magenheimer schreef:> Xen hypervisor experts --...> Comments? (on both the API proposal and the engine)Is this engine capable of progressing to a memory de-duplication style? Or is that outside the scope? Stefan -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEAREKAAYFAkjxJMkACgkQYH1+F2Rqwn3H0wCfas1yNV+EXXwRuUVGZE5OpkKs WVQAn0ViQ/B3HtXaZLj7vPM0CHKOXc8B =2OZG -----END PGP SIGNATURE----- _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Oct-11 22:59 UTC
RE: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
Yes. Well, for basic page de-duplication it probably is irrelevant, but for "advanced features" it will be more capable because it will handle near-page-size allocations much better.> -----Original Message----- > From: Stefan de Konink [mailto:stefan@konink.de] > Sent: Saturday, October 11, 2008 4:12 PM > To: Dan Magenheimer > Cc: Xen-Devel (E-mail); Diwaker Gupta; nitingupta910@gmail.com; Kurt > Hackel > Subject: Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine > and(?) API > > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA512 > > Dan Magenheimer schreef: > > Xen hypervisor experts -- > > ... > > > Comments? (on both the API proposal and the engine) > > Is this engine capable of progressing to a memory > de-duplication style? > Or is that outside the scope? > > > Stefan > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v2.0.9 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEAREKAAYFAkjxJMkACgkQYH1+F2Rqwn3H0wCfas1yNV+EXXwRuUVGZE5OpkKs > WVQAn0ViQ/B3HtXaZLj7vPM0CHKOXc8B > =2OZG > -----END PGP SIGNATURE----- > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-12 09:16 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
On 11/10/08 22:44, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> As a result, I''d like to propose a change to the xmalloc interface > to make this issue more explicit: I''d like to change xmalloc/xfree > to FAIL on allocation sizes greater than PAGE_SIZE - DELTA, where > DELTA is a defined constant. Callers that require an allocation > larger than that MUST use the page_alloc (and corresponding > page_free) interfaces. In other words, for any dynamic allocation > code that needs a dynamically computed size that might exceed a > page, the test must be done on the caller-side... and the caller > is responsible for remembering whether the subpage allocator or > the page-plus allocator was used, and free''ing with the matching > subpage-free or page-plus-free routine. While I''d never propose > this unforgiving API for user-land code, I think it isn''t unreasonable > in a hypervisor.This sounds crazy to me. xmalloc() should work like malloc(). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-12 10:25 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
On 12/10/08 10:16, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:> On 11/10/08 22:44, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote: > >> As a result, I''d like to propose a change to the xmalloc interface >> to make this issue more explicit: I''d like to change xmalloc/xfree >> to FAIL on allocation sizes greater than PAGE_SIZE - DELTA, where >> DELTA is a defined constant. Callers that require an allocation >> larger than that MUST use the page_alloc (and corresponding >> page_free) interfaces. In other words, for any dynamic allocation >> code that needs a dynamically computed size that might exceed a >> page, the test must be done on the caller-side... and the caller >> is responsible for remembering whether the subpage allocator or >> the page-plus allocator was used, and free''ing with the matching >> subpage-free or page-plus-free routine. While I''d never propose >> this unforgiving API for user-land code, I think it isn''t unreasonable >> in a hypervisor. > > This sounds crazy to me. xmalloc() should work like malloc().For example, why not take Linux''s SLUB allocator? The fact it''s tried and tested in a real-world environment not unlike our own is a big advantage to my mind. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Oct-12 16:28 UTC
RE: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
> > On 11/10/08 22:44, "Dan Magenheimer" > <dan.magenheimer@oracle.com> wrote: > > > >> As a result, I''d like to propose a change to the xmalloc interface > >> to make this issue more explicit: I''d like to change xmalloc/xfree > >> to FAIL on allocation sizes greater than PAGE_SIZE - DELTA, where > >> DELTA is a defined constant. Callers that require an allocation > >> larger than that MUST use the page_alloc (and corresponding > >> page_free) interfaces. In other words, for any dynamic allocation > >> code that needs a dynamically computed size that might exceed a > >> page, the test must be done on the caller-side... and the caller > >> is responsible for remembering whether the subpage allocator or > >> the page-plus allocator was used, and free''ing with the matching > >> subpage-free or page-plus-free routine. While I''d never propose > >> this unforgiving API for user-land code, I think it isn''t > >> unreasonable in a hypervisor. > > This sounds crazy to me. xmalloc() should work like malloc().While I agree completely in principle, abstractions can be costly. In this case, the fact that xmalloc(PAGE_SIZE-1) actually uses 2*PAGE_SIZE of space (and fails even if there are lots of free pages but no pair of contiguous pages) is an undocumented consequence of the underlying implementation... which is not entirely unreasonable in user-land but is IMHO questionable in the guts of a hypervisor where memory shouldn''t be accidentally wasted. To date, Xen hasn''t focused much on optimizing memory usage but I think that will change over the next year or two.> For example, why not take Linux''s SLUB allocator? The fact > it''s tried and > tested in a real-world environment not unlike our own is a > big advantage to > my mind.The slub allocator is still quite a bit more complex than xmalloc or TLSF and its not clear that that complexity will add much value, or solve the worst-case issues when allocating a huge number of different-sized near-page-size units. TLSF is very similar to the existing xmalloc except that it has better response time, MUCH better worst-case response time, and better fragmentation characteristics. Adding a slub allocator to Xen might be useful to solve some other problems but it doesn''t help me (or future de-duplication code I don''t think). (BTW, my proposal above to change the API is orthogonal to this discussion.) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-12 16:47 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
On 12/10/08 17:28, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:>> This sounds crazy to me. xmalloc() should work like malloc(). > > While I agree completely in principle, abstractions can be costly. > In this case, the fact that xmalloc(PAGE_SIZE-1) actually uses > 2*PAGE_SIZE of space (and fails even if there are lots of free > pages but no pair of contiguous pages) is an undocumented consequence > of the underlying implementation... which is not entirely > unreasonable in user-land but is IMHO questionable in the guts > of a hypervisor where memory shouldn''t be accidentally wasted. > > To date, Xen hasn''t focused much on optimizing memory usage but > I think that will change over the next year or two.This wastage should be empirically measured by instrumentation and then optimisations made where worthwhile. That is somewhat orthogonal to the issue of what represents a sensible interface to xmalloc(), except to say that I would generally prefer a more sophisticated and efficient mechanism behind a simple interface, rather than punt complexity into callers (by making the costs hidden by the simple interface excessive and hence unusable; or by complicating the interface with weird constraints). My point in bringing up SLUB is that I assume it''s an allocator designed to work reasonably well across a range of allocation-request-size distributions, including those containing requests of size x-pages-minus-a-bit. I''d rather have a more complicated allocator than a more complicated xmalloc() interface. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Oct-12 17:31 UTC
RE: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
> >> This sounds crazy to me. xmalloc() should work like malloc(). > > > > While I agree completely in principle, abstractions can be costly. > > In this case, the fact that xmalloc(PAGE_SIZE-1) actually uses > > 2*PAGE_SIZE of space (and fails even if there are lots of free > > pages but no pair of contiguous pages) is an undocumented > consequence > > of the underlying implementation... which is not entirely > > unreasonable in user-land but is IMHO questionable in the guts > > of a hypervisor where memory shouldn''t be accidentally wasted. > > > > To date, Xen hasn''t focused much on optimizing memory usage but > > I think that will change over the next year or two. > > This wastage should be empirically measured by > instrumentation and then > optimisations made where worthwhile. That is somewhat > orthogonal to the > issue of what represents a sensible interface to xmalloc(), > except to say > that I would generally prefer a more sophisticated and > efficient mechanism > behind a simple interface, rather than punt complexity into > callers (by > making the costs hidden by the simple interface excessive and hence > unusable; or by complicating the interface with weird constraints).Fair enough. I will mimic the xmalloc API then. However, I *would* like to export (via #define in xmalloc.h or function call or something) the definition of DELTA (e.g. the xmalloc space overhead) so my caller-side code can avoid the wastage. I never want to accidentally xmalloc two pages when heap-alloc''ing one page will do.> My point in bringing up SLUB is that I assume it''s an > allocator designed to > work reasonably well across a range of allocation-request-size > distributions, including those containing requests of size > x-pages-minus-a-bit. I''d rather have a more complicated > allocator than a > more complicated xmalloc() interface.I''m no slab/slub expert but I think the interface only works well with fixed-size objects and when several of the fixed-size objects can be crammed into a single page. I have a large set of objects that are essentially random in size (but all less than or equal to a page). Dan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-12 17:42 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
On 12/10/08 18:31, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:>> This wastage should be empirically measured by >> instrumentation and then >> optimisations made where worthwhile. That is somewhat >> orthogonal to the >> issue of what represents a sensible interface to xmalloc(), >> except to say >> that I would generally prefer a more sophisticated and >> efficient mechanism >> behind a simple interface, rather than punt complexity into >> callers (by >> making the costs hidden by the simple interface excessive and hence >> unusable; or by complicating the interface with weird constraints). > > Fair enough. I will mimic the xmalloc API then. However, I *would* > like to export (via #define in xmalloc.h or function call or something) > the definition of DELTA (e.g. the xmalloc space overhead) so my > caller-side code can avoid the wastage. I never want to accidentally > xmalloc two pages when heap-alloc''ing one page will do.A better xmalloc() implementation would allocate the necessary two pages from alloc_heap_pages() and put the remaining just-less-than-a-page on its free lists, rather than waste it. Or put the allocation metadata out-of-band (e.g., in page_info, like SLUB) so that there is no DELTA. *However*, exposing DELTA for those callers who care about it, given limitations of current xmalloc() implementation, is a reasonable way to go as far as I''m concerned.>> My point in bringing up SLUB is that I assume it''s an >> allocator designed to >> work reasonably well across a range of allocation-request-size >> distributions, including those containing requests of size >> x-pages-minus-a-bit. I''d rather have a more complicated >> allocator than a >> more complicated xmalloc() interface. > > I''m no slab/slub expert but I think the interface only works > well with fixed-size objects and when several of the fixed-size > objects can be crammed into a single page. I have a large set > of objects that are essentially random in size (but all less > than or equal to a page).Well, you''d end up using the power-of-two sized caches. And since SLUB doesn''t put metadata in the data pages, the 4096-byte object cache will serve up true single pages. But yes, this is an aspect I hadn''t fully considered. I''m not wedded to use of SLUB, it was just my arguing stick. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-12 17:55 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
On 12/10/08 18:44, "Nitin Gupta" <nitingupta910@gmail.com> wrote:>> I''m no slab/slub expert but I think the interface only works >> well with fixed-size objects and when several of the fixed-size >> objects can be crammed into a single page. I have a large set >> of objects that are essentially random in size (but all less >> than or equal to a page). >> > > Using slab for your use case, which requires lot of allocations for > near page-sized objects, is really bad idea. For memory compression > project (compcache) I had very similar requirement. So, I tested > kmalloc which uses slabs of various sizes to allocate various sizes. > As expected its space efficiency was horrible. At least for compcache > workload it used ~40% more memory than TLSF. Please refer: > http://code.google.com/p/compcache/wiki/AllocatorsComparisonYes, I was just going to reply to Dan''s TLSF email to say that my default plan for improving xmalloc() was going to be to implement a sub-page buddy allocator or to port SLUB from Linux. Noth of these place metadata in the page-info structure, but both would end up allocating power-of-two-sized data blocks for random allocations of xmalloc(). The overheads of this -- for example if many allocations are just over 2048 bytes for example -- could be pretty nasty. I think TLSF sounds quite an intriquing alternative. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Oct-12 18:03 UTC
RE: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
Hi Nitin -- Excellent! Your xvMalloc looks even better! I''ll read the code and try it (tomorrow). Keir, is a "GNU Lesser GPL Version 3" license OK for Xen? Thanks, Dan> -----Original Message----- > From: Nitin Gupta [mailto:nitingupta910@gmail.com] > Sent: Sunday, October 12, 2008 11:45 AM > To: Dan Magenheimer > Cc: Keir Fraser; Xen-Devel (E-mail); Diwaker Gupta; Kurt Hackel > Subject: Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine > and(?) API > > > On Sun, Oct 12, 2008 at 11:01 PM, Dan Magenheimer > <dan.magenheimer@oracle.com> wrote: > > > > I''m no slab/slub expert but I think the interface only works > > well with fixed-size objects and when several of the fixed-size > > objects can be crammed into a single page. I have a large set > > of objects that are essentially random in size (but all less > > than or equal to a page). > > > > Using slab for your use case, which requires lot of allocations for > near page-sized objects, is really bad idea. For memory compression > project (compcache) I had very similar requirement. So, I tested > kmalloc which uses slabs of various sizes to allocate various sizes. > As expected its space efficiency was horrible. At least for compcache > workload it used ~40% more memory than TLSF. Please refer: > http://code.google.com/p/compcache/wiki/AllocatorsComparison > > Regarding non-standard TLSF interface, it should be quite simple to > have wrappers around TLSF to have standard malloc/free API. > > Also I recently completed draft implementation of a new allocator - > xvMalloc (based on TLSF) that is specifically suited for allocation > behavior of compcache (I think "difference engine" project has almost > same requirements). > http://code.google.com/p/compcache/wiki/xvMalloc > > Nitin >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Oct-12 18:34 UTC
Re: [Xen-devel] [RFC] Replacing Xen''s xmalloc engine and(?) API
On 12/10/08 19:03, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> Excellent! Your xvMalloc looks even better! I''ll read the > code and try it (tomorrow). > > Keir, is a "GNU Lesser GPL Version 3" license OK for Xen?Unfortunately I doubt it, since Xen is GPLv2 and we wouldn''t really be linking against xvMalloc as a separate library (not least because it would probably get tailored a bit to get checked into the Xen tree) so the ''Lesser'' license probably doesn''t apply. If Nitin isn''t too attached to GPLv3 it would certainly be better for us if it used GPLv2, like TLSF. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel