Michael S. Tsirkin
2016-Jul-28 21:51 UTC
[virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
On Thu, Jul 28, 2016 at 06:36:18AM +0000, Li, Liang Z wrote:> > > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big. > > > > How big was the pfn buffer before? > > > > > > Yes, it is if the max pfn is more than 32GB. > > > The size of the pfn buffer use before is 256*4 = 1024 Bytes, it's too > > > small, and it's the main reason for bad performance. > > > Use the max 1MB kmalloc is a balance between performance and > > > flexibility, a large page bitmap covers the range of all the memory is > > > no good for a system with huge amount of memory. If the bitmap is too > > > small, it means we have to traverse a long list for many times, and it's bad > > for performance. > > > > > > Thanks! > > > Liang > > > > There are all your implementation decisions though. > > > > If guest memory is so fragmented that you only have order 0 4k pages, then > > allocating a huge 1M contigious chunk is very problematic in and of itself. > > > > The memory is allocated in the probe stage. This will not happen if the driver is > loaded when booting the guest. > > > Most people rarely migrate and do not care how fast that happens. > > Wasting a large chunk of memory (and it's zeroed for no good reason, so you > > actually request host memory for it) for everyone to speed it up when it > > does happen is not really an option. > > > If people don't plan to do inflating/deflating, they should not enable the virtio-balloon > at the beginning, once they decide to use it, the driver should provide better performance > as much as possible.The reason people inflate/deflate is so they can overcommit memory. Do they need to overcommit very quickly? I don't see why. So let's get what we can for free but I don't really believe people would want to pay for it.> 1MB is a very small portion for a VM with more than 32GB memory and it's the *worst case*, > for VM with less than 32GB memory, the amount of RAM depends on VM's memory size > and will be less than 1MB.It's guest memmory so might all be in swap and never touched, your memset at probe time will fault it in and make hypervisor actually pay for it.> If 1MB is too big, how about 512K, or 256K? 32K seems too small. > > LiangIt's only small because it makes you rescan the free list. So maybe you should do something else. I looked at it a bit. Instead of scanning the free list, how about scanning actual page structures? If page is unused, pass it to host. Solves the problem of rescanning multiple times, does it not? Another idea: allocate a small bitmap at probe time (e.g. for deflate), allocate a bunch more on each request. Use something like GFP_ATOMIC and a scatter/gather, if that fails use the smaller bitmap.> > -- > > MST > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: virtio-dev-unsubscribe at lists.oasis-open.org > > For additional commands, e-mail: virtio-dev-help at lists.oasis-open.org
Li, Liang Z
2016-Jul-29 00:46 UTC
[virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
> On Thu, Jul 28, 2016 at 06:36:18AM +0000, Li, Liang Z wrote: > > > > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big. > > > > > How big was the pfn buffer before? > > > > > > > > Yes, it is if the max pfn is more than 32GB. > > > > The size of the pfn buffer use before is 256*4 = 1024 Bytes, it's > > > > too small, and it's the main reason for bad performance. > > > > Use the max 1MB kmalloc is a balance between performance and > > > > flexibility, a large page bitmap covers the range of all the > > > > memory is no good for a system with huge amount of memory. If the > > > > bitmap is too small, it means we have to traverse a long list for > > > > many times, and it's bad > > > for performance. > > > > > > > > Thanks! > > > > Liang > > > > > > There are all your implementation decisions though. > > > > > > If guest memory is so fragmented that you only have order 0 4k > > > pages, then allocating a huge 1M contigious chunk is very problematic in > and of itself. > > > > > > > The memory is allocated in the probe stage. This will not happen if > > the driver is loaded when booting the guest. > > > > > Most people rarely migrate and do not care how fast that happens. > > > Wasting a large chunk of memory (and it's zeroed for no good reason, > > > so you actually request host memory for it) for everyone to speed it > > > up when it does happen is not really an option. > > > > > If people don't plan to do inflating/deflating, they should not enable > > the virtio-balloon at the beginning, once they decide to use it, the > > driver should provide better performance as much as possible. > > The reason people inflate/deflate is so they can overcommit memory. > Do they need to overcommit very quickly? I don't see why. > So let's get what we can for free but I don't really believe people would want > to pay for it. > > > 1MB is a very small portion for a VM with more than 32GB memory and > > it's the *worst case*, for VM with less than 32GB memory, the amount > > of RAM depends on VM's memory size and will be less than 1MB. > > It's guest memmory so might all be in swap and never touched, your memset > at probe time will fault it in and make hypervisor actually pay for it. > > > If 1MB is too big, how about 512K, or 256K? 32K seems too small. > > > > Liang > > It's only small because it makes you rescan the free list. > So maybe you should do something else. > I looked at it a bit. Instead of scanning the free list, how about scanning actual > page structures? If page is unused, pass it to host. > Solves the problem of rescanning multiple times, does it not? >Yes, agree.> > Another idea: allocate a small bitmap at probe time (e.g. for deflate), allocate > a bunch more on each request. Use something like GFP_ATOMIC and a > scatter/gather, if that fails use the smaller bitmap. >So, the aim of v3 is to use a smaller bitmap without too heavy performance penalty. Thanks a lot! Liang
Dave Hansen
2016-Jul-29 19:48 UTC
[virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
On 07/28/2016 02:51 PM, Michael S. Tsirkin wrote:>> > If 1MB is too big, how about 512K, or 256K? 32K seems too small. >> > > It's only small because it makes you rescan the free list. > So maybe you should do something else. > I looked at it a bit. Instead of scanning the free list, how about > scanning actual page structures? If page is unused, pass it to host. > Solves the problem of rescanning multiple times, does it not?FWIW, I think the new data structure needs some work. Before, we had a potentially very long list of 4k areas. Now, we've just got a very large bitmap. The bitmap might not even be very dense if we are ballooning relatively few things. Can I suggest an alternate scheme? I think you actually need a hybrid scheme that has bitmaps but also allows more flexibility in the pfn ranges. The payload could be a number of records each containing 3 things: pfn, page order, length of bitmap (maybe in powers of 2) Each record is followed by the bitmap. Or, if the bitmap length is 0, immediately followed by another record. A bitmap length of 0 implies a bitmap with the least significant bit set. Page order specifies how many pages each bit represents. This scheme could easily encode the new data structure you are proposing by just setting pfn=0, order=0, and a very long bitmap length. But, it could handle sparse bitmaps much better *and* represent large pages much more efficiently. There's plenty of space to fit a whole record in 64 bits.
Li, Liang Z
2016-Aug-02 00:28 UTC
[virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
> > It's only small because it makes you rescan the free list. > > So maybe you should do something else. > > I looked at it a bit. Instead of scanning the free list, how about > > scanning actual page structures? If page is unused, pass it to host. > > Solves the problem of rescanning multiple times, does it not? > > FWIW, I think the new data structure needs some work. > > Before, we had a potentially very long list of 4k areas. Now, we've just got a > very large bitmap. The bitmap might not even be very dense if we are > ballooning relatively few things. > > Can I suggest an alternate scheme? I think you actually need a hybrid > scheme that has bitmaps but also allows more flexibility in the pfn ranges. > The payload could be a number of records each containing 3 things: > > pfn, page order, length of bitmap (maybe in powers of 2) > > Each record is followed by the bitmap. Or, if the bitmap length is 0, > immediately followed by another record. A bitmap length of 0 implies a > bitmap with the least significant bit set. Page order specifies how many > pages each bit represents. > > This scheme could easily encode the new data structure you are proposing > by just setting pfn=0, order=0, and a very long bitmap length. But, it could > handle sparse bitmaps much better *and* represent large pages much more > efficiently. > > There's plenty of space to fit a whole record in 64 bits.I like your idea and it's more flexible, and it's very useful if we want to optimize the page allocating stage further. I believe the memory fragmentation will not be very serious, so the performance won't be too bad in the worst case. Thanks! Liang
Seemingly Similar Threads
- [virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process