Michael S. Tsirkin
2016-Jul-28 01:45 UTC
[PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
On Thu, Jul 28, 2016 at 01:13:35AM +0000, Li, Liang Z wrote:> > Subject: Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate > > process > > > > On 07/26/2016 06:23 PM, Liang Li wrote: > > > + vb->pfn_limit = VIRTIO_BALLOON_PFNS_LIMIT; > > > + vb->pfn_limit = min(vb->pfn_limit, get_max_pfn()); > > > + vb->bmap_len = ALIGN(vb->pfn_limit, BITS_PER_LONG) / > > > + BITS_PER_BYTE + 2 * sizeof(unsigned long); > > > + hdr_len = sizeof(struct balloon_bmap_hdr); > > > + vb->bmap_hdr = kzalloc(hdr_len + vb->bmap_len, GFP_KERNEL); > > > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big. How big > > was the pfn buffer before? > > Yes, it is if the max pfn is more than 32GB. > The size of the pfn buffer use before is 256*4 = 1024 Bytes, it's too small, > and it's the main reason for bad performance. > Use the max 1MB kmalloc is a balance between performance and flexibility, > a large page bitmap covers the range of all the memory is no good for a system > with huge amount of memory. If the bitmap is too small, it means we have > to traverse a long list for many times, and it's bad for performance. > > Thanks! > LiangThere are all your implementation decisions though. If guest memory is so fragmented that you only have order 0 4k pages, then allocating a huge 1M contigious chunk is very problematic in and of itself. Most people rarely migrate and do not care how fast that happens. Wasting a large chunk of memory (and it's zeroed for no good reason, so you actually request host memory for it) for everyone to speed it up when it does happen is not really an option. -- MST
Li, Liang Z
2016-Jul-28 06:36 UTC
[virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
> > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big. > > > How big was the pfn buffer before? > > > > Yes, it is if the max pfn is more than 32GB. > > The size of the pfn buffer use before is 256*4 = 1024 Bytes, it's too > > small, and it's the main reason for bad performance. > > Use the max 1MB kmalloc is a balance between performance and > > flexibility, a large page bitmap covers the range of all the memory is > > no good for a system with huge amount of memory. If the bitmap is too > > small, it means we have to traverse a long list for many times, and it's bad > for performance. > > > > Thanks! > > Liang > > There are all your implementation decisions though. > > If guest memory is so fragmented that you only have order 0 4k pages, then > allocating a huge 1M contigious chunk is very problematic in and of itself. >The memory is allocated in the probe stage. This will not happen if the driver is loaded when booting the guest.> Most people rarely migrate and do not care how fast that happens. > Wasting a large chunk of memory (and it's zeroed for no good reason, so you > actually request host memory for it) for everyone to speed it up when it > does happen is not really an option. >If people don't plan to do inflating/deflating, they should not enable the virtio-balloon at the beginning, once they decide to use it, the driver should provide better performance as much as possible. 1MB is a very small portion for a VM with more than 32GB memory and it's the *worst case*, for VM with less than 32GB memory, the amount of RAM depends on VM's memory size and will be less than 1MB. If 1MB is too big, how about 512K, or 256K? 32K seems too small. Liang> -- > MST > > --------------------------------------------------------------------- > To unsubscribe, e-mail: virtio-dev-unsubscribe at lists.oasis-open.org > For additional commands, e-mail: virtio-dev-help at lists.oasis-open.org
Michael S. Tsirkin
2016-Jul-28 21:51 UTC
[virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
On Thu, Jul 28, 2016 at 06:36:18AM +0000, Li, Liang Z wrote:> > > > This ends up doing a 1MB kmalloc() right? That seems a _bit_ big. > > > > How big was the pfn buffer before? > > > > > > Yes, it is if the max pfn is more than 32GB. > > > The size of the pfn buffer use before is 256*4 = 1024 Bytes, it's too > > > small, and it's the main reason for bad performance. > > > Use the max 1MB kmalloc is a balance between performance and > > > flexibility, a large page bitmap covers the range of all the memory is > > > no good for a system with huge amount of memory. If the bitmap is too > > > small, it means we have to traverse a long list for many times, and it's bad > > for performance. > > > > > > Thanks! > > > Liang > > > > There are all your implementation decisions though. > > > > If guest memory is so fragmented that you only have order 0 4k pages, then > > allocating a huge 1M contigious chunk is very problematic in and of itself. > > > > The memory is allocated in the probe stage. This will not happen if the driver is > loaded when booting the guest. > > > Most people rarely migrate and do not care how fast that happens. > > Wasting a large chunk of memory (and it's zeroed for no good reason, so you > > actually request host memory for it) for everyone to speed it up when it > > does happen is not really an option. > > > If people don't plan to do inflating/deflating, they should not enable the virtio-balloon > at the beginning, once they decide to use it, the driver should provide better performance > as much as possible.The reason people inflate/deflate is so they can overcommit memory. Do they need to overcommit very quickly? I don't see why. So let's get what we can for free but I don't really believe people would want to pay for it.> 1MB is a very small portion for a VM with more than 32GB memory and it's the *worst case*, > for VM with less than 32GB memory, the amount of RAM depends on VM's memory size > and will be less than 1MB.It's guest memmory so might all be in swap and never touched, your memset at probe time will fault it in and make hypervisor actually pay for it.> If 1MB is too big, how about 512K, or 256K? 32K seems too small. > > LiangIt's only small because it makes you rescan the free list. So maybe you should do something else. I looked at it a bit. Instead of scanning the free list, how about scanning actual page structures? If page is unused, pass it to host. Solves the problem of rescanning multiple times, does it not? Another idea: allocate a small bitmap at probe time (e.g. for deflate), allocate a bunch more on each request. Use something like GFP_ATOMIC and a scatter/gather, if that fails use the smaller bitmap.> > -- > > MST > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: virtio-dev-unsubscribe at lists.oasis-open.org > > For additional commands, e-mail: virtio-dev-help at lists.oasis-open.org
Possibly Parallel Threads
- [virtio-dev] Re: [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process
- [PATCH v2 repost 4/7] virtio-balloon: speed up inflate/deflate process