Michael S. Tsirkin
2017-Mar-10 15:58 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 07:37:28PM +0800, Wei Wang wrote:> On 03/09/2017 10:14 PM, Matthew Wilcox wrote: > > On Fri, Mar 03, 2017 at 01:40:28PM +0800, Wei Wang wrote: > > > From: Liang Li <liang.z.li at intel.com> > > > 1) allocating pages (6.5%) > > > 2) sending PFNs to host (68.3%) > > > 3) address translation (6.1%) > > > 4) madvise (19%) > > > > > > This patch optimizes step 2) by transfering pages to the host in > > > chunks. A chunk consists of guest physically continuous pages, and > > > it is offered to the host via a base PFN (i.e. the start PFN of > > > those physically continuous pages) and the size (i.e. the total > > > number of the pages). A normal chunk is formated as below: > > > ----------------------------------------------- > > > | Base (52 bit) | Size (12 bit)| > > > ----------------------------------------------- > > > For large size chunks, an extended chunk format is used: > > > ----------------------------------------------- > > > | Base (64 bit) | > > > ----------------------------------------------- > > > ----------------------------------------------- > > > | Size (64 bit) | > > > ----------------------------------------------- > > What's the advantage to extended chunks? IOW, why is the added complexity > > of having two chunk formats worth it? You already reduced the overhead by > > a factor of 4096 with normal chunks ... how often are extended chunks used > > and how much more efficient are they than having several normal chunks? > > > > Right, chunk_ext may be rarely used, thanks. I will remove chunk_ext if > there is no objection from others. > > Best, > WeiI don't think we can drop this, this isn't an optimization. One of the issues of current balloon is the 4k page size assumption. For example if you free a huge page you have to split it up and pass 4k chunks to host. Quite often host can't free these 4k chunks at all (e.g. when it's using huge tlb fs). It's even sillier for architectures with base page size >4k. So as long as we are changing things, let's not hard-code the 12 shift thing everywhere. Two things to consider: - host should pass its base page size to guest this can be a separate patch and for now we can fall back on 12 bit if not there - guest should pass full huge pages to host this should be done correctly to avoid breaking up huge pages I would say yes let's use a single format but drop the "normal chunk" and always use the extended one. Also, size is in units of 4k, right? Please document that low 12 bit are reserved, they will be handy as e.g. flags.
Matthew Wilcox
2017-Mar-10 17:11 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 05:58:28PM +0200, Michael S. Tsirkin wrote:> One of the issues of current balloon is the 4k page size > assumption. For example if you free a huge page you > have to split it up and pass 4k chunks to host. > Quite often host can't free these 4k chunks at all (e.g. > when it's using huge tlb fs). > It's even sillier for architectures with base page size >4k.I completely agree with you that we should be able to pass a hugepage as a single chunk. Also we shouldn't assume that host and guest have the same page size. I think we can come up with a scheme that actually lets us encode that into a 64-bit word, something like this: bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k That means we can always pass 2048 pages (of whatever page size) in a single chunk. And we support arbitrary power of two page sizes. I suggest something like this: u64 page_to_chunk(struct page *page) { u64 chunk = page_to_pfn(page) << PAGE_SHIFT; chunk |= (1UL << compound_order(page)) - 1; } (note this is a single page of order N, so we leave the page count bits set to 0, meaning one page).> Two things to consider: > - host should pass its base page size to guest > this can be a separate patch and for now we can fall back on 12 bit if not thereWith this encoding scheme, I don't think we need to do this? As long as it's *at least* 12 bit, then we're fine.> - guest should pass full huge pages to host > this should be done correctly to avoid breaking up huge pages > I would say yes let's use a single format but drop the "normal chunk" > and always use the extended one. > Also, size is in units of 4k, right? Please document that low 12 bit > are reserved, they will be handy as e.g. flags.What per-chunk flags are you thinking would be useful?
Michael S. Tsirkin
2017-Mar-10 19:10 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 09:11:44AM -0800, Matthew Wilcox wrote:> On Fri, Mar 10, 2017 at 05:58:28PM +0200, Michael S. Tsirkin wrote: > > One of the issues of current balloon is the 4k page size > > assumption. For example if you free a huge page you > > have to split it up and pass 4k chunks to host. > > Quite often host can't free these 4k chunks at all (e.g. > > when it's using huge tlb fs). > > It's even sillier for architectures with base page size >4k. > > I completely agree with you that we should be able to pass a hugepage > as a single chunk. Also we shouldn't assume that host and guest have > the same page size. I think we can come up with a scheme that actually > lets us encode that into a 64-bit word, something like this: > > bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. > bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k > bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. > bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k > bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64khuge page sizes go up to gigabytes.> That means we can always pass 2048 pages (of whatever page size) in a single chunk. And > we support arbitrary power of two page sizes. I suggest something like this: > > u64 page_to_chunk(struct page *page) > { > u64 chunk = page_to_pfn(page) << PAGE_SHIFT; > chunk |= (1UL << compound_order(page)) - 1; > } > > (note this is a single page of order N, so we leave the page count bits > set to 0, meaning one page). > > > Two things to consider: > > - host should pass its base page size to guest > > this can be a separate patch and for now we can fall back on 12 bit if not there > > With this encoding scheme, I don't think we need to do this? As long as > it's *at least* 12 bit, then we're fine. > > > - guest should pass full huge pages to host > > this should be done correctly to avoid breaking up huge pages > > I would say yes let's use a single format but drop the "normal chunk" > > and always use the extended one. > > Also, size is in units of 4k, right? Please document that low 12 bit > > are reserved, they will be handy as e.g. flags. > > What per-chunk flags are you thinking would be useful?
Michael S. Tsirkin
2017-Mar-10 19:35 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 09:11:44AM -0800, Matthew Wilcox wrote:> On Fri, Mar 10, 2017 at 05:58:28PM +0200, Michael S. Tsirkin wrote: > > One of the issues of current balloon is the 4k page size > > assumption. For example if you free a huge page you > > have to split it up and pass 4k chunks to host. > > Quite often host can't free these 4k chunks at all (e.g. > > when it's using huge tlb fs). > > It's even sillier for architectures with base page size >4k. > > I completely agree with you that we should be able to pass a hugepage > as a single chunk. Also we shouldn't assume that host and guest have > the same page size. I think we can come up with a scheme that actually > lets us encode that into a 64-bit word, something like this: > > bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. > bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k > bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. > bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k > bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k > That means we can always pass 2048 pages (of whatever page size) in a single chunk. And > we support arbitrary power of two page sizes. I suggest something like this: > > u64 page_to_chunk(struct page *page) > { > u64 chunk = page_to_pfn(page) << PAGE_SHIFT; > chunk |= (1UL << compound_order(page)) - 1; > }You need to fill in the size, do you not?> > (note this is a single page of order N, so we leave the page count bits > set to 0, meaning one page). > > > Two things to consider: > > - host should pass its base page size to guest > > this can be a separate patch and for now we can fall back on 12 bit if not there > > With this encoding scheme, I don't think we need to do this? As long as > it's *at least* 12 bit, then we're fine.I think we will still need something like this down the road. The point is that not all hosts are able to use 4k pages in a balloon. So it's pointless for guest to pass 4k pages to such a host, and we need host to tell guest the page size it needs. However that's a separate feature that can wait until another day.> > - guest should pass full huge pages to host > > this should be done correctly to avoid breaking up huge pages > > I would say yes let's use a single format but drop the "normal chunk" > > and always use the extended one. > > Also, size is in units of 4k, right? Please document that low 12 bit > > are reserved, they will be handy as e.g. flags. > > What per-chunk flags are you thinking would be useful?Not entirely sure but I think would have been prudent to leave some free if possible. Your encoding seems to use them all up, so be it. -- MST
Wei Wang
2017-Mar-11 11:59 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On 03/11/2017 01:11 AM, Matthew Wilcox wrote:> On Fri, Mar 10, 2017 at 05:58:28PM +0200, Michael S. Tsirkin wrote: >> One of the issues of current balloon is the 4k page size >> assumption. For example if you free a huge page you >> have to split it up and pass 4k chunks to host. >> Quite often host can't free these 4k chunks at all (e.g. >> when it's using huge tlb fs). >> It's even sillier for architectures with base page size >4k. > I completely agree with you that we should be able to pass a hugepage > as a single chunk. Also we shouldn't assume that host and guest have > the same page size. I think we can come up with a scheme that actually > lets us encode that into a 64-bit word, something like this: > > bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. > bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k > bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. > bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k > bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k > > That means we can always pass 2048 pages (of whatever page size) in a single chunk. And > we support arbitrary power of two page sizes. I suggest something like this: > > u64 page_to_chunk(struct page *page) > { > u64 chunk = page_to_pfn(page) << PAGE_SHIFT; > chunk |= (1UL << compound_order(page)) - 1; > } > > (note this is a single page of order N, so we leave the page count bits > set to 0, meaning one page). >I'm thinking what if the guest needs to transfer these much physically continuous memory to host: 1GB+2MB+64KB+32KB+16KB+4KB. Is it going to use Six 64-bit chunks? Would it be simpler if we just use the 128-bit chunk format (we can drop the previous normal 64-bit format)? Best, Wei
Reasonably Related Threads
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER