Matthew Wilcox
2017-Mar-10 17:11 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 05:58:28PM +0200, Michael S. Tsirkin wrote:> One of the issues of current balloon is the 4k page size > assumption. For example if you free a huge page you > have to split it up and pass 4k chunks to host. > Quite often host can't free these 4k chunks at all (e.g. > when it's using huge tlb fs). > It's even sillier for architectures with base page size >4k.I completely agree with you that we should be able to pass a hugepage as a single chunk. Also we shouldn't assume that host and guest have the same page size. I think we can come up with a scheme that actually lets us encode that into a 64-bit word, something like this: bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k That means we can always pass 2048 pages (of whatever page size) in a single chunk. And we support arbitrary power of two page sizes. I suggest something like this: u64 page_to_chunk(struct page *page) { u64 chunk = page_to_pfn(page) << PAGE_SHIFT; chunk |= (1UL << compound_order(page)) - 1; } (note this is a single page of order N, so we leave the page count bits set to 0, meaning one page).> Two things to consider: > - host should pass its base page size to guest > this can be a separate patch and for now we can fall back on 12 bit if not thereWith this encoding scheme, I don't think we need to do this? As long as it's *at least* 12 bit, then we're fine.> - guest should pass full huge pages to host > this should be done correctly to avoid breaking up huge pages > I would say yes let's use a single format but drop the "normal chunk" > and always use the extended one. > Also, size is in units of 4k, right? Please document that low 12 bit > are reserved, they will be handy as e.g. flags.What per-chunk flags are you thinking would be useful?
Matthew Wilcox
2017-Mar-10 21:18 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 09:10:53PM +0200, Michael S. Tsirkin wrote:> > I completely agree with you that we should be able to pass a hugepage > > as a single chunk. Also we shouldn't assume that host and guest have > > the same page size. I think we can come up with a scheme that actually > > lets us encode that into a 64-bit word, something like this: > > > > bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. > > bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k > > bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. > > bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k > > bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k > > huge page sizes go up to gigabytes.There was supposed to be a '...' there. For a 16GB hugepage (largest size I know of today), that'd be: bits 0-21 set, 22 clear, 23-33 page count, 34-63 PFN, page size 16G
Matthew Wilcox
2017-Mar-10 21:25 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Fri, Mar 10, 2017 at 09:35:21PM +0200, Michael S. Tsirkin wrote:> > bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. > > bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k > > bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. > > bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k > > bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k > > That means we can always pass 2048 pages (of whatever page size) in a single chunk. And > > we support arbitrary power of two page sizes. I suggest something like this: > > > > u64 page_to_chunk(struct page *page) > > { > > u64 chunk = page_to_pfn(page) << PAGE_SHIFT; > > chunk |= (1UL << compound_order(page)) - 1; > > } > > You need to fill in the size, do you not?I think I did ... (1UL << compound_order(page)) - 1 sets the bottom N bits. Bit N+1 will already be clear. What am I missing?> > > - host should pass its base page size to guest > > > this can be a separate patch and for now we can fall back on 12 bit if not there > > > > With this encoding scheme, I don't think we need to do this? As long as > > it's *at least* 12 bit, then we're fine. > > I think we will still need something like this down the road. The point > is that not all hosts are able to use 4k pages in a balloon. > So it's pointless for guest to pass 4k pages to such a host, > and we need host to tell guest the page size it needs. > > However that's a separate feature that can wait until > another day.Ah, the TRIM/DISCARD debate all over again ... should the guest batch up or should the host do that work ... probably easier to account it in the guest. Might be better to frame it as 'balloon chunk size' rather than host page size as it might have nothing to do with the host page size.> > What per-chunk flags are you thinking would be useful? > > Not entirely sure but I think would have been prudent to leave some free > if possible. Your encoding seems to use them all up, so be it.We don't necessarily have to support 2048 pages in a single chunk. If it's worth reserving some bits, we can do that at the expense of reducing the maximum number of pages per chunk.
Matthew Wilcox
2017-Mar-11 14:09 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Sat, Mar 11, 2017 at 07:59:31PM +0800, Wei Wang wrote:> I'm thinking what if the guest needs to transfer these much physically > continuous > memory to host: 1GB+2MB+64KB+32KB+16KB+4KB. > Is it going to use Six 64-bit chunks? Would it be simpler if we just > use the 128-bit chunk format (we can drop the previous normal 64-bit > format)?Is that a likely thing for the guest to need to do though? Freeing a 1GB page is much more liikely, IMO.
Michael S. Tsirkin
2017-Mar-12 00:07 UTC
[PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
On Sat, Mar 11, 2017 at 07:59:31PM +0800, Wei Wang wrote:> On 03/11/2017 01:11 AM, Matthew Wilcox wrote: > > On Fri, Mar 10, 2017 at 05:58:28PM +0200, Michael S. Tsirkin wrote: > > > One of the issues of current balloon is the 4k page size > > > assumption. For example if you free a huge page you > > > have to split it up and pass 4k chunks to host. > > > Quite often host can't free these 4k chunks at all (e.g. > > > when it's using huge tlb fs). > > > It's even sillier for architectures with base page size >4k. > > I completely agree with you that we should be able to pass a hugepage > > as a single chunk. Also we shouldn't assume that host and guest have > > the same page size. I think we can come up with a scheme that actually > > lets us encode that into a 64-bit word, something like this: > > > > bit 0 clear => bits 1-11 encode a page count, bits 12-63 encode a PFN, page size 4k. > > bit 0 set, bit 1 clear => bits 2-12 encode a page count, bits 13-63 encode a PFN, page size 8k > > bits 0+1 set, bit 2 clear => bits 3-13 for page count, bits 14-63 for PFN, page size 16k. > > bits 0-2 set, bit 3 clear => bits 4-14 for page count, bits 15-63 for PFN, page size 32k > > bits 0-3 set, bit 4 clear => bits 5-15 for page count, bits 16-63 for PFN, page size 64k > > > > That means we can always pass 2048 pages (of whatever page size) in a single chunk. And > > we support arbitrary power of two page sizes. I suggest something like this: > > > > u64 page_to_chunk(struct page *page) > > { > > u64 chunk = page_to_pfn(page) << PAGE_SHIFT; > > chunk |= (1UL << compound_order(page)) - 1; > > } > > > > (note this is a single page of order N, so we leave the page count bits > > set to 0, meaning one page). > > > > I'm thinking what if the guest needs to transfer these much physically > continuous > memory to host: 1GB+2MB+64KB+32KB+16KB+4KB. > Is it going to use Six 64-bit chunks? Would it be simpler if we just > use the 128-bit chunk format (we can drop the previous normal 64-bit > format)? > > Best, > WeiI think I prefer that as a more straightforward approach, but I can live with either approach. -- MST
Apparently Analagous Threads
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER
- [PATCH v7 kernel 3/5] virtio-balloon: implementation of VIRTIO_BALLOON_F_CHUNK_TRANSFER