Michael S. Tsirkin
2016-Dec-15 15:54 UTC
[Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote:> On 12/14/2016 12:59 AM, Li, Liang Z wrote: > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for > >> fast (de)inflating & fast live migration > >> > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > >>> What's the conclusion of your discussion? It seems you want some > >>> statistic before deciding whether to ripping the bitmap from the ABI, > >>> am I right? > >> > >> I think Andrea and David feel pretty strongly that we should remove the > >> bitmap, unless we have some data to support keeping it. I don't feel as > >> strongly about it, but I think their critique of it is pretty valid. I think the > >> consensus is that the bitmap needs to go. > >> > >> The only real question IMNHO is whether we should do a power-of-2 or a > >> length. But, if we have 12 bits, then the argument for doing length is pretty > >> strong. We don't need anywhere near 12 bits if doing power-of-2. > > > > Just found the MAX_ORDER should be limited to 12 if use length instead of order, > > If the MAX_ORDER is configured to a value bigger than 12, it will make things more > > complex to handle this case. > > > > If use order, we need to break a large memory range whose length is not the power of 2 into several > > small ranges, it also make the code complex. > > I can't imagine it makes the code that much more complex. It adds a for > loop. Right? > > > It seems we leave too many bit for the pfn, and the bits leave for length is not enough, > > How about keep 45 bits for the pfn and 19 bits for length, 45 bits for pfn can cover 57 bits > > physical address, that should be enough in the near feature. > > > > What's your opinion? > > I still think 'order' makes a lot of sense. But, as you say, 57 bits is > enough for x86 for a while. Other architectures.... who knows?I think you can probably assume page size >= 4K. But I would not want to make any other assumptions. E.g. there are systems that absolutely require you to set high bits for DMA. I think we really want both length and order. I understand how you are trying to pack them as tightly as possible. However, I thought of a trick, we don't need to encode all possible orders. For example, with 2 bits of order, we can make them mean: 00 - 4K pages 01 - 2M pages 02 - 1G pages guest can program the sizes for each order through config space. We will have 10 bits left for legth. It might make sense to also allow guest to program the number of bits used for order, this will make it easy to extend without host changes. -- MST
Li, Liang Z
2016-Dec-16 01:12 UTC
[Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
> On Thu, Dec 15, 2016 at 07:34:33AM -0800, Dave Hansen wrote: > > On 12/14/2016 12:59 AM, Li, Liang Z wrote: > > >> Subject: Re: [Qemu-devel] [PATCH kernel v5 0/5] Extend > > >> virtio-balloon for fast (de)inflating & fast live migration > > >> > > >> On 12/08/2016 08:45 PM, Li, Liang Z wrote: > > >>> What's the conclusion of your discussion? It seems you want some > > >>> statistic before deciding whether to ripping the bitmap from the > > >>> ABI, am I right? > > >> > > >> I think Andrea and David feel pretty strongly that we should remove > > >> the bitmap, unless we have some data to support keeping it. I > > >> don't feel as strongly about it, but I think their critique of it > > >> is pretty valid. I think the consensus is that the bitmap needs to go. > > >> > > >> The only real question IMNHO is whether we should do a power-of-2 > > >> or a length. But, if we have 12 bits, then the argument for doing > > >> length is pretty strong. We don't need anywhere near 12 bits if doing > power-of-2. > > > > > > Just found the MAX_ORDER should be limited to 12 if use length > > > instead of order, If the MAX_ORDER is configured to a value bigger > > > than 12, it will make things more complex to handle this case. > > > > > > If use order, we need to break a large memory range whose length is > > > not the power of 2 into several small ranges, it also make the code > complex. > > > > I can't imagine it makes the code that much more complex. It adds a > > for loop. Right? > > > > > It seems we leave too many bit for the pfn, and the bits leave for > > > length is not enough, How about keep 45 bits for the pfn and 19 bits > > > for length, 45 bits for pfn can cover 57 bits physical address, that should > be enough in the near feature. > > > > > > What's your opinion? > > > > I still think 'order' makes a lot of sense. But, as you say, 57 bits > > is enough for x86 for a while. Other architectures.... who knows? > > I think you can probably assume page size >= 4K. But I would not want to > make any other assumptions. E.g. there are systems that absolutely require > you to set high bits for DMA. > > I think we really want both length and order. > > I understand how you are trying to pack them as tightly as possible. > > However, I thought of a trick, we don't need to encode all possible orders. > For example, with 2 bits of order, we can make them mean: > 00 - 4K pages > 01 - 2M pages > 02 - 1G pages > > guest can program the sizes for each order through config space. > > We will have 10 bits left for legth. >Please don't, we just get rid of the bitmap for simplification. :)> It might make sense to also allow guest to program the number of bits used > for order, this will make it easy to extend without host changes. >There still exist the case if the MAX_ORDER is configured to a large value, e.g. 36 for a system with huge amount of memory, then there is only 28 bits left for the pfn, which is not enough. Should we limit the MAX_ORDER? I don't think so. It seems use order is better. Thanks! Liang> -- > MST
Andrea Arcangeli
2016-Dec-16 15:40 UTC
[Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
On Fri, Dec 16, 2016 at 01:12:21AM +0000, Li, Liang Z wrote:> There still exist the case if the MAX_ORDER is configured to a large value, e.g. 36 for a system > with huge amount of memory, then there is only 28 bits left for the pfn, which is not enough.Not related to the balloon but how would it help to set MAX_ORDER to 36? What the MAX_ORDER affects is that you won't be able to ask the kernel page allocator for contiguous memory bigger than 1<<(MAX_ORDER-1), but that's a driver issue not relevant to the amount of RAM. Drivers won't suddenly start to ask the kernel allocator to allocate compound pages at orders >= 11 just because more RAM was added. The higher the MAX_ORDER the slower the kernel runs simply so the smaller the MAX_ORDER the better.> Should we limit the MAX_ORDER? I don't think so.We shouldn't strictly depend on MAX_ORDER value but it's mostly limited already even if configurable at build time. We definitely need it to reach at least the hugepage size, then it's mostly driver issue, but drivers requiring large contiguous allocations should rely on CMA only or vmalloc if they only require it virtually contiguous, and not rely on larger MAX_ORDER that would slowdown all kernel allocations/freeing.
Reasonably Related Threads
- [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
- [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
- [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
- [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration
- [Qemu-devel] [PATCH kernel v5 0/5] Extend virtio-balloon for fast (de)inflating & fast live migration