David Hildenbrand
2021-Jan-05 10:27 UTC
[RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO
On 05.01.21 11:22, Liang Li wrote:>>>> That?s mostly already existing scheduling logic, no? (How many vms can I put onto a specific machine eventually) >>> >>> It depends on how the scheduling component is designed. Yes, you can put >>> 10 VMs with 4C8G(4CPU, 8G RAM) on a host and 20 VMs with 2C4G on >>> another one. But if one type of them, e.g. 4C8G are sold out, customers >>> can't by more 4C8G VM while there are some free 2C4G VMs, the resource >>> reserved for them can be provided as 4C8G VMs >>> >> >> 1. You can, just the startup time will be a little slower? E.g., grow >> pre-allocated 4G file to 8G. >> >> 2. Or let's be creative: teach QEMU to construct a single >> RAMBlock/MemoryRegion out of multiple tmpfs files. Works as long as you >> don't go crazy on different VM sizes / size differences. >> >> 3. In your example above, you can dynamically rebalance as VMs are >> getting sold, to make sure you always have "big ones" lying around you >> can shrink on demand. >> > Yes, we can always come up with some ways to make things work. > it will make the developer of the upper layer component crazy :)I'd say that's life in upper layers to optimize special (!) use cases. :)>>> >>> You must know there are a lot of functions in the kernel which can >>> be done in userspace. e.g. Some of the device emulations like APIC, >>> vhost-net backend which has userspace implementation. :) >>> Bad or not depends on the benefits the solution brings. >>> From the viewpoint of a user space application, the kernel should >>> provide high performance memory management service. That's why >>> I think it should be done in the kernel. >> >> As I expressed a couple of times already, I don't see why using >> hugetlbfs and implementing some sort of pre-zeroing there isn't sufficient. > > Did I miss something before? I thought you doubt the need for > hugetlbfs free page pre zero out. Hugetlbfs is a good choice and is > sufficient.I remember even suggesting to focus on hugetlbfs during your KVM talk when chatting. Maybe I was not clear before.> >> We really don't *want* complicated things deep down in the mm core if >> there are reasonable alternatives. >> > I understand your concern, we should have sufficient reason to add a new > feature to the kernel. And for this one, it's most value is to make the > application's life is easier. And implementing it in hugetlbfs can avoid > adding more complexity to core MM.Exactly, that's my point. Some people might still disagree with the hugetlbfs approach, but there it's easier to add tunables without affecting the overall system. -- Thanks, David / dhildenb