Blair Bethwaite
2017-Nov-14 17:52 UTC
Re: [libvirt-users] dramatic performance slowdown due to THP allocation failure with full pagecache
Thanks for the reply Daniel, However I think you slightly misunderstood the scenario... On 14 November 2017 at 10:32, Daniel P. Berrange <berrange@redhat.com> wrote:> IOW, if your application has a certain expectation of performance that can only > be satisfied by having the KVM guest backed by huge pages, then you should > really change to explicitly reserve huge pages for the guests, and not rely on > THP which inherantly can't provide any guarantee in this area.We already do this. The problem is not hugepage backing of the guest, it is THP allocation inside the guest (or indeed on a bare-metal host). The issue in the HPC world is that we support so many different applications (some of which are complete black-boxes) that explicit hugepage allocation for application memory is generally not viable, so we are reliant on THP to avoid TLB thrashing.> The kernel can't predict the future usage pattern of processes so it is not at > all clear cut that evicting the entire pagecache in order to allocate more > huge pages is going to be beneficial for system performance as a whole.Yet the default behaviour seems to be to stall on fault, then directly reclaim and defrag in order to allocate a hugepage if at all possible. In my test-case there is almost no free memory, so some pagecache has to be reclaimed for the process, I don't understand why the THP allocation fails in this case versus when pagecache is lower though. -- Cheers, ~Blairo
Daniel P. Berrange
2017-Nov-14 17:56 UTC
Re: [libvirt-users] dramatic performance slowdown due to THP allocation failure with full pagecache
On Tue, Nov 14, 2017 at 10:52:03AM -0700, Blair Bethwaite wrote:> Thanks for the reply Daniel, > > However I think you slightly misunderstood the scenario... > > On 14 November 2017 at 10:32, Daniel P. Berrange <berrange@redhat.com> wrote: > > IOW, if your application has a certain expectation of performance that can only > > be satisfied by having the KVM guest backed by huge pages, then you should > > really change to explicitly reserve huge pages for the guests, and not rely on > > THP which inherantly can't provide any guarantee in this area. > > We already do this. The problem is not hugepage backing of the guest, > it is THP allocation inside the guest (or indeed on a bare-metal > host). The issue in the HPC world is that we support so many different > applications (some of which are complete black-boxes) that explicit > hugepage allocation for application memory is generally not viable, so > we are reliant on THP to avoid TLB thrashing.Oh well THP usage inside the guest is then not really anything todo with virt, just a regular Linux questions, so not sure libvirt is the best place to ask. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
Blair Bethwaite
2017-Nov-14 18:03 UTC
Re: [libvirt-users] dramatic performance slowdown due to THP allocation failure with full pagecache
On 14 November 2017 at 10:56, Daniel P. Berrange <berrange@redhat.com> wrote:> Oh well THP usage inside the guest is then not really anything todo with > virt, just a regular Linux questions, so not sure libvirt is the best > place to ask.True, I just hoped you or one of the other devs might have some insight on reclaim behaviour that would provide a clue. I guess I'll try a linux list. Any comment on this (which is virt related): (1) a related and possibly dumb question: in the case of a high-performance KVM where the guest is hugepage backed and pinned in host memory anyway, why do we still have a table based resolution for guest physical to host virtual address translation - couldn't this just be done by offset? -- Cheers, ~Blairo
Possibly Parallel Threads
- Re: dramatic performance slowdown due to THP allocation failure with full pagecache
- dramatic performance slowdown due to THP allocation failure with full pagecache
- Re: dramatic performance slowdown due to THP allocation failure with full pagecache
- Re: dramatic performance slowdown due to THP allocation failure with full pagecache
- high memory guest issues - virsh start and QEMU_JOB_WAIT_TIME