David Hildenbrand
2020-Mar-09 09:03 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On 08.03.20 05:47, Tyler Sanderson wrote:> Tested-by: Tyler Sanderson <tysand at google.com> > > Test setup: VM with 16 CPU, 64GB RAM. Running Debian 10. We have a 42 > GB file full of random bytes that we continually cat to /dev/null. > This fills the page cache as the file is read. Meanwhile we trigger > the balloon to inflate, with a target size of 53 GB. This setup causes > the balloon inflation to pressure the page cache as the page cache is > also trying to grow. Afterwards we shrink the balloon back to zero (so > total deflate = total inflate). > > Without patch (kernel 4.19.0-5): > Inflation never reaches the target until we stop the "cat file > > /dev/null" process. Total inflation time was 542 seconds. The longest > period that made no net forward progress was 315 seconds (see attached > graph). > Result of "grep balloon /proc/vmstat" after the test: > balloon_inflate 154828377 > balloon_deflate 154828377 > > With patch (kernel 5.6.0-rc4+): > Total inflation duration was 63 seconds. No deflate-queue activity > occurs when pressuring the page-cache. > Result of "grep balloon /proc/vmstat" after the test: > balloon_inflate 12968539 > balloon_deflate 12968539 > > Conclusion: This patch fixes the issue. In the test it reduced > inflate/deflate activity by 12x, and reduced inflation time by 8.6x. > But more importantly, if we hadn't killed the "grep balloon > /proc/vmstat" process then, without the patch, the inflation process > would never reach the target. > > Attached is a png of a graph showing the problematic behavior without > this patch. It shows deflate-queue activity increasing linearly while > balloon size stays constant over the course of more than 8 minutes of > the test.Thanks a lot for the extended test! -- Thanks, David / dhildenb
Michael S. Tsirkin
2020-Mar-09 10:14 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On Mon, Mar 09, 2020 at 10:03:14AM +0100, David Hildenbrand wrote:> On 08.03.20 05:47, Tyler Sanderson wrote: > > Tested-by: Tyler Sanderson <tysand at google.com> > > > > Test setup: VM with 16 CPU, 64GB RAM. Running Debian 10. We have a 42 > > GB file full of random bytes that we continually cat to /dev/null. > > This fills the page cache as the file is read. Meanwhile we trigger > > the balloon to inflate, with a target size of 53 GB. This setup causes > > the balloon inflation to pressure the page cache as the page cache is > > also trying to grow. Afterwards we shrink the balloon back to zero (so > > total deflate = total inflate). > > > > Without patch (kernel 4.19.0-5): > > Inflation never reaches the target until we stop the "cat file > > > /dev/null" process. Total inflation time was 542 seconds. The longest > > period that made no net forward progress was 315 seconds (see attached > > graph). > > Result of "grep balloon /proc/vmstat" after the test: > > balloon_inflate 154828377 > > balloon_deflate 154828377 > > > > With patch (kernel 5.6.0-rc4+): > > Total inflation duration was 63 seconds. No deflate-queue activity > > occurs when pressuring the page-cache. > > Result of "grep balloon /proc/vmstat" after the test: > > balloon_inflate 12968539 > > balloon_deflate 12968539 > > > > Conclusion: This patch fixes the issue. In the test it reduced > > inflate/deflate activity by 12x, and reduced inflation time by 8.6x. > > But more importantly, if we hadn't killed the "grep balloon > > /proc/vmstat" process then, without the patch, the inflation process > > would never reach the target. > > > > Attached is a png of a graph showing the problematic behavior without > > this patch. It shows deflate-queue activity increasing linearly while > > balloon size stays constant over the course of more than 8 minutes of > > the test. > > Thanks a lot for the extended test!Given we shipped this for a long time, I think the best way to make progress is to merge 1/3, 2/3 right now, and 3/3 in the next release.> -- > Thanks, > > David / dhildenb
David Hildenbrand
2020-Mar-09 10:59 UTC
[PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On 09.03.20 11:14, Michael S. Tsirkin wrote:> On Mon, Mar 09, 2020 at 10:03:14AM +0100, David Hildenbrand wrote: >> On 08.03.20 05:47, Tyler Sanderson wrote: >>> Tested-by: Tyler Sanderson <tysand at google.com> >>> >>> Test setup: VM with 16 CPU, 64GB RAM. Running Debian 10. We have a 42 >>> GB file full of random bytes that we continually cat to /dev/null. >>> This fills the page cache as the file is read. Meanwhile we trigger >>> the balloon to inflate, with a target size of 53 GB. This setup causes >>> the balloon inflation to pressure the page cache as the page cache is >>> also trying to grow. Afterwards we shrink the balloon back to zero (so >>> total deflate = total inflate). >>> >>> Without patch (kernel 4.19.0-5): >>> Inflation never reaches the target until we stop the "cat file > >>> /dev/null" process. Total inflation time was 542 seconds. The longest >>> period that made no net forward progress was 315 seconds (see attached >>> graph). >>> Result of "grep balloon /proc/vmstat" after the test: >>> balloon_inflate 154828377 >>> balloon_deflate 154828377 >>> >>> With patch (kernel 5.6.0-rc4+): >>> Total inflation duration was 63 seconds. No deflate-queue activity >>> occurs when pressuring the page-cache. >>> Result of "grep balloon /proc/vmstat" after the test: >>> balloon_inflate 12968539 >>> balloon_deflate 12968539 >>> >>> Conclusion: This patch fixes the issue. In the test it reduced >>> inflate/deflate activity by 12x, and reduced inflation time by 8.6x. >>> But more importantly, if we hadn't killed the "grep balloon >>> /proc/vmstat" process then, without the patch, the inflation process >>> would never reach the target. >>> >>> Attached is a png of a graph showing the problematic behavior without >>> this patch. It shows deflate-queue activity increasing linearly while >>> balloon size stays constant over the course of more than 8 minutes of >>> the test. >> >> Thanks a lot for the extended test! > > > Given we shipped this for a long time, I think the best way > to make progress is to merge 1/3, 2/3 right now, and 3/3 > in the next release.Agreed. -- Thanks, David / dhildenb
Possibly Parallel Threads
- [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
- [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
- [PATCH v3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
- [PATCH v3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM
- [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM