Chris Lalancette
2009-Jul-02 14:47 UTC
[Xen-devel] Latency spike during page_scrub_softirq
All, This is a topic which has been brought up before, but is still something that plagues certain machines. Let me describe the test scenario, the problem as I see it, and some of the data that I''ve collected. TEST ---- The test is on a large AMD NUMA machine with 128GB of memory and 32 cpus (8 x quad-core), memory interleaved, running RHEL-5.4 Xen (although I believe the issue probably affects upstream Xen as well). I install 2 RHEL-5.3 guests, one with 32GB of memory, and one with 64GB of memory. On the first guest, I run a continuous ping (just out to the default gateway). While that ping test is running, on the dom0 I do "xm destroy <64GB_guest>". This takes a while to complete (as expected), but what is not expected is some huge jumps in the ping responses on the 32GB domains. For instance, in the test I''m currently running, normal ping response time is ~0.5ms, but during the xm destroy of the other domain the ping response can jump up all the way to 3000 (or more) ms. Once the big domain destroy is finished, everything returns to normal. PROBLEM ------->From what I can tell, the problem lies in page_scrub_softirq(). As a firsttest, I disabled page-scrubbing completely (obviously insecure, but just a test). With no page-scrubbing at all, and direct memory freeing in free_domheap_pages(), no delays of the kind experienced in the original test were seen. As a second test, I implemented the page scrubbing inside free_domheap_pages(), and again, no spikes at all were seen. I then put things back like they were, and instrumented page_scrub_softirq(). Now, the serialize_lock at the top of the function makes sure only one CPU at a time comes in here. However, when I instrumented the rest of the function, I found that when a CPU was in here doing work, it was spending 80-95% of it''s time waiting to get the page_scrub_lock (I have raw numbers, if you want to see them). At first I would think this was purely contention with the other page_scrub_lock user in free_domheap_pages(). However, after changing the spin_lock(&page_scrub_lock) into a spin_trylock() inside page_scrub_softirq(), I still saw the spikes in the ping test, even though my instrumentation showed I was only waiting like 20 - 30% of the time on the spinlock. So I can''t fully explain the rest of the spike. Any ideas? Other things I should probe? SOLUTION -------- There are a couple of solutions that I can think of: 1) Just clear the pages inside free_domheap_pages(). I tried this with a 64GB guest as mentioned above, and I didn''t see any ill effects from doing so. It seems like this might actually be a valid way to go, although then a single CPU is doing all of the work of freeing the pages (might be a problem on UP systems). 2) Clear the pages inside free_domheap_pages(), but do some kind of yield every once in a while. I don''t know how feasible this would be. 3) Do a lockless FIFO between free_domheap_pages() and page_scrub_softirq() (since that is all it really is). While this would certainly work, it seems like a bit of overengineering for this problem. Other ideas? I''m happy to try to implement these, I''m just not sure what we would prefer to do. -- Chris Lalancette _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 02/07/2009 15:47, "Chris Lalancette" <clalance@redhat.com> wrote:> There are a couple of solutions that I can think of: > 1) Just clear the pages inside free_domheap_pages(). I tried this with a > 64GB > guest as mentioned above, and I didn''t see any ill effects from doing so. It > seems like this might actually be a valid way to go, although then a single > CPU > is doing all of the work of freeing the pages (might be a problem on UP > systems).Now that domain destruction is preemptible all the way back up to libxc, I think the page-scrub queue is not so much required. And it seems it never worked very well anyway! I will remove it. This may make ''xm destroy'' operations take a while, but actually this may be more sensibly handled by punting the destroy hypercall into another thread at dom0 userspace level, rather than doing the shonky ''scheduling'' we attempt in Xen itself right now. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Chris Lalancette
2009-Jul-03 07:32 UTC
Re: [Xen-devel] Latency spike during page_scrub_softirq
Keir Fraser wrote:> On 02/07/2009 15:47, "Chris Lalancette" <clalance@redhat.com> wrote: > >> There are a couple of solutions that I can think of: >> 1) Just clear the pages inside free_domheap_pages(). I tried this with a >> 64GB >> guest as mentioned above, and I didn''t see any ill effects from doing so. It >> seems like this might actually be a valid way to go, although then a single >> CPU >> is doing all of the work of freeing the pages (might be a problem on UP >> systems). > > Now that domain destruction is preemptible all the way back up to libxc, I > think the page-scrub queue is not so much required. And it seems it never > worked very well anyway! I will remove it. > > This may make ''xm destroy'' operations take a while, but actually this may be > more sensibly handled by punting the destroy hypercall into another thread > at dom0 userspace level, rather than doing the shonky ''scheduling'' we > attempt in Xen itself right now.Yep, agreed, and I see you''ve committed as c/s 19886. Except... diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c --- a/xen/common/page_alloc.c +++ b/xen/common/page_alloc.c ... @@ -1247,10 +1220,7 @@ void free_domheap_pages(struct page_info for ( i = 0; i < (1 << order); i++ ) { page_set_owner(&pg[i], NULL); - spin_lock(&page_scrub_lock); - page_list_add(&pg[i], &page_scrub_list); - scrub_pages++; - spin_unlock(&page_scrub_lock); + scrub_one_page(&pg[i]); } } } This hunk actually needs to free the page as well, with free_heap_pages(). -- Chris Lalancette _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 03/07/2009 08:32, "Chris Lalancette" <clalance@redhat.com> wrote:> This hunk actually needs to free the page as well, with free_heap_pages().Yeah, oops! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel