This patch solves the following problem. When a large VS terminates, the node locks up. The node locks up because the page_scrub_kick routine sends a softirq to all processors instructing them to run the page scrub code. There they interfere with each other as they serialize behind the page_scrub_lock. The patch does two things: (1) In page_scrub_kick, only a single cpu is interrupted. Some cpu other than the calling cpu is chosen (if available) because we assume the calling cpu has other higher priority work to do. (2) In page_scrub_softirq, if more than one cpu is online, the first cpu to start scrubbing designates itself as the primary_scrubber. As such it is dedicated to scrubbing pages until the list is empty. Other cpus might call page_scrub_softirq but they spend only 1 msec scrubbing before returning to check for other higher priority work. But, with multiple cpus online, the node can afford to have one cpu dedicated to scrubbing when that work needs to be done. Signed-off-by: Robert Phillips <rphillips@virtualiron.com> Signed-off-by: Ben Guthro <bguthro@virtualiron.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-May-23 16:04 UTC
Re: [Xen-devel] [PATCH] scrub pages on guest termination
The aim of the loop was to scrub enough pages in a batch that lock contention is kept tolerably low. Even if 16 pages is not sufficient for that, I¹m surprised a node¹ (you mean a whole system, presumably?) would appear to lock up. Maybe pages would be scrubbed slower than we¹d like, but still CPUs should be able to get the spinlock often enough to evaluate whether they have spent 1ms in the loop and hence get out of there. What sort of system were you seeing the lockup on? Does it have very many physical CPUs? -- Keir On 23/5/08 16:00, "Ben Guthro" <bguthro@virtualiron.com> wrote:> This patch solves the following problem. When a large VS terminates, the node > locks > up. The node locks up because the page_scrub_kick routine sends a softirq to > all processors instructing them to run the page scrub code. There they > interfere > with each other as they serialize behind the page_scrub_lock._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yes, sorry - should have removed our terminology from the description. Node=physical machine VS=HVM guest w/ pv-on-hvm drivers Looking back at the original bug report - it seems to indicate it was migrating from a system with 2 processors to one with 8 Specifcally - from Dell Precision WorkStation 380 Processor: Intel(R) Pentium(R) D CPU 2.80GHz # of CPUs: 2 Speed: 2.8GHz to Supermicro X7DB8 Processor: Genuine Intel(R) CPU @ 2.13GHz # of CPUs: 8 Speed: 2.133 GHz Keir Fraser wrote:> The aim of the loop was to scrub enough pages in a batch that lock > contention is kept tolerably low. Even if 16 pages is not sufficient > for that, I''m surprised a ''node'' (you mean a whole system, > presumably?) would appear to lock up. Maybe pages would be scrubbed > slower than we''d like, but still CPUs should be able to get the > spinlock often enough to evaluate whether they have spent 1ms in the > loop and hence get out of there. > > What sort of system were you seeing the lockup on? Does it have very > many physical CPUs? > > -- Keir > > On 23/5/08 16:00, "Ben Guthro" <bguthro@virtualiron.com> wrote: > > This patch solves the following problem. When a large VS > terminates, the node locks > up. The node locks up because the page_scrub_kick routine sends a > softirq to > all processors instructing them to run the page scrub code. There > they interfere > with each other as they serialize behind the page_scrub_lock. > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-May-23 17:19 UTC
Re: [Xen-devel] [PATCH] scrub pages on guest termination
On 23/5/08 18:01, "Ben Guthro" <bguthro@virtualiron.com> wrote:> Yes, sorry - should have removed our terminology from the description. > Node=physical machine > VS=HVM guest w/ pv-on-hvm drivers > Looking back at the original bug report - it seems to indicate it was > migrating from a system with 2 processors to one with 8It¹s very surprising that lock contention would cause such a severe lack of progress on an 8-CPU system. If the lock is that hotly contended then even the usage of it in free_domheap_pages() has to be questionable. I¹m inclined to say that if we want to address this then we should do it in one or more of the following ways: 1. Count CPUs into the scrub function with an atomic_t and beyond a limit all other CPUs bail straight out after re-setting their timer. 2. Increase scrub batch size to reduce proportion of time that each loop iteration holds the lock. 3. Turn the spin_lock() into a spin_trylock() so that the timeout check can be guaranteed to execute frequently. 4. Eliminate the global lock by building a lock-free linked list, or by maintaining per-CPU hashed work queues with work stealing, or... etc. The patch as-is at least suffers from the issue that the primary scrubber¹ should be regularly checking for softirq work. But I¹m not sure such a sizeable change to the scheduling policy for scrubbing (such as it is!) is necessary or desirable. Option 4 is on the morally highest ground but is of course the most work. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel