Having looked a little into the disabled SPLIT_PT_LOCK issue on Xen, I realized that is shouldn''t be too difficult to re-enable it (at least in some cases). One option would be to not use the mapping field of struct page for PageForeign but rather the index one (and to that effect was a mail I sent to Andrew Morton and Hugh Dickins that I got below answer on, basically indicating that it should be okay to use index here). This would allow removing the Xen conditional on SPLIT_PTLOCK_CPUS altogether. The other option would be along the lines of attached patch (not checked whether it would apply as-is to -unstable, but it appears to work okay), allowing the splitting at least in non-debug configs. Opinions? Thanks, Jan>>> Hugh Dickins <hugh@veritas.com> 28.02.07 22:08 >>>On Wed, 28 Feb 2007, Jan Beulich wrote:> A change early last year reordered struct page so that ptl overlaps not only > private, but also mapping. Since spinlock_t can be much larger, I''m wondering > whether there''s a reason to not also overlay the space index and lru take - > are these used for anything on page table pages?Overlaying lru is a problem for for those architectures which use kmem_cache_alloc for their pagetables: arm26, powerpc, sparc64 and perhaps others (I just grepped quickly through include/asm*, didn''t follow up those who have extern functions): since slab reuses the lru fields for its own purposes. Could perhaps be stacked somehow. Overlaying index is fairly straightforward: the index field is fair game. In my original patches I did overlay index, but Andrew was strongly averse to the way I was doing it, and scaled things back, to private alone if I remember rightly, then relaxed a little to include mapping too. Way back then I made up a patch to overlay index too (when I saw Fedora going out with CONFIG_DEBUG_SPINLOCK), but I could never get it into a form where I felt it would satisfy Andrew; and grew increasingly dissatisfied with that approach myself. I don''t think further overlaying is the right answer really. But I do think it''s a scandal that the size of struct page (in a DEBUG_SPINLOCK system) is governed by such a minority use of the struct page. Lacking a satisfying answer, I''ve just let it drift on until someone notices and complains. kmalloc a separate spinlock structure when it''s too big to fit in? Not such a good idea, since then there will tend to be false sharing of cachelines between them: simpler just to disable SPLIT_PTLOCK in that case. I''m not happy with the status quo, but I don''t know the right answer: perhaps allow pagetable pages to use an undebugged spinlock variant? Hugh _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 1/3/07 08:42, "Jan Beulich" <jbeulich@novell.com> wrote:> Having looked a little into the disabled SPLIT_PT_LOCK issue on Xen, I > realized that is shouldn''t be too difficult to re-enable it (at least in some > cases).How does pagetable locking work in modern Linux kernels? It seems that updates of ptes are protected by a per-page lock or the mm lock, and population of page directory entries is protected by the mm lock, but that there is no synchronisation with read-only pagetable walks. Does this mean that sections of pagetable hierarchy are never reaped from a process until it dies? Can we confident that the mm_pin/mm_unpin code (which walks pagetables and has to find every page to make every one read-only or writable) is safe? Presumably for this to be true we need to be sure that noone can meanwhile concurrently be populating the pagetable we are walking with extra pgds/puds/pmds/ptes... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir@xensource.com> 01.03.07 11:22 >>> >On 1/3/07 08:42, "Jan Beulich" <jbeulich@novell.com> wrote: > >> Having looked a little into the disabled SPLIT_PT_LOCK issue on Xen, I >> realized that is shouldn''t be too difficult to re-enable it (at least in some >> cases). > >How does pagetable locking work in modern Linux kernels? It seems that >updates of ptes are protected by a per-page lock or the mm lock, and >population of page directory entries is protected by the mm lock, but that >there is no synchronisation with read-only pagetable walks. Does this mean >that sections of pagetable hierarchy are never reaped from a process until >it dies?Yes, that''s my understanding.>Can we confident that the mm_pin/mm_unpin code (which walks pagetables and >has to find every page to make every one read-only or writable) is safe? >Presumably for this to be true we need to be sure that noone can meanwhile >concurrently be populating the pagetable we are walking with extra >pgds/puds/pmds/ptes...Since the pin/unpin walking only cares about pgd/pud/pmd entries, synchronization is guaranteed through mm->page_table_lock. The pte lock is used only for leaf entries, which are of no concern to (un)pinning. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 1/3/07 11:45, "Jan Beulich" <jbeulich@novell.com> wrote:>> Can we confident that the mm_pin/mm_unpin code (which walks pagetables and >> has to find every page to make every one read-only or writable) is safe? >> Presumably for this to be true we need to be sure that noone can meanwhile >> concurrently be populating the pagetable we are walking with extra >> pgds/puds/pmds/ptes... > > Since the pin/unpin walking only cares about pgd/pud/pmd entries, > synchronization > is guaranteed through mm->page_table_lock. The pte lock is used only for leaf > entries, which are of no concern to (un)pinning.Oh yes, of course. By the way, I think your proposed patch looks okay. However we already use page->index for foreign pages, in netback. Perhaps page->lru is also fair game? :-) K. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>By the way, I think your proposed patch looks okay. However we already use >page->index for foreign pages, in netback. Perhaps page->lru is also fair >game? :-)Not generally, there are arch-es that use it for their own purposes (look back at the message from Hugh I originally forwarded). So even if none of Xen''s currently supported would use it (I didn''t check), I wouldn''t want to introduce a lurking issue if anyone comes up with a new port. I didn''t check what private is being used for - while it can''t be used for PageForeign''s destructor pointer, perhaps netback could use that one rather than index? Or, even more likely, netback could use mapping if PageForeign was switched over to index (while mapping overlays ptl, that is benign to netback as it''s not dealing with pte pages). But of course, this requires thorough auditing of other uses of index. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 1/3/07 12:50, "Jan Beulich" <jbeulich@novell.com> wrote:> I didn''t check what private is being used for - while it can''t be used for > PageForeign''s destructor pointer, perhaps netback could use that one > rather than index? Or, even more likely, netback could use mapping if > PageForeign was switched over to index (while mapping overlays ptl, that > is benign to netback as it''s not dealing with pte pages). But of course, > this requires thorough auditing of other uses of index.Since PageForeign is only used for pagetables (for which you checked ->index should be okay) and for skbuff data (where we already use ->index in netback) then changing PageForeign to put the callback in ->index should be fine. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel