Hi, Patch below is needed to make my system work stable in PAE mode. Havn''t seen problems without PAE, not sure whenever thats just pure luck or whenever there is a bug in my PAE xenlinux kernel. To me it looks like a generic bug though. I''ve actually trapped into problems with unpin only: A process exits, somewhere in exit_mm() the page tables are unpinned, shortly thereafter the mappings are cleared. While doing so the kernel oopses in zap_pte_range(), on page table write access. Probably due to some stale tlb entry where the page is still tagged read-only. cheers, Gerd Index: linux-2.6.11/arch/xen/i386/mm/pgtable.c ==================================================================--- linux-2.6.11.orig/arch/xen/i386/mm/pgtable.c 2005-06-22 16:25:17.000000000 +0200 +++ linux-2.6.11/arch/xen/i386/mm/pgtable.c 2005-06-23 18:20:45.000000000 +0200 @@ -486,7 +486,8 @@ void mm_pin(struct mm_struct *mm) mm_walk(mm, PAGE_KERNEL_RO); HYPERVISOR_update_va_mapping( (unsigned long)mm->pgd, - pfn_pte(virt_to_phys(mm->pgd)>>PAGE_SHIFT, PAGE_KERNEL_RO), 0); + pfn_pte(virt_to_phys(mm->pgd)>>PAGE_SHIFT, PAGE_KERNEL_RO), + UVMF_TLB_FLUSH); xen_pgd_pin(__pa(mm->pgd)); mm->context.pinned = 1; spin_lock(&mm_unpinned_lock); @@ -505,6 +506,7 @@ void mm_unpin(struct mm_struct *mm) (unsigned long)mm->pgd, pfn_pte(virt_to_phys(mm->pgd)>>PAGE_SHIFT, PAGE_KERNEL), 0); mm_walk(mm, PAGE_KERNEL); + xen_tlb_flush(); mm->context.pinned = 0; spin_lock(&mm_unpinned_lock); list_add(&mm->context.unpinned, &mm_unpinned); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 23 Jun 2005, at 17:36, Gerd Knorr wrote:> Patch below is needed to make my system work stable in PAE mode. > Havn''t seen problems without PAE, not sure whenever thats just > pure luck or whenever there is a bug in my PAE xenlinux kernel. > To me it looks like a generic bug though. > > I''ve actually trapped into problems with unpin only: A process > exits, somewhere in exit_mm() the page tables are unpinned, > shortly thereafter the mappings are cleared. While doing so the > kernel oopses in zap_pte_range(), on page table write access. > Probably due to some stale tlb entry where the page is still > tagged read-only.That''s a good catch! I thought for a long time that, when you increase permissions of a page mapping, a TLB flush is unnecessary (because the processor would see restrictive TLB entry, but walk pagetables before triggering a page fault). This is certainly a feature of most architectures with hardware-walked pagetables, but does not occur on modern x86 CPUs because it significantly increases demand-fault latency. I only found that out from Intel a couple of days ago[*]. :-) The flush on mm_pin() is actually unnecessary, because one will be triggered by pgd_pin(). But it is harmless to add it, it makes it clear we want a flush at that point, and it will cause pgd_pin not to do an extra superfluous fault (because we have flush-avoidance logic). -- Keir [*]: Even more sophisticated, at the time it triggers the page fault the cpu will invalidate the read-only tlb entry. This means that, when doing copy-on-write demand faults, for example, the OS does not need to flush or invlpg to avoid another protection fault when it replays the faulting instruction. Kind of odd behaviour, but obvious required behaviour when you think about it... _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel