Konrad Rzeszutek Wilk
2011-Sep-26 13:13 UTC
[Xen-devel] [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
which has git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9. The unintended consequence of removing the flushing of MMU updates when doing kmap_atomic (or kunmap_atomic) is that we can hit a dereference bug when processing a "fork()" under a heavy loaded machine. Specifically we can hit: BUG: unable to handle kernel paging request at f573fc8c IP: [<c01abc54>] swap_count_continued+0x104/0x180 *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3 EIP is at swap_count_continued+0x104/0x180 .. snip.. Call Trace: [<c01ac222>] ? __swap_duplicate+0xc2/0x160 [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0 [<c01ac2e4>] ? swap_duplicate+0x14/0x40 [<c01a0a6b>] ? copy_pte_range+0x45b/0x500 [<c01a0ca5>] ? copy_page_range+0x195/0x200 [<c01328c6>] ? dup_mmap+0x1c6/0x2c0 [<c0132cf8>] ? dup_mm+0xa8/0x130 [<c013376a>] ? copy_process+0x98a/0xb30 [<c013395f>] ? do_fork+0x4f/0x280 [<c01573b3>] ? getnstimeofday+0x43/0x100 [<c010f770>] ? sys_clone+0x30/0x40 [<c06c048d>] ? ptregs_clone+0x15/0x48 [<c06bfb71>] ? syscall_call+0x7/0xb The problem looks that in copy_page_range we turn lazy mode on, and then in swap_entry_free we call swap_count_continued which ends up in: map = kmap_atomic(page, KM_USER0) + offset; and then later touches *map. Since we are running in batched mode (lazy) we don''t actually set up the PTE mappings and the kmap_atomic is not done synchronously and ends up trying to dereference a page that has not been set. Looking at kmap_atomic_prot_pfn, it uses ''arch_flush_lazy_mmu_mode'' and sprinkling that in kmap_atomic_prot and __kunmap_atomic makes the problem go away. CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: x86@kernel.org CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: stable@kernel.org Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/mm/highmem_32.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c index b499626..f4f29b1 100644 --- a/arch/x86/mm/highmem_32.c +++ b/arch/x86/mm/highmem_32.c @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot) vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); BUG_ON(!pte_none(*(kmap_pte-idx))); set_pte(kmap_pte-idx, mk_pte(page, prot)); + arch_flush_lazy_mmu_mode(); return (void *)vaddr; } @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr) */ kpte_clear_flush(kmap_pte-idx, vaddr); kmap_atomic_idx_pop(); + arch_flush_lazy_mmu_mode(); } #ifdef CONFIG_DEBUG_HIGHMEM else { -- 1.7.4.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2011-Sep-26 16:22 UTC
[Xen-devel] Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On 09/26/2011 06:13 AM, Konrad Rzeszutek Wilk wrote:> which has git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9. > > The unintended consequence of removing the flushing of MMU > updates when doing kmap_atomic (or kunmap_atomic) is that we can > hit a dereference bug when processing a "fork()" under a heavy loaded > machine. Specifically we can hit:The patch is all OK, but I wouldn''t have headlined it as a "partial revert" - the important point is that the pte updates in k(un)map_atomic need to be synchronous, regardless of whether we''re in lazy_mmu mode. The fact that b8bcfe997e4 introduced the problem is interesting to note, but only somewhat relevant to the analysis of what''s being fixed here. J> > BUG: unable to handle kernel paging request at f573fc8c > IP: [<c01abc54>] swap_count_continued+0x104/0x180 > *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 > Oops: 0000 [#1] SMP > Modules linked in: > Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 > EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3 > EIP is at swap_count_continued+0x104/0x180 > .. snip.. > Call Trace: > [<c01ac222>] ? __swap_duplicate+0xc2/0x160 > [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0 > [<c01ac2e4>] ? swap_duplicate+0x14/0x40 > [<c01a0a6b>] ? copy_pte_range+0x45b/0x500 > [<c01a0ca5>] ? copy_page_range+0x195/0x200 > [<c01328c6>] ? dup_mmap+0x1c6/0x2c0 > [<c0132cf8>] ? dup_mm+0xa8/0x130 > [<c013376a>] ? copy_process+0x98a/0xb30 > [<c013395f>] ? do_fork+0x4f/0x280 > [<c01573b3>] ? getnstimeofday+0x43/0x100 > [<c010f770>] ? sys_clone+0x30/0x40 > [<c06c048d>] ? ptregs_clone+0x15/0x48 > [<c06bfb71>] ? syscall_call+0x7/0xb > > The problem looks that in copy_page_range we turn lazy mode on, and then > in swap_entry_free we call swap_count_continued which ends up in: > > map = kmap_atomic(page, KM_USER0) + offset; > > and then later touches *map. > > Since we are running in batched mode (lazy) we don''t actually set up the > PTE mappings and the kmap_atomic is not done synchronously and ends up > trying to dereference a page that has not been set. > > Looking at kmap_atomic_prot_pfn, it uses ''arch_flush_lazy_mmu_mode'' and > sprinkling that in kmap_atomic_prot and __kunmap_atomic makes the problem > go away. > > CC: Thomas Gleixner <tglx@linutronix.de> > CC: Ingo Molnar <mingo@redhat.com> > CC: "H. Peter Anvin" <hpa@zytor.com> > CC: x86@kernel.org > CC: Peter Zijlstra <a.p.zijlstra@chello.nl> > CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > CC: stable@kernel.org > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/mm/highmem_32.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c > index b499626..f4f29b1 100644 > --- a/arch/x86/mm/highmem_32.c > +++ b/arch/x86/mm/highmem_32.c > @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot) > vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); > BUG_ON(!pte_none(*(kmap_pte-idx))); > set_pte(kmap_pte-idx, mk_pte(page, prot)); > + arch_flush_lazy_mmu_mode(); > > return (void *)vaddr; > } > @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr) > */ > kpte_clear_flush(kmap_pte-idx, vaddr); > kmap_atomic_idx_pop(); > + arch_flush_lazy_mmu_mode(); > } > #ifdef CONFIG_DEBUG_HIGHMEM > else {_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-26 19:34 UTC
[Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On Mon, Sep 26, 2011 at 09:22:21AM -0700, Jeremy Fitzhardinge wrote:> On 09/26/2011 06:13 AM, Konrad Rzeszutek Wilk wrote: > > which has git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9. > > > > The unintended consequence of removing the flushing of MMU > > updates when doing kmap_atomic (or kunmap_atomic) is that we can > > hit a dereference bug when processing a "fork()" under a heavy loaded > > machine. Specifically we can hit: > > The patch is all OK, but I wouldn''t have headlined it as a "partial > revert" - the important point is that the pte updates in k(un)map_atomic > need to be synchronous, regardless of whether we''re in lazy_mmu mode. > > The fact that b8bcfe997e4 introduced the problem is interesting to note, > but only somewhat relevant to the analysis of what''s being fixed here.Good point. How about>From 09966678dd645b68a422c9bf0223b13e73387302 Mon Sep 17 00:00:00 2001From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Fri, 23 Sep 2011 17:02:29 -0400 Subject: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. This patch fixes an outstanding issue that has been reported since 2.6.37. Under a heavy loaded machine processing "fork()" calls could keepover with: BUG: unable to handle kernel paging request at f573fc8c IP: [<c01abc54>] swap_count_continued+0x104/0x180 *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 Oops: 0000 [#1] SMP Modules linked in: Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3 EIP is at swap_count_continued+0x104/0x180 .. snip.. Call Trace: [<c01ac222>] ? __swap_duplicate+0xc2/0x160 [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0 [<c01ac2e4>] ? swap_duplicate+0x14/0x40 [<c01a0a6b>] ? copy_pte_range+0x45b/0x500 [<c01a0ca5>] ? copy_page_range+0x195/0x200 [<c01328c6>] ? dup_mmap+0x1c6/0x2c0 [<c0132cf8>] ? dup_mm+0xa8/0x130 [<c013376a>] ? copy_process+0x98a/0xb30 [<c013395f>] ? do_fork+0x4f/0x280 [<c01573b3>] ? getnstimeofday+0x43/0x100 [<c010f770>] ? sys_clone+0x30/0x40 [<c06c048d>] ? ptregs_clone+0x15/0x48 [<c06bfb71>] ? syscall_call+0x7/0xb The problem is that in copy_page_range we turn lazy mode on, and then in swap_entry_free we call swap_count_continued which ends up in: map = kmap_atomic(page, KM_USER0) + offset; and then later we touch *map. Since we are running in batched mode (lazy) we don''t actually set up the PTE mappings and the kmap_atomic is not done synchronously and ends up trying to dereference a page that has not been set. Looking at kmap_atomic_prot_pfn, it uses ''arch_flush_lazy_mmu_mode'' and doing the same in kmap_atomic_prot and __kunmap_atomic makes the problem go away. Interestingly, git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9 removed part of this to fix an interrupt issue - but it went to far and did not consider this scenario. CC: Thomas Gleixner <tglx@linutronix.de> CC: Ingo Molnar <mingo@redhat.com> CC: "H. Peter Anvin" <hpa@zytor.com> CC: x86@kernel.org CC: Peter Zijlstra <a.p.zijlstra@chello.nl> CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> CC: stable@kernel.org [v1: Redid the commit description per Jeremy''s apt suggestion] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- arch/x86/mm/highmem_32.c | 2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c index b499626..f4f29b1 100644 --- a/arch/x86/mm/highmem_32.c +++ b/arch/x86/mm/highmem_32.c @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot) vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); BUG_ON(!pte_none(*(kmap_pte-idx))); set_pte(kmap_pte-idx, mk_pte(page, prot)); + arch_flush_lazy_mmu_mode(); return (void *)vaddr; } @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr) */ kpte_clear_flush(kmap_pte-idx, vaddr); kmap_atomic_idx_pop(); + arch_flush_lazy_mmu_mode(); } #ifdef CONFIG_DEBUG_HIGHMEM else { -- 1.7.4.1 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Stefan Bader
2011-Sep-30 09:59 UTC
Re: [Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On 26.09.2011 21:34, Konrad Rzeszutek Wilk wrote:> On Mon, Sep 26, 2011 at 09:22:21AM -0700, Jeremy Fitzhardinge wrote: >> On 09/26/2011 06:13 AM, Konrad Rzeszutek Wilk wrote: >>> which has git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9. >>> >>> The unintended consequence of removing the flushing of MMU >>> updates when doing kmap_atomic (or kunmap_atomic) is that we can >>> hit a dereference bug when processing a "fork()" under a heavy loaded >>> machine. Specifically we can hit: >> >> The patch is all OK, but I wouldn''t have headlined it as a "partial >> revert" - the important point is that the pte updates in k(un)map_atomic >> need to be synchronous, regardless of whether we''re in lazy_mmu mode. >> >> The fact that b8bcfe997e4 introduced the problem is interesting to note, >> but only somewhat relevant to the analysis of what''s being fixed here. > > Good point. How about >Limiting the cc''s for just asking about status...>>From 09966678dd645b68a422c9bf0223b13e73387302 Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Fri, 23 Sep 2011 17:02:29 -0400 > Subject: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. > > This patch fixes an outstanding issue that has been reported since 2.6.37. > Under a heavy loaded machine processing "fork()" calls could keepover with: >I wonder whether this may have some effect on older kernels too. According to git the patch that removed the lines that are added back happened in 2.6.31. Probably it is not the same symptom... I would tend to have it applied all the way back but its always better to get some authoritative answer (maybe helps the maintainers of longterm, too). Anyway, since this is a somewhat painful bug to users, do you happen to know how far this is in reaching the upstream kernel? Thanks, Stefan> BUG: unable to handle kernel paging request at f573fc8c > IP: [<c01abc54>] swap_count_continued+0x104/0x180 > *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 > Oops: 0000 [#1] SMP > Modules linked in: > Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 > EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3 > EIP is at swap_count_continued+0x104/0x180 > .. snip.. > Call Trace: > [<c01ac222>] ? __swap_duplicate+0xc2/0x160 > [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0 > [<c01ac2e4>] ? swap_duplicate+0x14/0x40 > [<c01a0a6b>] ? copy_pte_range+0x45b/0x500 > [<c01a0ca5>] ? copy_page_range+0x195/0x200 > [<c01328c6>] ? dup_mmap+0x1c6/0x2c0 > [<c0132cf8>] ? dup_mm+0xa8/0x130 > [<c013376a>] ? copy_process+0x98a/0xb30 > [<c013395f>] ? do_fork+0x4f/0x280 > [<c01573b3>] ? getnstimeofday+0x43/0x100 > [<c010f770>] ? sys_clone+0x30/0x40 > [<c06c048d>] ? ptregs_clone+0x15/0x48 > [<c06bfb71>] ? syscall_call+0x7/0xb > > The problem is that in copy_page_range we turn lazy mode on, and then > in swap_entry_free we call swap_count_continued which ends up in: > > map = kmap_atomic(page, KM_USER0) + offset; > > and then later we touch *map. > > Since we are running in batched mode (lazy) we don''t actually set up the > PTE mappings and the kmap_atomic is not done synchronously and ends up > trying to dereference a page that has not been set. > > Looking at kmap_atomic_prot_pfn, it uses ''arch_flush_lazy_mmu_mode'' and > doing the same in kmap_atomic_prot and __kunmap_atomic makes the problem > go away. > > Interestingly, git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9 > removed part of this to fix an interrupt issue - but it went to far > and did not consider this scenario. > > CC: Thomas Gleixner <tglx@linutronix.de> > CC: Ingo Molnar <mingo@redhat.com> > CC: "H. Peter Anvin" <hpa@zytor.com> > CC: x86@kernel.org > CC: Peter Zijlstra <a.p.zijlstra@chello.nl> > CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > CC: stable@kernel.org > [v1: Redid the commit description per Jeremy''s apt suggestion] > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/mm/highmem_32.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c > index b499626..f4f29b1 100644 > --- a/arch/x86/mm/highmem_32.c > +++ b/arch/x86/mm/highmem_32.c > @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot) > vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); > BUG_ON(!pte_none(*(kmap_pte-idx))); > set_pte(kmap_pte-idx, mk_pte(page, prot)); > + arch_flush_lazy_mmu_mode(); > > return (void *)vaddr; > } > @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr) > */ > kpte_clear_flush(kmap_pte-idx, vaddr); > kmap_atomic_idx_pop(); > + arch_flush_lazy_mmu_mode(); > } > #ifdef CONFIG_DEBUG_HIGHMEM > else {_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Sep-30 14:22 UTC
[Xen-devel] Re: Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
> >From 09966678dd645b68a422c9bf0223b13e73387302 Mon Sep 17 00:00:00 2001 > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Fri, 23 Sep 2011 17:02:29 -0400 > Subject: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. > > This patch fixes an outstanding issue that has been reported since 2.6.37. > Under a heavy loaded machine processing "fork()" calls could keepover with:Hm, looks like I forgot to include Andrew on this. Andrew, what is your opinion on this tiny little critical patch?> > BUG: unable to handle kernel paging request at f573fc8c > IP: [<c01abc54>] swap_count_continued+0x104/0x180 > *pdpt = 000000002a3b9027 *pde = 0000000001bed067 *pte = 0000000000000000 > Oops: 0000 [#1] SMP > Modules linked in: > Pid: 1638, comm: apache2 Not tainted 3.0.4-linode37 #1 > EIP: 0061:[<c01abc54>] EFLAGS: 00210246 CPU: 3 > EIP is at swap_count_continued+0x104/0x180 > .. snip.. > Call Trace: > [<c01ac222>] ? __swap_duplicate+0xc2/0x160 > [<c01040f7>] ? pte_mfn_to_pfn+0x87/0xe0 > [<c01ac2e4>] ? swap_duplicate+0x14/0x40 > [<c01a0a6b>] ? copy_pte_range+0x45b/0x500 > [<c01a0ca5>] ? copy_page_range+0x195/0x200 > [<c01328c6>] ? dup_mmap+0x1c6/0x2c0 > [<c0132cf8>] ? dup_mm+0xa8/0x130 > [<c013376a>] ? copy_process+0x98a/0xb30 > [<c013395f>] ? do_fork+0x4f/0x280 > [<c01573b3>] ? getnstimeofday+0x43/0x100 > [<c010f770>] ? sys_clone+0x30/0x40 > [<c06c048d>] ? ptregs_clone+0x15/0x48 > [<c06bfb71>] ? syscall_call+0x7/0xb > > The problem is that in copy_page_range we turn lazy mode on, and then > in swap_entry_free we call swap_count_continued which ends up in: > > map = kmap_atomic(page, KM_USER0) + offset; > > and then later we touch *map. > > Since we are running in batched mode (lazy) we don''t actually set up the > PTE mappings and the kmap_atomic is not done synchronously and ends up > trying to dereference a page that has not been set. > > Looking at kmap_atomic_prot_pfn, it uses ''arch_flush_lazy_mmu_mode'' and > doing the same in kmap_atomic_prot and __kunmap_atomic makes the problem > go away. > > Interestingly, git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9 > removed part of this to fix an interrupt issue - but it went to far > and did not consider this scenario. > > CC: Thomas Gleixner <tglx@linutronix.de> > CC: Ingo Molnar <mingo@redhat.com> > CC: "H. Peter Anvin" <hpa@zytor.com> > CC: x86@kernel.org > CC: Peter Zijlstra <a.p.zijlstra@chello.nl> > CC: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> > CC: stable@kernel.org > [v1: Redid the commit description per Jeremy''s apt suggestion] > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > arch/x86/mm/highmem_32.c | 2 ++ > 1 files changed, 2 insertions(+), 0 deletions(-) > > diff --git a/arch/x86/mm/highmem_32.c b/arch/x86/mm/highmem_32.c > index b499626..f4f29b1 100644 > --- a/arch/x86/mm/highmem_32.c > +++ b/arch/x86/mm/highmem_32.c > @@ -45,6 +45,7 @@ void *kmap_atomic_prot(struct page *page, pgprot_t prot) > vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); > BUG_ON(!pte_none(*(kmap_pte-idx))); > set_pte(kmap_pte-idx, mk_pte(page, prot)); > + arch_flush_lazy_mmu_mode(); > > return (void *)vaddr; > } > @@ -88,6 +89,7 @@ void __kunmap_atomic(void *kvaddr) > */ > kpte_clear_flush(kmap_pte-idx, vaddr); > kmap_atomic_idx_pop(); > + arch_flush_lazy_mmu_mode(); > } > #ifdef CONFIG_DEBUG_HIGHMEM > else { > -- > 1.7.4.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-03 16:50 UTC
Re: [Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On Fri, Sep 30, 2011 at 11:59:46AM +0200, Stefan Bader wrote:> On 26.09.2011 21:34, Konrad Rzeszutek Wilk wrote: > > On Mon, Sep 26, 2011 at 09:22:21AM -0700, Jeremy Fitzhardinge wrote: > >> On 09/26/2011 06:13 AM, Konrad Rzeszutek Wilk wrote: > >>> which has git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9. > >>> > >>> The unintended consequence of removing the flushing of MMU > >>> updates when doing kmap_atomic (or kunmap_atomic) is that we can > >>> hit a dereference bug when processing a "fork()" under a heavy loaded > >>> machine. Specifically we can hit: > >> > >> The patch is all OK, but I wouldn''t have headlined it as a "partial > >> revert" - the important point is that the pte updates in k(un)map_atomic > >> need to be synchronous, regardless of whether we''re in lazy_mmu mode. > >> > >> The fact that b8bcfe997e4 introduced the problem is interesting to note, > >> but only somewhat relevant to the analysis of what''s being fixed here. > > > > Good point. How about > > > > Limiting the cc''s for just asking about status...CC-ed you on my query to Andrew. If nothing happens in the next couple of days can you ping him too please?> > >>From 09966678dd645b68a422c9bf0223b13e73387302 Mon Sep 17 00:00:00 2001 > > From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Date: Fri, 23 Sep 2011 17:02:29 -0400 > > Subject: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. > > > > This patch fixes an outstanding issue that has been reported since 2.6.37. > > Under a heavy loaded machine processing "fork()" calls could keepover with: > > > I wonder whether this may have some effect on older kernels too. According to > git the patch that removed the lines that are added back happened in 2.6.31. > Probably it is not the same symptom... I would tend to have it applied all the > way back but its always better to get some authoritative answer (maybe helps the > maintainers of longterm, too).I think so, but I''ve only gotten bug reports from 2.6.37 and on - so I am being cautious.> > Anyway, since this is a somewhat painful bug to users, do you happen to know how > far this is in reaching the upstream kernel?Just need an Ack from either akpm, or x86 maintainers. The x86 maintainers are busy with the kernel.org mishap so ... andrew is our guy. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-03 17:04 UTC
Re: [Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On Fri, Sep 30, 2011 at 11:59:46AM +0200, Stefan Bader wrote:> On 26.09.2011 21:34, Konrad Rzeszutek Wilk wrote: > > On Mon, Sep 26, 2011 at 09:22:21AM -0700, Jeremy Fitzhardinge wrote: > >> On 09/26/2011 06:13 AM, Konrad Rzeszutek Wilk wrote: > >>> which has git commit b8bcfe997e46150fedcc3f5b26b846400122fdd9. > >>> > >>> The unintended consequence of removing the flushing of MMU > >>> updates when doing kmap_atomic (or kunmap_atomic) is that we can > >>> hit a dereference bug when processing a "fork()" under a heavy loaded > >>> machine. Specifically we can hit: > >> > >> The patch is all OK, but I wouldn''t have headlined it as a "partial > >> revert" - the important point is that the pte updates in k(un)map_atomic > >> need to be synchronous, regardless of whether we''re in lazy_mmu mode. > >> > >> The fact that b8bcfe997e4 introduced the problem is interesting to note, > >> but only somewhat relevant to the analysis of what''s being fixed here. > > > > Good point. How about > > > > Limiting the cc''s for just asking about status...Ah, got this email: The patch titled Subject: x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode has been added to the -mm tree. Its filename is x86-paravirt-pte-updates-in-kunmap_atomic-need-to-be-synchronous-regardless-of-lazy_mmu-mode.patch so it is definitly on the train. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Christopher S. Aker
2011-Oct-25 17:55 UTC
Re: [Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
> This patch fixes an outstanding issue that has been reported since > 2.6.37. Under a heavy loaded machine processing "fork()" calls could > keepover with:I noticed this patch is not in Linux 3.1 -- was this fixed some other way, or is it still in mainline''s pipeline somewhere? Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-25 18:19 UTC
Re: [Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On Tue, Oct 25, 2011 at 01:55:14PM -0400, Christopher S. Aker wrote:> >This patch fixes an outstanding issue that has been reported since > >2.6.37. Under a heavy loaded machine processing "fork()" calls could > >keepover with: > > I noticed this patch is not in Linux 3.1 -- was this fixed some > other way, or is it still in mainline''s pipeline somewhere?Hmm, it was in Andrew''s tree, but you are right - I am not seeing it in 3.1. Let me double check Andrew''s tree.> > Thanks, > -Chris_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-25 18:26 UTC
Re: [Xen-devel] Is: [PATCH] x86/paravirt: PTE updates in k(un)map_atomic need to be synchronous, regardless of lazy_mmu mode. Was: Re: [PATCH] x86/paravirt: Partially revert "remove lazy mode in interrupts"
On Tue, Oct 25, 2011 at 01:55:14PM -0400, Christopher S. Aker wrote:> >This patch fixes an outstanding issue that has been reported since > >2.6.37. Under a heavy loaded machine processing "fork()" calls could > >keepover with: > > I noticed this patch is not in Linux 3.1 -- was this fixed some > other way, or is it still in mainline''s pipeline somewhere?Well, looks like it got dropped out of Andrew''s tree. Not sure why, but let me make sure it gets the proper attention. Thanks for spotting it! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel