Michael Abd-El-Malek
2008-Apr-20 21:19 UTC
[Xen-devel] Vanilla Linux and has_foreign_mapping
Hello, I''m trying to add support to Linux 2.6.25 for the "has_foreign_mappings" MMU context flag. Xen''s Linux 2.6.18 tree uses this flag, so that page tables are properly disposed of when an application exits when it has foreign mappings. See: http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00038.html Here is my attempt: diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 2a054ef..3e51897 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -633,8 +633,13 @@ void xen_exit_mmap(struct mm_struct *mm) spin_lock(&mm->page_table_lock); /* pgd may not be pinned in the error exit path of execve */ - if (PagePinned(virt_to_page(mm->pgd))) - xen_pgd_unpin(mm->pgd); + if (PagePinned(virt_to_page(mm->pgd))) { + if (mm->context.has_foreign_mappings) { + printk("%s: because of has_foreign_mappings, delaying unpinning\n", __FUNCTION__); + } else { + xen_pgd_unpin(mm->pgd); + } + } spin_unlock(&mm->page_table_lock); } diff --git a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h index efa962c..7194698 100644 --- a/include/asm-x86/mmu.h +++ b/include/asm-x86/mmu.h @@ -18,6 +18,9 @@ typedef struct { int size; struct mutex lock; void *vdso; +#ifdef CONFIG_XEN + int has_foreign_mappings; +#endif } mm_context_t; #ifdef CONFIG_SMP Unfortunately, I got the following kernel crash on process exit: BUG: unable to handle kernel paging request at ebdae008 IP: [<c01157f9>] pgd_mop_up_pmds+0x6a/0xd8 *pdpt = 000000007f494027 Oops: 0003 [#1] PREEMPT SMP Modules linked in: efsvm(F) nfs lockd sunrpc dm_snapshot dm_mirror dm_mod Pid: 5565, comm: a.out Tainted: GF (2.6.25 #9) EIP: 0061:[<c01157f9>] EFLAGS: 00010246 CPU: 0 EIP is at pgd_mop_up_pmds+0x6a/0xd8 ... Call Trace: [<c01158bf>] pgd_free+0x8/0x19 [<c011fca0>] __mmdrop+0x16/0x2a [<c01244bc>] do_exit+0x1b3/0x569 [<c01248d5>] do_group_exit+0x63/0x7a [<c0107066>] syscall_call+0x7/0xb Has anyone else implemented this functionality in the mainline Linux tree? Any thoughts? Thanks, Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Apr-21 11:46 UTC
[Xen-devel] Re: Vanilla Linux and has_foreign_mapping
Michael Abd-El-Malek wrote:> I''m trying to add support to Linux 2.6.25 for the > "has_foreign_mappings" MMU context flag. Xen''s Linux 2.6.18 tree uses > this flag, so that page tables are properly disposed of when an > application exits when it has foreign mappings.I was hoping to avoid having to introduce that flag, but I have to admit I haven''t given it much analysis. How are you using it?> See: > http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00038.html > > Here is my attempt: > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 2a054ef..3e51897 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -633,8 +633,13 @@ void xen_exit_mmap(struct mm_struct *mm) > spin_lock(&mm->page_table_lock); > > /* pgd may not be pinned in the error exit path of execve */ > - if (PagePinned(virt_to_page(mm->pgd))) > - xen_pgd_unpin(mm->pgd); > + if (PagePinned(virt_to_page(mm->pgd))) { > + if (mm->context.has_foreign_mappings) { > + printk("%s: because of has_foreign_mappings, delaying > unpinning\n", __FUNCTION__); > + } else { > + xen_pgd_unpin(mm->pgd); > + } > + } > > spin_unlock(&mm->page_table_lock); > } > diff --git a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h > index efa962c..7194698 100644 > --- a/include/asm-x86/mmu.h > +++ b/include/asm-x86/mmu.h > @@ -18,6 +18,9 @@ typedef struct { > int size; > struct mutex lock; > void *vdso; > +#ifdef CONFIG_XEN > + int has_foreign_mappings; > +#endif > } mm_context_t; > > #ifdef CONFIG_SMP > > Unfortunately, I got the following kernel crash on process exit: > > BUG: unable to handle kernel paging request at ebdae008 > IP: [<c01157f9>] pgd_mop_up_pmds+0x6a/0xd8Which line is that? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark McLoughlin
2008-Apr-21 16:36 UTC
[Xen-devel] Re: Vanilla Linux and has_foreign_mapping
On Mon, 2008-04-21 at 21:46 +1000, Jeremy Fitzhardinge wrote:> Michael Abd-El-Malek wrote: > > I''m trying to add support to Linux 2.6.25 for the > > "has_foreign_mappings" MMU context flag. Xen''s Linux 2.6.18 tree uses > > this flag, so that page tables are properly disposed of when an > > application exits when it has foreign mappings. > > I was hoping to avoid having to introduce that flag, but I have to admit > I haven''t given it much analysis. How are you using it?I looked at this a while back, but am somewhat sparse on the details now. Attaching a commit from the dom0 tree that references this.>From my notes:http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00014.html http://xenbits.xensource.com/xen-unstable.hg?rev/e351aace191e it sounds like the scenario is thus: 1) process with foreign mappings exits, arch_exit_mmap() called 2) the page tables get unpinned, no other users of the foreign pages so they get returned to the xen heap 3) dom0 balloons and the machine physical page which had been foreign mapped is now allocated to another dom0 process 4) when the original process goes to clean up its ptes, it reverse maps the mfn in the pte to the now allocated and drops a reference count it doesn''t own I''ve also a vague memory of thinking that the PAGE_IO flag recently introduced to linux-2.6.18-xen.hg could be used to avoid this condition too. Cheers, Mark. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi there, Take a look at changeset 488 in http://xenbits.xensource.com/linux-2.6.18-xen.hg There you will see that we now have a new page flag (_PAGE_IO) that we apply to any PTE which maps I/O pages or ''foreign'' pages. We use this to avoid pseudophysical<->machine translations when getting/setting ptes, because such translations are not generally valid for such pages, but equally it may obviate the need for the has_foreign_mappings flag. This is because we can now have pte_pfn() return an invalid pfn for foreign mappings based on the _PAGE_IO flag in the pte, rather than by the roundabout logic implemented in pfn_to_local_mfn(). The latter relies on us keeping the pagetables pinned -- so if we no longer rely on it then we no longer need to forcibly keep things pinned via the has_foreign_mappings flag. Unfortunately I had to keep the has_foreign_mappings flag for other reasons in the 2.6.18 tree: gntdev and blktap device drivers use it to forcibly keep ptes pinned which contain grant mappings. This could probably be fixed within those drivers by having them pin just the pte pages that contain grant mappings. Then even on early-unpin (which has foreign_mappings currently defeats) we would still have the necessary pte-containing pages pinned even though the pgd-containin g page gets unpinned. It''s all a bit of a pain I''m afraid. :-( -- Keir On 20/4/08 22:19, "Michael Abd-El-Malek" <mabdelmalek@cmu.edu> wrote:> Hello, > > I''m trying to add support to Linux 2.6.25 for the "has_foreign_mappings" MMU > context flag. Xen''s Linux 2.6.18 tree uses this flag, so that page tables are > properly disposed of when an application exits when it has foreign mappings. > See: > http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00038.html > > Here is my attempt: > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 2a054ef..3e51897 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -633,8 +633,13 @@ void xen_exit_mmap(struct mm_struct *mm) > spin_lock(&mm->page_table_lock); > > /* pgd may not be pinned in the error exit path of execve */ > - if (PagePinned(virt_to_page(mm->pgd))) > - xen_pgd_unpin(mm->pgd); > + if (PagePinned(virt_to_page(mm->pgd))) { > + if (mm->context.has_foreign_mappings) { > + printk("%s: because of has_foreign_mappings, delaying > unpinning\n", > __FUNCTION__); > + } else { > + xen_pgd_unpin(mm->pgd); > + } > + } > > spin_unlock(&mm->page_table_lock); > } > diff --git a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h > index efa962c..7194698 100644 > --- a/include/asm-x86/mmu.h > +++ b/include/asm-x86/mmu.h > @@ -18,6 +18,9 @@ typedef struct { > int size; > struct mutex lock; > void *vdso; > +#ifdef CONFIG_XEN > + int has_foreign_mappings; > +#endif > } mm_context_t; > > #ifdef CONFIG_SMP > > Unfortunately, I got the following kernel crash on process exit: > > BUG: unable to handle kernel paging request at ebdae008 > IP: [<c01157f9>] pgd_mop_up_pmds+0x6a/0xd8 > *pdpt = 000000007f494027 > Oops: 0003 [#1] PREEMPT SMP > Modules linked in: efsvm(F) nfs lockd sunrpc dm_snapshot dm_mirror dm_mod > > Pid: 5565, comm: a.out Tainted: GF (2.6.25 #9) > EIP: 0061:[<c01157f9>] EFLAGS: 00010246 CPU: 0 > EIP is at pgd_mop_up_pmds+0x6a/0xd8 > ... > Call Trace: > [<c01158bf>] pgd_free+0x8/0x19 > [<c011fca0>] __mmdrop+0x16/0x2a > [<c01244bc>] do_exit+0x1b3/0x569 > [<c01248d5>] do_group_exit+0x63/0x7a > [<c0107066>] syscall_call+0x7/0xb > > Has anyone else implemented this functionality in the mainline Linux tree? > Any > thoughts? > > Thanks, > Mike > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-21 18:00 UTC
[Xen-devel] Re: Vanilla Linux and has_foreign_mapping
Jeremy Fitzhardinge wrote:> Michael Abd-El-Malek wrote: >> I''m trying to add support to Linux 2.6.25 for the >> "has_foreign_mappings" MMU context flag. Xen''s Linux 2.6.18 tree uses >> this flag, so that page tables are properly disposed of when an >> application exits when it has foreign mappings. > > I was hoping to avoid having to introduce that flag, but I have to admit > I haven''t given it much analysis. How are you using it?A user-level application has a page table entry pointing to another domain''s page. My virtual memory area''s zap_pte handler (which I added to 2.6.25, a la 2.6.18) unmaps the grant. But on application exit on 2.6.25, my zap_pte function runs too late. There''s a comment in gntdev.c that explains the need for has_foreign_mappings: /* This flag ensures that the page tables are not unpinned before the * VM area is unmapped. Therefore Xen still recognises the PTE as * belonging to an L1 pagetable, and the grant unmap operation will * succeed, even if the process does not exit cleanly. */>> See: >> http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00038.html >> >> Unfortunately, I got the following kernel crash on process exit: >> >> BUG: unable to handle kernel paging request at ebdae008 >> IP: [<c01157f9>] pgd_mop_up_pmds+0x6a/0xd8 > > Which line is that?My original 2.6.25 kernel didn''t have debugging support, preventing objdump -S from giving me address-to-source-line translations. I rebuilt the kernel and here is the new stack dump: BUG: unable to handle kernel paging request at ebb11008 IP: [<c0115b7b>] pgd_mop_up_pmds+0x7b/0xe6 *pdpt = 000000007f01a027 Oops: 0003 [#1] SMP Modules linked in: efsvm(F) nfs lockd sunrpc dm_snapshot dm_mirror dm_mod Pid: 2607, comm: a.out Tainted: GF (2.6.25 #12) EIP: 0061:[<c0115b7b>] EFLAGS: 00010246 CPU: 0 EIP is at pgd_mop_up_pmds+0x7b/0xe6 EAX: ebb11000 EBX: 00000000 ECX: 00000001 EDX: 00000000 ESI: 7edc3007 EDI: eb8533f4 EBP: ebaedf28 ESP: ebaedefc DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 Process a.out (pid: 2607, ti=ebaec000 task=ed3f4ed0 task.ti=ebaec000) Stack: 2bb01007 00000000 00000000 00000000 ebb11000 00000001 00000000 2bb01007 ebb11000 ed3f4ed0 eb8533f4 ebaedf34 c0115c44 eb8533c0 ebaedf40 c0121bad eb8533c0 ebaedf4c c0121c34 eb8533c0 ebaedf60 c01250f2 00000000 ed3f4ed0 Call Trace: [<c0115c44>] ? pgd_free+0xb/0x1e [<c0121bad>] ? __mmdrop+0x19/0x2f [<c0121c34>] ? mmput+0x71/0x74 [<c01250f2>] ? exit_mm+0xd5/0xda [<c01263c4>] ? do_exit+0x1b3/0x56f [<c01267ed>] ? do_group_exit+0x6d/0x84 [<c0126813>] ? sys_exit_group+0xf/0x11 [<c0107012>] ? syscall_call+0x7/0xb ======================Code: 00 00 00 89 55 dc 8b 45 dc 0b 45 d4 89 4d e0 09 c1 74 61 89 f0 89 da e8 8f d8 fe ff 90 89 45 f0 8b 4d e8 8b 45 e4 89 55 ec 89 da <c7> 04 c8 00 00 00 00 c7 44 c8 04 00 00 00 00 89 f0 e8 6a d8 fe EIP: [<c0115b7b>] pgd_mop_up_pmds+0x7b/0xe6 SS:ESP 0069:ebaedefc ---[ end trace b8f5f274f55408cd ]--- Fixing recursive fault but reboot is needed! pgd_free calls pgd_mop_up_pmds, where the crash is occurring at c0115b7b. Here''s the relevant assembly and source code: pmd_t *pmd = (pmd_t *)pgd_page_vaddr(pgd); pgdp[i] = native_make_pgd(0); c0115b70: 8b 4d e8 mov -0x18(%ebp),%ecx c0115b73: 8b 45 e4 mov -0x1c(%ebp),%eax c0115b76: 89 55 ec mov %edx,-0x14(%ebp) c0115b79: 89 da mov %ebx,%edx c0115b7b: c7 04 c8 00 00 00 00 movl $0x0,(%eax,%ecx,8) c0115b82: c7 44 c8 04 00 00 00 movl $0x0,0x4(%eax,%ecx,8) c0115b89: 00 c0115b8a: 89 f0 mov %esi,%eax c0115b8c: ff 15 a0 5b 4c c0 call *0xc04c5ba0 { Any hints? Thanks, Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-21 18:10 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Keir Fraser wrote:> Hi there, > > Take a look at changeset 488 in > http://xenbits.xensource.com/linux-2.6.18-xen.hg > > There you will see that we now have a new page flag (_PAGE_IO) that we apply > to any PTE which maps I/O pages or ''foreign'' pages. We use this to avoid > pseudophysical<->machine translations when getting/setting ptes, because > such translations are not generally valid for such pages, but equally it may > obviate the need for the has_foreign_mappings flag. This is because we can > now have pte_pfn() return an invalid pfn for foreign mappings based on the > _PAGE_IO flag in the pte, rather than by the roundabout logic implemented in > pfn_to_local_mfn(). The latter relies on us keeping the pagetables pinned -- > so if we no longer rely on it then we no longer need to forcibly keep things > pinned via the has_foreign_mappings flag. > > Unfortunately I had to keep the has_foreign_mappings flag for other reasons > in the 2.6.18 tree: gntdev and blktap device drivers use it to forcibly keep > ptes pinned which contain grant mappings. This could probably be fixed > within those drivers by having them pin just the pte pages that contain > grant mappings. Then even on early-unpin (which has foreign_mappings > currently defeats) we would still have the necessary pte-containing pages > pinned even though the pgd-containin g page gets unpinned.Yeah I''m in a similar situation as gntdev and blktap. So on 2.6.25, which doesn''t have has_foreign_mappings, you''re suggesting that I pin the pte pages that contain grant mappings. How do I accomplish that? And on process exit, do I have to take any extra steps to unpin the pte page, or will those pages be freed regardless of their "pin status"? Thanks, Mike> It''s all a bit of a pain I''m afraid. :-( > > -- Keir > > On 20/4/08 22:19, "Michael Abd-El-Malek" <mabdelmalek@cmu.edu> wrote: > >> Hello, >> >> I''m trying to add support to Linux 2.6.25 for the "has_foreign_mappings" MMU >> context flag. Xen''s Linux 2.6.18 tree uses this flag, so that page tables are >> properly disposed of when an application exits when it has foreign mappings. >> See: >> http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00038.html >> >> Here is my attempt: >> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c >> index 2a054ef..3e51897 100644 >> --- a/arch/x86/xen/mmu.c >> +++ b/arch/x86/xen/mmu.c >> @@ -633,8 +633,13 @@ void xen_exit_mmap(struct mm_struct *mm) >> spin_lock(&mm->page_table_lock); >> >> /* pgd may not be pinned in the error exit path of execve */ >> - if (PagePinned(virt_to_page(mm->pgd))) >> - xen_pgd_unpin(mm->pgd); >> + if (PagePinned(virt_to_page(mm->pgd))) { >> + if (mm->context.has_foreign_mappings) { >> + printk("%s: because of has_foreign_mappings, delaying >> unpinning\n", >> __FUNCTION__); >> + } else { >> + xen_pgd_unpin(mm->pgd); >> + } >> + } >> >> spin_unlock(&mm->page_table_lock); >> } >> diff --git a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h >> index efa962c..7194698 100644 >> --- a/include/asm-x86/mmu.h >> +++ b/include/asm-x86/mmu.h >> @@ -18,6 +18,9 @@ typedef struct { >> int size; >> struct mutex lock; >> void *vdso; >> +#ifdef CONFIG_XEN >> + int has_foreign_mappings; >> +#endif >> } mm_context_t; >> >> #ifdef CONFIG_SMP >> >> Unfortunately, I got the following kernel crash on process exit: >> >> BUG: unable to handle kernel paging request at ebdae008 >> IP: [<c01157f9>] pgd_mop_up_pmds+0x6a/0xd8 >> *pdpt = 000000007f494027 >> Oops: 0003 [#1] PREEMPT SMP >> Modules linked in: efsvm(F) nfs lockd sunrpc dm_snapshot dm_mirror dm_mod >> >> Pid: 5565, comm: a.out Tainted: GF (2.6.25 #9) >> EIP: 0061:[<c01157f9>] EFLAGS: 00010246 CPU: 0 >> EIP is at pgd_mop_up_pmds+0x6a/0xd8 >> ... >> Call Trace: >> [<c01158bf>] pgd_free+0x8/0x19 >> [<c011fca0>] __mmdrop+0x16/0x2a >> [<c01244bc>] do_exit+0x1b3/0x569 >> [<c01248d5>] do_group_exit+0x63/0x7a >> [<c0107066>] syscall_call+0x7/0xb >> >> Has anyone else implemented this functionality in the mainline Linux tree? >> Any >> thoughts? >> >> Thanks, >> Mike_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-21 18:17 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Michael Abd-El-Malek wrote:> Keir Fraser wrote: >> Hi there, >> >> Take a look at changeset 488 in >> http://xenbits.xensource.com/linux-2.6.18-xen.hg >> >> There you will see that we now have a new page flag (_PAGE_IO) that we >> apply >> to any PTE which maps I/O pages or ''foreign'' pages. We use this to avoid >> pseudophysical<->machine translations when getting/setting ptes, because >> such translations are not generally valid for such pages, but equally >> it may >> obviate the need for the has_foreign_mappings flag. This is because we >> can >> now have pte_pfn() return an invalid pfn for foreign mappings based on >> the >> _PAGE_IO flag in the pte, rather than by the roundabout logic >> implemented in >> pfn_to_local_mfn(). The latter relies on us keeping the pagetables >> pinned -- >> so if we no longer rely on it then we no longer need to forcibly keep >> things >> pinned via the has_foreign_mappings flag. >> >> Unfortunately I had to keep the has_foreign_mappings flag for other >> reasons >> in the 2.6.18 tree: gntdev and blktap device drivers use it to >> forcibly keep >> ptes pinned which contain grant mappings. This could probably be fixed >> within those drivers by having them pin just the pte pages that contain >> grant mappings. Then even on early-unpin (which has foreign_mappings >> currently defeats) we would still have the necessary pte-containing pages >> pinned even though the pgd-containin g page gets unpinned. > > Yeah I''m in a similar situation as gntdev and blktap. > > So on 2.6.25, which doesn''t have has_foreign_mappings, you''re suggesting > that I pin the pte pages that contain grant mappings. How do I > accomplish that? And on process exit, do I have to take any extra steps > to unpin the pte page, or will those pages be freed regardless of their > "pin status"?OK I took a quick look at the code. On 2.6.25, xen_exit_mmap calls xen_pgd_unpin. xen_pgd_unpin walks over a pgd and calls unpin_page on any pinned pages. So it seems the current pgd_unpin code will not let us have any pte pages unpinned?> Thanks, > Mike > >> It''s all a bit of a pain I''m afraid. :-( >> >> -- Keir >> >> On 20/4/08 22:19, "Michael Abd-El-Malek" <mabdelmalek@cmu.edu> wrote: >> >>> Hello, >>> >>> I''m trying to add support to Linux 2.6.25 for the >>> "has_foreign_mappings" MMU >>> context flag. Xen''s Linux 2.6.18 tree uses this flag, so that page >>> tables are >>> properly disposed of when an application exits when it has foreign >>> mappings. >>> See: >>> http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00038.html >>> >>> Here is my attempt: >>> diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c >>> index 2a054ef..3e51897 100644 >>> --- a/arch/x86/xen/mmu.c >>> +++ b/arch/x86/xen/mmu.c >>> @@ -633,8 +633,13 @@ void xen_exit_mmap(struct mm_struct *mm) >>> spin_lock(&mm->page_table_lock); >>> >>> /* pgd may not be pinned in the error exit path of execve */ >>> - if (PagePinned(virt_to_page(mm->pgd))) >>> - xen_pgd_unpin(mm->pgd); >>> + if (PagePinned(virt_to_page(mm->pgd))) { >>> + if (mm->context.has_foreign_mappings) { >>> + printk("%s: because of has_foreign_mappings, delaying >>> unpinning\n", __FUNCTION__); >>> + } else { >>> + xen_pgd_unpin(mm->pgd); >>> + } >>> + } >>> >>> spin_unlock(&mm->page_table_lock); >>> } >>> diff --git a/include/asm-x86/mmu.h b/include/asm-x86/mmu.h >>> index efa962c..7194698 100644 >>> --- a/include/asm-x86/mmu.h >>> +++ b/include/asm-x86/mmu.h >>> @@ -18,6 +18,9 @@ typedef struct { >>> int size; >>> struct mutex lock; >>> void *vdso; >>> +#ifdef CONFIG_XEN >>> + int has_foreign_mappings; >>> +#endif >>> } mm_context_t; >>> >>> #ifdef CONFIG_SMP >>> >>> Unfortunately, I got the following kernel crash on process exit: >>> >>> BUG: unable to handle kernel paging request at ebdae008 >>> IP: [<c01157f9>] pgd_mop_up_pmds+0x6a/0xd8 >>> *pdpt = 000000007f494027 >>> Oops: 0003 [#1] PREEMPT SMP >>> Modules linked in: efsvm(F) nfs lockd sunrpc dm_snapshot dm_mirror >>> dm_mod >>> >>> Pid: 5565, comm: a.out Tainted: GF (2.6.25 #9) >>> EIP: 0061:[<c01157f9>] EFLAGS: 00010246 CPU: 0 >>> EIP is at pgd_mop_up_pmds+0x6a/0xd8 >>> ... >>> Call Trace: >>> [<c01158bf>] pgd_free+0x8/0x19 >>> [<c011fca0>] __mmdrop+0x16/0x2a >>> [<c01244bc>] do_exit+0x1b3/0x569 >>> [<c01248d5>] do_group_exit+0x63/0x7a >>> [<c0107066>] syscall_call+0x7/0xb >>> >>> Has anyone else implemented this functionality in the mainline Linux >>> tree? >>> Any thoughts? >>> >>> Thanks, >>> Mike > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 21/4/08 19:17, "Michael Abd-El-Malek" <mabdelmalek@cmu.edu> wrote:>> Yeah I''m in a similar situation as gntdev and blktap. >> >> So on 2.6.25, which doesn''t have has_foreign_mappings, you''re suggesting >> that I pin the pte pages that contain grant mappings. How do I >> accomplish that? And on process exit, do I have to take any extra steps >> to unpin the pte page, or will those pages be freed regardless of their >> "pin status"? > > OK I took a quick look at the code. On 2.6.25, xen_exit_mmap calls > xen_pgd_unpin. xen_pgd_unpin walks over a pgd and calls unpin_page on any > pinned pages. So it seems the current pgd_unpin code will not let us have any > pte pages unpinned?I''m not really familiar with the pv_ops code I''m afraid. But thinking about this some more I''ve realised there''s no way really to avoid making the early-unpin logic aware of gntdev mappings. This is because if we do pin pte pages, and require them to remain pinned across early-unpin, then pgd_unpin() must not attempt to make those pte pages writable. That will fail, because the pages are still pinned! You''d either need to handle the failure to make the page writable, or have a per-page flag to indicate which pte pages contain gntdev mappings. Frankly you may as well stick with the per-mm-context has_foreign_mappings flag. Is it a pain to add a pv_ops-subtype-specific flag to mm_context? If so you could maintain a set datastructure instead, indicating which mm_contexts contain foreign mappings. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Apr-25 00:18 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Keir Fraser wrote:> I''m not really familiar with the pv_ops code I''m afraid. But thinking about > this some more I''ve realised there''s no way really to avoid making the > early-unpin logic aware of gntdev mappings. This is because if we do pin pte > pages, and require them to remain pinned across early-unpin, then > pgd_unpin() must not attempt to make those pte pages writable. That will > fail, because the pages are still pinned! You''d either need to handle the > failure to make the page writable, or have a per-page flag to indicate which > pte pages contain gntdev mappings. Frankly you may as well stick with the > per-mm-context has_foreign_mappings flag. >So the issue is that a pte page containing a _PAGE_IO pte must remain pinned while it contains that mapping? Would shooting down the mapping allow it to be unpinned, or does that need to be deferred until some later point (if so, when?)? I guess the downside is that we''d need to scan the pte looking for _PAGE_IO mappings, which is a bit of a pain. Skipping that would mean hiding a flag somewhere...> Is it a pain to add a pv_ops-subtype-specific flag to mm_context? If so you > could maintain a set datastructure instead, indicating which mm_contexts > contain foreign mappings. >So, in 2.6.18-xen mm->has_foreign_mapping makes it skip early-unpin, but puts it off until pgd_free(). Presumably that works because all the vma''s all been unmapped by then... I wonder if Christoph/Andrea''s mmu notifier patch is useful here... Maybe? Christoph/Andrea: would the mmu notifier mechanism allow us to efficiently generate the set of ptes mapping granted pages (pages from other domains), so we can shoot them down during process exit? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 25/4/08 01:18, "Jeremy Fitzhardinge" <jeremy@goop.org> wrote:> So the issue is that a pte page containing a _PAGE_IO pte must remain > pinned while it contains that mapping? Would shooting down the mapping > allow it to be unpinned, or does that need to be deferred until some > later point (if so, when?)?If you have _PAGE_IO then only unpinning of ptes containing grant-table mappings must not be deferred. ''Ordinary'' foreign mappings, of the sort that dom0 can create because it is privileged, do not have this constraint. This is because you can tell they are foreign, and hence avoid page refcounting, simply because the pte contains _PAGE_IO. Without _PAGE_IO we relied on checking whether a machine->pseudophys->machine double lookup took us from the machine address in the pte back to the same machine address. If not, we knew the page was not ours. This doesn''t work reliably if the pte page is not pinned because in that case the mapped page can be freed from the foreign domain and be reallocated into the local domain: in which case the M->P->M check could succeed! The _PAGE_IO check is more robust. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-25 17:11 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Jeremy Fitzhardinge wrote:> Keir Fraser wrote: >> I''m not really familiar with the pv_ops code I''m afraid. But thinking >> about >> this some more I''ve realised there''s no way really to avoid making the >> early-unpin logic aware of gntdev mappings. This is because if we do >> pin pte >> pages, and require them to remain pinned across early-unpin, then >> pgd_unpin() must not attempt to make those pte pages writable. That will >> fail, because the pages are still pinned! You''d either need to handle the >> failure to make the page writable, or have a per-page flag to indicate >> which >> pte pages contain gntdev mappings. Frankly you may as well stick with the >> per-mm-context has_foreign_mappings flag. >> > > So the issue is that a pte page containing a _PAGE_IO pte must remain > pinned while it contains that mapping? Would shooting down the mapping > allow it to be unpinned, or does that need to be deferred until some > later point (if so, when?)? > > I guess the downside is that we''d need to scan the pte looking for > _PAGE_IO mappings, which is a bit of a pain. Skipping that would mean > hiding a flag somewhere... > >> Is it a pain to add a pv_ops-subtype-specific flag to mm_context? If >> so you >> could maintain a set datastructure instead, indicating which mm_contexts >> contain foreign mappings. >> > > So, in 2.6.18-xen mm->has_foreign_mapping makes it skip early-unpin, but > puts it off until pgd_free(). Presumably that works because all the > vma''s all been unmapped by then...The following patch was sufficient for me. I delayed the arch_exit_mmap (which eventually calls into xen) until after unmap_vmas is called, which calls zap_pte (where I unmap the grant). Presumably, there is a performance overhead to always doing this delay, and hence 2.6.18 only did the delay if has_foreign_mappings is set. For macrobenchmarks like compilation, I couldn''t find a difference. Cheers, Mike diff --git a/mm/mmap.c b/mm/mmap.c index a32d28c..c118b54 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2036,15 +2036,14 @@ void exit_mmap(struct mm_struct *mm) unsigned long nr_accounted = 0; unsigned long end; - /* mm''s last user has gone, and its about to be pulled down */ - arch_exit_mmap(mm); - lru_add_drain(); flush_cache_mm(mm); tlb = tlb_gather_mmu(mm, 1); /* Don''t update_hiwater_rss(mm) here, do_exit already did */ /* Use -1 here to ensure all VMAs in the mm are unmapped */ end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); + /* mm''s last user has gone, and its about to be pulled down */ + arch_exit_mmap(mm); vm_unacct_memory(nr_accounted); free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, 0); tlb_finish_mmu(tlb, 0, end); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Apr-25 18:22 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Michael Abd-El-Malek wrote:> > > Jeremy Fitzhardinge wrote: >> Keir Fraser wrote: >>> I''m not really familiar with the pv_ops code I''m afraid. But >>> thinking about >>> this some more I''ve realised there''s no way really to avoid making the >>> early-unpin logic aware of gntdev mappings. This is because if we do >>> pin pte >>> pages, and require them to remain pinned across early-unpin, then >>> pgd_unpin() must not attempt to make those pte pages writable. That >>> will >>> fail, because the pages are still pinned! You''d either need to >>> handle the >>> failure to make the page writable, or have a per-page flag to >>> indicate which >>> pte pages contain gntdev mappings. Frankly you may as well stick >>> with the >>> per-mm-context has_foreign_mappings flag. >>> >> >> So the issue is that a pte page containing a _PAGE_IO pte must remain >> pinned while it contains that mapping? Would shooting down the >> mapping allow it to be unpinned, or does that need to be deferred >> until some later point (if so, when?)? >> >> I guess the downside is that we''d need to scan the pte looking for >> _PAGE_IO mappings, which is a bit of a pain. Skipping that would >> mean hiding a flag somewhere... >> >>> Is it a pain to add a pv_ops-subtype-specific flag to mm_context? If >>> so you >>> could maintain a set datastructure instead, indicating which >>> mm_contexts >>> contain foreign mappings. >>> >> >> So, in 2.6.18-xen mm->has_foreign_mapping makes it skip early-unpin, >> but puts it off until pgd_free(). Presumably that works because all >> the vma''s all been unmapped by then... > The following patch was sufficient for me. I delayed the > arch_exit_mmap (which eventually calls into xen) until after > unmap_vmas is called, which calls zap_pte (where I unmap the grant). > Presumably, there is a performance overhead to always doing this > delay, and hence 2.6.18 only did the delay if has_foreign_mappings is > set. For macrobenchmarks like compilation, I couldn''t find a difference. > > Cheers, > Mike > > diff --git a/mm/mmap.c b/mm/mmap.c > index a32d28c..c118b54 100644 > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -2036,15 +2036,14 @@ void exit_mmap(struct mm_struct *mm) > unsigned long nr_accounted = 0; > unsigned long end; > > - /* mm''s last user has gone, and its about to be pulled down */ > - arch_exit_mmap(mm); > - > lru_add_drain(); > flush_cache_mm(mm); > tlb = tlb_gather_mmu(mm, 1); > /* Don''t update_hiwater_rss(mm) here, do_exit already did */ > /* Use -1 here to ensure all VMAs in the mm are unmapped */ > end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); > + /* mm''s last user has gone, and its about to be pulled down */ > + arch_exit_mmap(mm);Yeah, that pretty much removes the point of "early unpin". The problem is that unmap_vmas tears down the pagetable, and if its pinned that means lots of hypercalls. If it has already been unpinned it can be much more efficient. It mainly effects exec/exit performance, and I''m not sure a kernel compile is all that execy/exity - most of the time is just spent in gcc crunching. Something with more shellscripts might show more of a difference. (I have to admit I haven''t benchmarked this myself.) J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-25 18:31 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Jeremy Fitzhardinge wrote:> Michael Abd-El-Malek wrote: >> >> >> Jeremy Fitzhardinge wrote: >>> Keir Fraser wrote: >>>> I''m not really familiar with the pv_ops code I''m afraid. But >>>> thinking about >>>> this some more I''ve realised there''s no way really to avoid making the >>>> early-unpin logic aware of gntdev mappings. This is because if we do >>>> pin pte >>>> pages, and require them to remain pinned across early-unpin, then >>>> pgd_unpin() must not attempt to make those pte pages writable. That >>>> will >>>> fail, because the pages are still pinned! You''d either need to >>>> handle the >>>> failure to make the page writable, or have a per-page flag to >>>> indicate which >>>> pte pages contain gntdev mappings. Frankly you may as well stick >>>> with the >>>> per-mm-context has_foreign_mappings flag. >>>> >>> >>> So the issue is that a pte page containing a _PAGE_IO pte must remain >>> pinned while it contains that mapping? Would shooting down the >>> mapping allow it to be unpinned, or does that need to be deferred >>> until some later point (if so, when?)? >>> >>> I guess the downside is that we''d need to scan the pte looking for >>> _PAGE_IO mappings, which is a bit of a pain. Skipping that would >>> mean hiding a flag somewhere... >>> >>>> Is it a pain to add a pv_ops-subtype-specific flag to mm_context? If >>>> so you >>>> could maintain a set datastructure instead, indicating which >>>> mm_contexts >>>> contain foreign mappings. >>>> >>> >>> So, in 2.6.18-xen mm->has_foreign_mapping makes it skip early-unpin, >>> but puts it off until pgd_free(). Presumably that works because all >>> the vma''s all been unmapped by then... >> The following patch was sufficient for me. I delayed the >> arch_exit_mmap (which eventually calls into xen) until after >> unmap_vmas is called, which calls zap_pte (where I unmap the grant). >> Presumably, there is a performance overhead to always doing this >> delay, and hence 2.6.18 only did the delay if has_foreign_mappings is >> set. For macrobenchmarks like compilation, I couldn''t find a difference. >> >> Cheers, >> Mike >> >> diff --git a/mm/mmap.c b/mm/mmap.c >> index a32d28c..c118b54 100644 >> --- a/mm/mmap.c >> +++ b/mm/mmap.c >> @@ -2036,15 +2036,14 @@ void exit_mmap(struct mm_struct *mm) >> unsigned long nr_accounted = 0; >> unsigned long end; >> >> - /* mm''s last user has gone, and its about to be pulled down */ >> - arch_exit_mmap(mm); >> - >> lru_add_drain(); >> flush_cache_mm(mm); >> tlb = tlb_gather_mmu(mm, 1); >> /* Don''t update_hiwater_rss(mm) here, do_exit already did */ >> /* Use -1 here to ensure all VMAs in the mm are unmapped */ >> end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); >> + /* mm''s last user has gone, and its about to be pulled down */ >> + arch_exit_mmap(mm); > > Yeah, that pretty much removes the point of "early unpin". The problem > is that unmap_vmas tears down the pagetable, and if its pinned that > means lots of hypercalls. If it has already been unpinned it can be > much more efficient. > > It mainly effects exec/exit performance, and I''m not sure a kernel > compile is all that execy/exity - most of the time is just spent in gcc > crunching. Something with more shellscripts might show more of a > difference. (I have to admit I haven''t benchmarked this myself.) > > JHow about we do the following: arch_exit_mmap_pre(mm); lru_add_drain(); flush_cache_mm(mm); tlb = tlb_gather_mmu(mm, 1); /* Don''t update_hiwater_rss(mm) here, do_exit already did */ /* Use -1 here to ensure all VMAs in the mm are unmapped */ end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); arch_exit_mmap_post(mm); We''ll reintroduce has_foreign_mappings. If has_foreign_mappings is _not_ set, then arch_exit_mmap_pre can early unpin the page tables and arch_exit_mmap_post will do nothing. If has_foreign_mappings is set, then arch_exist_mmap_pre won''t do anything, and arch_exit_mmap_post will do the actual xen_exit_mmap call. What do you think? Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Apr-25 22:33 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Michael Abd-El-Malek wrote:> How about we do the following: > > arch_exit_mmap_pre(mm); > > lru_add_drain(); > flush_cache_mm(mm); > tlb = tlb_gather_mmu(mm, 1); > /* Don''t update_hiwater_rss(mm) here, do_exit already did */ > /* Use -1 here to ensure all VMAs in the mm are unmapped */ > end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); > > arch_exit_mmap_post(mm); > > We''ll reintroduce has_foreign_mappings. If has_foreign_mappings is > _not_ set, then arch_exit_mmap_pre can early unpin the page tables and > arch_exit_mmap_post will do nothing. If has_foreign_mappings is set, > then arch_exist_mmap_pre won''t do anything, and arch_exit_mmap_post > will do the actual xen_exit_mmap call. > > What do you think?I''m thinking along the lines of: 1. steal the "private" field in struct page for Xen pte pages 2. if we install a grant mapping in that page, allocate a secondary page and point private to it. In that secondary page, keep an array of grant handles corresponding to the grant mappings in the pte page (non-grant mappings have an invalid handle). 3. In unpin_page, if we''re unpinning a pte page with a non-null private page, then walk the private page to tear down the grant mappings, and free the private page, and unpin the pte page normally. I like it because it 1) avoids the need for any core kernel hooks, and 2) decouples unpinning grant pages from the mechanism used to actually map the grant pages, 2a) the metadata for granted pages is stored with the pagetable (effectively), so the grant driver doesn''t need to do anything special to make it work. Also it means all the information to pull down the mapping is available for normal unmap operations (ie, we can do it in set_pte without needing a special zap_pte hook). No doubt I''m overlooking something important. What is it? I guess one concern is if the per-grant-mapping data is larger than a pte, then the private "page" will either need to be larger than a page, or more complex a structure than a simple array. The kernel and user handles would be stored separately, since they''d have separate ptes anyway. Looks like it will need to be a (domid, ref, handle) tuple, which would be 10 bytes. Are refs and/or handles really 32-bit quantities? Hm, though it looks like GNTTABOP_unmap_grant_ref only uses the handle, so that''s quite convenient. Would this scheme work? Does it seem reasonable? Does it solve the problem? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Apr-26 00:14 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Andrea Arcangeli wrote:> Hello everyone, > > On Fri, Apr 25, 2008 at 01:11:28PM -0400, Michael Abd-El-Malek wrote: > >> - /* mm''s last user has gone, and its about to be pulled down */ >> - arch_exit_mmap(mm); >> - >> lru_add_drain(); >> flush_cache_mm(mm); >> tlb = tlb_gather_mmu(mm, 1); >> /* Don''t update_hiwater_rss(mm) here, do_exit already did */ >> /* Use -1 here to ensure all VMAs in the mm are unmapped */ >> end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); >> + /* mm''s last user has gone, and its about to be pulled down */ >> + arch_exit_mmap(mm); >> > > If that''s what you need I doubt mmu notifiers can help. mmu notifiers > allows to keep secondary mmu mappings (like vmx/svm/npt/ept sptes) in > total synchrony with the primary mmu mappings established by the linux > VM. All secondary mmu mappings must be zapped and the secondary mmu > must be freezed before the pages are freed, hence the last mmu > notifier call is ->release and it''s done _before_ the above > unmap_vmas. >For Xen, arch_exit_mmap() needs to be done before unmap_vmas since the whole point is to switch to init_mm before tearing down the pagetable to avoid lots of hypercalls. The trouble is that mappings of foreign pages need to be dealt with specially because Xen requires that they be unmapped with a special mechanism. What I was wondering if, rather than getting a callback, we could call into the mmu notifier machinery to get a list of mapped foreign pages and their corresponding pte pointers so that they can be "manually" unmapped early in Xen''s arch_exit_mmap(). On the other hand, I think we can just hang a shadow page off the pte page''s struct page to store all the extra metadata we need... J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-29 16:39 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
On Apr 25, 2008, at 6:33 PM, Jeremy Fitzhardinge wrote:> Michael Abd-El-Malek wrote: >> How about we do the following: >> >> arch_exit_mmap_pre(mm); >> >> lru_add_drain(); >> flush_cache_mm(mm); >> tlb = tlb_gather_mmu(mm, 1); >> /* Don''t update_hiwater_rss(mm) here, do_exit already did */ >> /* Use -1 here to ensure all VMAs in the mm are unmapped */ >> end = unmap_vmas(&tlb, vma, 0, -1, &nr_accounted, NULL); >> >> arch_exit_mmap_post(mm); >> >> We''ll reintroduce has_foreign_mappings. If has_foreign_mappings is >> _not_ set, then arch_exit_mmap_pre can early unpin the page tables >> and arch_exit_mmap_post will do nothing. If has_foreign_mappings >> is set, then arch_exist_mmap_pre won''t do anything, and >> arch_exit_mmap_post will do the actual xen_exit_mmap call. >> >> What do you think? > > I''m thinking along the lines of: > > 1. steal the "private" field in struct page for Xen pte pages > 2. if we install a grant mapping in that page, allocate a secondary > page and point private to it. In that secondary page, keep an > array of grant handles corresponding to the grant mappings in the > pte page (non-grant mappings have an invalid handle). > 3. In unpin_page, if we''re unpinning a pte page with a non-null > private page, then walk the private page to tear down the grant > mappings, and free the private page, and unpin the pte page > normally. > > > I like it because it 1) avoids the need for any core kernel hooks, > and 2) decouples unpinning grant pages from the mechanism used to > actually map the grant pages, 2a) the metadata for granted pages is > stored with the pagetable (effectively), so the grant driver doesn''t > need to do anything special to make it work. Also it means all the > information to pull down the mapping is available for normal unmap > operations (ie, we can do it in set_pte without needing a special > zap_pte hook).I like this approach!> No doubt I''m overlooking something important. What is it?Some drivers may need to do additional tasks besides just clearing the PTE. For example, when unzapping my kernel PTE, I need to restore the physical mapping of the page. (On the initial set_pte (which I''ve overridden), I removed the physical backing of the page.) If we really want to avoid a zap_pte hook, I suppose we can add flags to the page/PTE that indicate things like "this page needs to have its physical backing restored".> I guess one concern is if the per-grant-mapping data is larger than > a pte, then the private "page" will either need to be larger than a > page, or more complex a structure than a simple array. The kernel > and user handles would be stored separately, since they''d have > separate ptes anyway. Looks like it will need to be a (domid, ref, > handle) tuple, which would be 10 bytes. Are refs and/or handles > really 32-bit quantities? Hm, though it looks like > GNTTABOP_unmap_grant_ref only uses the handle, so that''s quite > convenient.Do we even need to store a domid? The grant handle is all you need to unmap the grant. And that''s 32-bits.> Would this scheme work? Does it seem reasonable? Does it solve the > problem?It''s definitely reasonable, clean, and would solve the problem. My only concern is stated above. If you think that having a "restore physical backing" page/PTE flag is OK, then I''m willing to make a 2.6.25 patch for this. The next couple of weeks are a bit hectic, but I can have it done by mid-May. Cheers, Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Apr-29 17:32 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
Michael Abd-El-Malek wrote:> Some drivers may need to do additional tasks besides just clearing the > PTE. For example, when unzapping my kernel PTE, I need to restore > the physical mapping of the page. (On the initial set_pte (which I''ve > overridden), I removed the physical backing of the page.) > > If we really want to avoid a zap_pte hook, I suppose we can add flags > to the page/PTE that indicate things like "this page needs to have its > physical backing restored".Is that something that needs to happen synchronously with the unmap, or could the driver do it in its vma callback? What do you mean by remove/restoring the physical backing? Do any of the in-tree drivers do this? In the PAE case, which I think we can regard as usual these days, we have 64-bits to store per-mapping info, so stashing a callback or callback index seems pretty straightforward if we need full generality. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Michael Abd-El-Malek
2008-Apr-30 16:31 UTC
Re: [Xen-devel] Vanilla Linux and has_foreign_mapping
On Apr 29, 2008, at 1:32 PM, Jeremy Fitzhardinge wrote:> Michael Abd-El-Malek wrote: >> Some drivers may need to do additional tasks besides just clearing >> the PTE. For example, when unzapping my kernel PTE, I need to >> restore the physical mapping of the page. (On the initial set_pte >> (which I''ve overridden), I removed the physical backing of the page.) >> >> If we really want to avoid a zap_pte hook, I suppose we can add >> flags to the page/PTE that indicate things like "this page needs to >> have its physical backing restored". > > Is that something that needs to happen synchronously with the unmap, > or could the driver do it in its vma callback? What do you mean by > remove/restoring the physical backing? Do any of the in-tree > drivers do this?Good point. This extra work can happen later in the vma''s callback. By removing/restoring the physical backing, I mean doing the work in balloon.c. Removing the physical backing is done by doing a XENMEM_decrease_reservation hypercall, while restoring the physical backing is done through XENMEM_populate_physmap.> In the PAE case, which I think we can regard as usual these days, we > have 64-bits to store per-mapping info, so stashing a callback or > callback index seems pretty straightforward if we need full > generality.I don''t think there''s a need for this, either in the current gntdev/ blktap and my work. The generic xen code can undo the user-space grant, and the vma callback can later do any extra work. Cheers, Mike _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel