Ian Campbell
2010-Oct-12 10:14 UTC
[Xen-devel] [PATCH] xen: correctly rebuild mfn list list after migration.
Otherwise the second migration attempt fails because the mfn_list_list still refers to all the old mfns. We need to update the entires in both p2m_top_mfn and the mid_mfn pages which p2m_top_mfn refers to. In order to do this we need to keep track of the virtual addresses mapping the p2m_mid_mfn pages since we cannot rely on mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still contain the old MFN after a migration, which may now belong to another domain and hence have a different mapping in the m2p. Therefore add and maintain a third top level page, p2m_mid_mfn_p[], which tracks the virtual addresses of the mfns contained in p2m_top_mfn[]. We also need to update the content of the p2m_mid_missing_mfn page on resume to refer to the page''s new mfn. p2m_missing does not need updating since the migration process takes care of the leaf p2m pages for us. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> --- arch/x86/xen/mmu.c | 49 ++++++++++++++++++++++++++++++++++++++----------- 1 files changed, 38 insertions(+), 11 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 16a8e25..8788064 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -185,6 +185,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */ * / \ / \ / / * p2m p2m p2m p2m p2m p2m p2m ... * + * The p2m_mid_mfn pages are mapped by p2m_mid_mfn_p. + * * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the * maximum representable pseudo-physical address space is: * P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages @@ -209,6 +211,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE); RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); @@ -245,6 +248,14 @@ static void p2m_top_mfn_init(unsigned long *top) top[i] = virt_to_mfn(p2m_mid_missing_mfn); } +static void p2m_mid_mfn_p_init(unsigned long **top) +{ + unsigned i; + + for (i = 0; i < P2M_TOP_PER_PAGE; i++) + top[i] = p2m_mid_missing_mfn; +} + static void p2m_mid_init(unsigned long **mid) { unsigned i; @@ -301,15 +312,21 @@ EXPORT_SYMBOL(create_lookup_pte_addr); */ void xen_build_mfn_list_list(void) { - unsigned pfn; + unsigned long pfn; /* Pre-initialize p2m_top_mfn to be completely missing */ if (p2m_top_mfn == NULL) { p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_mid_mfn_init(p2m_mid_missing_mfn); + p2m_mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_mid_mfn_p_init(p2m_mid_mfn_p); + p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_top_mfn_init(p2m_top_mfn); + } else { + /* Reinitialise, mfn''s all change after migration */ + p2m_mid_mfn_init(p2m_mid_missing_mfn); } for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) { @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void) mid = p2m_top[topidx]; /* Don''t bother allocating any mfn mid levels if - they''re just missing */ - if (mid[mididx] == p2m_missing) + * they''re just missing, just update the stored mfn, + * since all could have changed over a migrate. + */ + if (mid == p2m_mid_missing) { + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing); + pfn += P2M_MID_PER_PAGE - 1; continue; + } - mid_mfn = p2m_top_mfn[topidx]; - mid_mfn_p = mfn_to_virt(mid_mfn); + mid_mfn_p = p2m_mid_mfn_p[topidx]; + mid_mfn = virt_to_mfn(mid_mfn_p); - if (mid_mfn_p == p2m_mid_missing_mfn) { + if (mid_mfn_p == p2m_missing) { /* * XXX boot-time only! We should never find * missing parts of the mfn tree after @@ -340,10 +362,11 @@ void xen_build_mfn_list_list(void) p2m_mid_mfn_init(mid_mfn_p); mid_mfn = virt_to_mfn(mid_mfn_p); - - p2m_top_mfn[topidx] = mid_mfn; + p2m_mid_mfn_p[topidx] = mid_mfn_p; } + p2m_top_mfn[topidx] = mid_mfn; + mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]); } } @@ -362,7 +385,7 @@ void __init xen_build_dynamic_phys_to_machine(void) { unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list; unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages); - unsigned pfn; + unsigned long pfn; xen_max_p2m_pfn = max_pfn; @@ -452,7 +475,9 @@ static bool alloc_p2m(unsigned long pfn) } top_mfn_p = &p2m_top_mfn[topidx]; - mid_mfn = mfn_to_virt(*top_mfn_p); + mid_mfn = p2m_mid_mfn_p[topidx]; + + BUG_ON(mid_mfn != mfn_to_virt(*top_mfn_p)); if (mid_mfn == p2m_mid_missing_mfn) { /* Separately check the mid mfn level */ @@ -464,11 +489,13 @@ static bool alloc_p2m(unsigned long pfn) return false; p2m_mid_mfn_init(mid_mfn); - + missing_mfn = virt_to_mfn(p2m_mid_missing_mfn); mid_mfn_mfn = virt_to_mfn(mid_mfn); if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) free_p2m_page(mid_mfn); + else + p2m_mid_mfn_p[topidx] = mid_mfn; } if (p2m_top[topidx][mididx] == p2m_missing) { -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-14 00:37 UTC
[Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On 10/12/2010 03:14 AM, Ian Campbell wrote:> Otherwise the second migration attempt fails because the mfn_list_list > still refers to all the old mfns. > > We need to update the entires in both p2m_top_mfn and the mid_mfn > pages which p2m_top_mfn refers to. > > In order to do this we need to keep track of the virtual addresses > mapping the p2m_mid_mfn pages since we cannot rely on > mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still > contain the old MFN after a migration, which may now belong to another > domain and hence have a different mapping in the m2p. > > Therefore add and maintain a third top level page, p2m_mid_mfn_p[], > which tracks the virtual addresses of the mfns contained in > p2m_top_mfn[]. > > We also need to update the content of the p2m_mid_missing_mfn page on > resume to refer to the page''s new mfn. > > p2m_missing does not need updating since the migration process takes > care of the leaf p2m pages for us. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > arch/x86/xen/mmu.c | 49 ++++++++++++++++++++++++++++++++++++++----------- > 1 files changed, 38 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 16a8e25..8788064 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -185,6 +185,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */ > * / \ / \ / / > * p2m p2m p2m p2m p2m p2m p2m ... > * > + * The p2m_mid_mfn pages are mapped by p2m_mid_mfn_p. > + * > * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the > * maximum representable pseudo-physical address space is: > * P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages > @@ -209,6 +211,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE); > > static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); > static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); > +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE); > > RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); > RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); > @@ -245,6 +248,14 @@ static void p2m_top_mfn_init(unsigned long *top) > top[i] = virt_to_mfn(p2m_mid_missing_mfn); > } > > +static void p2m_mid_mfn_p_init(unsigned long **top) > +{ > + unsigned i; > + > + for (i = 0; i < P2M_TOP_PER_PAGE; i++) > + top[i] = p2m_mid_missing_mfn; > +} > + > static void p2m_mid_init(unsigned long **mid) > { > unsigned i; > @@ -301,15 +312,21 @@ EXPORT_SYMBOL(create_lookup_pte_addr); > */ > void xen_build_mfn_list_list(void) > { > - unsigned pfn; > + unsigned long pfn; > > /* Pre-initialize p2m_top_mfn to be completely missing */ > if (p2m_top_mfn == NULL) { > p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_mid_mfn_init(p2m_mid_missing_mfn); > > + p2m_mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); > + p2m_mid_mfn_p_init(p2m_mid_mfn_p); > + > p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_top_mfn_init(p2m_top_mfn); > + } else { > + /* Reinitialise, mfn''s all change after migration */ > + p2m_mid_mfn_init(p2m_mid_missing_mfn); > } > > for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) { > @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void) > mid = p2m_top[topidx]; > > /* Don''t bother allocating any mfn mid levels if > - they''re just missing */ > - if (mid[mididx] == p2m_missing) > + * they''re just missing, just update the stored mfn, > + * since all could have changed over a migrate. > + */ > + if (mid == p2m_mid_missing) { > + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing); > + pfn += P2M_MID_PER_PAGE - 1; > continue; > + } > > - mid_mfn = p2m_top_mfn[topidx]; > - mid_mfn_p = mfn_to_virt(mid_mfn); > + mid_mfn_p = p2m_mid_mfn_p[topidx]; > + mid_mfn = virt_to_mfn(mid_mfn_p); > > - if (mid_mfn_p == p2m_mid_missing_mfn) { > + if (mid_mfn_p == p2m_missing) { > /* > * XXX boot-time only! We should never find > * missing parts of the mfn tree after > @@ -340,10 +362,11 @@ void xen_build_mfn_list_list(void) > p2m_mid_mfn_init(mid_mfn_p); > > mid_mfn = virt_to_mfn(mid_mfn_p); > - > - p2m_top_mfn[topidx] = mid_mfn; > + p2m_mid_mfn_p[topidx] = mid_mfn_p; > } > > + p2m_top_mfn[topidx] = mid_mfn; > + > mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]); > } > } > @@ -362,7 +385,7 @@ void __init xen_build_dynamic_phys_to_machine(void) > { > unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list; > unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages); > - unsigned pfn; > + unsigned long pfn; > > xen_max_p2m_pfn = max_pfn; > > @@ -452,7 +475,9 @@ static bool alloc_p2m(unsigned long pfn) > } > > top_mfn_p = &p2m_top_mfn[topidx]; > - mid_mfn = mfn_to_virt(*top_mfn_p); > + mid_mfn = p2m_mid_mfn_p[topidx]; > + > + BUG_ON(mid_mfn != mfn_to_virt(*top_mfn_p));I''m getting this triggering at boot: PM: Adding info for No Bus:xen!gntdev ------------[ cut here ]------------ kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! invalid opcode: 0000 [#1] SMP last sysfs file: CPU 0 Modules linked in: Pid: 6, comm: events/0 Not tainted 2.6.32.24-next #220 RIP: e030:[<ffffffff8100d715>] [<ffffffff8100d715>] set_phys_to_machine+0x12f/0x2d9 RSP: e02b:ffff88001fd8bca0 EFLAGS: 00010206 RAX: ffff88000227d000 RBX: ffffffff8227d000 RCX: 0000000000000016 RDX: ffff880000000000 RSI: 0000000000195a4e RDI: 00000000000207ff RBP: ffff88001fd8bcf0 R08: 0000000000000003 R09: 00000003981ce0e5 R10: 0000000000000000 R11: 0000000000000003 R12: 00000000000207ff R13: 0000000000195a4d R14: 0000000000000000 R15: ffffffff8227f000 FS: 0000000000000000(0000) GS:ffff8800051bd000(0000) knlGS:0000000000000000 CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000000000000 CR3: 0000000001001000 CR4: 0000000000002660 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Process events/0 (pid: 6, threadinfo ffff88001fd8a000, task ffff88001fd88180) Stack: 2222222222222222 2222222222222222 2222222222222222 ffffffff82278000 <0> 0000000000000001 ffffea0000b2bfa8 00000000000207ff 0000000000000000 <0> 0000000000000200 0000000000000200 ffff88001fd8bdc0 ffffffff812a128e Call Trace: [<ffffffff812a128e>] balloon_process+0x20a/0x626 [<ffffffff81069cfb>] ? worker_thread+0x1fc/0x347 [<ffffffff81069d50>] worker_thread+0x251/0x347 [<ffffffff81069cfb>] ? worker_thread+0x1fc/0x347 [<ffffffff812a1084>] ? balloon_process+0x0/0x626 [<ffffffff8106e8ce>] ? autoremove_wake_function+0x0/0x39 [<ffffffff81069aff>] ? worker_thread+0x0/0x347 [<ffffffff8106e5f4>] kthread+0x7f/0x87 [<ffffffff81013e4a>] child_rip+0xa/0x20 [<ffffffff810137d0>] ? restore_args+0x0/0x30 [<ffffffff81013e40>] ? child_rip+0x0/0x20 Code: 83 c8 ff eb 10 48 c1 e0 03 31 d2 48 03 05 c4 c3 75 00 48 8b 00 48 c1 e0 0c 48 ba 00 00 00 00 00 88 ff ff 48 01 d0 48 39 c3 74 04 <0f> 0b eb fe 48 3b 1d 80 68 94 00 0f 85 bd 00 00 00 31 f6 bf d0 RIP [<ffffffff8100d715>] set_phys_to_machine+0x12f/0x2d9 RSP <ffff88001fd8bca0> ---[ end trace 93d72a36b9146f22 ]---> > if (mid_mfn == p2m_mid_missing_mfn) { > /* Separately check the mid mfn level */ > @@ -464,11 +489,13 @@ static bool alloc_p2m(unsigned long pfn) > return false; > > p2m_mid_mfn_init(mid_mfn); > - > + > missing_mfn = virt_to_mfn(p2m_mid_missing_mfn); > mid_mfn_mfn = virt_to_mfn(mid_mfn); > if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) > free_p2m_page(mid_mfn); > + else > + p2m_mid_mfn_p[topidx] = mid_mfn; > } > > if (p2m_top[topidx][mididx] == p2m_missing) {_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-14 01:43 UTC
Re: [Xen-devel] [PATCH] xen: correctly rebuild mfn list list after migration.
On 10/12/2010 03:14 AM, Ian Campbell wrote:> Otherwise the second migration attempt fails because the mfn_list_list > still refers to all the old mfns. > > We need to update the entires in both p2m_top_mfn and the mid_mfn > pages which p2m_top_mfn refers to. > > In order to do this we need to keep track of the virtual addresses > mapping the p2m_mid_mfn pages since we cannot rely on > mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still > contain the old MFN after a migration, which may now belong to another > domain and hence have a different mapping in the m2p. > > Therefore add and maintain a third top level page, p2m_mid_mfn_p[], > which tracks the virtual addresses of the mfns contained in > p2m_top_mfn[]. > > We also need to update the content of the p2m_mid_missing_mfn page on > resume to refer to the page''s new mfn. > > p2m_missing does not need updating since the migration process takes > care of the leaf p2m pages for us. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > --- > arch/x86/xen/mmu.c | 49 ++++++++++++++++++++++++++++++++++++++----------- > 1 files changed, 38 insertions(+), 11 deletions(-) > > diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c > index 16a8e25..8788064 100644 > --- a/arch/x86/xen/mmu.c > +++ b/arch/x86/xen/mmu.c > @@ -185,6 +185,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */ > * / \ / \ / / > * p2m p2m p2m p2m p2m p2m p2m ... > * > + * The p2m_mid_mfn pages are mapped by p2m_mid_mfn_p. > + * > * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the > * maximum representable pseudo-physical address space is: > * P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages > @@ -209,6 +211,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE); > > static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); > static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); > +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE);This should be called "p2m_top_mfn_p" to have consistent naming, since its at the top of the hierarchy and is indexed by topidx. The "mid" in the name makes it look like it should be indexed by mididx, but then it wouldn''t make sense to have just one of these (since there needs to be a mid for each entry in top). The fact that it points to mids is irrelevant (or at least implied by the fact that its a top).> > RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); > RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); > @@ -245,6 +248,14 @@ static void p2m_top_mfn_init(unsigned long *top) > top[i] = virt_to_mfn(p2m_mid_missing_mfn); > } > > +static void p2m_mid_mfn_p_init(unsigned long **top) > +{ > + unsigned i; > + > + for (i = 0; i < P2M_TOP_PER_PAGE; i++) > + top[i] = p2m_mid_missing_mfn; > +} > + > static void p2m_mid_init(unsigned long **mid) > { > unsigned i; > @@ -301,15 +312,21 @@ EXPORT_SYMBOL(create_lookup_pte_addr); > */ > void xen_build_mfn_list_list(void) > { > - unsigned pfn; > + unsigned long pfn; > > /* Pre-initialize p2m_top_mfn to be completely missing */ > if (p2m_top_mfn == NULL) { > p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_mid_mfn_init(p2m_mid_missing_mfn); > > + p2m_mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); > + p2m_mid_mfn_p_init(p2m_mid_mfn_p); > + > p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); > p2m_top_mfn_init(p2m_top_mfn); > + } else { > + /* Reinitialise, mfn''s all change after migration */ > + p2m_mid_mfn_init(p2m_mid_missing_mfn); > } > > for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) { > @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void) > mid = p2m_top[topidx]; > > /* Don''t bother allocating any mfn mid levels if > - they''re just missing */ > - if (mid[mididx] == p2m_missing) > + * they''re just missing, just update the stored mfn, > + * since all could have changed over a migrate. > + */ > + if (mid == p2m_mid_missing) { > + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing); > + pfn += P2M_MID_PER_PAGE - 1;Why -1? How does this interact with the "pfn += P2M_PER_PAGE" in the for loop? Should it be "(P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE", with the pfn += P2M_PER_PAGE moved to the bottom of the loop rather than in the header? Is this test even necessary? Is it to save redundant re-testing of the mid level? J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-14 06:49 UTC
[Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote:> > > I''m getting this triggering at boot: > > PM: Adding info for No Bus:xen!gntdev > ------------[ cut here ]------------ > kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480!Probably a consequence of the bogus attempt to skip over completely empty mid levels which you pointed out in your next mail. I guess I should test 64 bit guests and not just PAE ones and I guess pre-ballooning is likely to interact as well. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-14 06:55 UTC
Re: [Xen-devel] [PATCH] xen: correctly rebuild mfn list list after migration.
On Thu, 2010-10-14 at 02:43 +0100, Jeremy Fitzhardinge wrote:> On 10/12/2010 03:14 AM, Ian Campbell wrote: > > static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); > > static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); > > +static RESERVE_BRK_ARRAY(unsigned long *, p2m_mid_mfn_p, P2M_TOP_PER_PAGE); > > This should be called "p2m_top_mfn_p" to have consistent naming, since > its at the top of the hierarchy and is indexed by topidx. The "mid" in > the name makes it look like it should be indexed by mididx, but then it > wouldn''t make sense to have just one of these (since there needs to be a > mid for each entry in top). The fact that it points to mids is > irrelevant (or at least implied by the fact that its a top).OK.> > @@ -322,14 +339,19 @@ void xen_build_mfn_list_list(void) > > mid = p2m_top[topidx]; > > > > /* Don''t bother allocating any mfn mid levels if > > - they''re just missing */ > > - if (mid[mididx] == p2m_missing) > > + * they''re just missing, just update the stored mfn, > > + * since all could have changed over a migrate. > > + */ > > + if (mid == p2m_mid_missing) { > > + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing); > > + pfn += P2M_MID_PER_PAGE - 1; > > Why -1? How does this interact with the "pfn += P2M_PER_PAGE" in the > for loop? Should it be "(P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE", with > the pfn += P2M_PER_PAGE moved to the bottom of the loop rather than in > the header?The -1 was supposed to offset the ++ in the for header, except as you point out its actually a +=. I''ll figure out what the correct increment should be.> Is this test even necessary? Is it to save redundant re-testing of the > mid level?It avoids descending P2M_MID_PER_PAGE times per p2m_mis_missing top level entry into leaf entries which we know are going to be p2m_missing because they are referred to by p2m_mid_missing. Perhaps its a premature optimisation, if the test and increment end up too complex once I''ve fixed them I may just drop it. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-14 08:51 UTC
Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote:> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote: > > > > > > I''m getting this triggering at boot: > > > > PM: Adding info for No Bus:xen!gntdev > > ------------[ cut here ]------------ > > kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! > > Probably a consequence of the bogus attempt to skip over completely > empty mid levels which you pointed out in your next mail. > > I guess I should test 64 bit guests and not just PAE ones and I guess > pre-ballooning is likely to interact as well.I''m only able to reproduce this by booting ballooned and then ballooning up, are you doing that or were you seeing it just by booting? What are your memory and maxmem settings? I''ll obviously fix the issue I''m seeing and hope it also fixes yours but even better would be to reproduce your failure here. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-14 18:27 UTC
Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On 10/14/2010 01:51 AM, Ian Campbell wrote:> On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote: >> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote: >>> >>> I''m getting this triggering at boot: >>> >>> PM: Adding info for No Bus:xen!gntdev >>> ------------[ cut here ]------------ >>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! >> Probably a consequence of the bogus attempt to skip over completely >> empty mid levels which you pointed out in your next mail. >> >> I guess I should test 64 bit guests and not just PAE ones and I guess >> pre-ballooning is likely to interact as well. > I''m only able to reproduce this by booting ballooned and then ballooning > up, are you doing that or were you seeing it just by booting? > > What are your memory and maxmem settings?memory=512, no maxmem. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-21 10:10 UTC
Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote:> On 10/14/2010 01:51 AM, Ian Campbell wrote: > > On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote: > >> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote: > >>> > >>> I''m getting this triggering at boot: > >>> > >>> PM: Adding info for No Bus:xen!gntdev > >>> ------------[ cut here ]------------ > >>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! > >> Probably a consequence of the bogus attempt to skip over completely > >> empty mid levels which you pointed out in your next mail. > >> > >> I guess I should test 64 bit guests and not just PAE ones and I guess > >> pre-ballooning is likely to interact as well. > > I''m only able to reproduce this by booting ballooned and then ballooning > > up, are you doing that or were you seeing it just by booting? > > > > What are your memory and maxmem settings? > > memory=512, no maxmem.The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an address in the direct map and then compared it with a virtual address in kernel map, IIRC). Here''s an update patch which fixes all your previous comments, this issue and a few other bits and bobs * s/p2m_mid_mfn_p/p2m_top_mfn_p/g * Fix BUG_ON in alloc_p2m * Skip correct number of pfns when mid == p2m_mid_missing * Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missing Ian. 8<-------->From 0a771251cf7e31ff056e24feb3507b236be2a827 Mon Sep 17 00:00:00 2001From: Ian Campbell <ian.campbell@citrix.com> Date: Thu, 21 Oct 2010 11:00:46 +0100 Subject: [PATCH] xen: correctly rebuild mfn list list after migration. Otherwise the second migration attempt fails because the mfn_list_list still refers to all the old mfns. We need to update the entires in both p2m_top_mfn and the mid_mfn pages which p2m_top_mfn refers to. In order to do this we need to keep track of the virtual addresses mapping the p2m_mid_mfn pages since we cannot rely on mfn_to_virt(p2m_top_mfn[idx]) since p2m_top_mfn[idx] will still contain the old MFN after a migration, which may now belong to another domain and hence have a different mapping in the m2p. Therefore add and maintain a third top level page, p2m_top_mfn_p[], which tracks the virtual addresses of the mfns contained in p2m_top_mfn[]. We also need to update the content of the p2m_mid_missing_mfn page on resume to refer to the page''s new mfn. p2m_missing does not need updating since the migration process takes care of the leaf p2m pages for us. --- arch/x86/xen/mmu.c | 50 +++++++++++++++++++++++++++++++++++++------------- 1 files changed, 37 insertions(+), 13 deletions(-) diff --git a/arch/x86/xen/mmu.c b/arch/x86/xen/mmu.c index 15bbccd..ebb74ec 100644 --- a/arch/x86/xen/mmu.c +++ b/arch/x86/xen/mmu.c @@ -186,6 +186,8 @@ DEFINE_PER_CPU(unsigned long, xen_current_cr3); /* actual vcpu cr3 */ * / \ / \ / / * p2m p2m p2m p2m p2m p2m p2m ... * + * The p2m_mid_mfn pages are mapped by p2m_top_mfn_p. + * * The p2m_top and p2m_top_mfn levels are limited to 1 page, so the * maximum representable pseudo-physical address space is: * P2M_TOP_PER_PAGE * P2M_MID_PER_PAGE * P2M_PER_PAGE pages @@ -210,6 +212,7 @@ static RESERVE_BRK_ARRAY(unsigned long, p2m_mid_missing_mfn, P2M_MID_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long **, p2m_top, P2M_TOP_PER_PAGE); static RESERVE_BRK_ARRAY(unsigned long, p2m_top_mfn, P2M_TOP_PER_PAGE); +static RESERVE_BRK_ARRAY(unsigned long *, p2m_top_mfn_p, P2M_TOP_PER_PAGE); RESERVE_BRK(p2m_mid, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); RESERVE_BRK(p2m_mid_mfn, PAGE_SIZE * (MAX_DOMAIN_PAGES / (P2M_PER_PAGE * P2M_MID_PER_PAGE))); @@ -246,6 +249,14 @@ static void p2m_top_mfn_init(unsigned long *top) top[i] = virt_to_mfn(p2m_mid_missing_mfn); } +static void p2m_top_mfn_p_init(unsigned long **top) +{ + unsigned i; + + for (i = 0; i < P2M_TOP_PER_PAGE; i++) + top[i] = p2m_mid_missing_mfn; +} + static void p2m_mid_init(unsigned long **mid) { unsigned i; @@ -302,33 +313,43 @@ EXPORT_SYMBOL(create_lookup_pte_addr); */ void xen_build_mfn_list_list(void) { - unsigned pfn; + unsigned long pfn; /* Pre-initialize p2m_top_mfn to be completely missing */ if (p2m_top_mfn == NULL) { p2m_mid_missing_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_mid_mfn_init(p2m_mid_missing_mfn); + p2m_top_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); + p2m_top_mfn_p_init(p2m_top_mfn_p); + p2m_top_mfn = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_top_mfn_init(p2m_top_mfn); + } else { + /* Reinitialise, mfn''s all change after migration */ + p2m_mid_mfn_init(p2m_mid_missing_mfn); } for (pfn = 0; pfn < xen_max_p2m_pfn; pfn += P2M_PER_PAGE) { unsigned topidx = p2m_top_index(pfn); unsigned mididx = p2m_mid_index(pfn); unsigned long **mid; - unsigned long mid_mfn; unsigned long *mid_mfn_p; mid = p2m_top[topidx]; + mid_mfn_p = p2m_top_mfn_p[topidx]; /* Don''t bother allocating any mfn mid levels if - they''re just missing */ - if (mid[mididx] == p2m_missing) + * they''re just missing, just update the stored mfn, + * since all could have changed over a migrate. + */ + if (mid == p2m_mid_missing) { + BUG_ON(mididx); + BUG_ON(mid_mfn_p != p2m_mid_missing_mfn); + p2m_top_mfn[topidx] = virt_to_mfn(p2m_mid_missing_mfn); + pfn += (P2M_MID_PER_PAGE - 1) * P2M_PER_PAGE; continue; - - mid_mfn = p2m_top_mfn[topidx]; - mid_mfn_p = mfn_to_virt(mid_mfn); + } if (mid_mfn_p == p2m_mid_missing_mfn) { /* @@ -340,11 +361,10 @@ void xen_build_mfn_list_list(void) mid_mfn_p = extend_brk(PAGE_SIZE, PAGE_SIZE); p2m_mid_mfn_init(mid_mfn_p); - mid_mfn = virt_to_mfn(mid_mfn_p); - - p2m_top_mfn[topidx] = mid_mfn; + p2m_top_mfn_p[topidx] = mid_mfn_p; } + p2m_top_mfn[topidx] = virt_to_mfn(mid_mfn_p); mid_mfn_p[mididx] = virt_to_mfn(mid[mididx]); } } @@ -363,7 +383,7 @@ void __init xen_build_dynamic_phys_to_machine(void) { unsigned long *mfn_list = (unsigned long *)xen_start_info->mfn_list; unsigned long max_pfn = min(MAX_DOMAIN_PAGES, xen_start_info->nr_pages); - unsigned pfn; + unsigned long pfn; xen_max_p2m_pfn = max_pfn; @@ -453,7 +473,9 @@ static bool alloc_p2m(unsigned long pfn) } top_mfn_p = &p2m_top_mfn[topidx]; - mid_mfn = mfn_to_virt(*top_mfn_p); + mid_mfn = p2m_top_mfn_p[topidx]; + + BUG_ON(virt_to_mfn(mid_mfn) != *top_mfn_p); if (mid_mfn == p2m_mid_missing_mfn) { /* Separately check the mid mfn level */ @@ -465,11 +487,13 @@ static bool alloc_p2m(unsigned long pfn) return false; p2m_mid_mfn_init(mid_mfn); - + missing_mfn = virt_to_mfn(p2m_mid_missing_mfn); mid_mfn_mfn = virt_to_mfn(mid_mfn); if (cmpxchg(top_mfn_p, missing_mfn, mid_mfn_mfn) != missing_mfn) free_p2m_page(mid_mfn); + else + p2m_top_mfn_p[topidx] = mid_mfn; } if (p2m_top[topidx][mididx] == p2m_missing) { -- 1.5.6.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2010-Oct-21 17:10 UTC
Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On 10/21/2010 03:10 AM, Ian Campbell wrote:> On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote: >> On 10/14/2010 01:51 AM, Ian Campbell wrote: >>> On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote: >>>> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote: >>>>> I''m getting this triggering at boot: >>>>> >>>>> PM: Adding info for No Bus:xen!gntdev >>>>> ------------[ cut here ]------------ >>>>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! >>>> Probably a consequence of the bogus attempt to skip over completely >>>> empty mid levels which you pointed out in your next mail. >>>> >>>> I guess I should test 64 bit guests and not just PAE ones and I guess >>>> pre-ballooning is likely to interact as well. >>> I''m only able to reproduce this by booting ballooned and then ballooning >>> up, are you doing that or were you seeing it just by booting? >>> >>> What are your memory and maxmem settings? >> memory=512, no maxmem. > The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an > address in the direct map and then compared it with a virtual address in > kernel map, IIRC). Here''s an update patch which fixes all your previous > comments, this issue and a few other bits and bobs > > * s/p2m_mid_mfn_p/p2m_top_mfn_p/g > * Fix BUG_ON in alloc_p2m > * Skip correct number of pfns when mid == p2m_mid_missing > * Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missingOK, it passed my sniff test - it boots and save/restored a few times. (I added a S-O-B for you.) Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2010-Oct-21 17:28 UTC
Re: [Xen-devel] Re: [PATCH] xen: correctly rebuild mfn list list after migration.
On Thu, 2010-10-21 at 18:10 +0100, Jeremy Fitzhardinge wrote:> On 10/21/2010 03:10 AM, Ian Campbell wrote: > > On Thu, 2010-10-14 at 19:27 +0100, Jeremy Fitzhardinge wrote: > >> On 10/14/2010 01:51 AM, Ian Campbell wrote: > >>> On Thu, 2010-10-14 at 07:49 +0100, Ian Campbell wrote: > >>>> On Thu, 2010-10-14 at 01:37 +0100, Jeremy Fitzhardinge wrote: > >>>>> I''m getting this triggering at boot: > >>>>> > >>>>> PM: Adding info for No Bus:xen!gntdev > >>>>> ------------[ cut here ]------------ > >>>>> kernel BUG at /home/jeremy/git/linux/arch/x86/xen/mmu.c:480! > >>>> Probably a consequence of the bogus attempt to skip over completely > >>>> empty mid levels which you pointed out in your next mail. > >>>> > >>>> I guess I should test 64 bit guests and not just PAE ones and I guess > >>>> pre-ballooning is likely to interact as well. > >>> I''m only able to reproduce this by booting ballooned and then ballooning > >>> up, are you doing that or were you seeing it just by booting? > >>> > >>> What are your memory and maxmem settings? > >> memory=512, no maxmem. > > The BUG_ON I added to alloc_p2m was wrong (converted an mfn to an > > address in the direct map and then compared it with a virtual address in > > kernel map, IIRC). Here''s an update patch which fixes all your previous > > comments, this issue and a few other bits and bobs > > > > * s/p2m_mid_mfn_p/p2m_top_mfn_p/g > > * Fix BUG_ON in alloc_p2m > > * Skip correct number of pfns when mid == p2m_mid_missing > > * Use correct value p2m_top_mfn[topidx] when mid == p2m_mid_missing > > OK, it passed my sniff test - it boots and save/restored a few times.Great.> (I added a S-O-B for you.)Oops, thanks. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel