Olaf Hering
2010-Nov-23 21:01 UTC
[Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
Hello, what is the purpose of the mfn_to_gfn() check in guest_physmap_add_entry()? This function gets a fresh mfn and its gfn passed to update the global p2m state. But before doing that, it checks wether that fresh mfn maps still to some gfn. If it does, it checks if that gfn maps to any mfn. If it doesnt, Xen crashes with an assert. Now, if that mfn was part of an old guest, should that old guest cleanup all of its mfns and update the machine_to_phys_mapping[]? Appearently that rarely happens. And if there is still some random gfn number in that array, the function tries to see what happens with this number in the current guests context. IF that number happens to be a gfn in paged-out state, there will be no mfn. So the ASSERT triggers. I would guess that if guest_physmap_add_entry() gets a page with type p2m_ram_rw, nothing else can own that page. Is that right? If so, this ASSERT or most of the loop can be removed. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Nov-24 10:22 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
Hi, At 21:01 +0000 on 23 Nov (1290546118), Olaf Hering wrote:> what is the purpose of the mfn_to_gfn() check in > guest_physmap_add_entry()?It''s intended to stop a VM aliasing two PFNs to the same MFN.> This function gets a fresh mfn and its gfn passed to update the global > p2m state. But before doing that, it checks wether that fresh mfn maps > still to some gfn. If it does, it checks if that gfn maps to any mfn. If > it doesnt, Xen crashes with an assert. > > Now, if that mfn was part of an old guest, should that old guest cleanup > all of its mfns and update the machine_to_phys_mapping[]? Appearently > that rarely happens. > And if there is still some random gfn number in that array, the function > tries to see what happens with this number in the current guests > context. IF that number happens to be a gfn in paged-out state, there > will be no mfn. So the ASSERT triggers.The assert is guarded by p2m_is_ram(), which ought to imply that there was actual RAM behind it. There are other places in the code where p2m_is_ram() is used to check that the associated mfn is valid (e.g. when loading CR3). I think that adding the paging types (in particular p2m_ram_paged) to P2M_RAM_TYPES is a mistake, unless gfn_to_mfn() guarantees to get the pfn into a state where it''s backed by an MFN before it returns (which it probably can''t). Another option would be to audit all callers of p2m_is_ram() and check whether they can handle paged-out PFNs (which I though had already been done but seems not to be). Keir''s VCPU yield patch might be useful in some of those places too. Cc''ing Patrick for comments.> I would guess that if guest_physmap_add_entry() gets a page with type > p2m_ram_rw, nothing else can own that page. Is that right? > If so, this ASSERT or most of the loop can be removed.The loop shouldn''t be removed without some more thought about aliasing PFNs, and I think that removing the ASSERT plasters over a deeper problem. The BUG_ON() just above it, on the other hand, is not correct, and I''ll remove it. Cheers, Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Nov-24 10:26 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
Actually CCing Patrick this time... Tim. At 10:22 +0000 on 24 Nov (1290594122), Tim Deegan wrote:> Hi, > > At 21:01 +0000 on 23 Nov (1290546118), Olaf Hering wrote: > > what is the purpose of the mfn_to_gfn() check in > > guest_physmap_add_entry()? > > It''s intended to stop a VM aliasing two PFNs to the same MFN. > > > This function gets a fresh mfn and its gfn passed to update the global > > p2m state. But before doing that, it checks wether that fresh mfn maps > > still to some gfn. If it does, it checks if that gfn maps to any mfn. If > > it doesnt, Xen crashes with an assert. > > > > Now, if that mfn was part of an old guest, should that old guest cleanup > > all of its mfns and update the machine_to_phys_mapping[]? Appearently > > that rarely happens. > > And if there is still some random gfn number in that array, the function > > tries to see what happens with this number in the current guests > > context. IF that number happens to be a gfn in paged-out state, there > > will be no mfn. So the ASSERT triggers. > > The assert is guarded by p2m_is_ram(), which ought to imply that there > was actual RAM behind it. There are other places in the code where > p2m_is_ram() is used to check that the associated mfn is valid > (e.g. when loading CR3). > > I think that adding the paging types (in particular p2m_ram_paged) to > P2M_RAM_TYPES is a mistake, unless gfn_to_mfn() guarantees to get the > pfn into a state where it''s backed by an MFN before it returns (which it > probably can''t). > > Another option would be to audit all callers of p2m_is_ram() and check > whether they can handle paged-out PFNs (which I though had already been > done but seems not to be). Keir''s VCPU yield patch might be useful in > some of those places too. > > Cc''ing Patrick for comments. > > > I would guess that if guest_physmap_add_entry() gets a page with type > > p2m_ram_rw, nothing else can own that page. Is that right? > > If so, this ASSERT or most of the loop can be removed. > > The loop shouldn''t be removed without some more thought about aliasing > PFNs, and I think that removing the ASSERT plasters over a deeper > problem. > > The BUG_ON() just above it, on the other hand, is not correct, and I''ll > remove it. > > Cheers, > > Tim. > > -- > Tim Deegan <Tim.Deegan@citrix.com> > Principal Software Engineer, Xen Platform Team > Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)> # HG changeset patch > # User Tim Deegan <Tim.Deegan@citrix.com> > # Date 1290594003 0 > # Node ID 79b71c77907b80772ee8cba0c5bbf8e444e61226 > # Parent e5c4e925e1bd15baeadc0817dcceb5fff54b8a74 > x86/mm: remove incorrect BUG_ON. > This BUG_ON tests a property of an effectively random PFN in the guest, > and is explicitly _not_ seeing the MFN that''s known to be owned. > > Signed-off-by: Tim Deegan <Tim.Deegan@citrix.com> > > diff -r e5c4e925e1bd -r 79b71c77907b xen/arch/x86/mm/p2m.c > --- a/xen/arch/x86/mm/p2m.c Tue Nov 23 19:36:14 2010 +0000 > +++ b/xen/arch/x86/mm/p2m.c Wed Nov 24 10:20:03 2010 +0000 > @@ -2380,9 +2380,6 @@ guest_physmap_add_entry(struct p2m_domai > P2M_DEBUG("aliased! mfn=%#lx, old gfn=%#lx, new gfn=%#lx\n", > mfn + i, ogfn, gfn + i); > omfn = gfn_to_mfn_query(p2m, ogfn, &ot); > - /* If we get here, we know the local domain owns the page, > - so it can''t have been grant mapped in. */ > - BUG_ON( p2m_is_grant(ot) ); > if ( p2m_is_ram(ot) ) > { > ASSERT(mfn_valid(omfn));-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-24 14:41 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Wed, Nov 24, Tim Deegan wrote:> I think that adding the paging types (in particular p2m_ram_paged) to > P2M_RAM_TYPES is a mistake, unless gfn_to_mfn() guarantees to get the > pfn into a state where it''s backed by an MFN before it returns (which it > probably can''t).Do you mean p2m_mem_paging_evict() should invalidate the mfn, and p2m_mem_paging_resume() should update the mfn to the current gfn? My patches do that.> Another option would be to audit all callers of p2m_is_ram() and check > whether they can handle paged-out PFNs (which I though had already been > done but seems not to be). Keir''s VCPU yield patch might be useful in > some of those places too.I think most if not all is caught by my changes already.> > I would guess that if guest_physmap_add_entry() gets a page with type > > p2m_ram_rw, nothing else can own that page. Is that right? > > If so, this ASSERT or most of the loop can be removed. > > The loop shouldn''t be removed without some more thought about aliasing > PFNs, and I think that removing the ASSERT plasters over a deeper > problem.What is supposed to happen when building a guest? I think at some point all (or most) mfns are passed to dom0 and the machine_to_phys_mapping array is in a state to reflect that. Then the first guest is created. How is memory for this guest freed from dom0 and assigned to the guest? Why do these mfns are not invalidated in machine_to_phys_mapping[]? For a guest shutdown p2m_teardown() seems to be the place to do invalidate the mfns in machine_to_phys_mapping[]. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Nov-24 14:53 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
At 14:41 +0000 on 24 Nov (1290609698), Olaf Hering wrote:> > Another option would be to audit all callers of p2m_is_ram() and check > > whether they can handle paged-out PFNs (which I though had already been > > done but seems not to be). Keir''s VCPU yield patch might be useful in > > some of those places too. > > I think most if not all is caught by my changes already.In that case, maybe removing the p2m_paging types (at least those where using the mfn immediately isn''t sensible) from p2m_is_ram() and chasing the last few users would be the right thing to do.> What is supposed to happen when building a guest? > > I think at some point all (or most) mfns are passed to dom0 and the > machine_to_phys_mapping array is in a state to reflect that.That''s not necessarily the case - XenServer has a fixed-size dom0 and leaves all other RAM in the free pools.> Then the first guest is created. > How is memory for this guest freed from dom0 and assigned to the guest? > Why do these mfns are not invalidated in machine_to_phys_mapping[]? > > For a guest shutdown p2m_teardown() seems to be the place to do > invalidate the mfns in machine_to_phys_mapping[].The problem is that PV guests set their own m2p entries and can''t be relied on to tear them down. The guest_physmap_add_entry code, and the p2m audit code, would be made more reliable if, say, alloc_domheap_pages and/or free_domheap_pages zapped the m2p entries for MFNs they touched. I think originally that wasn''t done because the alloc is quickly followed by another write of the m2p but that''s probably over-keen optimization. Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-24 15:00 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Wed, Nov 24, Tim Deegan wrote:> The guest_physmap_add_entry code, and the p2m audit code, would be made > more reliable if, say, alloc_domheap_pages and/or free_domheap_pages > zapped the m2p entries for MFNs they touched.Ok, if thats the ultimate place where mfns come and go, I will make my changes there. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-24 19:58 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Wed, Nov 24, Olaf Hering wrote:> On Wed, Nov 24, Tim Deegan wrote:> > Another option would be to audit all callers of p2m_is_ram() and check > > whether they can handle paged-out PFNs (which I though had already been > > done but seems not to be). Keir''s VCPU yield patch might be useful in > > some of those places too. > > I think most if not all is caught by my changes already.A quick grep shows there are a few places that need an audit wether or not the paging types can be removed from p2m_is_ram(). Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Patrick Colp
2010-Nov-24 20:25 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On 24 November 2010 11:58, Olaf Hering <olaf@aepfle.de> wrote:> On Wed, Nov 24, Olaf Hering wrote: > >> On Wed, Nov 24, Tim Deegan wrote: > >> > Another option would be to audit all callers of p2m_is_ram() and check >> > whether they can handle paged-out PFNs (which I though had already been >> > done but seems not to be). Keir''s VCPU yield patch might be useful in >> > some of those places too. >> >> I think most if not all is caught by my changes already. > > A quick grep shows there are a few places that need an audit wether or > not the paging types can be removed from p2m_is_ram().If I recall correctly, I made the paging types part of p2m_is_ram() because the idea was to have paged out pages appear to be normal pages. But I think this was done without deep analysis of other bits of the Xen code that might be confused (and as to whether it would make Xen explode if they weren''t part of p2m_is_ram()). But there could be places that do a check against p2m_is_ram() where a paged out page should be returning true. I''m sure those calls can be tracked down and an extra check added for paged out pages too. Doing a quick scan, it seems like there are some places that checks against p2m_is_ram() will have to be update, some where having paged types might be an issue, and others where above it is a check to p2m_is_paging(), so it should be fine if paging types aren''t ram types. It doesn''t seem like there''s too many places to check, though, so hopefully it''s not too horrible to fix this up. Patrick _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-25 15:03 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Wed, Nov 24, Tim Deegan wrote:> The problem is that PV guests set their own m2p entries and can''t be > relied on to tear them down.What needs to happen for PV guests? Dont they use the machine_to_phys_mapping[] array like HVM guests?> The guest_physmap_add_entry code, and the p2m audit code, would be made > more reliable if, say, alloc_domheap_pages and/or free_domheap_pages > zapped the m2p entries for MFNs they touched. > > I think originally that wasn''t done because the alloc is quickly > followed by another write of the m2p but that''s probably over-keen > optimization.Could it be done like that? (not yet compile-tested) The mfn is probably always valid. I see memory_exchange uses assign_pages() to move mfns from one domain to another (havent studied the whole function yet). I think thats another place that needs an audit wether the machine_to_phys_mapping[] array is maintained properly. --- xen-4.0.1-testing.orig/xen/common/page_alloc.c +++ xen-4.0.1-testing/xen/common/page_alloc.c @@ -1146,6 +1146,8 @@ struct page_info *alloc_domheap_pages( struct page_info *pg = NULL; unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1; unsigned int node = (uint8_t)((memflags >> _MEMF_node) - 1), dma_zone; + int i; + mfn_t mfn; ASSERT(!in_irq()); @@ -1170,6 +1172,13 @@ struct page_info *alloc_domheap_pages( free_heap_pages(pg, order); return NULL; } + /* this page is not yet a gfn */ + mfn = page_to_mfn(pg); + if (mfn_valid(mfn)) + { + for ( i = 0; i < (1 << order); i++ ) + set_gpfn_from_mfn(mfn_x(mfn) + j, INVALID_M2P_ENTRY); + } return pg; } @@ -1178,9 +1187,18 @@ void free_domheap_pages(struct page_info { int i, drop_dom_ref; struct domain *d = page_get_owner(pg); + mfn_t mfn; ASSERT(!in_irq()); + /* this page is not a gfn anymore */ + mfn = page_to_mfn(pg); + if (mfn_valid(mfn)) + { + for ( i = 0; i < (1 << order); i++ ) + set_gpfn_from_mfn(mfn_x(mfn) + j, INVALID_M2P_ENTRY); + } + if ( unlikely(is_xen_heap_page(pg)) ) { /* NB. May recursively lock from relinquish_memory(). */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2010-Nov-25 15:32 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
At 15:03 +0000 on 25 Nov (1290697390), Olaf Hering wrote:> On Wed, Nov 24, Tim Deegan wrote: > > > The problem is that PV guests set their own m2p entries and can''t be > > relied on to tear them down. > > What needs to happen for PV guests? > Dont they use the machine_to_phys_mapping[] array like HVM guests?They do, but they can put whatever they like in it, and don''t have to clean up afterwards.> I see memory_exchange uses assign_pages() to move mfns from one domain > to another (havent studied the whole function yet). I think thats > another place that needs an audit wether the machine_to_phys_mapping[] > array is maintained properly.Good catch. It might be better to hook page_set_owner(), which ought to imply that the old m2p info is stale. I had a quick look at all its callers and I _think_ it would be OK, but I haven''t tested that at all. :) Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Xen Platform Team Citrix Systems UK Ltd. (Company #02937203, SL9 0BG) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Nov-25 17:16 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On 25/11/2010 15:03, "Olaf Hering" <olaf@aepfle.de> wrote:>> The guest_physmap_add_entry code, and the p2m audit code, would be made >> more reliable if, say, alloc_domheap_pages and/or free_domheap_pages >> zapped the m2p entries for MFNs they touched. >> >> I think originally that wasn''t done because the alloc is quickly >> followed by another write of the m2p but that''s probably over-keen >> optimization. > > Could it be done like that? (not yet compile-tested) > The mfn is probably always valid.If you xap m2p in free_domheap_pages(), you shouldn''t need to do it again in alloc_domheap_pages(). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-25 20:53 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Thu, Nov 25, Keir Fraser wrote:> On 25/11/2010 15:03, "Olaf Hering" <olaf@aepfle.de> wrote: > > >> The guest_physmap_add_entry code, and the p2m audit code, would be made > >> more reliable if, say, alloc_domheap_pages and/or free_domheap_pages > >> zapped the m2p entries for MFNs they touched. > >> > >> I think originally that wasn''t done because the alloc is quickly > >> followed by another write of the m2p but that''s probably over-keen > >> optimization. > > > > Could it be done like that? (not yet compile-tested) > > The mfn is probably always valid. > > If you xap m2p in free_domheap_pages(), you shouldn''t need to do it again in > alloc_domheap_pages().What about the initial allocation? I have to double check, but I think the array does already contain gfn numbers. ... checking ... It seems the change below is indeed enough to invalidate the entries. Olaf #Subject: xenpaging: update machine_to_phys_mapping during page-in and page deallocation The machine_to_phys_mapping array needs updating during page-in, and during page deallocation. If a page is gone, a call to get_gpfn_from_mfn will still return the old gfn for an already paged-out page. This happens when the entire guest ram is paged-out before xen_vga_populate_vram() runs. Then XENMEM_populate_physmap is called with gfn 0xff000. A new page is allocated with alloc_domheap_pages. This new page does not have a gfn yet. However, in guest_physmap_add_entry() the passed mfn maps still to an old gfn (perhaps from another old guest). This old gfn is in paged-out state in this guests context and has no mfn anymore. As a result, the ASSERT() triggers because p2m_is_ram() is true for p2m_ram_paging* types. If the machine_to_phys_mapping array is updated properly, both loops in guest_physmap_add_entry() turn into no-ops for the new page and the mfn/gfn mapping will be done at the end of the function. The same thing needs to happen dring a page-in, the current gfn must be written for the page. If XENMEM_add_to_physmap is used with XENMAPSPACE_gmfn, get_gpfn_from_mfn() will return an appearently valid gfn. As a result, guest_physmap_remove_page() is called. The ASSERT in p2m_remove_page triggers because the passed mfn does not match the old mfn for the passed gfn. Signed-off-by: Olaf Hering <olaf@aepfle.de> --- v3: invalidate machine_to_phys_mapping[] during page deallocation v2: call set_gpfn_from_mfn only if mfn is valid xen/arch/x86/mm/p2m.c | 13 ++++++++++--- xen/common/page_alloc.c | 9 +++++++++ 2 files changed, 19 insertions(+), 3 deletions(-) --- xen-4.0.1-testing.orig/xen/arch/x86/mm/p2m.c +++ xen-4.0.1-testing/xen/arch/x86/mm/p2m.c @@ -2598,9 +2598,16 @@ void p2m_mem_paging_resume(struct domain /* Fix p2m entry */ mfn = gfn_to_mfn(d, rsp.gfn, &p2mt); - p2m_lock(d->arch.p2m); - set_p2m_entry(d, rsp.gfn, mfn, 0, p2m_ram_rw); - p2m_unlock(d->arch.p2m); + if (mfn_valid(mfn)) + { + p2m_lock(d->arch.p2m); + set_p2m_entry(d, rsp.gfn, mfn, 0, p2m_ram_rw); + set_gpfn_from_mfn(mfn_x(mfn), rsp.gfn); + p2m_unlock(d->arch.p2m); + } else { + gdprintk(XENLOG_ERR, "invalid mfn %lx for gfn %lx p2mt %x flags %lx\n", + mfn_x(mfn), rsp.gfn, p2mt, (unsigned long)rsp.flags); + } /* Unpause domain */ if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED ) --- xen-4.0.1-testing.orig/xen/common/page_alloc.c +++ xen-4.0.1-testing/xen/common/page_alloc.c @@ -1178,9 +1178,18 @@ void free_domheap_pages(struct page_info { int i, drop_dom_ref; struct domain *d = page_get_owner(pg); + unsigned long mfn; ASSERT(!in_irq()); + /* this page is not a gfn anymore */ + mfn = page_to_mfn(pg); + if (mfn_valid(mfn)) + { + for ( i = 0; i < (1 << order); i++ ) + set_gpfn_from_mfn(mfn + i, INVALID_M2P_ENTRY); + } + if ( unlikely(is_xen_heap_page(pg)) ) { /* NB. May recursively lock from relinquish_memory(). */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-25 20:56 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Thu, Nov 25, Tim Deegan wrote:> Good catch. It might be better to hook page_set_owner(), which ought to > imply that the old m2p info is stale. I had a quick look at all its > callers and I _think_ it would be OK, but I haven''t tested that at > all. :)I tried page_set_owner(). Blindly adding set_gpfn_from_mfn there leads to compile errors, thanks to all the redefinitions of mfn_t related macros. Also, the page_set_owner() in alloc_heap_pages would cause Xen to crash early. It appears that maintaing machine_to_phys_mapping[] in free_domheap_pages() is enough to avoid the ASSERTS. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2010-Nov-25 22:30 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On 25/11/2010 20:53, "Olaf Hering" <olaf@aepfle.de> wrote:> On Thu, Nov 25, Keir Fraser wrote: > >> On 25/11/2010 15:03, "Olaf Hering" <olaf@aepfle.de> wrote: >> >>>> The guest_physmap_add_entry code, and the p2m audit code, would be made >>>> more reliable if, say, alloc_domheap_pages and/or free_domheap_pages >>>> zapped the m2p entries for MFNs they touched. >>>> >>>> I think originally that wasn''t done because the alloc is quickly >>>> followed by another write of the m2p but that''s probably over-keen >>>> optimization. >>> >>> Could it be done like that? (not yet compile-tested) >>> The mfn is probably always valid. >> >> If you xap m2p in free_domheap_pages(), you shouldn''t need to do it again in >> alloc_domheap_pages(). > > What about the initial allocation? I have to double check, but I think > the array does already contain gfn numbers. > > ... checking ... > > It seems the change below is indeed enough to invalidate the entries.This patch is still RFC I assume? Quite apart from anything else, the patch will need to be made against xen-unstable. At this point, given the need for the waitqueue infrastructure to get xenpaging working reliably, I don''t think fixing xenpaging for stable 4.0.x upstream is going to fly. It''s going to be tough enough getting it stable for 4.1. -- Keir> Olaf > > #Subject: xenpaging: update machine_to_phys_mapping during page-in and page > deallocation > > The machine_to_phys_mapping array needs updating during page-in, and > during page deallocation. If a page is gone, a call to > get_gpfn_from_mfn will still return the old gfn for an already paged-out > page. This happens when the entire guest ram is paged-out before > xen_vga_populate_vram() runs. Then XENMEM_populate_physmap is called > with gfn 0xff000. A new page is allocated with alloc_domheap_pages. > This new page does not have a gfn yet. However, in > guest_physmap_add_entry() the passed mfn maps still to an old gfn > (perhaps from another old guest). This old gfn is in paged-out state in > this guests context and has no mfn anymore. As a result, the ASSERT() > triggers because p2m_is_ram() is true for p2m_ram_paging* types. > > If the machine_to_phys_mapping array is updated properly, both loops in > guest_physmap_add_entry() turn into no-ops for the new page and the > mfn/gfn mapping will be done at the end of the function. > > The same thing needs to happen dring a page-in, the current gfn must be > written for the page. > > If XENMEM_add_to_physmap is used with XENMAPSPACE_gmfn, > get_gpfn_from_mfn() will return an appearently valid gfn. As a result, > guest_physmap_remove_page() is called. The ASSERT in p2m_remove_page > triggers because the passed mfn does not match the old mfn for the > passed gfn. > > Signed-off-by: Olaf Hering <olaf@aepfle.de> > > --- > v3: > invalidate machine_to_phys_mapping[] during page deallocation > v2: > call set_gpfn_from_mfn only if mfn is valid > > xen/arch/x86/mm/p2m.c | 13 ++++++++++--- > xen/common/page_alloc.c | 9 +++++++++ > 2 files changed, 19 insertions(+), 3 deletions(-) > > --- xen-4.0.1-testing.orig/xen/arch/x86/mm/p2m.c > +++ xen-4.0.1-testing/xen/arch/x86/mm/p2m.c > @@ -2598,9 +2598,16 @@ void p2m_mem_paging_resume(struct domain > > /* Fix p2m entry */ > mfn = gfn_to_mfn(d, rsp.gfn, &p2mt); > - p2m_lock(d->arch.p2m); > - set_p2m_entry(d, rsp.gfn, mfn, 0, p2m_ram_rw); > - p2m_unlock(d->arch.p2m); > + if (mfn_valid(mfn)) > + { > + p2m_lock(d->arch.p2m); > + set_p2m_entry(d, rsp.gfn, mfn, 0, p2m_ram_rw); > + set_gpfn_from_mfn(mfn_x(mfn), rsp.gfn); > + p2m_unlock(d->arch.p2m); > + } else { > + gdprintk(XENLOG_ERR, "invalid mfn %lx for gfn %lx p2mt %x flags > %lx\n", > + mfn_x(mfn), rsp.gfn, p2mt, (unsigned long)rsp.flags); > + } > > /* Unpause domain */ > if ( rsp.flags & MEM_EVENT_FLAG_VCPU_PAUSED ) > --- xen-4.0.1-testing.orig/xen/common/page_alloc.c > +++ xen-4.0.1-testing/xen/common/page_alloc.c > @@ -1178,9 +1178,18 @@ void free_domheap_pages(struct page_info > { > int i, drop_dom_ref; > struct domain *d = page_get_owner(pg); > + unsigned long mfn; > > ASSERT(!in_irq()); > > + /* this page is not a gfn anymore */ > + mfn = page_to_mfn(pg); > + if (mfn_valid(mfn)) > + { > + for ( i = 0; i < (1 << order); i++ ) > + set_gpfn_from_mfn(mfn + i, INVALID_M2P_ENTRY); > + } > + > if ( unlikely(is_xen_heap_page(pg)) ) > { > /* NB. May recursively lock from relinquish_memory(). */_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Olaf Hering
2010-Nov-26 07:27 UTC
Re: [Xen-devel] bogus gfn - mfn - gfn - mfn checks in guest_physmap_add_entry
On Thu, Nov 25, Keir Fraser wrote:> This patch is still RFC I assume? Quite apart from anything else, the patch > will need to be made against xen-unstable. At this point, given the need for > the waitqueue infrastructure to get xenpaging working reliably, I don''t > think fixing xenpaging for stable 4.0.x upstream is going to fly. It''s going > to be tough enough getting it stable for 4.1.Sure, I will finally port all these changes to xen-unstable, today. Olaf _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel