This patch fixes memory exchange hypercall which has been broken on ia64. Especially the error recovery path. Dropping _PGC_allocated bit and guest_physmap_remove_page() must be done before stealing page because The ia64 p2m table increments page refcount and guest_physmap_remove_page() behaves depending on _PGC_allocated. So far there is work around code in ia64, however the c/s 13366:ed73ff8440d8 of xen-unstable.hg revealed that the work around is broken. The c/s passes dma bit argument of memory exchange so that it results in hypercall error. On x86 platform, memory_exchange() is used only for paravirtualized domain and guest_physmap_remove_page() and guest_phsymap_add_page() are nop for paravirtualized domain. So reordering them is safe. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/3/07 09:53, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote:> This patch fixes memory exchange hypercall which has been broken on ia64. > Especially the error recovery path. > Dropping _PGC_allocated bit and guest_physmap_remove_page() must be done > before stealing page because The ia64 p2m table increments page > refcount and guest_physmap_remove_page() behaves depending on _PGC_allocated.Do the p2m semantics of the two architectures really need to differ? Can''t ia64 match x86 semantics (or vice versa)? Otherwise you''re going to be continually getting broken and having to implement skanky workarounds. -- Keir> So far there is work around code in ia64, however the c/s 13366:ed73ff8440d8 > of xen-unstable.hg revealed that the work around is broken. > The c/s passes dma bit argument of memory exchange so that it results > in hypercall error._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Mar-15  11:42 UTC
Re: [Xen-devel] [PATCH] fix memory exchange hypercall.
On Thu, Mar 15, 2007 at 10:59:33AM +0000, Keir Fraser wrote:> Do the p2m semantics of the two architectures really need to differ? Can''t > ia64 match x86 semantics (or vice versa)? Otherwise you''re going to be > continually getting broken and having to implement skanky workarounds.I fully agree with you. I''ve tried, but I haven''t found any good way yet. :-(. The big differrence is that the x86 p2m table (i.e. hvm domain) doesn''t support foreign domain page mapping and grant table page mapping. On the other hand, the ia64 p2m supports them. (NOTE:The ia64 p2m is used for both para/full virtualized domain. Even for dom0. So they must be supported.) For that purpose, the ia64 p2m table semantic is something similar to (the x86 p2m table) + (the x86 page table reference counting). (NOTE:those page mapping are based on virtual address with x86 paravirtualized domain. On the other hand they are based on pseudo physical address with auto translated physmap mode i.e. hvm domain or ia64 domain. This is the root cause of the difference.) If the x86 p2m table supported those page mapping, the similar semantics would be necessary, I suppose. But it won''t happen in the near future. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/3/07 11:42, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote:> The big differrence is that the x86 p2m table (i.e. hvm domain) > doesn''t support foreign domain page mapping and grant table page mapping. > On the other hand, the ia64 p2m supports them. > (NOTE:The ia64 p2m is used for both para/full virtualized domain. > Even for dom0. So they must be supported.)Could you take references for the foreign mappings and grant mappings only? Presumably you have some kind of metadata associated with each p2m entry to tell you what kind of thing is mapped at each pfn? Alternatively we could change x86 to take a reference for anything in its physmap. It seems a bit unnecessary though. I think when we come to add other types of things to a physmap (grant references in particular) we''ll do reference counting dependent on the type of thing that is mapped. After all, we also sometime want to be able to assign real mmio ranges to HVM guests, and there are no page_info structures associated with such ranges, and so no reference count to increment! Whichever, I think we should sort it out one way or the other and change ia64 and/or x86 now. Or it''ll never happen! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Mar-15  13:38 UTC
Re: [Xen-devel] [PATCH] fix memory exchange hypercall.
On Thu, Mar 15, 2007 at 12:23:20PM +0000, Keir Fraser wrote:> Whichever, I think we should sort it out one way or the other and change > ia64 and/or x86 now. Or it''ll never happen!I see. To be honest I expected this kind of response. I''ll give it a try again, but it may take a while. Could you explain about PGC_allocated and balloon? What''s the rationale when to clear/set PGC_allocate bit? My understanding is as follows. - PGC_allocated means +1 page reference count. - It is neccessary for x86 paravirtualized domain to keep page from freeing. - It doesn''t make sense for x86 hvm domain so that it is only cleared by relinquish_memory(). - If the p2m entry is cleared which isn''t followed by clearing PGC_allocaed and put_page() without any recording, there is no easy way to free the page until domain destruction. thnaks, -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> My understanding is as follows. > - PGC_allocated means +1 page reference count. > - It is neccessary for x86 paravirtualized domain to keep > page from freeing.More generally it provides the semantics that an allocated page is not freed until or after it has been explicitly deallocated (so the deallocation routine must test-and-clear that flag and -1 the refcnt if it was set).> - It doesn''t make sense for x86 hvm domain so that it is only cleared > by relinquish_memory().It will make sense as soon as x86 hvm guests can execute the decrease_reservation memory_op. Intel have prototype code for this already. It may not make 3.0.5-0, but it''s not *that* far out.> - If the p2m entry is cleared which isn''t followed by > clearing PGC_allocaed and put_page() without any recording, > there is no easy way to free the page until domain destruction.This is true. Arguably guest_physmap_add_page() should test-and-clear PGC_allocated if there was already a page mapped at that address. Or it should fail. Or we should expect the guest not to do something this silly. Really the correct answer is not immediately clear. I guess the first option makes most sense though, perhaps with a gdprintk(XENLOG_WARNING). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Isaku Yamahata
2007-Mar-16  03:25 UTC
Re: [Xen-devel] [PATCH] fix memory exchange hypercall.
On Thu, Mar 15, 2007 at 01:58:56PM +0000, Keir Fraser wrote:> > - If the p2m entry is cleared which isn''t followed by > > clearing PGC_allocaed and put_page() without any recording, > > there is no easy way to free the page until domain destruction. > > This is true. Arguably guest_physmap_add_page() should test-and-clear > PGC_allocated if there was already a page mapped at that address. Or it > should fail. Or we should expect the guest not to do something this silly. > > Really the correct answer is not immediately clear. I guess the first option > makes most sense though, perhaps with a gdprintk(XENLOG_WARNING).The ia64 p2m already adopted the first option without warning. So the guest_physmap_remove_page() depends on PGC_allocated bit. What''s the expected behavior of guset_physmap_remove_page() with PGC_allocated? It shouldn''t touch the bit? Thanks a lot for your clarification. -- yamahata _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 16/3/07 03:25, "Isaku Yamahata" <yamahata@valinux.co.jp> wrote:> The ia64 p2m already adopted the first option without warning. > So the guest_physmap_remove_page() depends on PGC_allocated bit. > What''s the expected behavior of guset_physmap_remove_page() with > PGC_allocated? It shouldn''t touch the bit? > > Thanks a lot for your clarification.I don''t think that guest_physmap_remove_page() should need to touch the PGC_allocated bit. All callers of that function can take responsibility for the page they unhook (that''s the case for all the callers from common code, at least). I''m more worried about callers of guest_physmap_add_page(): it may not be obvious to users of that function if they have to take responsibility for what used to be mapped at that pseudophysical address! -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel