Andres Lagar-Cavilla
2011-Dec-06 18:13 UTC
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination
> Date: Tue, 06 Dec 2011 18:07:03 +0100 > From: Olaf Hering <olaf@aepfle.de> > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] [PATCH 2 of 5] xenpaging: map gfn before > nomination > Message-ID: <96d3292797d861592a7d.1323191223@probook.site> > Content-Type: text/plain; charset="us-ascii" > > # HG changeset patch > # User Olaf Hering <olaf@aepfle.de> > # Date 1323189147 -3600 > # Node ID 96d3292797d861592a7d2d3840f371ec719775a9 > # Parent b733498b351a8650b2d952aa56725f63d49c1889 > xenpaging: map gfn before nomination > > If the gfn is mapped before nomination, all special cases in > do_mmu_update() > for paged gfns can be removed. If a gfn is actually in any of the paging > states the caller has to try again. > > Bump interface age.Ouch! You''re absolutely tying the user space pager with the underlying xen paging code. Most of your patches change the tools and the hypervisor in lockstep. These things are independent. Let me clarify this: I''m working on a pager. I rely on the current interface. I''m not opposed to changes that optimize the paging code in Xen, but I am not in favor of changes that change the interface in a rather gratuitous manner. Patch 4 in your series is one such case. Short-cutting the state machine: great. But what is the gain for the hypervisor in *not* sending the EVICT_FAIL event. It''s a good thing. It keeps the same interface to user-space. Xenpaging may not need it, but the Xen paging code does not exist solely for xenpaging. Patch 5 also ties xenpaging and xen code together. I don''t mind this patch, but it should definitely be split in two. Patch 1 is also a problem. I don''t buy that we can number interfaces, and then only support the tip ("Sorry, ENOEXEC, dig out an older hypervisor"). Either we bite the bullet of some level of backwards compatibility with all its messiness, or we just keep it in flux until there is convergence on a major "barrier". Patch 3 makes a lot of sense. This current patch 2 is also rather gratuitous. The need to map before nominate feels superfluous. If Xen cannot deal with nomination requests on pages that have not been mapped, then we''ve broken Xen. Anyway, I hope my message gets across. I''m not trying to organize a blockade on your code. I''m trying to get to the best possible hypervisor paging support we can. Thanks, Andres> > Signed-off-by: Olaf Hering <olaf@aepfle.de> > > diff -r b733498b351a -r 96d3292797d8 tools/xenpaging/xenpaging.c > --- a/tools/xenpaging/xenpaging.c > +++ b/tools/xenpaging/xenpaging.c > @@ -576,7 +576,7 @@ static int xenpaging_evict_page(xenpagin > > DECLARE_DOMCTL; > > - /* Map page */ > + /* Map page to get a handle */ > gfn = victim->gfn; > ret = -EFAULT; > page = xc_map_foreign_pages(xch, paging->mem_event.domain_id, > @@ -587,16 +587,21 @@ static int xenpaging_evict_page(xenpagin > goto out; > } > > + /* Nominate the page */ > + ret = xc_mem_paging_nominate(xch, paging->mem_event.domain_id, gfn); > + if ( ret != 0 ) > + goto out; > + > /* Copy page */ > ret = write_page(fd, page, i); > if ( ret != 0 ) > { > PERROR("Error copying page %lx", victim->gfn); > - munmap(page, PAGE_SIZE); > goto out; > } > > munmap(page, PAGE_SIZE); > + page = NULL; > > /* Tell Xen to evict page */ > ret = xc_mem_paging_evict(xch, paging->mem_event.domain_id, > @@ -615,6 +620,8 @@ static int xenpaging_evict_page(xenpagin > paging->num_paged_out++; > > out: > + if (page) > + munmap(page, PAGE_SIZE); > return ret; > } > > @@ -738,14 +745,11 @@ static int evict_victim(xenpaging_t *pag > ret = -EINTR; > goto out; > } > - ret = xc_mem_paging_nominate(xch, paging->mem_event.domain_id, > victim->gfn); > - if ( ret == 0 ) > - ret = xenpaging_evict_page(paging, victim, fd, i); > - else > + ret = xenpaging_evict_page(paging, victim, fd, i); > + if ( ret && j++ % 1000 == 0 ) > { > - if ( j++ % 1000 == 0 ) > - if ( xenpaging_mem_paging_flush_ioemu_cache(paging) ) > - PERROR("Error flushing ioemu cache"); > + if ( xenpaging_mem_paging_flush_ioemu_cache(paging) ) > + PERROR("Error flushing ioemu cache"); > } > } > while ( ret ); > diff -r b733498b351a -r 96d3292797d8 xen/arch/x86/mm/p2m.c > --- a/xen/arch/x86/mm/p2m.c > +++ b/xen/arch/x86/mm/p2m.c > @@ -723,7 +723,7 @@ set_shared_p2m_entry(struct domain *d, u > * - the gfn is backed by a mfn > * - the p2mt of the gfn is pageable > * - the mfn is not used for IO > - * - the mfn has exactly one user and has no special meaning > + * - the mfn has exactly two users (guest+pager) and has no special > meaning > * > * Once the p2mt is changed the page is readonly for the guest. On > success the > * pager can write the page contents to disk and later evict the page. > @@ -758,7 +758,7 @@ int p2m_mem_paging_nominate(struct domai > /* Check page count and type */ > page = mfn_to_page(mfn); > if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !> - (1 | PGC_allocated) ) > + (2 | PGC_allocated) ) > goto out; > > if ( (page->u.inuse.type_info & PGT_type_mask) != PGT_none ) > @@ -785,7 +785,7 @@ int p2m_mem_paging_nominate(struct domai > * freed: > * - the gfn is backed by a mfn > * - the gfn was nominated > - * - the mfn has still exactly one user and has no special meaning > + * - the mfn has still exactly one user (the guest) and has no special > meaning > * > * After successful nomination some other process could have mapped the > page. In > * this case eviction can not be done. If the gfn was populated before > the pager > diff -r b733498b351a -r 96d3292797d8 xen/include/public/mem_event.h > --- a/xen/include/public/mem_event.h > +++ b/xen/include/public/mem_event.h > @@ -49,7 +49,7 @@ > #define MEM_EVENT_REASON_INT3 5 /* int3 was hit: gla/gfn are > RIP */ > #define MEM_EVENT_REASON_SINGLESTEP 6 /* single step was invoked: > gla/gfn are RIP */ > > -#define MEM_EVENT_PAGING_AGE 1UL /* Number distinguish the > mem_paging <-> pager interface */ > +#define MEM_EVENT_PAGING_AGE 2UL /* Number distinguish the > mem_paging <-> pager interface */ > > typedef struct mem_event_shared_page { > uint32_t port; > > > > ------------------------------ > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > > End of Xen-devel Digest, Vol 82, Issue 93 > ***************************************** >
On Tue, Dec 06, Andres Lagar-Cavilla wrote:> Ouch! You''re absolutely tying the user space pager with the underlying xen > paging code. Most of your patches change the tools and the hypervisor in > lockstep.Yes, pager and hypervisor are bound closely together.> Patch 4 in your series is one such case. Short-cutting the state machine: > great. But what is the gain for the hypervisor in *not* sending the > EVICT_FAIL event. It''s a good thing. It keeps the same interface to > user-space. Xenpaging may not need it, but the Xen paging code does not > exist solely for xenpaging.What IS the need to send yet another request? It adds just overhead for no obvious need. Please show the code that will benefit from the extra EVICT_FAIL message.> Patch 1 is also a problem. I don''t buy that we can number interfaces, and > then only support the tip ("Sorry, ENOEXEC, dig out an older hypervisor"). > Either we bite the bullet of some level of backwards compatibility with > all its messiness, or we just keep it in flux until there is convergence > on a major "barrier".An out-of-tree pager can very well try to support a number of hypervisors, perhaps patch #1 can be extended to report the age in some way. But why would the upstream pager care about (development!) state from the past?> This current patch 2 is also rather gratuitous. The need to map before > nominate feels superfluous. If Xen cannot deal with nomination requests on > pages that have not been mapped, then we''ve broken Xen.Why? It was broken (or rather: not optimal) in the first place. Any attempt to map a gfn in any paging state has to return -ENOENT right away. The pager is certainly a special case, and thanks to this change, and your change for the page-in part, the special handling can now get removed.> Anyway, I hope my message gets across. I''m not trying to organize a > blockade on your code. I''m trying to get to the best possible hypervisor > paging support we can.Great. Then please improve tools/xenpaging/, thats the cure for NIH. Olaf
At 10:27 +0100 on 07 Dec (1323253657), Olaf Hering wrote:> On Tue, Dec 06, Andres Lagar-Cavilla wrote: > > > Ouch! You''re absolutely tying the user space pager with the underlying xen > > paging code. Most of your patches change the tools and the hypervisor in > > lockstep. > > Yes, pager and hypervisor are bound closely together. > > > Patch 4 in your series is one such case. Short-cutting the state machine: > > great. But what is the gain for the hypervisor in *not* sending the > > EVICT_FAIL event. It''s a good thing. It keeps the same interface to > > user-space. Xenpaging may not need it, but the Xen paging code does not > > exist solely for xenpaging. > > What IS the need to send yet another request? It adds just overhead for > no obvious need. Please show the code that will benefit from the extra > EVICT_FAIL message.While in general it''s good to keep backward compatibility, I don''t think this interface has ever had a stable, usable release (that didn''t involve the consumer at least carrying hypervisor patches of their own) so I think it''s OK to make fairly large changes in it. Tim.
On Wed, Dec 07, Tim Deegan wrote:> While in general it''s good to keep backward compatibility, I don''t think > this interface has ever had a stable, usable release (that didn''t > involve the consumer at least carrying hypervisor patches of their own) > so I think it''s OK to make fairly large changes in it.Yes, that what I forgot to mention. 4.0 worked somehow, 4.1 was unfortunately broken. Hopefully everything will be ready by 4.2-rc1. Olaf
Andres Lagar-Cavilla
2011-Dec-07 16:22 UTC
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination
> On Tue, Dec 06, Andres Lagar-Cavilla wrote: > >> Ouch! You''re absolutely tying the user space pager with the underlying >> xen >> paging code. Most of your patches change the tools and the hypervisor in >> lockstep. > > Yes, pager and hypervisor are bound closely together. > >> Patch 4 in your series is one such case. Short-cutting the state >> machine: >> great. But what is the gain for the hypervisor in *not* sending the >> EVICT_FAIL event. It''s a good thing. It keeps the same interface to >> user-space. Xenpaging may not need it, but the Xen paging code does not >> exist solely for xenpaging. > > What IS the need to send yet another request? It adds just overhead for > no obvious need. Please show the code that will benefit from the extra > EVICT_FAIL message.I have mentioned the reasoning in a previous email. We need to look at the big picture when it comes to overhead. Sending an event is, what, nanoseconds? -- particularly when guest events are guaranteed to succeed and not perhaps go to sleep on a wait queue. The I/O involved is an overhead orders of magnitude much larger. When you have hundreds (thousands?) of VMs pounding your storage fabric, it becomes germane to avoid unnecessary I/O. A pager that performs a batch of nominations, I/O, and evictions, is not at all unreasonable. It will benefit greatly from early "do not evict" notices. The cost/benefit equation here is drastically skewed in favor of one event, versus one I/O.> >> Patch 1 is also a problem. I don''t buy that we can number interfaces, >> and >> then only support the tip ("Sorry, ENOEXEC, dig out an older >> hypervisor"). >> Either we bite the bullet of some level of backwards compatibility with >> all its messiness, or we just keep it in flux until there is convergence >> on a major "barrier". > > An out-of-tree pager can very well try to support a number of > hypervisors, perhaps patch #1 can be extended to report the age in some > way. But why would the upstream pager care about (development!) state > from the past? > > >> This current patch 2 is also rather gratuitous. The need to map before >> nominate feels superfluous. If Xen cannot deal with nomination requests >> on >> pages that have not been mapped, then we''ve broken Xen. > > Why? It was broken (or rather: not optimal) in the first place. > Any attempt to map a gfn in any paging state has to return -ENOENT > right away. The pager is certainly a special case, and thanks to this > change, and your change for the page-in part, the special handling can now > get removed.If you''re pushing for this change because you want to get rid from a few lines of (correct) error checking in the hypervisor, that''s not a good reason imho. If you have a more compelling reason, sure.> >> Anyway, I hope my message gets across. I''m not trying to organize a >> blockade on your code. I''m trying to get to the best possible hypervisor >> paging support we can. > > Great. > Then please improve tools/xenpaging/, thats the cure for NIH.Woaah. Please read my email carefully. This is not about NIH. This about changes that bring no tangible benefits and break other people''s code. Andres> > Olaf >
Andres Lagar-Cavilla
2011-Dec-07 16:23 UTC
Re: [PATCH 2 of 5] xenpaging: map gfn before nomination
> At 10:27 +0100 on 07 Dec (1323253657), Olaf Hering wrote: >> On Tue, Dec 06, Andres Lagar-Cavilla wrote: >> >> > Ouch! You''re absolutely tying the user space pager with the underlying >> xen >> > paging code. Most of your patches change the tools and the hypervisor >> in >> > lockstep. >> >> Yes, pager and hypervisor are bound closely together. >> >> > Patch 4 in your series is one such case. Short-cutting the state >> machine: >> > great. But what is the gain for the hypervisor in *not* sending the >> > EVICT_FAIL event. It''s a good thing. It keeps the same interface to >> > user-space. Xenpaging may not need it, but the Xen paging code does >> not >> > exist solely for xenpaging. >> >> What IS the need to send yet another request? It adds just overhead for >> no obvious need. Please show the code that will benefit from the extra >> EVICT_FAIL message. > > While in general it''s good to keep backward compatibility, I don''t think > this interface has ever had a stable, usable release (that didn''t > involve the consumer at least carrying hypervisor patches of their own) > so I think it''s OK to make fairly large changes in it.Would you agree that targeting 4.2 as a reasonable long-term interface is a. feasible? b. worthy? Andres> > Tim. >
At 08:23 -0800 on 07 Dec (1323246231), Andres Lagar-Cavilla wrote:> > At 10:27 +0100 on 07 Dec (1323253657), Olaf Hering wrote: > >> On Tue, Dec 06, Andres Lagar-Cavilla wrote: > >> > >> > Ouch! You''re absolutely tying the user space pager with the underlying > >> xen > >> > paging code. Most of your patches change the tools and the hypervisor > >> in > >> > lockstep. > >> > >> Yes, pager and hypervisor are bound closely together. > >> > >> > Patch 4 in your series is one such case. Short-cutting the state > >> machine: > >> > great. But what is the gain for the hypervisor in *not* sending the > >> > EVICT_FAIL event. It''s a good thing. It keeps the same interface to > >> > user-space. Xenpaging may not need it, but the Xen paging code does > >> not > >> > exist solely for xenpaging. > >> > >> What IS the need to send yet another request? It adds just overhead for > >> no obvious need. Please show the code that will benefit from the extra > >> EVICT_FAIL message. > > > > While in general it''s good to keep backward compatibility, I don''t think > > this interface has ever had a stable, usable release (that didn''t > > involve the consumer at least carrying hypervisor patches of their own) > > so I think it''s OK to make fairly large changes in it. > Would you agree that targeting 4.2 as a reasonable long-term interface is > a. feasible? b. worthy?Yes, and yes. Tim.