The attached file supports AMD-V nested paging live migration. Please comment. I will create an updated version after collecting feedbacks. arch/x86/hvm/hvm.c | 2 arch/x86/hvm/io.c | 2 arch/x86/hvm/svm/svm.c | 3 arch/x86/mm.c | 12 +- arch/x86/mm/hap/hap.c | 220 +++++++++++++++++++++++++++++++++++++++++- arch/x86/mm/p2m.c | 92 +++++++++++++++-- arch/x86/mm/paging.c | 12 ++ include/asm-x86/domain.h | 8 + include/asm-x86/grant_table.h | 2 include/asm-x86/hap.h | 1 include/asm-x86/p2m.h | 5 include/asm-x86/page.h | 2 include/asm-x86/paging.h | 2 include/asm-x86/shadow.h | 7 - 14 files changed, 341 insertions(+), 29 deletions(-) Design: 1. We handle four live migration operators as follow: * XEN_DOMCTL_SHADOW_OP_ENABLE_LOGDIRTY ** Allocates log_dirty_bitmap ** Set log dirty bit in paging mode ** Goes through the P2M table and mark all physical memory as NOT WRITABLE ** Continues to run the guest as usual * XEN_DOMCTL_SHADOW_OP_PEEK ** There is nothing special here. It is pretty similar to shadow code. Just copy dirty bitmap information to live migration handler. * XEN_DOMCTL_SHADOW_OP_CLEAN ** Clean dirty bitmap to all 0''s. ** Goes through the P2M table and marks all physical memory as NOT WRITABLE ** Continues to run the guest as usual * XEN_DOMCTL_SHADOW_OP_OFF ** Fix P2M table and mark all physical memory as WRITABLE ** De-allocate dirty bitmap resources ** Clear log dirty bit in paging mode 2. We handle nested page fault as follow: * Nested Paging Fault ** If it is MMIO space, call handle_mmio() ** Otherwise, call p2m_fix_table() to mark a specific page as WRITABLE. Additionally, we call paging_mark_dirty() to update dirty bitmap. By doing this, we only receive one NPF for each dirty page (in each cycle). The following areas require special attention: 1. paging_mark_dirty() Currently, paging_mark_dirty() dispatches to sh_mark_dirty() or hap_mark_dirty() based on paging support. I personally prefer a function pointer. However, current paging interface only provides a function pointer for vcpu-level functions, not for domain-level functions. This is a bit annoying. 2. locking in p2m_set_l1e_flags() p2m_set_l1e_flags(), which is invoked by hap.c, calls hap_write_p2m_entry(). hap_lock() is called twice. I currently remove hap_lock() in hap_write_p2m_entry(). A better solution is needed here. Thanks, -Wei _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, Thanks for this patch. At 10:05 -0500 on 01 Jun (1180692316), Huang2, Wei wrote:> The attached file supports AMD-V nested paging live migration. Please > comment. I will create an updated version after collecting feedbacks.Can a lot more log-dirty code (bitmap allocation, clearing, reporting) be made common? E.g.: hap_mark_dirty() is virtually identical to sh_mark_dirty() -- including some recursive locking and associated comments that are not true in HAP modes. Maybe give it its own lock to cover bit-setting? Probably only the code for clearing the bitmap (i.e., resetting the trap that will cause us to mark pages dirty) needs to be split out.> The following areas require special attention: > 1. paging_mark_dirty() > Currently, paging_mark_dirty() dispatches to sh_mark_dirty() or > hap_mark_dirty() based on paging support. I personally prefer a function > pointer. However, current paging interface only provides a function > pointer for vcpu-level functions, not for domain-level functions. This > is a bit annoying.Make it a common function and that should go away.> 2. locking in p2m_set_l1e_flags() > p2m_set_l1e_flags(), which is invoked by hap.c, calls > hap_write_p2m_entry(). hap_lock() is called twice. I currently remove > hap_lock() in hap_write_p2m_entry(). A better solution is needed here.Hmm. Since you don''t ever change the monitor table of a HAP domain, it might be possible to make hap_write_p2m_entry (and hap.c:p2m_install_entry_in_monitors()) safe without locking. It is worth noting that this would be a different locking discipline from the one used in shadow code -- code paths that take both the p2m lock and the shadow lock always take the p2m lock first (there are some convolutions in shadow init routines etc to make sure this is true). If the hap lock is to be taken before the p2m lock that will need some care and attention in the rest of the code.> +int p2m_set_l1e_flags(struct domain *d, u32 l1e_flags) > +{[...]> + for ( entry = d->page_list.next; > + entry != &d->page_list; > + entry = entry->next ) > + {Why not just walk the p2m? It shouldn''t be very sparse.> +/* This function handles P2M page faults by fixing l1e flags with correct > + * values. It also calls paging_mark_dirty() function to record the dirty > + * pages. > + */ > +int p2m_fix_table(struct domain *d, paddr_t gpa)Can this have a better name? It''s not really fixing anything. Maybe have this be p2m_set_flags() and the previous function be p2m_set_flags_global()? Also maybe the call to mark_dirty could be made from the SVM code, which is where we''re really handling the write? Cheers, Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Retry. 1. Most common code are moved from shadow to paging: * log dirty related fields (dirty_count ...) are moved to paging_domain * log_dirty_bitmap allocation, free, peek, and clean * mark_dirty_page becomes a common function too * a new lock dirty lock is created to guard these code 2. shadow/hap_log_dirty_enable() and shadow/hap_log_dirty_disable() These four functions were not changed. However, I really want to create two common functions (paging_log_dirty_disable() and paging_log_dirty_enable()) for them. To do this, it requires two function pointers (*log_dirty_enable() and *log_dirty_disable()), which point to shadow-specific code or hap-specific code. For example, *log_dirty_enable() points to shadow_log_dirty_enable(). Tim, let me know if you like this approach. 3. p2m_set_l1e_flags() is renamed to p2m_set_flags_global() as requested. It does NOT walk P2M. Instead, it still relies on set_p2m_entry() to walk P2M table. The reason: I feel uncomfortable to duplicate the code of set_p2m_entry() in this method. Most of them will be same as set_p2m_entry() and p2m_next_level(). What is your opinion? Any comments is welcome. I will create a new patch after collecting them. Thanks, -Wei Tim Deegan wrote:> Hi, > > Thanks for this patch. > > At 10:05 -0500 on 01 Jun (1180692316), Huang2, Wei wrote: > > The attached file supports AMD-V nested paging live migration. Please > > comment. I will create an updated version after collecting feedbacks. > > Can a lot more log-dirty code (bitmap allocation, clearing, reporting) > be made common? E.g.: hap_mark_dirty() is virtually identical to > sh_mark_dirty() -- including some recursive locking and associated > comments that are not true in HAP modes. Maybe give it its own lock to > cover bit-setting? Probably only the code for clearing the bitmap > (i.e., resetting the trap that will cause us to mark pages dirty) needs > to be split out. > > > The following areas require special attention: > > 1. paging_mark_dirty() > > Currently, paging_mark_dirty() dispatches to sh_mark_dirty() or > > hap_mark_dirty() based on paging support. I personally prefer a function > > pointer. However, current paging interface only provides a function > > pointer for vcpu-level functions, not for domain-level functions. This > > is a bit annoying. > > Make it a common function and that should go away. > > > 2. locking in p2m_set_l1e_flags() > > p2m_set_l1e_flags(), which is invoked by hap.c, calls > > hap_write_p2m_entry(). hap_lock() is called twice. I currently remove > > hap_lock() in hap_write_p2m_entry(). A better solution is needed here. > > Hmm. Since you don''t ever change the monitor table of a HAP domain, it > might be possible to make hap_write_p2m_entry (and > hap.c:p2m_install_entry_in_monitors()) safe without locking. > > It is worth noting that this would be a different locking discipline > from the one used in shadow code -- code paths that take both the p2m > lock and the shadow lock always take the p2m lock first (there are some > convolutions in shadow init routines etc to make sure this is true). > If the hap lock is to be taken before the p2m lock that will need some > care and attention in the rest of the code. > > > > +/* This function handles P2M page faults by fixing l1e flags with > correct > > + * values. It also calls paging_mark_dirty() function to record the > dirty > > + * pages. > > + */ > > +int p2m_fix_table(struct domain *d, paddr_t gpa) > > Can this have a better name? It''s not really fixing anything. Maybe > have this be p2m_set_flags() and the previous function be > p2m_set_flags_global()? > > Also maybe the call to mark_dirty could be made from the SVM code, which > is where we''re really handling the write? > > Cheers, > > Tim. > > -- > Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited > Registered office c/o EC2Y 5EB, UK; company number 05334508 > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, At 23:29 -0500 on 05 Jun (1181086164), Wei Huang wrote:> 2. shadow/hap_log_dirty_enable() and shadow/hap_log_dirty_disable() > These four functions were not changed. However, I really want to create > two common functions (paging_log_dirty_disable() and > paging_log_dirty_enable()) for them. To do this, it requires two > function pointers (*log_dirty_enable() and *log_dirty_disable()), which > point to shadow-specific code or hap-specific code. For example, > *log_dirty_enable() points to shadow_log_dirty_enable(). > > Tim, let me know if you like this approach.Yep, that seems fine.> 3. p2m_set_l1e_flags() is renamed to p2m_set_flags_global() as > requested. It does NOT walk P2M. Instead, it still relies on > set_p2m_entry() to walk P2M table. > > The reason: I feel uncomfortable to duplicate the code of > set_p2m_entry() in this method. Most of them will be same as > set_p2m_entry() and p2m_next_level(). What is your opinion?I think it''d be fairly easy to do with a few nested loops since it doesn''t need to care about contents or changing the shape of the tree, or have to handle different PT layouts at run-time. I was worried about the cost of reading the struct page-info and the m2p and doing _two_ traverses of the p2m for every frame in the domain; but I don''t suppose that enabling log-dirty mode is too time-critical an operation. :) Cheers, Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Retry. All common functions were extracted to paging level. Plus, p2m_set_flags_global() does NOT rely on set_p2m_entry() anymore. Live_migrate_patch_all.txt is the complete patch. To make it clear, I further splitted it into two small ones (1: live_migrate_interface_patch.txt, 2: live_migrate_npt_patch.txt). Please comment. Thanks. -Wei Tim Deegan wrote:> Hi, > > At 23:29 -0500 on 05 Jun (1181086164), Wei Huang wrote: >> 2. shadow/hap_log_dirty_enable() and shadow/hap_log_dirty_disable() >> These four functions were not changed. However, I really want to >> create two common functions (paging_log_dirty_disable() and >> paging_log_dirty_enable()) for them. To do this, it requires two >> function pointers (*log_dirty_enable() and *log_dirty_disable()), >> which point to shadow-specific code or hap-specific code. For >> example, *log_dirty_enable() points to shadow_log_dirty_enable(). >> >> Tim, let me know if you like this approach. > > Yep, that seems fine. > >> 3. p2m_set_l1e_flags() is renamed to p2m_set_flags_global() as >> requested. It does NOT walk P2M. Instead, it still relies on >> set_p2m_entry() to walk P2M table. >> >> The reason: I feel uncomfortable to duplicate the code of >> set_p2m_entry() in this method. Most of them will be same as >> set_p2m_entry() and p2m_next_level(). What is your opinion? > > I think it''d be fairly easy to do with a few nested loops since it > doesn''t need to care about contents or changing the shape of the > tree, or have to handle different PT layouts at run-time. > > I was worried about the cost of reading the struct page-info and the > m2p and doing _two_ traverses of the p2m for every frame in the > domain; but I don''t suppose that enabling log-dirty mode is too > time-critical an operation. :) > > Cheers, > > Tim._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, This patch is much nicer. One or two more nits below, and a Signed-off-by: line, please. :) At 16:58 -0500 on 07 Jun (1181235517), Huang2, Wei wrote:> +int hap_enable_log_dirty(struct domain *d) > +{ > + /* turn on PG_log_dirty bit in paging mode */ > + d->arch.paging.mode |= PG_log_dirty; > + p2m_set_flags_global(d, (_PAGE_PRESENT|_PAGE_USER)); > + flush_tlb_all_pge(); > + > + return 0; > +} > + > +int hap_disable_log_dirty(struct domain *d) > +{ > + /* log dirty already accquired lock to guard this code */ > + d->arch.paging.mode &= ~PG_log_dirty; > + p2m_set_flags_global(d, __PAGE_HYPERVISOR|_PAGE_USER); > + > + return 1; > +}The log-dirty lock doesn''t guard against concurrent updates of d->arch.paging.mode! You need the HAP lock here.> int paging_domctl(struct domain *d, xen_domctl_shadow_op_t *sc, > XEN_GUEST_HANDLE(void) u_domctl) > { > + int rc; > + > + if ( unlikely(d == current->domain) ) > + { > + gdprintk(XENLOG_INFO, "Dom %u tried to do a shadow op on itself.\n",(and subsequently) s/shadow/paging/ here?> @@ -2565,7 +2568,7 @@ void shadow_teardown(struct domain *d) > if (d->arch.paging.shadow.hash_table) > shadow_hash_teardown(d); > /* Release the log-dirty bitmap of dirtied pages */ > - sh_free_log_dirty_bitmap(d); > + paging_free_log_dirty_bitmap(d);Shouldn''t this be handled in paging.c? Otherwise we''d need to acquire the log-dirty lock with the shadow lock held. Cheers, Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This patch creates a common interface for live migration. It also supports nested paging live migration. Signed-off-by: Wei Huang <wei.huang2@amd.com> arch/x86/hvm/hvm.c | 2 arch/x86/hvm/io.c | 2 arch/x86/hvm/svm/svm.c | 4 arch/x86/mm.c | 12 - arch/x86/mm/hap/hap.c | 58 ++++++- arch/x86/mm/p2m.c | 136 ++++++++++++++++- arch/x86/mm/paging.c | 323 ++++++++++++++++++++++++++++++++++++++++- arch/x86/mm/shadow/common.c | 330 +++++++----------------------------------- arch/x86/mm/shadow/multi.c | 12 - arch/x86/mm/shadow/private.h | 6 include/asm-x86/domain.h | 45 +++-- include/asm-x86/grant_table.h | 2 include/asm-x86/p2m.h | 5 include/asm-x86/paging.h | 26 +++ include/asm-x86/shadow.h | 18 +- 15 files changed, 642 insertions(+), 339 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
This patch creates a common interface for live migration. It also supports nested paging live migration. Signed-off-by: Wei Huang <wei.huang2@amd.com>>> @@ -2565,7 +2568,7 @@ void shadow_teardown(struct domain *d) >> if (d->arch.paging.shadow.hash_table) >> shadow_hash_teardown(d); >> /* Release the log-dirty bitmap of dirtied pages */ >> - sh_free_log_dirty_bitmap(d); >> + paging_free_log_dirty_bitmap(d); > > Shouldn''t this be handled in paging.c? Otherwise we''d need to > acquire the log-dirty lock with the shadow lock held. >Please ignore my previous patch. Most of time, log dirty will be turned on and off together. Under this assumption, my previous patch only removes paging_free_log_dirty_bitmap() from shadow.c and hap.c. But it does not call paging_free_log_dirty_bitmap() in paging_teardown(). The attached file is same as previous one, expect that it adds a paging_log_dirty_teardown() function to paging.c (see below). Thanks, -Wei ==============diff -r f270fef2fb60 -r 6323c8beb60c xen/arch/x86/mm/paging.c --- a/xen/arch/x86/mm/paging.c Fri Jun 08 05:10:28 2007 -0500 +++ b/xen/arch/x86/mm/paging.c Fri Jun 08 08:05:56 2007 -0500 @@ -305,6 +305,13 @@ void paging_log_dirty_init(struct domain d->arch.paging.log_dirty.clean_dirty_bitmap = clean_dirty_bitmap; } +/* This function fress log dirty bitmap resources. */ +void paging_log_dirty_teardown(struct domain*d) +{ + log_dirty_lock(d); + paging_free_log_dirty_bitmap(d); + log_dirty_unlock(d); +} /************************************************/ /* CODE FOR PAGING SUPPORT */ /************************************************/ @@ -390,6 +397,9 @@ int paging_domctl(struct domain *d, xen_ /* Call when destroying a domain */ void paging_teardown(struct domain *d) { + /* clean up log dirty resources. */ + paging_log_dirty_teardown(d); + if ( opt_hap_enabled && is_hvm_domain(d) ) hap_teardown(d); else _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel