Hi, I''ve been working on a project which is to minimize the overhead of page_fault when PV is in log_dirty mode. I wrote a function "sh_prebuild", it''s very much similar to sh_prefetch(), the major difference between them is: I set the "ft" as "ft_demand_write" rather than "ft_prefetch" when I call l1e_propagate_from_guest(). I just want to make the faulted page and its following pages writable and marked as dirty after a write_protect page_fault. The function "sh_prebuild" is called in the sh_page_fault() right before sh_prefetch(): ---------------sh_page_fault()----------------- if ( ( shadow_mode_log_dirty(v->domain) ) && ( ft == ft_demand_write ) ) sh_prebuild(.......); .............. I got problems when I migrate. The code can successfully migrate the domainU to another UNMODIFIED Xen. But when I migrate a domainU from UNMODIFIED Xen back to this MODIFIED Xen, the domU''s clock is frozen, I use "xm console" to get in domU''s console, but I just got about 6 TSC error messages and no other respond. I am using 32-bit PV domain, no-PSE. Does someone know any reasons? Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2009-Jun-24 10:22 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
Hi, At 17:44 +0100 on 23 Jun (1245779059), David Knight wrote:> I''ve been working on a project which is to minimize the > overhead of page_fault when PV is in log_dirty mode. I > wrote a function "sh_prebuild", it''s very much similar > to sh_prefetch(), the major difference between them is: > I set the "ft" as "ft_demand_write" rather than > "ft_prefetch" when I call l1e_propagate_from_guest().I''m not sure that''s a good idea. The point of log-dirty is that it lets the migration tool resend only the dirtied pages, and although your change will avoid some page faults, it will increase the number of frames that have to be retransmitted. Also, just setting ft_demand_write won''t always do what you want: in _sh_propagate, if the guest PTE''s _PAGE_DIRTY bit isn''t set it will mask out _PAGE_RW anyway. That said, I don''t understand why your live migrations are failing, especially not on the migration _to_ the modified Xen, since the idea you describe should be safe. Cheers, Tim.> I just want to make the faulted page and its following > pages writable and marked as dirty after a write_protect > page_fault. The function "sh_prebuild" is called in the > sh_page_fault() right before sh_prefetch(): > > ---------------sh_page_fault()----------------- > > if ( ( shadow_mode_log_dirty(v->domain) ) > && ( ft == ft_demand_write ) ) > sh_prebuild(.......); > > .............. > > I got problems when I migrate. The code can successfully > migrate the domainU to another UNMODIFIED Xen. But when I > migrate a domainU from UNMODIFIED Xen back to this MODIFIED > Xen, the domU''s clock is frozen, I use "xm console" to get > in domU''s console, but I just got about 6 TSC error messages > and no other respond. > > I am using 32-bit PV domain, no-PSE. > Does someone know any reasons? Thanks. > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Knight
2009-Jun-24 13:08 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
Thank you for your reply. > Hi, > > At 17:44 +0100 on 23 Jun (1245779059), David Knight wrote: >> I''ve been working on a project which is to minimize the >> overhead of page_fault when PV is in log_dirty mode. I >> wrote a function "sh_prebuild", it''s very much similar >> to sh_prefetch(), the major difference between them is: >> I set the "ft" as "ft_demand_write" rather than >> "ft_prefetch" when I call l1e_propagate_from_guest(). > > I''m not sure that''s a good idea. The point of log-dirty is that it lets > the migration tool resend only the dirtied pages, and although your > change will avoid some page faults, it will increase the number of > frames that have to be retransmitted. Yes, but according to our locality statistics. Marking the nearest 3 or 4 pages as writable will not be a big problem. > > Also, just setting ft_demand_write won''t always do what you want: in > _sh_propagate, if the guest PTE''s _PAGE_DIRTY bit isn''t set it will mask > out _PAGE_RW anyway. > > That said, I don''t understand why your live migrations are failing, > especially not on the migration _to_ the modified Xen, since the idea > you describe should be safe. > Could it be possible that xen is in log_dirty_mode when a DomU is immigrating into it ? > Cheers, > > Tim. > >> I just want to make the faulted page and its following >> pages writable and marked as dirty after a write_protect >> page_fault. The function "sh_prebuild" is called in the >> sh_page_fault() right before sh_prefetch(): >> >> ---------------sh_page_fault()----------------- >> >> if ( ( shadow_mode_log_dirty(v->domain) ) >> && ( ft == ft_demand_write ) ) >> sh_prebuild(.......); >> >> .............. >> >> I got problems when I migrate. The code can successfully >> migrate the domainU to another UNMODIFIED Xen. But when I >> migrate a domainU from UNMODIFIED Xen back to this MODIFIED >> Xen, the domU''s clock is frozen, I use "xm console" to get >> in domU''s console, but I just got about 6 TSC error messages >> and no other respond. >> >> I am using 32-bit PV domain, no-PSE. >> Does someone know any reasons? Thanks. >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2009-Jun-24 13:11 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
At 14:08 +0100 on 24 Jun (1245852490), David Knight wrote:> Could it be possible that xen is in log_dirty_mode when a DomU is > immigrating into it ?No; and if it were it wouldn''t matter since the domain being migrated into isn''t running. Cheers, Tim.> >> I just want to make the faulted page and its following > >> pages writable and marked as dirty after a write_protect > >> page_fault. The function "sh_prebuild" is called in the > >> sh_page_fault() right before sh_prefetch(): > >> > >> ---------------sh_page_fault()----------------- > >> > >> if ( ( shadow_mode_log_dirty(v->domain) ) > >> && ( ft == ft_demand_write ) ) > >> sh_prebuild(.......); > >> > >> .............. > >> > >> I got problems when I migrate. The code can successfully > >> migrate the domainU to another UNMODIFIED Xen. But when I > >> migrate a domainU from UNMODIFIED Xen back to this MODIFIED > >> Xen, the domU''s clock is frozen, I use "xm console" to get > >> in domU''s console, but I just got about 6 TSC error messages > >> and no other respond. > >> > >> I am using 32-bit PV domain, no-PSE. > >> Does someone know any reasons? Thanks. > >> > >> > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com<mailto:Xen-devel@lists.xensource.com> > >> http://lists.xensource.com/xen-devel > > > >-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Knight
2009-Jun-24 13:20 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
Thanks a lot. Tim Deegan wrote: At 14:08 +0100 on 24 Jun (1245852490), David Knight wrote: Could it be possible that xen is in log_dirty_mode when a DomU is immigrating into it ? No; and if it were it wouldn''t matter since the domain being migrated into isn''t running. Cheers, Tim. I just want to make the faulted page and its following pages writable and marked as dirty after a write_protect page_fault. The function "sh_prebuild" is called in the sh_page_fault() right before sh_prefetch(): ---------------sh_page_fault()----------------- if ( ( shadow_mode_log_dirty(v->domain) ) && ( ft == ft_demand_write ) ) sh_prebuild(.......); .............. I got problems when I migrate. The code can successfully migrate the domainU to another UNMODIFIED Xen. But when I migrate a domainU from UNMODIFIED Xen back to this MODIFIED Xen, the domU''s clock is frozen, I use "xm console" to get in domU''s console, but I just got about 6 TSC error messages and no other respond. I am using 32-bit PV domain, no-PSE. Does someone know any reasons? Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Knight
2009-Jun-25 12:56 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
Hi, Sorry to bother you again. I made a mistake when I explain my problem. My code can successfully migrate the domU TO MODIFIED Xen. But when a domU is migrated from Modified Xen to a Unmodified Xen. The clock is frozen. I got in console of that domU with "xm console" and got error messages: Timer ISR/0: Time went backwards: delta=-33260965833 delta_cpu=-33250965833 shadow=541029605945 off=567446658 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-31534624494 delta_cpu=-31524624494 shadow=543029632891 off=293761045 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-30670414300 delta_cpu=-30660414300 shadow=544029644694 off=157959445 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-29118214802 delta_cpu=-29108214802 shadow=545029657756 off=710145854 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-28974426202 delta_cpu=-28964426202 shadow=545029657756 off=853934423 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-28702177774 delta_cpu=-28692177774 shadow=546029671231 off=126169401 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-28542229390 delta_cpu=-28532229390 shadow=546029671231 off=286117763 processed=574858018252 cpu_processed=574848018252 0: 574848018252 Timer ISR/0: Time went backwards: delta=-28413103192 delta_cpu=-28403103192 shadow=546029671231 off=415243991 processed=574858018252 cpu_processed=574848018252 0: 574848018252 I could still login to that dom with SSH, I found the clock was not going. Here is my code, most of it is just a copy of sh_prefetch(); static void sh_prebuild(struct vcpu *v, walk_t gw, shadow_l1e_t *ptr_sl1e, mfn_t sl1mfn, unsigned int countl, unsigned int countr) { int i, distr,distl; gfn_t gfn; mfn_t gmfn; guest_l1e_t *gl1p = NULL, gl1e; shadow_l1e_t sl1e; u32 gflags; p2m_type_t p2mt; struct page_info *pg; distl = ((unsigned long)ptr_sl1e & ~PAGE_MASK) / sizeof sl1e; distr = (PAGE_SIZE - ((unsigned long)ptr_sl1e & ~PAGE_MASK)) / sizeof sl1e; if ( distl > countl ) distl = countl; if ( distr > countr + 1 ) distr = countr + 1; if ( !mfn_valid(gw.l1mfn) ) return; gl1p = sh_map_domain_page(gw.l1mfn); gl1p += guest_l1_table_offset(gw.va); ptr_sl1e -= distl; distl = 0-distl; for ( i = distl; i < distr ; i++ ) { if ( i == 0 ) { ptr_sl1e += 1; continue; } if ( ptr_sl1e->l1 != 0 ) break; if ( mfn_valid(gw.l1mfn) ) { gl1e = *(gl1p + i); gflags = guest_l1e_get_flags(gl1e); if ( (gflags & _PAGE_PRESENT) && (!(gflags & _PAGE_ACCESSED) || ((gflags & _PAGE_RW) && !(gflags & _PAGE_DIRTY))) ) break; } else { ASSERT(guest_l2e_get_flags(gw.l2e) & _PAGE_PSE); gl1e = guest_l1e_from_gfn( _gfn(gfn_x(guest_l1e_get_gfn(gw.l1e)) + i), guest_l1e_get_flags(gw.l1e)); } gfn = guest_l1e_get_gfn(gl1e); gmfn = gfn_to_mfn(v->domain, gfn, &p2mt); pg = mfn_to_page(gmfn); if ( mfn_valid(gmfn) && ((pg->u.inuse.type_info & PGT_type_mask)==PGT_writable_page) ) { l1e_propagate_from_guest(v, gl1e, gmfn, &sl1e, ft_demand_write, p2mt); (void) shadow_set_l1e(v, ptr_sl1e, sl1e, sl1mfn); } ptr_sl1e += 1; } if ( gl1p != NULL ) sh_unmap_domain_page(gl1p); } Tim Deegan wrote: At 14:08 +0100 on 24 Jun (1245852490), David Knight wrote: Could it be possible that xen is in log_dirty_mode when a DomU is immigrating into it ? No; and if it were it wouldn''t matter since the domain being migrated into isn''t running. Cheers, Tim. I just want to make the faulted page and its following pages writable and marked as dirty after a write_protect page_fault. The function "sh_prebuild" is called in the sh_page_fault() right before sh_prefetch(): ---------------sh_page_fault()----------------- if ( ( shadow_mode_log_dirty(v->domain) ) && ( ft == ft_demand_write ) ) sh_prebuild(.......); .............. I got problems when I migrate. The code can successfully migrate the domainU to another UNMODIFIED Xen. But when I migrate a domainU from UNMODIFIED Xen back to this MODIFIED Xen, the domU''s clock is frozen, I use "xm console" to get in domU''s console, but I just got about 6 TSC error messages and no other respond. I am using 32-bit PV domain, no-PSE. Does someone know any reasons? Thanks. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2009-Jun-25 13:23 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
Hi, At 13:56 +0100 on 25 Jun (1245938205), David Knight wrote:> I made a mistake when I explain my problem. My code can successfully > migrate the domU TO MODIFIED Xen. But when a domU is migrated > from Modified Xen to a Unmodified Xen. The clock is frozen.OK, that makes much more sense. It''s the log-dirty code in the Xen you''re migratin from that will cause the problems.> Here is my code, most of it is just a copy of sh_prefetch();You''ve done some pretty odd things to it. Why did you remove all the comments, for example? Why did you replace all the array indirections with pointer arithmetic? Anyway, the main thing that stands out is that you don''t handle the OOS optimization at all. Are you running a very old version of Xen? Cheers, Tim.> static void sh_prebuild(struct vcpu *v, walk_t gw, > shadow_l1e_t *ptr_sl1e, mfn_t sl1mfn, unsigned int countl, unsigned int countr) > { > int i, distr,distl; > gfn_t gfn; > mfn_t gmfn; > guest_l1e_t *gl1p = NULL, gl1e; > shadow_l1e_t sl1e; > u32 gflags; > p2m_type_t p2mt; > struct page_info *pg; > > distl = ((unsigned long)ptr_sl1e & ~PAGE_MASK) / sizeof sl1e; > distr = (PAGE_SIZE - ((unsigned long)ptr_sl1e & ~PAGE_MASK)) / sizeof sl1e; > > if ( distl > countl ) > distl = countl; > if ( distr > countr + 1 ) > distr = countr + 1; > > if ( !mfn_valid(gw.l1mfn) ) > return; > > gl1p = sh_map_domain_page(gw.l1mfn); > gl1p += guest_l1_table_offset(gw.va); > > ptr_sl1e -= distl; > distl = 0-distl; > > for ( i = distl; i < distr ; i++ ) > { > if ( i == 0 ) > { > ptr_sl1e += 1; > continue; > } > > if ( ptr_sl1e->l1 != 0 ) > break; > > if ( mfn_valid(gw.l1mfn) ) > { > gl1e = *(gl1p + i); > > gflags = guest_l1e_get_flags(gl1e); > if ( (gflags & _PAGE_PRESENT) > && (!(gflags & _PAGE_ACCESSED) > || ((gflags & _PAGE_RW) && !(gflags & _PAGE_DIRTY))) ) > break; > } > else > { > ASSERT(guest_l2e_get_flags(gw.l2e) & _PAGE_PSE); > gl1e = guest_l1e_from_gfn( > _gfn(gfn_x(guest_l1e_get_gfn(gw.l1e)) + i), > guest_l1e_get_flags(gw.l1e)); > } > > gfn = guest_l1e_get_gfn(gl1e); > gmfn = gfn_to_mfn(v->domain, gfn, &p2mt); > > pg = mfn_to_page(gmfn); > > if ( mfn_valid(gmfn) > && ((pg->u.inuse.type_info & PGT_type_mask)==PGT_writable_page) ) > { > l1e_propagate_from_guest(v, gl1e, gmfn, &sl1e, ft_demand_write, p2mt); > (void) shadow_set_l1e(v, ptr_sl1e, sl1e, sl1mfn); > } > > ptr_sl1e += 1; > } > if ( gl1p != NULL ) > sh_unmap_domain_page(gl1p); > }-- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
David Knight
2009-Jun-25 14:37 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
Tim Deegan wrote: Hi, At 13:56 +0100 on 25 Jun (1245938205), David Knight wrote: I made a mistake when I explain my problem. My code can successfully migrate the domU TO MODIFIED Xen. But when a domU is migrated from Modified Xen to a Unmodified Xen. The clock is frozen. OK, that makes much more sense. It''s the log-dirty code in the Xen you''re migratin from that will cause the problems. Here is my code, most of it is just a copy of sh_prefetch(); You''ve done some pretty odd things to it. Why did you remove all the comments, for example? Why did you replace all the array indirections with pointer arithmetic? Would pointer arithmetic become a problem? anyway, I didn''t get any warning during compilation. Anyway, the main thing that stands out is that you don''t handle the OOS optimization at all. Are you running a very old version of Xen? I am running XEN 3.3.1, our optimization is mainly for PV guest, so I remove OOS code. I know there will be problem when using HVM. Cheers, Tim. static void sh_prebuild(struct vcpu *v, walk_t gw, shadow_l1e_t *ptr_sl1e, mfn_t sl1mfn, unsigned int countl, unsigned int countr) { int i, distr,distl; gfn_t gfn; mfn_t gmfn; guest_l1e_t *gl1p = NULL, gl1e; shadow_l1e_t sl1e; u32 gflags; p2m_type_t p2mt; struct page_info *pg; distl = ((unsigned long)ptr_sl1e & ~PAGE_MASK) / sizeof sl1e; distr = (PAGE_SIZE - ((unsigned long)ptr_sl1e & ~PAGE_MASK)) / sizeof sl1e; if ( distl > countl ) distl = countl; if ( distr > countr + 1 ) distr = countr + 1; if ( !mfn_valid(gw.l1mfn) ) return; gl1p = sh_map_domain_page(gw.l1mfn); gl1p += guest_l1_table_offset(gw.va); ptr_sl1e -= distl; distl = 0-distl; for ( i = distl; i < distr ; i++ ) { if ( i == 0 ) { ptr_sl1e += 1; continue; } if ( ptr_sl1e->l1 != 0 ) break; if ( mfn_valid(gw.l1mfn) ) { gl1e = *(gl1p + i); gflags = guest_l1e_get_flags(gl1e); if ( (gflags & _PAGE_PRESENT) && (!(gflags & _PAGE_ACCESSED) || ((gflags & _PAGE_RW) && !(gflags & _PAGE_DIRTY))) ) break; } else { ASSERT(guest_l2e_get_flags(gw.l2e) & _PAGE_PSE); gl1e = guest_l1e_from_gfn( _gfn(gfn_x(guest_l1e_get_gfn(gw.l1e)) + i), guest_l1e_get_flags(gw.l1e)); } gfn = guest_l1e_get_gfn(gl1e); gmfn = gfn_to_mfn(v->domain, gfn, &p2mt); pg = mfn_to_page(gmfn); if ( mfn_valid(gmfn) && ((pg->u.inuse.type_info & PGT_type_mask)==PGT_writable_page) ) { l1e_propagate_from_guest(v, gl1e, gmfn, &sl1e, ft_demand_write, p2mt); (void) shadow_set_l1e(v, ptr_sl1e, sl1e, sl1mfn); } ptr_sl1e += 1; } if ( gl1p != NULL ) sh_unmap_domain_page(gl1p); } _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2009-Jun-25 14:51 UTC
Re: [Xen-devel] Questions about sh_prefetch and log_dirty
At 15:37 +0100 on 25 Jun (1245944264), David Knight wrote:>> You''ve done some pretty odd things to it. Why did you remove all the >> comments, for example? Why did you replace all the array indirections >> with pointer arithmetic? > > Would pointer arithmetic become a problem? anyway, I didn''t get any > warning during compilation.No, it just struck me as odd that you would bother to change it.> I am running XEN 3.3.1, our optimization is mainly for PV guest, so I > remove OOS code. I know there will be problem when using HVM.In that case I don''t see anything obvious that would cause the crash. Tim. -- Tim Deegan <Tim.Deegan@citrix.com> Principal Software Engineer, Citrix Systems (R&D) Ltd. [Company #02300071, SL9 0DZ, UK.] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thanks a lot. 2009/6/25 10:51 pm, Tim Deegan wrote:> At 15:37 +0100 on 25 Jun (1245944264), David Knight wrote: >>> You''ve done some pretty odd things to it. Why did you remove all >>> the >>> comments, for example? Why did you replace all the array >>> indirections >>> with pointer arithmetic? >> >> Would pointer arithmetic become a problem? anyway, I didn''t get any >> warning during compilation. > > No, it just struck me as odd that you would bother to change it. > >> I am running XEN 3.3.1, our optimization is mainly for PV guest, so I >> remove OOS code. I know there will be problem when using HVM. > > In that case I don''t see anything obvious that would cause the crash. > > Tim. > > -- > Tim Deegan <Tim.Deegan@citrix.com> > Principal Software Engineer, Citrix Systems (R&D) Ltd. > [Company #02300071, SL9 0DZ, UK.] > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel