James Song
2008-Nov-27  10:21 UTC
RE: RE: [Xen-devel] when timer go back in dom0 save and restore ormigrate, PV domain hung
Hi,
     Ok, now two machine A and B. the system-time of A is ahead of B. So wc_sec
of A is also bigger than B. When PV dom in A migrate to B, we haven''t
upate that PV dom''s wc_sec to equal with B. Ok, now we see pv
dom''s kernel:
    xen_sched_clock() in arch/86/xen/time.c andxen_clocksource_read() 
arch/x86/kernel/time_32-xen.c
  you will find if state_entry_time of its''s vcpu, because the
state_entry_time is initalized in machine A. this time it more big than
"now" of machine B. So no schedule, no system-update in Guest os.
I don''t whether did I describe it clearly.
>>> "Tian, Kevin" <kevin.tian@intel.com> 08/11/27 PM
9:18 >>>there''s a clock_was_set called for each settimeofday.
In latest kernel, clock_was_set will adjust CLOCK_REALTIME queue accordingly,
while in 2.6.18 it''s defined as a nop. That says, current domU would be
unable to handle wallclock change, but newer kernel with pvops could.  ---- yes,
it works for FV, but for a modified PV domain, mybe not.
 
for the issue reported in original thread, I agree that James should dig into
the hang and explain the exact reason first.
 
Thanks
Kevin
  From: Keir Fraser   [mailto:keir.fraser@eu.citrix.com] 
Sent: Wednesday, November 26,   2008 10:58 PM
To: Tian, Kevin; ''James Song'';   xen-devel@lists.xensource.com
Subject: Re: [Xen-devel] when timer go   back in dom0 save and restore or
migrate, PV domain hung
  
So what happens if someone changes wallclock using   ''date''?
That''s basically kind of what will appear to happen when s/r   occurs.
 -- Keir
On 26/11/08 14:32, "Tian, Kevin"   <kevin.tian@intel.com> wrote:
  hrtimer supports two timer bases: CLOCK_MONOTONIC and     CLOCK_REALTIME.
wall_to_monotonic is only added in former case, and for     latter instead TOD
is used directly per my reading. I did a quick search,     and it looks that
futex and ntp are using CLOCK_REALTIME. Also there''s one     vsyscall
gate which can pass CLOCK_REALTIME from caller     too.
Thanks,
Kevin
    
 
      mailto:keir.fraser@eu.citrix.com]       
Sent: Wednesday, November 26,  2008 10:26 PM
To:       Tian, Kevin; ''James Song'';       
xen-devel@lists.xensource.com
Subject: Re: [Xen-devel]       when timer go  back in dom0 save and restore or
migrate, PV domain       hung
 
hrtimers add       wall_to_monotonic to xtime to get a  timesource that
doesn''t (or       shouldn''t!) warp.
 -- Keir
On  26/11/08 14:20,       "Tian, Kevin" <kevin.tian@intel.com>  
wrote:
 
      how about hrtimers? one mode is CLOCK_REALTIME, which uses         
getnstimeofday as expiration. Once system time is changed either         in
local or  new machine, that expiration can''t be adjusted. but        
i''m not sure whether it  still makes sense to try hrtimers in a        
guest.
Thanks
Kevin
 
        
 
 
          mailto:keir.fraser@eu.citrix.com]            
Sent: Wednesday, November 26,  2008 10:11           PM
To:  Tian, Kevin; ''James Song'';            
xen-devel@lists.xensource.com
Subject: Re:           [Xen-devel]  when timer go  back in dom0 save and restore
or           migrate, PV domain  hung
 
The  problem           hasn''t been fully explained, but I can say  that
PV guests            expect system time to jump across s/r and deal with that.
For             example, Linux doesn''t use Xen system time internally,
but           uses its  progress  to periodically update jiffies, which         
does not warp across  s/r.
We have  had problems           corrupting wc_sec/wc_nsec in 
xc_domain_restore.c, but that was            fixed some time  ago.
 -- Keir
On           26/11/08 14:00, "Tian,  Kevin" 
<kevin.tian@intel.com>           wrote:
 
 
          This is not a s/r or lm specific issue. For example,            
system  time  can be changed even when pv guest is             running. Your
patch only  hacks restore  point once, and             wc_sec can still be
changed later  when system time is              changed on-the-fly  again.
IIRC, pv guest can catch up wall             clock change in timer  interrupt, 
and time_resume will             sync internal processed system  time with new
system  time             after restored. But I''m not sure whether 
it''s enough. Actually             the more  interesting is the uptime 
difference. For             example, timer with expiration  calculated on 
previous             system time may wait nearly infinite if uptime among  two  
boxes vary a lot. But I think such issue should have been             considered
already, e.g. some user tool assistance. I             think Keir can comment 
better              here.
BTW, do you happen to know what             exactly dom0 hangs on? In  some 
busy loop to catch up             time, or long delay to some critical  timer   
expiration?
Thanks,
Kevin
 
 
            
 
 
 
              mailto:xen-devel-bounces@lists.xensource.com]                 On
Behalf Of James  Song
Sent:               Tuesday,  November 25,  2008 4:02 PM
To:                  xen-devel@lists.xensource.com
Subject:                [Xen-devel] when  timer go  back in dom0 save and       
restore or  migrate, PV domain  hung
 
Hi,
   I                 find PV domin hung, When we take those steps
         1,                save PV  domain                 
         2,                 change system time of  PV domain back
         3,                restore   a PV domain                
        or                  
         1,                migrate  a PV domain  from Machine A to Machine      
B
         2,                the system   time of Machine B is slower than        
Machine  A.
   the  problem is                wc_sec will be  change when system-time
chanaged in               dom0  or restore in a   slower-system-time machine,   
but when restoring, xen  don''t  restore the wc_sec                of
share_info from xenstore and use native   one.               So guest os will
hang.
this patch will work for                this  issue.
 Thanks
 -- Song                 Wei
diff -r  a5ed0dbc829f                tools/libxc/xc_domain_restore.c
---                  a/tools/libxc/xc_domain_restore.c                  Tue  Nov
18  14:34:14 2008                +0800
+++  b/tools/libxc/xc_domain_restore.c                   Fri Nov 21   17:34:15
2008               +0800
@@ -328,6  +328,16                 @@
 
     /* For               info   only                */
     nr_pfns = 0;
+                     //jsong@novell.com, james               song
+      memset(&domctl, 0,                 sizeof(domctl));
+                   domctl.domain =   dom;
+                   domctl.cmd    =                  XEN_DOMCTL_restoredomain;
+                  frc =   do_domctl(xc_handle,                &domctl);
+     if ( frc  !=               0 )
+      {
+                             ERROR("Unable                 to set flag of 
restore.");
+                             goto                 out;
+                    }
 
     if                (   read_exact(io_fd, &p2m_size,              
sizeof(unsigned long))                  )
     {
@@               -1120,6 +1130,8                  @@
 
     /*               restore  saved  vcpu_info and arch  specific info         
*/
     MEMCPY_FIELD(new_shared_info,                  old_shared_info, vcpu_info);
+                     MEMCPY_FIELD(new_shared_info,               
old_shared_info,   wc_nsec);
+                   MEMCPY_FIELD(new_shared_info,                 
old_shared_info,                 wc_sec);
      MEMCPY_FIELD(new_shared_info,                 old_shared_info,            
arch);
 
     /*               clear  any  pending events and  the selector              
*/
diff -r  a5ed0dbc829f  xen/arch/x86/time.c
---                 a/xen/arch/x86/time.c     Tue Nov               18  14:34:14
2008 +0800
+++                 b/xen/arch/x86/time.c     Fri Nov               21 17:34:15
2008  +0800
@@   -689,7 +689,6                 @@
      wmb();
     (*version)++;
 }
-
 void                  update_vcpu_system_time(struct vcpu                 *v)
 {
      struct                cpu_time        *t;
@@               -703,7  +702,6                 @@
 
     if (                 u->tsc_timestamp ==  t->local_tsc_stamp         
)
          return;
-
      version_update_begin(&u->version);
 
      u->tsc_timestamp                     = t->local_tsc_stamp;
@@                 -713,14  +711,19                 @@
 
      version_update_end(&u->version);
 }
-
 void                  update_domain_wallclock_time(struct domain               
*d)
 {
      spin_lock(&wc_lock);
+                    if(d->after_restore  )
+                    {
+                         d->after_restore                =  0;
+                      goto   out;                //jsong@novell.com
+                    }
      version_update_begin(&shared_info(d,                  wc_version));
     shared_info(d,                 wc_sec)  =  wc_sec +                
d->time_offset_seconds;
     shared_info(d,                  wc_nsec) =                 wc_nsec;
      version_update_end(&shared_info(d,                  wc_version));
+out:
      spin_unlock(&wc_lock);
 }
 
@@                 -751,7 +754,6                @@
     u64                 x;
     u32 y,                _wc_sec,                 _wc_nsec;
     struct               domain                  *d;
-
     x =               (secs *  1000000000ULL)  + (u64)nsecs -                
system_time_base;
     y                =  do_div(x,  1000000000);
 
@@ -1050,7               +1052,6   @@
 struct tm                  wallclock_time(void)
 {
     uint64_t                  seconds;
-
     if               (  !wc_sec                  )
         return                 (struct tm) { 0  };
 
diff -r               a5ed0dbc829f   xen/common/domctl.c
---                a/xen/common/domctl.c      Tue Nov               18 14:34:14
2008 +0800
+++                  b/xen/common/domctl.c    Fri Nov               21  17:34:15
2008  +0800
@@  -24,7 +24,6               @@
 #include                 <asm/current.h>
 #include                  <public/domctl.h>
 #include                  <xsm/xsm.h>
-
 extern long                  arch_do_domctl(
     struct                xen_domctl  *op,  XEN_GUEST_HANDLE(xen_domctl_t)     
u_domctl);
 
@@  -315,6 +314,16                  @@
         ret                =                  0;
     }
      break;
+                   case XEN_DOMCTL_restoredomain:
+                   {
+                        struct               domain   *d;
+                       if ( (d  =                
rcu_lock_domain_by_id(op->domain)) == NULL                 )
+                             break;
+                         
+                        d->after_restore               =    1;
+                         rcu_unlock_domain(d);
+                         break;
+                   }
 
     case                  XEN_DOMCTL_createdomain:
     {
diff                 -r a5ed0dbc829f                xen/include/public/domctl.h
---                  a/xen/include/public/domctl.h                  Tue Nov 18 
14:34:14  2008                +0800
+++ b/xen/include/public/domctl.h                    Fri Nov 21  17:34:15 2008  
+0800
@@  -61,6 +61,7  @@
 #define                XEN_DOMCTL_destroydomain                      2
 #define                  XEN_DOMCTL_pausedomain                         3
 #define                 XEN_DOMCTL_unpausedomain                      4
+#define                 XEN_DOMCTL_restoredomain                       51
 #define                 XEN_DOMCTL_resumedomain                       27
 
 #define                  XEN_DOMCTL_getdomaininfo                     5
diff -r                 a5ed0dbc829f  xen/include/xen/sched.h
---                 a/xen/include/xen/sched.h     Tue               Nov 18
14:34:14 2008   +0800
+++                b/xen/include/xen/sched.h    Fri Nov 21               
17:34:15   2008 +0800
@@ -231,6 +231,7                 @@
      * cause a                 deadlock.  Acquirers don''t spin
waiting; they                  preempt.
      */
      spinlock_t                 hypercall_deadlock_mutex;
+    int                after_restore;                  //jsong@novell.com
 };
 
 struct                  domain_setup_info
---------------------------------------------------------------------------------------------
 Thanks
--Song                  wei
</kevin.tian@intel.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Keir Fraser
2008-Nov-27  10:51 UTC
Re: RE: RE: [Xen-devel] when timer go back in dom0 save and restore ormigrate, PV domain hung
Might this be a pv_ops bug in newer Linux kernels? I don¹t really get what you¹re describing though. -- Keir On 27/11/08 10:21, "James Song" <jsong@novell.com> wrote:> Hi, > Ok, now two machine A and B. the system-time of A is ahead of B. So > wc_sec of A is also bigger than B. When PV dom in A migrate to B, we haven''t > upate that PV dom''s wc_sec to equal with B. Ok, now we see pv dom''s kernel: > xen_sched_clock() in arch/86/xen/time.c andxen_clocksource_read() > arch/x86/kernel/time_32-xen.c > you will find if state_entry_time of its''s vcpu, because the > state_entry_time is initalized in machine A. this time it more big than "now" > of machine B. So no schedule, no system-update in Guest os. > I don''t whether did I describe it clearly. > >>>> >>> "Tian, Kevin" 08/11/27 PM 9:18 >>> > there''s a clock_was_set called for each settimeofday. In latest kernel, > clock_was_set will adjust CLOCK_REALTIME queue accordingly, while in 2.6.18 > it''s defined as a nop. That says, current domU would be unable to handle > wallclock change, but newer kernel with pvops could. ---- yes, it works for > FV, but for a modified PV domain, mybe not. > > for the issue reported in original thread, I agree that James should dig into > the hang and explain the exact reason first. > > Thanks > Kevin > >> >> >> >> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >> Sent: Wednesday, November 26, 2008 10:58 PM >> To: Tian, Kevin; ''James Song''; xen-devel@lists.xensource.com >> Subject: Re: [Xen-devel] when timer go back in dom0 save and restore or >> migrate, PV domain hung >> >> >> So what happens if someone changes wallclock using ''date''? That''s basically >> kind of what will appear to happen when s/r occurs. >> >> -- Keir >> >> On 26/11/08 14:32, "Tian, Kevin" <kevin.tian@intel.com> wrote: >> >> >>> hrtimer supports two timer bases: CLOCK_MONOTONIC and CLOCK_REALTIME. >>> wall_to_monotonic is only added in former case, and for latter instead TOD >>> is used directly per my reading. I did a quick search, and it looks that >>> futex and ntp are using CLOCK_REALTIME. Also there''s one vsyscall gate >>> which can pass CLOCK_REALTIME from caller too. >>> >>> Thanks, >>> Kevin >>> >>> >>>> >>>> >>>> >>>> >>>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >>>> <mailto:keir.fraser@eu.citrix.com%5D> >>>> Sent: Wednesday, November 26, 2008 10:26 PM >>>> To: Tian, Kevin; ''James Song''; xen-devel@lists.xensource.com >>>> Subject: Re: [Xen-devel] when timer go back in dom0 save and restore or >>>> migrate, PV domain hung >>>> >>>> >>>> hrtimers add wall_to_monotonic to xtime to get a timesource that doesn''t >>>> (or shouldn''t!) warp. >>>> >>>> -- Keir >>>> >>>> On 26/11/08 14:20, "Tian, Kevin" <kevin.tian@intel.com> wrote: >>>> >>>> >>>> >>>>> how about hrtimers? one mode is CLOCK_REALTIME, which uses >>>>> getnstimeofday as expiration. Once system time is changed either in local >>>>> or new machine, that expiration can''t be adjusted. but i''m not sure >>>>> whether it still makes sense to try hrtimers in a guest. >>>>> >>>>> Thanks >>>>> Kevin >>>>> >>>>> >>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] >>>>>> <mailto:keir.fraser@eu.citrix.com%5D> >>>>>> Sent: Wednesday, November 26, 2008 10:11 PM >>>>>> To: Tian, Kevin; ''James Song''; xen-devel@lists.xensource.com >>>>>> Subject: Re: [Xen-devel] when timer go back in dom0 save and restore >>>>>> or migrate, PV domain hung >>>>>> >>>>>> >>>>>> The problem hasn''t been fully explained, but I can say that PV guests >>>>>> expect system time to jump across s/r and deal with that. For example, >>>>>> Linux doesn''t use Xen system time internally, but uses its progress to >>>>>> periodically update jiffies, which does not warp across s/r. >>>>>> >>>>>> We have had problems corrupting wc_sec/wc_nsec in xc_domain_restore.c, >>>>>> but that was fixed some time ago. >>>>>> >>>>>> -- Keir >>>>>> >>>>>> On 26/11/08 14:00, "Tian, Kevin" <kevin.tian@intel.com> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> This is not a s/r or lm specific issue. For example, system time can >>>>>>> be changed even when pv guest is running. Your patch only hacks >>>>>>> restore point once, and wc_sec can still be changed later when system >>>>>>> time is changed on-the-fly again. >>>>>>> >>>>>>> IIRC, pv guest can catch up wall clock change in timer interrupt, and >>>>>>> time_resume will sync internal processed system time with new system >>>>>>> time after restored. But I''m not sure whether it''s enough. Actually >>>>>>> the more interesting is the uptime difference. For example, timer >>>>>>> with expiration calculated on previous system time may wait nearly >>>>>>> infinite if uptime among two boxes vary a lot. But I think such issue >>>>>>> should have been considered already, e.g. some user tool assistance. >>>>>>> I think Keir can comment better here. >>>>>>> >>>>>>> BTW, do you happen to know what exactly dom0 hangs on? In some busy >>>>>>> loop to catch up time, or long delay to some critical timer >>>>>>> expiration? >>>>>>> >>>>>>> Thanks, >>>>>>> Kevin >>>>>>> >>>>>>> >>>>>>> >>>>>>>From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel-bounces@lists.xensource.com] <mailto:xen-devel-bounces@lists.xensource.com%5D> On Behalf Of James Song Sent: Tuesday, November 25, 2008 4:02 PM To: xen-devel@lists.xensource.com Subject: [Xen-devel] when timer go back in dom0 save and restore or migrate, PV domain hung Hi, I find PV domin hung, When we take those steps 1, save PV domain 2, change system time of PV domain back 3, restore a PV domain or 1, migrate a PV domain from Machine A to Machine B 2, the system time of Machine B is slower than Machine A. the problem is wc_sec will be change when system-time chanaged in dom0 or restore in a slower-system-time machine, but when restoring, xen don''t restore the wc_sec of share_info from xenstore and use native one. So guest os will hang. this patch will work for this issue. Thanks -- Song Wei diff -r a5ed0dbc829f tools/libxc/xc_domain_restore.c --- a/tools/libxc/xc_domain_restore.c Tue Nov 18 14:34:14 2008 +0800 +++ b/tools/libxc/xc_domain_restore.c Fri Nov 21 17:34:15 2008 +0800 @@ -328,6 +328,16 @@ /* For info only */ nr_pfns = 0; + //jsong@novell.com, james song + memset(&domctl, 0, sizeof(domctl)); + domctl.domain = dom; + domctl.cmd = XEN_DOMCTL_restoredomain; + frc = do_domctl(xc_handle, &domctl); + if ( frc != 0 ) + { + ERROR("Unable to set flag of restore."); + goto out; + } if ( read_exact(io_fd, &p2m_size, sizeof(unsigned long)) ) { @@ -1120,6 +1130,8 @@ /* restore saved vcpu_info and arch specific info */ MEMCPY_FIELD(new_shared_info, old_shared_info, vcpu_info); + MEMCPY_FIELD(new_shared_info, old_shared_info, wc_nsec); + MEMCPY_FIELD(new_shared_info, old_shared_info, wc_sec); MEMCPY_FIELD(new_shared_info, old_shared_info, arch); /* clear any pending events and the selector */ diff -r a5ed0dbc829f xen/arch/x86/time.c --- a/xen/arch/x86/time.c Tue Nov 18 14:34:14 2008 +0800 +++ b/xen/arch/x86/time.c Fri Nov 21 17:34:15 2008 +0800 @@ -689,7 +689,6 @@ wmb(); (*version)++; } - void update_vcpu_system_time(struct vcpu *v) { struct cpu_time *t; @@ -703,7 +702,6 @@ if ( u->tsc_timestamp == t->local_tsc_stamp ) return; - version_update_begin(&u->version); u->tsc_timestamp = t->local_tsc_stamp; @@ -713,14 +711,19 @@ version_update_end(&u->version); } - void update_domain_wallclock_time(struct domain *d) { spin_lock(&wc_lock); + if(d->after_restore ) + { + d->after_restore = 0; + goto out; //jsong@novell.com + } version_update_begin(&shared_info(d, wc_version)); shared_info(d, wc_sec) = wc_sec + d->time_offset_seconds; shared_info(d, wc_nsec) = wc_nsec; version_update_end(&shared_info(d, wc_version)); +out: spin_unlock(&wc_lock); } @@ -751,7 +754,6 @@ u64 x; u32 y, _wc_sec, _wc_nsec; struct domain *d; - x = (secs * 1000000000ULL) + (u64)nsecs - system_time_base; y = do_div(x, 1000000000); @@ -1050,7 +1052,6 @@ struct tm wallclock_time(void) { uint64_t seconds; - if ( !wc_sec ) return (struct tm) { 0 }; diff -r a5ed0dbc829f xen/common/domctl.c --- a/xen/common/domctl.c Tue Nov 18 14:34:14 2008 +0800 +++ b/xen/common/domctl.c Fri Nov 21 17:34:15 2008 +0800 @@ -24,7 +24,6 @@ #include <asm/current.h> #include <public/domctl.h> #include <xsm/xsm.h> - extern long arch_do_domctl( struct xen_domctl *op, XEN_GUEST_HANDLE(xen_domctl_t) u_domctl); @@ -315,6 +314,16 @@ ret = 0; } break; + case XEN_DOMCTL_restoredomain: + { + struct domain *d; + if ( (d = rcu_lock_domain_by_id(op->domain)) == NULL ) + break; + + d->after_restore = 1; + rcu_unlock_domain(d); + break; + } case XEN_DOMCTL_createdomain: { diff -r a5ed0dbc829f xen/include/public/domctl.h --- a/xen/include/public/domctl.h Tue Nov 18 14:34:14 2008 +0800 +++ b/xen/include/public/domctl.h Fri Nov 21 17:34:15 2008 +0800 @@ -61,6 +61,7 @@ #define XEN_DOMCTL_destroydomain 2 #define XEN_DOMCTL_pausedomain 3 #define XEN_DOMCTL_unpausedomain 4 +#define XEN_DOMCTL_restoredomain 51 #define XEN_DOMCTL_resumedomain 27 #define XEN_DOMCTL_getdomaininfo 5 diff -r a5ed0dbc829f xen/include/xen/sched.h --- a/xen/include/xen/sched.h Tue Nov 18 14:34:14 2008 +0800 +++ b/xen/include/xen/sched.h Fri Nov 21 17:34:15 2008 +0800 @@ -231,6 +231,7 @@ * cause a deadlock. Acquirers don''t spin waiting; they preempt. */ spinlock_t hypercall_deadlock_mutex; + int after_restore; //jsong@novell.com }; struct domain_setup_info ---------------------------------------------------------------------------- ----------------- Thanks --Song wei>>>>>>> >>>>>> >>>>> >>>> >>> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jeremy Fitzhardinge
2008-Nov-27  17:51 UTC
Re: RE: RE: [Xen-devel] when timer go back in dom0 save and restore ormigrate, PV domain hung
Keir Fraser wrote:> Might this be a pv_ops bug in newer Linux kernels? I don’t really get > what you’re describing though. > > -- Keir > > On 27/11/08 10:21, "James Song" <jsong@novell.com> wrote: > > Hi, > Ok, now two machine A and B. the system-time of A is ahead of > B. So wc_sec of A is also bigger than B. When PV dom in A migrate > to B, we haven''t upate that PV dom''s wc_sec to equal with B. Ok, > now we see pv dom''s kernel: > xen_sched_clock() in arch/86/xen/time.c > andxen_clocksource_read() arch/x86/kernel/time_32-xen.c > you will find if state_entry_time of its''s vcpu, because the > state_entry_time is initalized in machine A. this time it more big > than "now" of machine B. So no schedule, no system-update in Guest os. > I don''t whether did I describe it clearly. >At one point I had some code in there to work out the delta between the system timestamps before and after save/restore, but I think I ended up deciding it wasn''t necessary because the clocksource and clockevents get reinitialized from scratch by the core clock code on resume. I don''t understand your mention of wc_sec, since the wallclock only used very occasionally, and never for scheduling. If this is in relation to the Novell forward-port kernel, perhaps you should look at what the mainline pvops xen code in this area. J _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel