(Moving from offlist discussion.) I''m interested in opinions... Assume there are four single vcpu domains A, B, C, D, running on a 2-CPU physical machine. We wish to test for time skew on domain A. Assuming B, C, and D are all running some workload that attempts to fully saturate the (single) cpu. 1) Should the affect on domain A be essentially the same regardless of what load (e.g., compile, lmbench, or just "while(1);") is running in B, C, and D? 2) Should "xm sched-credit -d A -c 50" have the same result (e.g. no other domains need be run)? If the load on other domains can affect time skew on domain A, this raises isolation questions. And it makes time skew testing much harder (What loads and real-customer situations can cause more skew?) If the load on other domains canNOT affect time skew on domain A, testing for time skew becomes a lot easier. (Use sched-credit instead of launching multiple domains.) Comments? My preliminary testing has been inconclusive. Dan -----Original Message----- From: Dave Winchell [mailto:dwinchell@virtualiron.com] Sent: Thursday, June 12, 2008 4:14 PM To: dan.magenheimer@oracle.com Cc: Dave Winchell Subject: RE: xen hpet patch Dan, Usually forcing "out-of-context" is more stressful. I think doing it with real domains under load is more realistic. However, the scheduling thing may be equivalent - I just haven''t looked into it or thought about it. thanks, Dave -----Original Message----- From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Sent: Thu 6/12/2008 5:13 PM To: Dave Winchell Subject: RE: xen hpet patch One more thought while waiting for compile and reboot: Am I right that all of the policies are correcting for when a domain "A" is out-of-context? There''s nothing in any other domain "B" that can account for any timer loss/gain in domain "A". The only reason we are running other domains is to ensure that domain "A" is sometimes out-of-context, and the more it is out-of-context, the more likely we will observe a problem, correct? If this is true, it doesn''t matter what workload is run in the non-A domains... as long as it is loading the CPU(s), thus ensuring that domain A is sometimes not scheduled on any CPU. And if all this is true, we may not need to run other domains at all... running "xm sched-credit -d A -c 50" should result in domain A being out-of-context at least half the time. Dan>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
<<Roger Cruz.vcf>> Hello fellow developers, I ran across a curious observation this week when trying to analyze the Debian 4.0 VM kernel''s printk timestamps against walltime. It looks as if the kernel''s timestamps are running slow. On a reference debian 4.0 VM (created from Citrix templates), I ran the following script after booting the OS with the "time" option in the grub boot line. You can see that the kernel time in the brackets (host up time in seconds) should have given us a difference of 3600 seconds when ran over an hr period. I, however, got 3592.770286, a slip of 7+ seconds. I''ve ran a similar experiment and over the course of 12hrs, the kernel timestamps lose 97 seconds! So unless I''m misunderstanding the printk times, the problem is either with the way the Linux OS converts TSC cycles to seconds not being very accurate OR how the hypervisor feeds the VM-ified TSC counts to the guest VM. Note that I''m sure the walltime was correct as I could monitor the printed walltime against an atomic clock in another window and it always matched (within the precision of my eyes moving back and forth, of course). Have any of you seen something similar or have an explanation? grub line title Debian GNU/Linux, kernel xenu-2.6.18.xs4.1.0.1168.6013 root (hd0,0) kernel /boot/vmlinuz-xenu-2.6.18.xs4.1.0.1168.6013 root=/dev/xvda1 ro time savedefault #/sbin/bash i="0" while [ $i -lt 10000 ] do date -u "+secs since 1970: %s, time:%c" > /dev/kmsg sleep 60 i=$[$i + 1] done Reference Debian VM: end time 91158.761414 - start time 87565.991128 = 3592.770286. It should have been 3600 seconds Epoch time: 1213306731 - 1213303130 = 3601 close enough to 3600. p20deb:~# ./timetest [87565.991128] secs since 1970: 1213303130, time:Thu Jun 12 20:38:50 2008 [87625.865258] secs since 1970: 1213303190, time:Thu Jun 12 20:39:50 2008 [87685.745109] secs since 1970: 1213303250, time:Thu Jun 12 20:40:50 2008 [87745.624397] secs since 1970: 1213303310, time:Thu Jun 12 20:41:50 2008 [87805.504331] secs since 1970: 1213303370, time:Thu Jun 12 20:42:50 2008 [87865.383613] secs since 1970: 1213303430, time:Thu Jun 12 20:43:50 2008 [87925.263261] secs since 1970: 1213303490, time:Thu Jun 12 20:44:50 2008 [87985.142794] secs since 1970: 1213303550, time:Thu Jun 12 20:45:50 2008 [88045.022465] secs since 1970: 1213303610, time:Thu Jun 12 20:46:50 2008 [88104.901993] secs since 1970: 1213303670, time:Thu Jun 12 20:47:50 2008 [88164.781555] secs since 1970: 1213303730, time:Thu Jun 12 20:48:50 2008 [88224.661193] secs since 1970: 1213303790, time:Thu Jun 12 20:49:50 2008 [88284.540777] secs since 1970: 1213303850, time:Thu Jun 12 20:50:50 2008 [88344.420448] secs since 1970: 1213303910, time:Thu Jun 12 20:51:50 2008 [88404.299957] secs since 1970: 1213303970, time:Thu Jun 12 20:52:50 2008 [88464.179722] secs since 1970: 1213304030, time:Thu Jun 12 20:53:50 2008 [88524.059229] secs since 1970: 1213304090, time:Thu Jun 12 20:54:50 2008 [88583.938799] secs since 1970: 1213304150, time:Thu Jun 12 20:55:50 2008 [88643.818430] secs since 1970: 1213304210, time:Thu Jun 12 20:56:50 2008 [88703.698102] secs since 1970: 1213304270, time:Thu Jun 12 20:57:50 2008 [88763.577552] secs since 1970: 1213304330, time:Thu Jun 12 20:58:50 2008 [88823.457180] secs since 1970: 1213304390, time:Thu Jun 12 20:59:50 2008 [88883.336805] secs since 1970: 1213304450, time:Thu Jun 12 21:00:50 2008 [88943.216354] secs since 1970: 1213304510, time:Thu Jun 12 21:01:50 2008 [89003.095940] secs since 1970: 1213304570, time:Thu Jun 12 21:02:50 2008 [89062.975550] secs since 1970: 1213304630, time:Thu Jun 12 21:03:50 2008 [89122.855184] secs since 1970: 1213304690, time:Thu Jun 12 21:04:50 2008 [89182.734796] secs since 1970: 1213304750, time:Thu Jun 12 21:05:50 2008 [89242.614451] secs since 1970: 1213304810, time:Thu Jun 12 21:06:50 2008 [89302.494083] secs since 1970: 1213304870, time:Thu Jun 12 21:07:50 2008 [89362.373748] secs since 1970: 1213304930, time:Thu Jun 12 21:08:50 2008 [89422.253149] secs since 1970: 1213304990, time:Thu Jun 12 21:09:50 2008 [89482.132723] secs since 1970: 1213305050, time:Thu Jun 12 21:10:50 2008 [89542.012413] secs since 1970: 1213305110, time:Thu Jun 12 21:11:50 2008 [89601.892321] secs since 1970: 1213305171, time:Thu Jun 12 21:12:51 2008 [89661.771534] secs since 1970: 1213305231, time:Thu Jun 12 21:13:51 2008 [89721.651093] secs since 1970: 1213305291, time:Thu Jun 12 21:14:51 2008 [89781.530707] secs since 1970: 1213305351, time:Thu Jun 12 21:15:51 2008 [89841.410355] secs since 1970: 1213305411, time:Thu Jun 12 21:16:51 2008 [89901.289916] secs since 1970: 1213305471, time:Thu Jun 12 21:17:51 2008 [89961.169452] secs since 1970: 1213305531, time:Thu Jun 12 21:18:51 2008 [90021.049277] secs since 1970: 1213305591, time:Thu Jun 12 21:19:51 2008 [90080.928731] secs since 1970: 1213305651, time:Thu Jun 12 21:20:51 2008 [90140.808362] secs since 1970: 1213305711, time:Thu Jun 12 21:21:51 2008 [90200.687903] secs since 1970: 1213305771, time:Thu Jun 12 21:22:51 2008 [90260.567678] secs since 1970: 1213305831, time:Thu Jun 12 21:23:51 2008 [90320.447062] secs since 1970: 1213305891, time:Thu Jun 12 21:24:51 2008 [90380.326679] secs since 1970: 1213305951, time:Thu Jun 12 21:25:51 2008 [90440.206252] secs since 1970: 1213306011, time:Thu Jun 12 21:26:51 2008 [90500.085917] secs since 1970: 1213306071, time:Thu Jun 12 21:27:51 2008 [90559.965484] secs since 1970: 1213306131, time:Thu Jun 12 21:28:51 2008 [90619.845052] secs since 1970: 1213306191, time:Thu Jun 12 21:29:51 2008 [90679.724900] secs since 1970: 1213306251, time:Thu Jun 12 21:30:51 2008 [90739.604366] secs since 1970: 1213306311, time:Thu Jun 12 21:31:51 2008 [90799.483803] secs since 1970: 1213306371, time:Thu Jun 12 21:32:51 2008 [90859.363649] secs since 1970: 1213306431, time:Thu Jun 12 21:33:51 2008 [90919.243037] secs since 1970: 1213306491, time:Thu Jun 12 21:34:51 2008 [90979.122646] secs since 1970: 1213306551, time:Thu Jun 12 21:35:51 2008 [91039.002174] secs since 1970: 1213306611, time:Thu Jun 12 21:36:51 2008 [91098.881855] secs since 1970: 1213306671, time:Thu Jun 12 21:37:51 2008 [91158.761414] secs since 1970: 1213306731, time:Thu Jun 12 21:38:51 2008 Roger Cruz Principal SW Engineer Marathon Technologies Corp. 978-489-1153 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Whether machine load can affect an HVM guest''s time synchronisation really depends on how the guest OS manages time. If it depends on ''timely'' timer ticks on a specific VCPU, for example, then there''s probably a limit to what we can do. Luckily it seems that many kernels are fairly robust to missing ticks however -- using ticks just to kick off queued work and to update system/wall-time stamps. If you want to do a really thorough testing job I don''t think you can ignore the multi-domain case. (1) and (2) below should probably behave similarly, but I''d expect that (2) might result in more regular scheduling/descheduling of the domain under test than (1) where the other domains run I/O workloads (and hence have irregular CPU demand). -- Keir On 13/6/08 18:53, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> (Moving from offlist discussion.) > > I''m interested in opinions... Assume there are four > single vcpu domains A, B, C, D, running on a 2-CPU > physical machine. We wish to test for time skew on > domain A. Assuming B, C, and D are all running > some workload that attempts to fully saturate the > (single) cpu. > > 1) Should the affect on domain A be essentially the > same regardless of what load (e.g., compile, > lmbench, or just "while(1);") is running in > B, C, and D? > 2) Should "xm sched-credit -d A -c 50" have the > same result (e.g. no other domains need be run)? > > If the load on other domains can affect time skew on > domain A, this raises isolation questions. And > it makes time skew testing much harder (What loads > and real-customer situations can cause more skew?) > > If the load on other domains canNOT affect time skew > on domain A, testing for time skew becomes a lot > easier. (Use sched-credit instead of launching multiple > domains.) > > Comments? My preliminary testing has been inconclusive. > > Dan > > -----Original Message----- > From: Dave Winchell [mailto:dwinchell@virtualiron.com] > Sent: Thursday, June 12, 2008 4:14 PM > To: dan.magenheimer@oracle.com > Cc: Dave Winchell > Subject: RE: xen hpet patch > > > Dan, > > Usually forcing "out-of-context" is more stressful. > I think doing it with real domains under load is more realistic. > However, the scheduling thing may be equivalent - I just haven''t > looked into it or thought about it. > > thanks, > Dave > > > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > Sent: Thu 6/12/2008 5:13 PM > To: Dave Winchell > Subject: RE: xen hpet patch > > One more thought while waiting for compile and reboot: > > Am I right that all of the policies are correcting for when > a domain "A" is out-of-context? There''s nothing in any other > domain "B" that can account for any timer loss/gain in domain > "A". The only reason we are running other domains is to ensure > that domain "A" is sometimes out-of-context, and the more > it is out-of-context, the more likely we will observe > a problem, correct? > > If this is true, it doesn''t matter what workload is run > in the non-A domains... as long as it is loading the > CPU(s), thus ensuring that domain A is sometimes not > scheduled on any CPU. > > And if all this is true, we may not need to run other > domains at all... running "xm sched-credit -d A -c 50" > should result in domain A being out-of-context at least > half the time. > > Dan >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jun-13 19:47 UTC
Re: [Xen-devel] Kernel printk timestamps and walltime drift
On 13/6/08 19:49, "Roger Cruz" <rcruz@marathontechnologies.com> wrote:> You can see that the kernel time in the brackets (host up time in > seconds) should have given us a difference of 3600 seconds when ran over > an hr period. I, however, got 3592.770286, a slip of 7+ seconds. I''ve > ran a similar experiment and over the course of 12hrs, the kernel > timestamps lose 97 seconds! > > So unless I''m misunderstanding the printk times, the problem is either > with the way the Linux OS converts TSC cycles to seconds not being very > accurate OR how the hypervisor feeds the VM-ified TSC counts to the > guest VM. > > Note that I''m sure the walltime was correct as I could monitor the > printed walltime against an atomic clock in another window and it always > matched (within the precision of my eyes moving back and forth, of > course).Do you know how the Debian kernel is computing its kernel-log timestamps? One possibility here is that Xen system time is running at a slightly different rate from wallclock (because the platform timer (PIT, HPET, etc) is not running at exactly its stated frequency) and ntp running in dom0 only keeps Xen''s wallclock reference in sync -- it does not affect Xen''s system time. If the Debian kernel were perhaps based on jiffies rather than xtime, I think that might explain the drift. This hypothesis would require Xen''s system time to be drifting at around 2000ppm, which is not great. I would expect a platform timer to be within a few 100ppm of its advertised frequency, really (and with jitter of much less than 10ppm). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Jun-13 20:06 UTC
RE: [Xen-devel] Kernel printk timestamps and walltime drift
Hi Roger -- I''m assuming this is an hvm 32-bit 2.6.18 domain and you are running on Xen 3.1.3 or newer? If so, try setting clock=pit in your grub.conf file then after boot test that it "took" with: cat /sys/devices/system/clocksource/clocksource0/current_clocksource If it is pit, try your test again. Note that there''s a lot of work going on trying to fix hvm timer problems right now. There are many many different cases because Linux has changed algorithms many times over the history of 2.6.> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Roger Cruz > Sent: Friday, June 13, 2008 12:49 PM > To: xen-devel > Subject: [Xen-devel] Kernel printk timestamps and walltime drift > > > <<Roger Cruz.vcf>> Hello fellow developers, > > I ran across a curious observation this week when trying to > analyze the > Debian 4.0 VM kernel''s printk timestamps against walltime. > It looks as > if the kernel''s timestamps are running slow. > > On a reference debian 4.0 VM (created from Citrix templates), > I ran the > following script after booting the OS with the "time" option > in the grub > boot line. > > You can see that the kernel time in the brackets (host up time in > seconds) should have given us a difference of 3600 seconds > when ran over > an hr period. I, however, got 3592.770286, a slip of 7+ > seconds. I''ve > ran a similar experiment and over the course of 12hrs, the kernel > timestamps lose 97 seconds! > > So unless I''m misunderstanding the printk times, the problem is either > with the way the Linux OS converts TSC cycles to seconds not > being very > accurate OR how the hypervisor feeds the VM-ified TSC counts to the > guest VM. > > Note that I''m sure the walltime was correct as I could monitor the > printed walltime against an atomic clock in another window > and it always > matched (within the precision of my eyes moving back and forth, of > course). > > Have any of you seen something similar or have an explanation? > > grub line > title Debian GNU/Linux, kernel xenu-2.6.18.xs4.1.0.1168.6013 > root (hd0,0) > kernel /boot/vmlinuz-xenu-2.6.18.xs4.1.0.1168.6013 root=/dev/xvda1 ro > time > savedefault > > #/sbin/bash > i="0" > while [ $i -lt 10000 ] > do > date -u "+secs since 1970: %s, time:%c" > /dev/kmsg > sleep 60 > i=$[$i + 1] > done > > Reference Debian VM: > end time 91158.761414 - start time 87565.991128 = 3592.770286. It > should have been 3600 seconds > Epoch time: 1213306731 - 1213303130 = 3601 close enough to 3600. > > p20deb:~# ./timetest > [87565.991128] secs since 1970: 1213303130, time:Thu Jun 12 20:38:50 > 2008 > [87625.865258] secs since 1970: 1213303190, time:Thu Jun 12 20:39:50 > 2008 > [87685.745109] secs since 1970: 1213303250, time:Thu Jun 12 20:40:50 > 2008 > [87745.624397] secs since 1970: 1213303310, time:Thu Jun 12 20:41:50 > 2008 > [87805.504331] secs since 1970: 1213303370, time:Thu Jun 12 20:42:50 > 2008 > [87865.383613] secs since 1970: 1213303430, time:Thu Jun 12 20:43:50 > 2008 > [87925.263261] secs since 1970: 1213303490, time:Thu Jun 12 20:44:50 > 2008 > [87985.142794] secs since 1970: 1213303550, time:Thu Jun 12 20:45:50 > 2008 > [88045.022465] secs since 1970: 1213303610, time:Thu Jun 12 20:46:50 > 2008 > [88104.901993] secs since 1970: 1213303670, time:Thu Jun 12 20:47:50 > 2008 > [88164.781555] secs since 1970: 1213303730, time:Thu Jun 12 20:48:50 > 2008 > [88224.661193] secs since 1970: 1213303790, time:Thu Jun 12 20:49:50 > 2008 > [88284.540777] secs since 1970: 1213303850, time:Thu Jun 12 20:50:50 > 2008 > [88344.420448] secs since 1970: 1213303910, time:Thu Jun 12 20:51:50 > 2008 > [88404.299957] secs since 1970: 1213303970, time:Thu Jun 12 20:52:50 > 2008 > [88464.179722] secs since 1970: 1213304030, time:Thu Jun 12 20:53:50 > 2008 > [88524.059229] secs since 1970: 1213304090, time:Thu Jun 12 20:54:50 > 2008 > [88583.938799] secs since 1970: 1213304150, time:Thu Jun 12 20:55:50 > 2008 > [88643.818430] secs since 1970: 1213304210, time:Thu Jun 12 20:56:50 > 2008 > [88703.698102] secs since 1970: 1213304270, time:Thu Jun 12 20:57:50 > 2008 > [88763.577552] secs since 1970: 1213304330, time:Thu Jun 12 20:58:50 > 2008 > [88823.457180] secs since 1970: 1213304390, time:Thu Jun 12 20:59:50 > 2008 > [88883.336805] secs since 1970: 1213304450, time:Thu Jun 12 21:00:50 > 2008 > [88943.216354] secs since 1970: 1213304510, time:Thu Jun 12 21:01:50 > 2008 > [89003.095940] secs since 1970: 1213304570, time:Thu Jun 12 21:02:50 > 2008 > [89062.975550] secs since 1970: 1213304630, time:Thu Jun 12 21:03:50 > 2008 > [89122.855184] secs since 1970: 1213304690, time:Thu Jun 12 21:04:50 > 2008 > [89182.734796] secs since 1970: 1213304750, time:Thu Jun 12 21:05:50 > 2008 > [89242.614451] secs since 1970: 1213304810, time:Thu Jun 12 21:06:50 > 2008 > [89302.494083] secs since 1970: 1213304870, time:Thu Jun 12 21:07:50 > 2008 > [89362.373748] secs since 1970: 1213304930, time:Thu Jun 12 21:08:50 > 2008 > [89422.253149] secs since 1970: 1213304990, time:Thu Jun 12 21:09:50 > 2008 > [89482.132723] secs since 1970: 1213305050, time:Thu Jun 12 21:10:50 > 2008 > [89542.012413] secs since 1970: 1213305110, time:Thu Jun 12 21:11:50 > 2008 > [89601.892321] secs since 1970: 1213305171, time:Thu Jun 12 21:12:51 > 2008 > [89661.771534] secs since 1970: 1213305231, time:Thu Jun 12 21:13:51 > 2008 > [89721.651093] secs since 1970: 1213305291, time:Thu Jun 12 21:14:51 > 2008 > [89781.530707] secs since 1970: 1213305351, time:Thu Jun 12 21:15:51 > 2008 > [89841.410355] secs since 1970: 1213305411, time:Thu Jun 12 21:16:51 > 2008 > [89901.289916] secs since 1970: 1213305471, time:Thu Jun 12 21:17:51 > 2008 > [89961.169452] secs since 1970: 1213305531, time:Thu Jun 12 21:18:51 > 2008 > [90021.049277] secs since 1970: 1213305591, time:Thu Jun 12 21:19:51 > 2008 > [90080.928731] secs since 1970: 1213305651, time:Thu Jun 12 21:20:51 > 2008 > [90140.808362] secs since 1970: 1213305711, time:Thu Jun 12 21:21:51 > 2008 > [90200.687903] secs since 1970: 1213305771, time:Thu Jun 12 21:22:51 > 2008 > [90260.567678] secs since 1970: 1213305831, time:Thu Jun 12 21:23:51 > 2008 > [90320.447062] secs since 1970: 1213305891, time:Thu Jun 12 21:24:51 > 2008 > [90380.326679] secs since 1970: 1213305951, time:Thu Jun 12 21:25:51 > 2008 > [90440.206252] secs since 1970: 1213306011, time:Thu Jun 12 21:26:51 > 2008 > [90500.085917] secs since 1970: 1213306071, time:Thu Jun 12 21:27:51 > 2008 > [90559.965484] secs since 1970: 1213306131, time:Thu Jun 12 21:28:51 > 2008 > [90619.845052] secs since 1970: 1213306191, time:Thu Jun 12 21:29:51 > 2008 > [90679.724900] secs since 1970: 1213306251, time:Thu Jun 12 21:30:51 > 2008 > [90739.604366] secs since 1970: 1213306311, time:Thu Jun 12 21:31:51 > 2008 > [90799.483803] secs since 1970: 1213306371, time:Thu Jun 12 21:32:51 > 2008 > [90859.363649] secs since 1970: 1213306431, time:Thu Jun 12 21:33:51 > 2008 > [90919.243037] secs since 1970: 1213306491, time:Thu Jun 12 21:34:51 > 2008 > [90979.122646] secs since 1970: 1213306551, time:Thu Jun 12 21:35:51 > 2008 > [91039.002174] secs since 1970: 1213306611, time:Thu Jun 12 21:36:51 > 2008 > [91098.881855] secs since 1970: 1213306671, time:Thu Jun 12 21:37:51 > 2008 > [91158.761414] secs since 1970: 1213306731, time:Thu Jun 12 21:38:51 > 2008 > > > Roger Cruz > Principal SW Engineer > Marathon Technologies Corp. > 978-489-1153 > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Roger Cruz
2008-Jun-13 20:10 UTC
RE: [Xen-devel] Kernel printk timestamps and walltime drift
Hi Keir, Thanks for taking the time to answer my question. I do not know how the kernel times are getting computed exactly. This is an area of Linux which I''m not very familiar with. Prior to my test on the reference debian VM, I performed exactly the same test on a Linux VM we build based on the Citrix-provided Linux 2.6.18 sources (my understanding is that these are the sources dom0 is using, correct me if wrong) and saw the same drift. That''s why I decided to do a test on a reference VM to make sure it was not caused by our own VM. When I looked at the Linux 2.6.18 sources to try to track down how these times are computed, I found the following for an x86, 32-bit platform... MtcAbsDbgPrint -> vprintk -> printk_clock -> sched_clock -> rdtscll MtcAbsTimeGetTimestampGmt -> current_kernel_time -> return timespec variable "xtime" "xtime" is in UTC and I believe it''s updated as a result of the timer interrupt in the sequence below do_timer_interrupt_hook-> do_timer-> update_times ->update_wall_time -> updates "xtime" Note that I''m not 100% of these call sequences since they were derived by looking at the sources and not tracing it with any debugger. This is how I came to conclude that the kernel timestamps (generated as a result of us calling vprintk) are derived from the RDTSC instruction. It is my understanding that the TSC values are derived by the hypervisor from the actual CPU register and some computed offset. I have yet to examine this code myself. The walltime appears to be generated from periodic timer ticks that update "xtime". These values have always been correct. Only the ones that get generated with the vprintk appear to slip when compared to this wall time. Roger -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: Friday, June 13, 2008 3:48 PM To: Roger Cruz; xen-devel Subject: Re: [Xen-devel] Kernel printk timestamps and walltime drift On 13/6/08 19:49, "Roger Cruz" <rcruz@marathontechnologies.com> wrote:> You can see that the kernel time in the brackets (host up time in > seconds) should have given us a difference of 3600 seconds when ranover> an hr period. I, however, got 3592.770286, a slip of 7+ seconds.I''ve> ran a similar experiment and over the course of 12hrs, the kernel > timestamps lose 97 seconds! > > So unless I''m misunderstanding the printk times, the problem is either > with the way the Linux OS converts TSC cycles to seconds not beingvery> accurate OR how the hypervisor feeds the VM-ified TSC counts to the > guest VM. > > Note that I''m sure the walltime was correct as I could monitor the > printed walltime against an atomic clock in another window and italways> matched (within the precision of my eyes moving back and forth, of > course).Do you know how the Debian kernel is computing its kernel-log timestamps? One possibility here is that Xen system time is running at a slightly different rate from wallclock (because the platform timer (PIT, HPET, etc) is not running at exactly its stated frequency) and ntp running in dom0 only keeps Xen''s wallclock reference in sync -- it does not affect Xen''s system time. If the Debian kernel were perhaps based on jiffies rather than xtime, I think that might explain the drift. This hypothesis would require Xen''s system time to be drifting at around 2000ppm, which is not great. I would expect a platform timer to be within a few 100ppm of its advertised frequency, really (and with jitter of much less than 10ppm). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jun-13 20:31 UTC
Re: [Xen-devel] Kernel printk timestamps and walltime drift
On 13/6/08 21:10, "Roger Cruz" <rcruz@marathontechnologies.com> wrote:> The walltime appears to be generated from periodic timer ticks that > update "xtime". These values have always been correct. Only the ones > that get generated with the vprintk appear to slip when compared to this > wall time.Yes, sched_clock() can drift relative to wall time. It''s an internal monotonic timesource for the kernel, and if it drifts it doesn''t usually matter. It''s not visible to user space. So, that explains your vprintk timestamp drift. I hope it doesn''t bother you too much. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Roger Cruz
2008-Jun-13 21:05 UTC
RE: [Xen-devel] Kernel printk timestamps and walltime drift
I''ve performed this test on two VMs, which it''s my understanding are both PV, 32-bits and compiled from the 2.6.18. I am using Citrix''s XenEnterprise 4.1.0 software and you can see its VM''s configuration below, giving the version and the PV bootloader. I applied your changes to the grub line but I don''t believe it is taking: Prior to application: p20deb:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource jiffies after "clock=pit" added to my grub line titleDebian GNU/Linux, kernel xenu-2.6.18.xs4.1.0.1168.6013 root(hd0,0) kernel/boot/vmlinuz-xenu-2.6.18.xs4.1.0.1168.6013 root=/dev/xvda1 ro time clock=pit savedefault p20deb:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource jiffies I ran the test with clock=pit and it still drifted [root@p20xen1 ~]# xe vm-list name-label=Debian\ Etch\ 4.0\ \(1\) paramsuuid ( RO) : bc40e67c-2f85-5be1-66ef-31ae3b2ab8b3 name-label ( RW): Debian Etch 4.0 (1) name-description ( RW): user-version ( RW): 1 is-a-template ( RW): false is-control-domain ( RO): false power-state ( RO): halted memory-dynamic-max ( RW): 268435456 memory-dynamic-min ( RW): 268435456 memory-static-max ( RW): 268435456 memory-static-min ( RW): 16777216 suspend-VDI-uuid ( RO): <not in database> VCPUs-params (MRW): VCPUs-max ( RW): 1 VCPUs-at-startup ( RW): 1 actions-after-shutdown ( RW): Destroy actions-after-reboot ( RW): Restart actions-after-crash ( RW): Restart console-uuids (SRO): ecf5b121-6a0b-9a6d-fb98-62c608afaa24 platform (MRW): nx: false; acpi: true; apic: true; pae: true allowed-operations (SRO): clone; start; start_on; export; destroy; make_into_template current-operations (SRO): allowed-VBD-devices (SRO): <expensive field> allowed-VIF-devices (SRO): <expensive field> possible-hosts ( RO): <expensive field> HVM-boot-policy ( RW): HVM-boot-params (MRW): HVM-shadow-multiplier ( RW): 1.000 PV-kernel ( RW): PV-ramdisk ( RW): PV-args ( RW): PV-legacy-args ( RW): PV-bootloader ( RW): pygrub PV-bootloader-args ( RW): last-boot-CPU-flags ( RO): resident-on ( RO): <not in database> affinity ( RW): 1c29b960-78b2-46cc-a808-8cb454a3cece other-config (MRW): last_shutdown_time: 20080612T20:16:56Z; last_shutdown_action: Restart; last_shutdown_initiator: internal; last_shutdown_reason: rebooted; linux_template: true; mac_seed: e381c9e3-0346-09d6-d92e-815f5267bbf8 dom-id ( RO): -1 recommendations ( RO): <restrictions><restriction field="memory-static-max" max="34359738368" /><restriction field="vcpus-max" max="8" /><restriction property="number-of-vbds" max="7" /><restriction property="number-of-vifs" max="7" /></restrictions> xenstore-data (MRW): start-time ( RO): <unknown time> install-time ( RO): <unknown time> memory-actual ( RO): <not in database> VCPUs-number ( RO): <not in database> VCPUs-utilisation (MRO): <not in database> vm-metrics-last-updated ( RO): <not in database> os-version (MRO): name: Debian 4.0; uname: 2.6.18.xs4.1.0.1168.6013; distro: debian; major: 4; minor: 0 PV-drivers-version (MRO): major: 4; minor: 1; micro: 0 PV-drivers-up-to-date ( RO): true memory (MRO): disks (MRO): networks (MRO): other (MRO): guest-metrics-last-updated ( RO): 19700101T00:00:00Z -----Original Message----- From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] Sent: Friday, June 13, 2008 4:06 PM To: Roger Cruz; xen-devel Subject: RE: [Xen-devel] Kernel printk timestamps and walltime drift Hi Roger -- I''m assuming this is an hvm 32-bit 2.6.18 domain and you are running on Xen 3.1.3 or newer? If so, try setting clock=pit in your grub.conf file then after boot test that it "took" with: cat /sys/devices/system/clocksource/clocksource0/current_clocksource If it is pit, try your test again. Note that there''s a lot of work going on trying to fix hvm timer problems right now. There are many many different cases because Linux has changed algorithms many times over the history of 2.6.> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Roger Cruz > Sent: Friday, June 13, 2008 12:49 PM > To: xen-devel > Subject: [Xen-devel] Kernel printk timestamps and walltime drift > > > <<Roger Cruz.vcf>> Hello fellow developers, > > I ran across a curious observation this week when trying to > analyze the > Debian 4.0 VM kernel''s printk timestamps against walltime. > It looks as > if the kernel''s timestamps are running slow. > > On a reference debian 4.0 VM (created from Citrix templates), > I ran the > following script after booting the OS with the "time" option > in the grub > boot line. > > You can see that the kernel time in the brackets (host up time in > seconds) should have given us a difference of 3600 seconds > when ran over > an hr period. I, however, got 3592.770286, a slip of 7+ > seconds. I''ve > ran a similar experiment and over the course of 12hrs, the kernel > timestamps lose 97 seconds! > > So unless I''m misunderstanding the printk times, the problem is either > with the way the Linux OS converts TSC cycles to seconds not > being very > accurate OR how the hypervisor feeds the VM-ified TSC counts to the > guest VM. > > Note that I''m sure the walltime was correct as I could monitor the > printed walltime against an atomic clock in another window > and it always > matched (within the precision of my eyes moving back and forth, of > course). > > Have any of you seen something similar or have an explanation? > > grub line > title Debian GNU/Linux, kernel xenu-2.6.18.xs4.1.0.1168.6013 > root (hd0,0) > kernel /boot/vmlinuz-xenu-2.6.18.xs4.1.0.1168.6013 root=/dev/xvda1 ro > time > savedefault > > #/sbin/bash > i="0" > while [ $i -lt 10000 ] > do > date -u "+secs since 1970: %s, time:%c" > /dev/kmsg > sleep 60 > i=$[$i + 1] > done > > Reference Debian VM: > end time 91158.761414 - start time 87565.991128 = 3592.770286. It > should have been 3600 seconds > Epoch time: 1213306731 - 1213303130 = 3601 close enough to 3600. > > p20deb:~# ./timetest > [87565.991128] secs since 1970: 1213303130, time:Thu Jun 12 20:38:50 > 2008 > [87625.865258] secs since 1970: 1213303190, time:Thu Jun 12 20:39:50 > 2008 > [87685.745109] secs since 1970: 1213303250, time:Thu Jun 12 20:40:50 > 2008 > [87745.624397] secs since 1970: 1213303310, time:Thu Jun 12 20:41:50 > 2008 > [87805.504331] secs since 1970: 1213303370, time:Thu Jun 12 20:42:50 > 2008 > [87865.383613] secs since 1970: 1213303430, time:Thu Jun 12 20:43:50 > 2008 > [87925.263261] secs since 1970: 1213303490, time:Thu Jun 12 20:44:50 > 2008 > [87985.142794] secs since 1970: 1213303550, time:Thu Jun 12 20:45:50 > 2008 > [88045.022465] secs since 1970: 1213303610, time:Thu Jun 12 20:46:50 > 2008 > [88104.901993] secs since 1970: 1213303670, time:Thu Jun 12 20:47:50 > 2008 > [88164.781555] secs since 1970: 1213303730, time:Thu Jun 12 20:48:50 > 2008 > [88224.661193] secs since 1970: 1213303790, time:Thu Jun 12 20:49:50 > 2008 > [88284.540777] secs since 1970: 1213303850, time:Thu Jun 12 20:50:50 > 2008 > [88344.420448] secs since 1970: 1213303910, time:Thu Jun 12 20:51:50 > 2008 > [88404.299957] secs since 1970: 1213303970, time:Thu Jun 12 20:52:50 > 2008 > [88464.179722] secs since 1970: 1213304030, time:Thu Jun 12 20:53:50 > 2008 > [88524.059229] secs since 1970: 1213304090, time:Thu Jun 12 20:54:50 > 2008 > [88583.938799] secs since 1970: 1213304150, time:Thu Jun 12 20:55:50 > 2008 > [88643.818430] secs since 1970: 1213304210, time:Thu Jun 12 20:56:50 > 2008 > [88703.698102] secs since 1970: 1213304270, time:Thu Jun 12 20:57:50 > 2008 > [88763.577552] secs since 1970: 1213304330, time:Thu Jun 12 20:58:50 > 2008 > [88823.457180] secs since 1970: 1213304390, time:Thu Jun 12 20:59:50 > 2008 > [88883.336805] secs since 1970: 1213304450, time:Thu Jun 12 21:00:50 > 2008 > [88943.216354] secs since 1970: 1213304510, time:Thu Jun 12 21:01:50 > 2008 > [89003.095940] secs since 1970: 1213304570, time:Thu Jun 12 21:02:50 > 2008 > [89062.975550] secs since 1970: 1213304630, time:Thu Jun 12 21:03:50 > 2008 > [89122.855184] secs since 1970: 1213304690, time:Thu Jun 12 21:04:50 > 2008 > [89182.734796] secs since 1970: 1213304750, time:Thu Jun 12 21:05:50 > 2008 > [89242.614451] secs since 1970: 1213304810, time:Thu Jun 12 21:06:50 > 2008 > [89302.494083] secs since 1970: 1213304870, time:Thu Jun 12 21:07:50 > 2008 > [89362.373748] secs since 1970: 1213304930, time:Thu Jun 12 21:08:50 > 2008 > [89422.253149] secs since 1970: 1213304990, time:Thu Jun 12 21:09:50 > 2008 > [89482.132723] secs since 1970: 1213305050, time:Thu Jun 12 21:10:50 > 2008 > [89542.012413] secs since 1970: 1213305110, time:Thu Jun 12 21:11:50 > 2008 > [89601.892321] secs since 1970: 1213305171, time:Thu Jun 12 21:12:51 > 2008 > [89661.771534] secs since 1970: 1213305231, time:Thu Jun 12 21:13:51 > 2008 > [89721.651093] secs since 1970: 1213305291, time:Thu Jun 12 21:14:51 > 2008 > [89781.530707] secs since 1970: 1213305351, time:Thu Jun 12 21:15:51 > 2008 > [89841.410355] secs since 1970: 1213305411, time:Thu Jun 12 21:16:51 > 2008 > [89901.289916] secs since 1970: 1213305471, time:Thu Jun 12 21:17:51 > 2008 > [89961.169452] secs since 1970: 1213305531, time:Thu Jun 12 21:18:51 > 2008 > [90021.049277] secs since 1970: 1213305591, time:Thu Jun 12 21:19:51 > 2008 > [90080.928731] secs since 1970: 1213305651, time:Thu Jun 12 21:20:51 > 2008 > [90140.808362] secs since 1970: 1213305711, time:Thu Jun 12 21:21:51 > 2008 > [90200.687903] secs since 1970: 1213305771, time:Thu Jun 12 21:22:51 > 2008 > [90260.567678] secs since 1970: 1213305831, time:Thu Jun 12 21:23:51 > 2008 > [90320.447062] secs since 1970: 1213305891, time:Thu Jun 12 21:24:51 > 2008 > [90380.326679] secs since 1970: 1213305951, time:Thu Jun 12 21:25:51 > 2008 > [90440.206252] secs since 1970: 1213306011, time:Thu Jun 12 21:26:51 > 2008 > [90500.085917] secs since 1970: 1213306071, time:Thu Jun 12 21:27:51 > 2008 > [90559.965484] secs since 1970: 1213306131, time:Thu Jun 12 21:28:51 > 2008 > [90619.845052] secs since 1970: 1213306191, time:Thu Jun 12 21:29:51 > 2008 > [90679.724900] secs since 1970: 1213306251, time:Thu Jun 12 21:30:51 > 2008 > [90739.604366] secs since 1970: 1213306311, time:Thu Jun 12 21:31:51 > 2008 > [90799.483803] secs since 1970: 1213306371, time:Thu Jun 12 21:32:51 > 2008 > [90859.363649] secs since 1970: 1213306431, time:Thu Jun 12 21:33:51 > 2008 > [90919.243037] secs since 1970: 1213306491, time:Thu Jun 12 21:34:51 > 2008 > [90979.122646] secs since 1970: 1213306551, time:Thu Jun 12 21:35:51 > 2008 > [91039.002174] secs since 1970: 1213306611, time:Thu Jun 12 21:36:51 > 2008 > [91098.881855] secs since 1970: 1213306671, time:Thu Jun 12 21:37:51 > 2008 > [91158.761414] secs since 1970: 1213306731, time:Thu Jun 12 21:38:51 > 2008 > > > Roger Cruz > Principal SW Engineer > Marathon Technologies Corp. > 978-489-1153 > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Dan Magenheimer
2008-Jun-13 21:21 UTC
RE: [Xen-devel] Kernel printk timestamps and walltime drift
Hi Roger -- Sorry, I made a bad assumption... the solution I provided works for hvm domains. For pvm domains, the guest clock will generally be determined by xen system time, and as Keir said, if the underlying clock xen is using skews from wallclock time, then xen system time will skew also. I think the solution for this situation is to ensure that /proc/sys/xen/independent_wallclock is set to 0 for each of your pvm domains and run ntpd on domain0. Dan> -----Original Message----- > From: Roger Cruz [mailto:rcruz@marathontechnologies.com] > Sent: Friday, June 13, 2008 3:05 PM > To: dan.magenheimer@oracle.com; xen-devel > Subject: RE: [Xen-devel] Kernel printk timestamps and walltime drift > > > I''ve performed this test on two VMs, which it''s my understanding are > both PV, 32-bits and compiled from the 2.6.18. I am using Citrix''s > XenEnterprise 4.1.0 software and you can see its VM''s configuration > below, giving the version and the PV bootloader. > > I applied your changes to the grub line but I don''t believe it is > taking: > > Prior to application: > > p20deb:~# cat > /sys/devices/system/clocksource/clocksource0/current_clocksource > jiffies > > after "clock=pit" added to my grub line > > titleDebian GNU/Linux, kernel xenu-2.6.18.xs4.1.0.1168.6013 > root(hd0,0) > kernel/boot/vmlinuz-xenu-2.6.18.xs4.1.0.1168.6013 root=/dev/xvda1 ro > time clock=pit > savedefault > > p20deb:~# cat > /sys/devices/system/clocksource/clocksource0/current_clocksource > jiffies > > I ran the test with clock=pit and it still drifted > > [root@p20xen1 ~]# xe vm-list name-label=Debian\ Etch\ 4.0\ > \(1\) params> uuid ( RO) : > bc40e67c-2f85-5be1-66ef-31ae3b2ab8b3 > name-label ( RW): Debian Etch 4.0 (1) > name-description ( RW): > user-version ( RW): 1 > is-a-template ( RW): false > is-control-domain ( RO): false > power-state ( RO): halted > memory-dynamic-max ( RW): 268435456 > memory-dynamic-min ( RW): 268435456 > memory-static-max ( RW): 268435456 > memory-static-min ( RW): 16777216 > suspend-VDI-uuid ( RO): <not in database> > VCPUs-params (MRW): > VCPUs-max ( RW): 1 > VCPUs-at-startup ( RW): 1 > actions-after-shutdown ( RW): Destroy > actions-after-reboot ( RW): Restart > actions-after-crash ( RW): Restart > console-uuids (SRO): > ecf5b121-6a0b-9a6d-fb98-62c608afaa24 > platform (MRW): nx: false; acpi: true; > apic: true; > pae: true > allowed-operations (SRO): clone; start; start_on; export; > destroy; make_into_template > current-operations (SRO): > allowed-VBD-devices (SRO): <expensive field> > allowed-VIF-devices (SRO): <expensive field> > possible-hosts ( RO): <expensive field> > HVM-boot-policy ( RW): > HVM-boot-params (MRW): > HVM-shadow-multiplier ( RW): 1.000 > PV-kernel ( RW): > PV-ramdisk ( RW): > PV-args ( RW): > PV-legacy-args ( RW): > PV-bootloader ( RW): pygrub > PV-bootloader-args ( RW): > last-boot-CPU-flags ( RO): > resident-on ( RO): <not in database> > affinity ( RW): > 1c29b960-78b2-46cc-a808-8cb454a3cece > other-config (MRW): last_shutdown_time: > 20080612T20:16:56Z; last_shutdown_action: Restart; > last_shutdown_initiator: internal; last_shutdown_reason: rebooted; > linux_template: true; mac_seed: e381c9e3-0346-09d6-d92e-815f5267bbf8 > dom-id ( RO): -1 > recommendations ( RO): <restrictions><restriction > field="memory-static-max" max="34359738368" /><restriction > field="vcpus-max" max="8" /><restriction property="number-of-vbds" > max="7" /><restriction property="number-of-vifs" max="7" > /></restrictions> > xenstore-data (MRW): > start-time ( RO): <unknown time> > install-time ( RO): <unknown time> > memory-actual ( RO): <not in database> > VCPUs-number ( RO): <not in database> > VCPUs-utilisation (MRO): <not in database> > vm-metrics-last-updated ( RO): <not in database> > os-version (MRO): name: Debian 4.0; uname: > 2.6.18.xs4.1.0.1168.6013; distro: debian; major: 4; minor: 0 > PV-drivers-version (MRO): major: 4; minor: 1; micro: 0 > PV-drivers-up-to-date ( RO): true > memory (MRO): > disks (MRO): > networks (MRO): > other (MRO): > guest-metrics-last-updated ( RO): 19700101T00:00:00Z > > > -----Original Message----- > From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > Sent: Friday, June 13, 2008 4:06 PM > To: Roger Cruz; xen-devel > Subject: RE: [Xen-devel] Kernel printk timestamps and walltime drift > > Hi Roger -- > > I''m assuming this is an hvm 32-bit 2.6.18 domain and you are > running on Xen 3.1.3 or newer? > > If so, try setting clock=pit in your grub.conf file then > after boot test that it "took" with: > > cat /sys/devices/system/clocksource/clocksource0/current_clocksource > > If it is pit, try your test again. > > Note that there''s a lot of work going on trying to fix > hvm timer problems right now. There are many many different > cases because Linux has changed algorithms many times over > the history of 2.6. > > > -----Original Message----- > > From: xen-devel-bounces@lists.xensource.com > > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of > Roger Cruz > > Sent: Friday, June 13, 2008 12:49 PM > > To: xen-devel > > Subject: [Xen-devel] Kernel printk timestamps and walltime drift > > > > > > <<Roger Cruz.vcf>> Hello fellow developers, > > > > I ran across a curious observation this week when trying to > > analyze the > > Debian 4.0 VM kernel''s printk timestamps against walltime. > > It looks as > > if the kernel''s timestamps are running slow. > > > > On a reference debian 4.0 VM (created from Citrix templates), > > I ran the > > following script after booting the OS with the "time" option > > in the grub > > boot line. > > > > You can see that the kernel time in the brackets (host up time in > > seconds) should have given us a difference of 3600 seconds > > when ran over > > an hr period. I, however, got 3592.770286, a slip of 7+ > > seconds. I''ve > > ran a similar experiment and over the course of 12hrs, the kernel > > timestamps lose 97 seconds! > > > > So unless I''m misunderstanding the printk times, the > problem is either > > with the way the Linux OS converts TSC cycles to seconds not > > being very > > accurate OR how the hypervisor feeds the VM-ified TSC counts to the > > guest VM. > > > > Note that I''m sure the walltime was correct as I could monitor the > > printed walltime against an atomic clock in another window > > and it always > > matched (within the precision of my eyes moving back and forth, of > > course). > > > > Have any of you seen something similar or have an explanation? > > > > grub line > > title Debian GNU/Linux, kernel xenu-2.6.18.xs4.1.0.1168.6013 > > root (hd0,0) > > kernel /boot/vmlinuz-xenu-2.6.18.xs4.1.0.1168.6013 > root=/dev/xvda1 ro > > time > > savedefault > > > > #/sbin/bash > > i="0" > > while [ $i -lt 10000 ] > > do > > date -u "+secs since 1970: %s, time:%c" > /dev/kmsg > > sleep 60 > > i=$[$i + 1] > > done > > > > Reference Debian VM: > > end time 91158.761414 - start time 87565.991128 = 3592.770286. It > > should have been 3600 seconds > > Epoch time: 1213306731 - 1213303130 = 3601 close enough to 3600. > > > > p20deb:~# ./timetest > > [87565.991128] secs since 1970: 1213303130, time:Thu Jun 12 20:38:50 > > 2008 > > [87625.865258] secs since 1970: 1213303190, time:Thu Jun 12 20:39:50 > > 2008 > > [87685.745109] secs since 1970: 1213303250, time:Thu Jun 12 20:40:50 > > 2008 > > [87745.624397] secs since 1970: 1213303310, time:Thu Jun 12 20:41:50 > > 2008 > > [87805.504331] secs since 1970: 1213303370, time:Thu Jun 12 20:42:50 > > 2008 > > [87865.383613] secs since 1970: 1213303430, time:Thu Jun 12 20:43:50 > > 2008 > > [87925.263261] secs since 1970: 1213303490, time:Thu Jun 12 20:44:50 > > 2008 > > [87985.142794] secs since 1970: 1213303550, time:Thu Jun 12 20:45:50 > > 2008 > > [88045.022465] secs since 1970: 1213303610, time:Thu Jun 12 20:46:50 > > 2008 > > [88104.901993] secs since 1970: 1213303670, time:Thu Jun 12 20:47:50 > > 2008 > > [88164.781555] secs since 1970: 1213303730, time:Thu Jun 12 20:48:50 > > 2008 > > [88224.661193] secs since 1970: 1213303790, time:Thu Jun 12 20:49:50 > > 2008 > > [88284.540777] secs since 1970: 1213303850, time:Thu Jun 12 20:50:50 > > 2008 > > [88344.420448] secs since 1970: 1213303910, time:Thu Jun 12 20:51:50 > > 2008 > > [88404.299957] secs since 1970: 1213303970, time:Thu Jun 12 20:52:50 > > 2008 > > [88464.179722] secs since 1970: 1213304030, time:Thu Jun 12 20:53:50 > > 2008 > > [88524.059229] secs since 1970: 1213304090, time:Thu Jun 12 20:54:50 > > 2008 > > [88583.938799] secs since 1970: 1213304150, time:Thu Jun 12 20:55:50 > > 2008 > > [88643.818430] secs since 1970: 1213304210, time:Thu Jun 12 20:56:50 > > 2008 > > [88703.698102] secs since 1970: 1213304270, time:Thu Jun 12 20:57:50 > > 2008 > > [88763.577552] secs since 1970: 1213304330, time:Thu Jun 12 20:58:50 > > 2008 > > [88823.457180] secs since 1970: 1213304390, time:Thu Jun 12 20:59:50 > > 2008 > > [88883.336805] secs since 1970: 1213304450, time:Thu Jun 12 21:00:50 > > 2008 > > [88943.216354] secs since 1970: 1213304510, time:Thu Jun 12 21:01:50 > > 2008 > > [89003.095940] secs since 1970: 1213304570, time:Thu Jun 12 21:02:50 > > 2008 > > [89062.975550] secs since 1970: 1213304630, time:Thu Jun 12 21:03:50 > > 2008 > > [89122.855184] secs since 1970: 1213304690, time:Thu Jun 12 21:04:50 > > 2008 > > [89182.734796] secs since 1970: 1213304750, time:Thu Jun 12 21:05:50 > > 2008 > > [89242.614451] secs since 1970: 1213304810, time:Thu Jun 12 21:06:50 > > 2008 > > [89302.494083] secs since 1970: 1213304870, time:Thu Jun 12 21:07:50 > > 2008 > > [89362.373748] secs since 1970: 1213304930, time:Thu Jun 12 21:08:50 > > 2008 > > [89422.253149] secs since 1970: 1213304990, time:Thu Jun 12 21:09:50 > > 2008 > > [89482.132723] secs since 1970: 1213305050, time:Thu Jun 12 21:10:50 > > 2008 > > [89542.012413] secs since 1970: 1213305110, time:Thu Jun 12 21:11:50 > > 2008 > > [89601.892321] secs since 1970: 1213305171, time:Thu Jun 12 21:12:51 > > 2008 > > [89661.771534] secs since 1970: 1213305231, time:Thu Jun 12 21:13:51 > > 2008 > > [89721.651093] secs since 1970: 1213305291, time:Thu Jun 12 21:14:51 > > 2008 > > [89781.530707] secs since 1970: 1213305351, time:Thu Jun 12 21:15:51 > > 2008 > > [89841.410355] secs since 1970: 1213305411, time:Thu Jun 12 21:16:51 > > 2008 > > [89901.289916] secs since 1970: 1213305471, time:Thu Jun 12 21:17:51 > > 2008 > > [89961.169452] secs since 1970: 1213305531, time:Thu Jun 12 21:18:51 > > 2008 > > [90021.049277] secs since 1970: 1213305591, time:Thu Jun 12 21:19:51 > > 2008 > > [90080.928731] secs since 1970: 1213305651, time:Thu Jun 12 21:20:51 > > 2008 > > [90140.808362] secs since 1970: 1213305711, time:Thu Jun 12 21:21:51 > > 2008 > > [90200.687903] secs since 1970: 1213305771, time:Thu Jun 12 21:22:51 > > 2008 > > [90260.567678] secs since 1970: 1213305831, time:Thu Jun 12 21:23:51 > > 2008 > > [90320.447062] secs since 1970: 1213305891, time:Thu Jun 12 21:24:51 > > 2008 > > [90380.326679] secs since 1970: 1213305951, time:Thu Jun 12 21:25:51 > > 2008 > > [90440.206252] secs since 1970: 1213306011, time:Thu Jun 12 21:26:51 > > 2008 > > [90500.085917] secs since 1970: 1213306071, time:Thu Jun 12 21:27:51 > > 2008 > > [90559.965484] secs since 1970: 1213306131, time:Thu Jun 12 21:28:51 > > 2008 > > [90619.845052] secs since 1970: 1213306191, time:Thu Jun 12 21:29:51 > > 2008 > > [90679.724900] secs since 1970: 1213306251, time:Thu Jun 12 21:30:51 > > 2008 > > [90739.604366] secs since 1970: 1213306311, time:Thu Jun 12 21:31:51 > > 2008 > > [90799.483803] secs since 1970: 1213306371, time:Thu Jun 12 21:32:51 > > 2008 > > [90859.363649] secs since 1970: 1213306431, time:Thu Jun 12 21:33:51 > > 2008 > > [90919.243037] secs since 1970: 1213306491, time:Thu Jun 12 21:34:51 > > 2008 > > [90979.122646] secs since 1970: 1213306551, time:Thu Jun 12 21:35:51 > > 2008 > > [91039.002174] secs since 1970: 1213306611, time:Thu Jun 12 21:36:51 > > 2008 > > [91098.881855] secs since 1970: 1213306671, time:Thu Jun 12 21:37:51 > > 2008 > > [91158.761414] secs since 1970: 1213306731, time:Thu Jun 12 21:38:51 > > 2008 > > > > > > Roger Cruz > > Principal SW Engineer > > Marathon Technologies Corp. > > 978-489-1153 > > > > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jun-13 21:36 UTC
Re: [Xen-devel] Kernel printk timestamps and walltime drift
On 13/6/08 22:21, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> Hi Roger -- > > Sorry, I made a bad assumption... the solution I provided > works for hvm domains. For pvm domains, the guest clock > will generally be determined by xen system time, and as > Keir said, if the underlying clock xen is using skews from > wallclock time, then xen system time will skew also. > > I think the solution for this situation is to ensure > that /proc/sys/xen/independent_wallclock is set to 0 > for each of your pvm domains and run ntpd on domain0.Since sched_clock() is not built on top of xtime, this won''t help. sched_clock()''s implementation is tightly bound to Xen system time in our Linux patchset. It could be changed, but really I think these timestamps are the only noticeable artefact. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Roger Cruz
2008-Jun-13 22:02 UTC
RE: [Xen-devel] Kernel printk timestamps and walltime drift
Keir, Was I correct in my understanding on how the timestamp is being obtained via this call sequence? -> vprintk -> printk_clock -> sched_clock -> rdtscll In other words, does your patchset use the rdtscll instruction on an i386, pv, 32-bit linux to compute determine that time? If not, how is it derived (what file/function should I look at for sched_clock?). While you are right that the only artifact that we have observed is the drifting timestamps, a future product of ours may need to have an accurate TSC presentation to the VM and if the time is being derived from the TSC as I''m conjecturing, then this drift is something we''re going to have to worry about then. Thank you Keir and Dan -----Original Message----- From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] Sent: Friday, June 13, 2008 5:36 PM To: dan.magenheimer@oracle.com; Roger Cruz Cc: xen-devel Subject: Re: [Xen-devel] Kernel printk timestamps and walltime drift On 13/6/08 22:21, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> Hi Roger -- > > Sorry, I made a bad assumption... the solution I provided > works for hvm domains. For pvm domains, the guest clock > will generally be determined by xen system time, and as > Keir said, if the underlying clock xen is using skews from > wallclock time, then xen system time will skew also. > > I think the solution for this situation is to ensure > that /proc/sys/xen/independent_wallclock is set to 0 > for each of your pvm domains and run ntpd on domain0.Since sched_clock() is not built on top of xtime, this won''t help. sched_clock()''s implementation is tightly bound to Xen system time in our Linux patchset. It could be changed, but really I think these timestamps are the only noticeable artefact. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Jun-13 22:09 UTC
Re: [Xen-devel] Kernel printk timestamps and walltime drift
On 13/6/08 23:02, "Roger Cruz" <rcruz@marathontechnologies.com> wrote:> Was I correct in my understanding on how the timestamp is being obtained > via this call sequence? > > -> vprintk -> printk_clock -> sched_clock -> rdtscll > > In other words, does your patchset use the rdtscll instruction on an > i386, pv, 32-bit linux to compute determine that time? If not, how is > it derived (what file/function should I look at for sched_clock?).The guest calculates current Xen system time based on extrapolating from a Xen-supplied timestamp using current TSC: sys_time = sys_timestamp + (RDTSC - tsc_timestamp) * scale_factor Where three terms on the RHS of the formula are supplied by Xen in the shared_info page, and the fourth is returned by the RDTSC CPU instruction.> While you are right that the only artifact that we have observed is the > drifting timestamps, a future product of ours may need to have an > accurate TSC presentation to the VM and if the time is being derived > from the TSC as I''m conjecturing, then this drift is something we''re > going to have to worry about then.Depends what you mean by accurate. PV guests see the underlying host TSC directly. In that sense what they see is 100% accurate! -- Keir> Thank you Keir and Dan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
OK, I guess I can buy that. Running real domains will cause more scheduling "artifacts": domain A will likely run more frequently for shorter periods of time than if sched-credit-capped with no other domains running. And more frequent "spurts" could result in more fractional ticks in domain A that potentially could get lost or miscounted. But is there anything else? Suppose the credit scheduler were modified to optionally schedule random "spurts" when the sum of caps was less than the total available CPU. Would you then expect the results to be essentially the same? And, on a N pcpu machine, if a single "interference domain" was run with N threads that randomly absorb much/most of each VCPU (but do no I/O), would that be just as good as running multiple interference domains? I''m not just wondering this for curiosity. All of my recent hpet testing used multiple domains that ate CPU, but did no I/O. Thanks, Dan> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Friday, June 13, 2008 1:39 PM > To: dan.magenheimer@oracle.com; Dave Winchell; Ben Guthro; xen-devel > Subject: Re: Isolation and time > > > Whether machine load can affect an HVM guest''s time > synchronisation really > depends on how the guest OS manages time. If it depends on > ''timely'' timer > ticks on a specific VCPU, for example, then there''s probably > a limit to what > we can do. Luckily it seems that many kernels are fairly > robust to missing > ticks however -- using ticks just to kick off queued work and > to update > system/wall-time stamps. If you want to do a really thorough > testing job I > don''t think you can ignore the multi-domain case. > > (1) and (2) below should probably behave similarly, but I''d > expect that (2) > might result in more regular scheduling/descheduling of the > domain under > test than (1) where the other domains run I/O workloads (and > hence have > irregular CPU demand). > > -- Keir > > On 13/6/08 18:53, "Dan Magenheimer" > <dan.magenheimer@oracle.com> wrote: > > > (Moving from offlist discussion.) > > > > I''m interested in opinions... Assume there are four > > single vcpu domains A, B, C, D, running on a 2-CPU > > physical machine. We wish to test for time skew on > > domain A. Assuming B, C, and D are all running > > some workload that attempts to fully saturate the > > (single) cpu. > > > > 1) Should the affect on domain A be essentially the > > same regardless of what load (e.g., compile, > > lmbench, or just "while(1);") is running in > > B, C, and D? > > 2) Should "xm sched-credit -d A -c 50" have the > > same result (e.g. no other domains need be run)? > > > > If the load on other domains can affect time skew on > > domain A, this raises isolation questions. And > > it makes time skew testing much harder (What loads > > and real-customer situations can cause more skew?) > > > > If the load on other domains canNOT affect time skew > > on domain A, testing for time skew becomes a lot > > easier. (Use sched-credit instead of launching multiple > > domains.) > > > > Comments? My preliminary testing has been inconclusive. > > > > Dan > > > > -----Original Message----- > > From: Dave Winchell [mailto:dwinchell@virtualiron.com] > > Sent: Thursday, June 12, 2008 4:14 PM > > To: dan.magenheimer@oracle.com > > Cc: Dave Winchell > > Subject: RE: xen hpet patch > > > > > > Dan, > > > > Usually forcing "out-of-context" is more stressful. > > I think doing it with real domains under load is more realistic. > > However, the scheduling thing may be equivalent - I just haven''t > > looked into it or thought about it. > > > > thanks, > > Dave > > > > > > -----Original Message----- > > From: Dan Magenheimer [mailto:dan.magenheimer@oracle.com] > > Sent: Thu 6/12/2008 5:13 PM > > To: Dave Winchell > > Subject: RE: xen hpet patch > > > > One more thought while waiting for compile and reboot: > > > > Am I right that all of the policies are correcting for when > > a domain "A" is out-of-context? There''s nothing in any other > > domain "B" that can account for any timer loss/gain in domain > > "A". The only reason we are running other domains is to ensure > > that domain "A" is sometimes out-of-context, and the more > > it is out-of-context, the more likely we will observe > > a problem, correct? > > > > If this is true, it doesn''t matter what workload is run > > in the non-A domains... as long as it is loading the > > CPU(s), thus ensuring that domain A is sometimes not > > scheduled on any CPU. > > > > And if all this is true, we may not need to run other > > domains at all... running "xm sched-credit -d A -c 50" > > should result in domain A being out-of-context at least > > half the time. > > > > Dan > >> > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/6/08 03:20, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> But is there anything else? > > Suppose the credit scheduler were modified to > optionally schedule random "spurts" when the > sum of caps was less than the total available > CPU. Would you then expect the results to be > essentially the same?I wouldn''t expect another domain''s workload to affect the test domain''s time synchronisation except so far as the workload affects the domain''s CPU demand over time. I imagine you could therefore simulate that CPU demand process inside the scheduler. How hard that is presumably depends how accurate you want the simulation to be. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
That''s what I thought. Sorry to belabour, but that leads to one more question: If one were to put an appropriately random CPU-only load on every processor on domain0 (assuming domain0 runs on all physical processors), then this would presumably be sufficient to exercise a domain-under-test''s time synchronization, correct? Thanks, Dan> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com]On Behalf Of Keir Fraser > Sent: Saturday, June 14, 2008 2:59 AM > To: dan.magenheimer@oracle.com; Dave Winchell; Ben Guthro; xen-devel > Subject: [Xen-devel] Re: Isolation and time > > > On 14/6/08 03:20, "Dan Magenheimer" > <dan.magenheimer@oracle.com> wrote: > > > But is there anything else? > > > > Suppose the credit scheduler were modified to > > optionally schedule random "spurts" when the > > sum of caps was less than the total available > > CPU. Would you then expect the results to be > > essentially the same? > > I wouldn''t expect another domain''s workload to affect the > test domain''s time > synchronisation except so far as the workload affects the domain''s CPU > demand over time. I imagine you could therefore simulate that > CPU demand > process inside the scheduler. How hard that is presumably depends how > accurate you want the simulation to be. > > -- Keir > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/6/08 16:16, "Dan Magenheimer" <dan.magenheimer@oracle.com> wrote:> That''s what I thought. Sorry to belabour, but > that leads to one more question: > > If one were to put an appropriately random CPU-only load > on every processor on domain0 (assuming domain0 runs > on all physical processors), then this would presumably > be sufficient to exercise a domain-under-test''s time > synchronization, correct?Yes, I reckon that could work pretty well. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Yes, I reckon that could work pretty well. > > -- KeirI''m pretty sure the attached script when run in domain0 will generate an appropriately random CPU load on every domain0 vcpu. For each vcpu, it randomly eats CPU for a fraction of a second, then sleeps for the same amount of time, then repeats. Thus any timer-under-test domain should see random unscheduled intervals which hopefully occur at random fractions of its timer tick. I''ll probably use this for future timer testing, so let me know if you see problems with this approach. Thanks, Dan P.S. May be dependent on RH-based dom0.> -----Original Message----- > From: Keir Fraser [mailto:keir.fraser@eu.citrix.com] > Sent: Saturday, June 14, 2008 9:25 AM > > <dan.magenheimer@oracle.com> wrote: > > > That''s what I thought. Sorry to belabour, but > > that leads to one more question: > > > > If one were to put an appropriately random CPU-only load > > on every processor on domain0 (assuming domain0 runs > > on all physical processors), then this would presumably > > be sufficient to exercise a domain-under-test''s time > > synchronization, correct? > > Yes, I reckon that could work pretty well. > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel