Atsushi SAKAI
2007-Jun-15 09:47 UTC
[Xen-devel] [PATCH] Fix clock_gettime to increment monotonically on PV-domain/x86
Hi, Keir This patch intends to increment do_gettimeofday monotonically. This is for PV-domain/x86. Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> By applying patch, the time rollback for SMP is fixed. This is important fixes for database software. Thanks Atsushi SAKAI _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2007-Jun-15 10:19 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
I''m afraid this isn''t MP-safe - you''re in a (pseudo-)read-locked section of code, yet write a global variable (and even in two pieces). Jan>>> Atsushi SAKAI <sakaia@jp.fujitsu.com> 15.06.07 11:47 >>>Hi, Keir This patch intends to increment do_gettimeofday monotonically. This is for PV-domain/x86. Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> By applying patch, the time rollback for SMP is fixed. This is important fixes for database software. Thanks Atsushi SAKAI _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-15 10:26 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
Yeah, it needs a spinlock. Which is a shame, but it''s still going to be much faster than trapping into the hypervisor, which is the only other way we''re going to be able to achieve guaranteed monotonic time. We could avoid the lock if we guaranteed monotonic time only per task (thread). Or is that not really good enough? :-) -- Keir On 15/6/07 11:19, "Jan Beulich" <jbeulich@novell.com> wrote:> I''m afraid this isn''t MP-safe - you''re in a (pseudo-)read-locked section of > code, > yet write a global variable (and even in two pieces). Jan > >>>> Atsushi SAKAI <sakaia@jp.fujitsu.com> 15.06.07 11:47 >>> > Hi, Keir > > This patch intends to increment do_gettimeofday monotonically. > This is for PV-domain/x86. > > Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> > > By applying patch, the time rollback for SMP is fixed. > This is important fixes for database software. > > Thanks > Atsushi SAKAI > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2007-Jun-15 11:12 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
An alternative to a spin lock would be to use a 64-bit variable storing the raw nanosecond value, and use cmpxchg to check/update it. I did it this way for the clocksource monotonicity: static cycle_t xen_clocksource_read(void) { int cpu = get_cpu(); struct shadow_time_info *shadow = &per_cpu(shadow_time, cpu); cycle_t ret; get_time_values_from_xen(cpu); ret = shadow->system_timestamp + get_nsec_offset(shadow); put_cpu(); #ifdef CONFIG_SMP for (;;) { static cycle_t last_ret; #ifndef CONFIG_64BIT cycle_t last = cmpxchg64(&last_ret, 0, 0); #else cycle_t last = last_ret; #define cmpxchg64 cmpxchg #endif if ((s64)(ret - last) < 0) { if (last - ret > permitted_clock_jitter && printk_ratelimit()) printk(KERN_WARNING "clocksource/%d: " "Time went backwards: " "delta=%Ld shadow=%Lu offset=%Lu\n", cpu, ret - last, shadow->system_timestamp, get_nsec_offset(shadow)); ret = last; } if (cmpxchg64(&last_ret, last, ret) == last) break; } #endif return ret; } Jan>>> Keir Fraser <keir@xensource.com> 15.06.07 12:26 >>>Yeah, it needs a spinlock. Which is a shame, but it''s still going to be much faster than trapping into the hypervisor, which is the only other way we''re going to be able to achieve guaranteed monotonic time. We could avoid the lock if we guaranteed monotonic time only per task (thread). Or is that not really good enough? :-) -- Keir On 15/6/07 11:19, "Jan Beulich" <jbeulich@novell.com> wrote:> I''m afraid this isn''t MP-safe - you''re in a (pseudo-)read-locked section of > code, > yet write a global variable (and even in two pieces). Jan > >>>> Atsushi SAKAI <sakaia@jp.fujitsu.com> 15.06.07 11:47 >>> > Hi, Keir > > This patch intends to increment do_gettimeofday monotonically. > This is for PV-domain/x86. > > Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> > > By applying patch, the time rollback for SMP is fixed. > This is important fixes for database software. > > Thanks > Atsushi SAKAI > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-15 11:19 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
A spinlock is almost certainly no slower. Since the critical region is so tiny the likelihood of spinning is very small. And the spin_lock() fast path is probably faster than a CMPXCHG8B. -- Keir On 15/6/07 12:12, "Jan Beulich" <jbeulich@novell.com> wrote:> An alternative to a spin lock would be to use a 64-bit variable storing the > raw > nanosecond value, and use cmpxchg to check/update it. I did it this way for > the clocksource monotonicity: > > static cycle_t xen_clocksource_read(void) > { > int cpu = get_cpu(); > struct shadow_time_info *shadow = &per_cpu(shadow_time, cpu); > cycle_t ret; > > get_time_values_from_xen(cpu); > > ret = shadow->system_timestamp + get_nsec_offset(shadow); > > put_cpu(); > > #ifdef CONFIG_SMP > for (;;) { > static cycle_t last_ret; > #ifndef CONFIG_64BIT > cycle_t last = cmpxchg64(&last_ret, 0, 0); > #else > cycle_t last = last_ret; > #define cmpxchg64 cmpxchg > #endif > > if ((s64)(ret - last) < 0) { > if (last - ret > permitted_clock_jitter > && printk_ratelimit()) > printk(KERN_WARNING "clocksource/%d: " > "Time went backwards: " > "delta=%Ld shadow=%Lu offset=%Lu\n", > cpu, ret - last, > shadow->system_timestamp, > get_nsec_offset(shadow)); > ret = last; > } > if (cmpxchg64(&last_ret, last, ret) == last) > break; > } > #endif > > return ret; > } > > Jan > >>>> Keir Fraser <keir@xensource.com> 15.06.07 12:26 >>> > Yeah, it needs a spinlock. Which is a shame, but it''s still going to be much > faster than trapping into the hypervisor, which is the only other way we''re > going to be able to achieve guaranteed monotonic time. > > We could avoid the lock if we guaranteed monotonic time only per task > (thread). Or is that not really good enough? :-) > > -- Keir > > On 15/6/07 11:19, "Jan Beulich" <jbeulich@novell.com> wrote: > >> I''m afraid this isn''t MP-safe - you''re in a (pseudo-)read-locked section of >> code, >> yet write a global variable (and even in two pieces). Jan >> >>>>> Atsushi SAKAI <sakaia@jp.fujitsu.com> 15.06.07 11:47 >>> >> Hi, Keir >> >> This patch intends to increment do_gettimeofday monotonically. >> This is for PV-domain/x86. >> >> Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> >> >> By applying patch, the time rollback for SMP is fixed. >> This is important fixes for database software. >> >> Thanks >> Atsushi SAKAI >> >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Atsushi SAKAI
2007-Jun-15 11:44 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to incrementmonotonically onPV-domain/x86
Hi, Keir and Jan Thank you for various discussions. I rewrite it to spin_lock version. Database software uses this kind of information. And problematic is running OK on Native and Not on Xen Thanks Atsushi SAKAI Keir Fraser <keir@xensource.com> wrote:> Yeah, it needs a spinlock. Which is a shame, but it''s still going to be much > faster than trapping into the hypervisor, which is the only other way we''re > going to be able to achieve guaranteed monotonic time. > > We could avoid the lock if we guaranteed monotonic time only per task > (thread). Or is that not really good enough? :-) > > -- Keir > > On 15/6/07 11:19, "Jan Beulich" <jbeulich@novell.com> wrote: > > > I''m afraid this isn''t MP-safe - you''re in a (pseudo-)read-locked section of > > code, > > yet write a global variable (and even in two pieces). Jan > > > >>>> Atsushi SAKAI <sakaia@jp.fujitsu.com> 15.06.07 11:47 >>> > > Hi, Keir > > > > This patch intends to increment do_gettimeofday monotonically. > > This is for PV-domain/x86. > > > > Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> > > > > By applying patch, the time rollback for SMP is fixed. > > This is important fixes for database software. > > > > Thanks > > Atsushi SAKAI > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Levon
2007-Jun-15 12:47 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On Fri, Jun 15, 2007 at 01:12:21PM +0200, Jan Beulich wrote:> An alternative to a spin lock would be to use a 64-bit variable storing the raw > nanosecond value, and use cmpxchg to check/update it. I did it this way for > the clocksource monotonicity:This is essentially what we''ve got now in Solaris. It seems like a terrible shame not to just fix it in Xen, especially given all that traffic from all CPUs to ''last_ret''. regards john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-15 12:59 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On 15/6/07 13:47, "John Levon" <levon@movementarian.org> wrote:> On Fri, Jun 15, 2007 at 01:12:21PM +0200, Jan Beulich wrote: > >> An alternative to a spin lock would be to use a 64-bit variable storing the >> raw >> nanosecond value, and use cmpxchg to check/update it. I did it this way for >> the clocksource monotonicity: > > This is essentially what we''ve got now in Solaris. It seems like a > terrible shame not to just fix it in Xen, especially given all that > traffic from all CPUs to ''last_ret''.How would we fix it in Xen in a way that is faster and more scalable? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Levon
2007-Jun-15 13:08 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On Fri, Jun 15, 2007 at 01:59:36PM +0100, Keir Fraser wrote:> >> An alternative to a spin lock would be to use a 64-bit variable storing the > >> raw > >> nanosecond value, and use cmpxchg to check/update it. I did it this way for > >> the clocksource monotonicity: > > > > This is essentially what we''ve got now in Solaris. It seems like a > > terrible shame not to just fix it in Xen, especially given all that > > traffic from all CPUs to ''last_ret''. > > How would we fix it in Xen in a way that is faster and more scalable?A good question :) One thing we''ve considered is losing some precision based upon how much of a delta there is between the real CPUs (i.e. drop lower bits and round up). But we''re still (slowly) looking into the problem. regards john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-15 13:21 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On 15/6/07 14:08, "John Levon" <levon@movementarian.org> wrote:>>> This is essentially what we''ve got now in Solaris. It seems like a >>> terrible shame not to just fix it in Xen, especially given all that >>> traffic from all CPUs to ''last_ret''. >> >> How would we fix it in Xen in a way that is faster and more scalable? > > A good question :) > > One thing we''ve considered is losing some precision based upon how much > of a delta there is between the real CPUs (i.e. drop lower bits and > round up). But we''re still (slowly) looking into the problem.IIUC this would make it less likely to see time going backwards, but when you do it''ll be by a lot more (the size of your precision granularity), and it''ll occur when your time values are unlucky enough to be just on the wrong sides of a boundary between two time intervals which map to different lower-precision time values. I believe it''s a pretty fundamental property of a monotonically-increasing counter in a distributed system that communication is required to implement it. Time is a funny thing. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Atsushi SAKAI
2007-Jun-15 13:58 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonicallyonPV-domain/x86
Hi, John Is there any plan to use vsyscall for clock_gettime on PV-dom? I think this is the best solution for this. But this kind of change is very large. In this meaning, my patch is temporaly solution like within 4CPU. Thanks Atsushi SAKAI John Levon <levon@movementarian.org> wrote:> On Fri, Jun 15, 2007 at 01:59:36PM +0100, Keir Fraser wrote: > > > >> An alternative to a spin lock would be to use a 64-bit variable storing the > > >> raw > > >> nanosecond value, and use cmpxchg to check/update it. I did it this way for > > >> the clocksource monotonicity: > > > > > > This is essentially what we''ve got now in Solaris. It seems like a > > > terrible shame not to just fix it in Xen, especially given all that > > > traffic from all CPUs to ''last_ret''. > > > > How would we fix it in Xen in a way that is faster and more scalable? > > A good question :) > > One thing we''ve considered is losing some precision based upon how much > of a delta there is between the real CPUs (i.e. drop lower bits and > round up). But we''re still (slowly) looking into the problem. > > regards > john > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Levon
2007-Jun-15 14:00 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On Fri, Jun 15, 2007 at 02:21:12PM +0100, Keir Fraser wrote:> >>> This is essentially what we''ve got now in Solaris. It seems like a > >>> terrible shame not to just fix it in Xen, especially given all that > >>> traffic from all CPUs to ''last_ret''. > >> > >> How would we fix it in Xen in a way that is faster and more scalable? > > > > A good question :) > > > > One thing we''ve considered is losing some precision based upon how much > > of a delta there is between the real CPUs (i.e. drop lower bits and > > round up). But we''re still (slowly) looking into the problem. > > IIUC this would make it less likely to see time going backwards, but when > you do it''ll be by a lot more (the size of your precision granularity), and > it''ll occur when your time values are unlucky enough to be just on the wrong > sides of a boundary between two time intervals which map to different > lower-precision time values.The idea is that it would never be wrong enough to hit that case (taken to extremes we wouldn''t expect to see it go backwards by a minute!)> I believe it''s a pretty fundamental property of a monotonically-increasing > counter in a distributed system that communication is required to implement > it. Time is a funny thing.Well. It''s proven a reasonable assumption for us on the x86 systems we''re supporting that the TSC difference between CPUs is stable enough for us to provide effective monotonicity. This is much harder to do from a virtual CPU for obvious reasons :) regards john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-15 14:18 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On 15/6/07 15:00, "John Levon" <levon@movementarian.org> wrote:>> IIUC this would make it less likely to see time going backwards, but when >> you do it''ll be by a lot more (the size of your precision granularity), and >> it''ll occur when your time values are unlucky enough to be just on the wrong >> sides of a boundary between two time intervals which map to different >> lower-precision time values. > > The idea is that it would never be wrong enough to hit that case (taken > to extremes we wouldn''t expect to see it go backwards by a minute!)Then I don''t understand what you''re proposing (I thought you were just shaving off least-significant bits, but it must be smarter than that). -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
John Levon
2007-Jun-15 14:32 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to increment monotonically onPV-domain/x86
On Fri, Jun 15, 2007 at 03:18:16PM +0100, Keir Fraser wrote:> >> IIUC this would make it less likely to see time going backwards, but when > >> you do it''ll be by a lot more (the size of your precision granularity), and > >> it''ll occur when your time values are unlucky enough to be just on the wrong > >> sides of a boundary between two time intervals which map to different > >> lower-precision time values. > > > > The idea is that it would never be wrong enough to hit that case (taken > > to extremes we wouldn''t expect to see it go backwards by a minute!) > > Then I don''t understand what you''re proposing (I thought you were just > shaving off least-significant bits, but it must be smarter than that).Sorry, I misunderstood you. Yes, you''re right, this doesn''t seem workable. Bah. john _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Atsushi SAKAI
2007-Jun-19 09:21 UTC
Re: [Xen-devel] [PATCH] Fix clock_gettime to incrementmonotonicallyonPV-domain/x86
Hi, Keir I accidentally lent Xeon E5310(Quad-core) x 2. And I test it with 8thread 1hour running.(8thread version of pvx86test) It works fine without seeing softlockup with attached patch. In 8thread running, top shows 300-400 % of CPU consumption. and it''s ratio is 3.0%us 32.9%sy 0.0%ni 60.5%id 0.0%wa 0.0%hi 0.0%si 3.6%st xentop shows 600 - 700 % of CPU consumption. Signed-off-by: Atsushi SAKAI <sakaia@jp.fujitsu.com> Thanks Atsushi SAKAI _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel