Yu, Ke
2008-Apr-07 09:18 UTC
[Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT Timer Interrupt
In X86, there is 100HZ 8254 PIT timer interrupt (xen/arch/x86/time.c timer_interrupt). This interrupt set a 10ms theoretical upper limit for C state residency. Every 10ms, processor must wake up from C state to service this interrupt. Since ACPI Cx support feature is under development, it is time to condider removing this upper limit. This email present a proposal to address this issue and allow xen to stay in C state for longer time, and thus make xen more power friendly. This proposal firstly exam the functionality of PIT timer interrupt, and then discuss the way to replace these functionality. * PIT timer interrupt handler in xen/arch/x86/time.c timer_interrupt() has three functionalities - increase jiffies: jiffies++ - if CPU has no APIC support, raise TIMER_SOFTIRQ - if platform timer counter overflow, call plt_overflow() to fold platform timer to 64 bit. * How to handle jiffies Since there are only several places using jiffies (see below), so it is easy to replace jiffies by NOW() API xen\drivers\passthrough\vtd\intremap.c xen\drivers\passthrough\vtd\iommu.c xen\arch\x86\io_apic.c * How to raise TIMER_SOFTIRQ if CPU has no APIC Since it rare that CPU has no APIC, so this case will not be optimized. If CPU has no APIC support, we will still use PIT timer to raise TIMER_SOFTIRQ * How to handle platform timer counter overflow Since IBM cyclone timer/HPET timer/ACPI PM timer has large overflow period (> 1s), so it is safe to use AC timer to handle the overflow. For PIT timer counter, it has pretty small overflow period (0.055s), so still use PIT timer interrupt to handle PIT timer counter overflow, to guarantee its accuracy. P.S. the platform timer counter overflow period is as follow - IBM cyclone timer 32bit counter overflows every 42.9s - HPET timer 32bit counter overflows every ~300s (in ICH7/8/9) - ACPI PM timer 24bit counter overflows every 4.6s - PIT timer 16bit counter overflows every 0.055s In summary, to remove the PIT timer interrupt, we can - replace jiffies usage by NOW() - if Cyclone/HPET/ACPI is used as platform timer souce, use AC timer to handle overflow. - if CPU has no APIC support, or PIT is used as platform timer source, still enable and setup PIT timer interrupt. And the timer_interrupt() will look like this: if ( !cpu_has_apic ) raise_softirq(TIMER_SOFTIRQ); if ( using_pit && --plt_overflow_jiffies == 0 ) plt_overflow(); Any comment is welcome. Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Apr-07 10:03 UTC
Re: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT Timer Interrupt
On 7/4/08 10:18, "Yu, Ke" <ke.yu@intel.com> wrote:> * How to handle jiffies > Since there are only several places using jiffies (see below), so it is > easy to replace jiffies by NOW() API > xen\drivers\passthrough\vtd\intremap.c > xen\drivers\passthrough\vtd\iommu.cYes, they should have been implemented that way in the first place.> xen\arch\x86\io_apic.cNo, since the whole point of using jiffies is to check that irq0 is working. My advice is to leave PIT (irq0) setup as-is during boot, and then disable PIT channel 0 at end of boot if you detect that it is not required (i.e., you have local APIC, and you are not using PIT as platform time source).> * How to raise TIMER_SOFTIRQ if CPU has no APIC > Since it rare that CPU has no APIC, so this case will not be optimized. > If CPU has no APIC support, we will still use PIT timer to raise > TIMER_SOFTIRQAgreed.> * How to handle platform timer counter overflow > Since IBM cyclone timer/HPET timer/ACPI PM timer has large overflow > period (> 1s), so it is safe to use AC timer to handle the overflow. For > PIT timer counter, it has pretty small overflow period (0.055s), so > still use PIT timer interrupt to handle PIT timer counter overflow, to > guarantee its accuracy.Agreed.> P.S. the platform timer counter overflow period is as follow > - IBM cyclone timer 32bit counter overflows every 42.9s > - HPET timer 32bit counter overflows every ~300s (in ICH7/8/9) > - ACPI PM timer 24bit counter overflows every 4.6s > - PIT timer 16bit counter overflows every 0.055sWell, bear in mind that we actually schedule the overflow handler when the counter is halfway to being wrapped. So actually we schedule overflow handling for the above timers every 21.5s, 150s, 2.3s and 0.028s, respectively. It is okay to use ac_timer logic, triggered off softirq, if we are sure that softirq work is guaranteed to be done often enough that the overflow handler will run before the counter actually overflows. Ignoring the PIT, this means we need a worst-case softirq schedule-to-execute latency of 2.3s, which seems pretty reasonable to expect!> In summary, to remove the PIT timer interrupt, we can > - replace jiffies usage by NOW() > - if Cyclone/HPET/ACPI is used as platform timer souce, use AC timer to > handle overflow. > - if CPU has no APIC support, or PIT is used as platform timer source, > still enable and setup PIT timer interrupt. And the timer_interrupt() > will look like this: > if ( !cpu_has_apic ) > raise_softirq(TIMER_SOFTIRQ); > if ( using_pit && --plt_overflow_jiffies == 0 ) > plt_overflow();Yep, something like that. However, you should always do PIT setup, and then *disable* it if you don''t need it after boot. Also, I''d handle PIT overflow a bit differently. The generic overflow handling we have in time.c can be changed to use ac timers. I would then disable that (effectively) for PIT by having init_pit() expose a 32-bit counter, then have the overflow handling for turning a 16-bit counter into a 32-bit counter happening behind the scenes. Don''t worry about this aspect too much though: I''m happy to clean up this part myself if you handle the rest of the patch. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Apr-08 01:29 UTC
RE: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT Timer Interrupt
Thanks for the comments. Keir Fraser wrote:> On 7/4/08 10:18, "Yu, Ke" <ke.yu@intel.com> wrote: > >> xen\arch\x86\io_apic.c > > No, since the whole point of using jiffies is to check that irq0 is > working. My advice is to leave PIT (irq0) setup as-is during boot, > and then disable PIT channel 0 at end of boot if you detect that it > is not required (i.e., you have local APIC, and you are not using PIT > as platform time source).Agree. Will do this way.> >> P.S. the platform timer counter overflow period is as follow >> - IBM cyclone timer 32bit counter overflows every 42.9s >> - HPET timer 32bit counter overflows every ~300s (in ICH7/8/9) >> - ACPI PM timer 24bit counter overflows every 4.6s >> - PIT timer 16bit counter overflows every 0.055s > > Well, bear in mind that we actually schedule the overflow handler > when the counter is halfway to being wrapped. So actually we schedule > overflow handling for the above timers every 21.5s, 150s, 2.3s and > 0.028s, respectively. > > It is okay to use ac_timer logic, triggered off softirq, if we are > sure that softirq work is guaranteed to be done often enough that the > overflow handler will run before the counter actually overflows. > Ignoring the PIT, this means we need a worst-case softirq > schedule-to-execute latency of 2.3s, which seems pretty reasonable to > expect!Yes. AC timer expire period will be half of the actual overflow period. Thanks for reminder.> >> In summary, to remove the PIT timer interrupt, we can >> - replace jiffies usage by NOW() >> - if Cyclone/HPET/ACPI is used as platform timer souce, use AC timer >> to handle overflow. >> - if CPU has no APIC support, or PIT is used as platform timer >> source, still enable and setup PIT timer interrupt. And the >> timer_interrupt() will look like this: if ( !cpu_has_apic ) >> raise_softirq(TIMER_SOFTIRQ); >> if ( using_pit && --plt_overflow_jiffies == 0 ) >> plt_overflow(); > > Yep, something like that. However, you should always do PIT setup, > and then *disable* it if you don''t need it after boot. > > Also, I''d handle PIT overflow a bit differently. The generic overflow > handling we have in time.c can be changed to use ac timers. I would > then disable that (effectively) for PIT by having init_pit() expose a > 32-bit counter, then have the overflow handling for turning a 16-bit > counter into a 32-bit counter happening behind the scenes. Don''t > worry about this aspect too much though: I''m happy to clean up this > part myself if you handle the rest of the patch. > > -- KeirNot quite understand this point, could you please elaborate more? And surely I am happy to leave this for you to clean up :) The logic in my mind is as follow, just want to make sure it leave enough space for you to refine PIT handling logic. 1. when xen is booting, setup PIT timer interrupt 2. if Cyclone/HPET/ACPI PM timer is clock source, setup AC timer for the overflow handling 3. if not using PIT and CPU has APIC, then disable PIT timer interrupt. Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Apr-08 01:42 UTC
[Xen-devel] RE: [RFC] Remove the x86 Periodic 100HZ PIT Timer Interrupt
Andi Kleen wrote:> "Yu, Ke" <ke.yu@intel.com> writes: >> >> * How to handle platform timer counter overflow >> Since IBM cyclone timer/HPET timer/ACPI PM timer has large overflow >> period (> 1s), > > ACPI PM timers are often only 24 bit and overflow in less the 5 > seconds. If you aim for longer sleep times using ACPI PM is not a > good idea > > -Andi >This proposal aims for longer processor C state residency, usually it is in ms. Take native linux kernel as example, when tickless idle is not enabled, the C state residency is usually less than 10ms. When tickless idle is enabled, I can see C state residency as long as 50ms sometimes. So 5s overflow period is long enough in terms of C state residency. Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Apr-08 06:42 UTC
Re: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT Timer Interrupt
On 8/4/08 02:29, "Yu, Ke" <ke.yu@intel.com> wrote:> The logic in my mind is as follow, just want to make sure it leave > enough space for you to refine PIT handling logic. > 1. when xen is booting, setup PIT timer interrupt > 2. if Cyclone/HPET/ACPI PM timer is clock source, setup AC timer for the > overflow handling > 3. if not using PIT and CPU has APIC, then disable PIT timer interrupt.This is close enough for me to tweak, so I''ll be happy if you could make a patch for this. Thanks, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Apr-10 08:38 UTC
RE: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT TimerInterrupt
Hi Keir, The attached are patches for this RFC. It has been tested again changeset 17427. Please feel free to change the PIT handling logic. Best Regards Ke Keir Fraser wrote:> On 8/4/08 02:29, "Yu, Ke" <ke.yu@intel.com> wrote: > >> The logic in my mind is as follow, just want to make sure it leave >> enough space for you to refine PIT handling logic. >> 1. when xen is booting, setup PIT timer interrupt >> 2. if Cyclone/HPET/ACPI PM timer is clock source, setup AC timer for >> the overflow handling >> 3. if not using PIT and CPU has APIC, then disable PIT timer >> interrupt. > > This is close enough for me to tweak, so I''ll be happy if you could > make a patch for this. > > Thanks, > Keir >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Apr-10 10:16 UTC
Re: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT TimerInterrupt
Applied as changesets 17432 and 17433. I suspect that right now the PIT will be re-enabled after host S3 resume, so you might want to look into that. -- Keir On 10/4/08 09:38, "Yu, Ke" <ke.yu@intel.com> wrote:> Hi Keir, > > The attached are patches for this RFC. It has been tested again > changeset 17427. Please feel free to change the PIT handling logic. > > Best Regards > Ke > > Keir Fraser wrote: >> On 8/4/08 02:29, "Yu, Ke" <ke.yu@intel.com> wrote: >> >>> The logic in my mind is as follow, just want to make sure it leave >>> enough space for you to refine PIT handling logic. >>> 1. when xen is booting, setup PIT timer interrupt >>> 2. if Cyclone/HPET/ACPI PM timer is clock source, setup AC timer for >>> the overflow handling >>> 3. if not using PIT and CPU has APIC, then disable PIT timer >>> interrupt. >> >> This is close enough for me to tweak, so I''ll be happy if you could >> make a patch for this. >> >> Thanks, >> Keir >>_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Apr-10 14:11 UTC
RE: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT TimerInterrupt
Keir Fraser wrote:> Applied as changesets 17432 and 17433. I suspect that right now the > PIT will be re-enabled after host S3 resume, so you might want to > look into that. > > -- KeirGood point. I will look into that. One glance showes calling late_time_init() in time_resume() should work. I will test this and send out the patch later. Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2008-Apr-10 14:24 UTC
Re: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT TimerInterrupt
On 10/4/08 15:11, "Yu, Ke" <ke.yu@intel.com> wrote:> One glance showes calling late_time_init() in time_resume() should work. > I will test this and send out the patch later.Looks like it would work. While you''re at it rename late_time_init() to maybe_disable_pit_irq() or something similarly informative. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Yu, Ke
2008-Apr-11 07:46 UTC
RE: [Xen-devel] [RFC] Remove the x86 Periodic 100HZ PIT TimerInterrupt
Keir Fraser wrote:> On 10/4/08 15:11, "Yu, Ke" <ke.yu@intel.com> wrote: > >> One glance showes calling late_time_init() in time_resume() should >> work. I will test this and send out the patch later. > > Looks like it would work. While you''re at it rename late_time_init() > to maybe_disable_pit_irq() or something similarly informative. > > -- KeirThe attached is the patch for this. It has been tested against changeset 17433. Wall clock time works fine after host S3 suspend/resume. diff -r 8d750b7acfa3 -r c05b191c8e2a xen/arch/x86/time.c --- a/xen/arch/x86/time.c Thu Apr 10 11:11:25 2008 +0100 +++ b/xen/arch/x86/time.c Fri Apr 11 15:18:45 2008 +0800 @@ -990,7 +990,7 @@ void __init early_time_init(void) setup_irq(0, &irq0); } -static int __init late_time_init(void) +static int __init disable_pit_irq(void) { if ( !using_pit && cpu_has_apic ) { @@ -1001,7 +1001,7 @@ static int __init late_time_init(void) } return 0; } -__initcall(late_time_init); +__initcall(disable_pit_irq); void send_timer_event(struct vcpu *v) { @@ -1035,6 +1035,8 @@ int time_resume(void) int time_resume(void) { u64 tmp = init_pit_and_calibrate_tsc(); + + disable_pit_irq(); set_time_scale(&this_cpu(cpu_time).tsc_scale, tmp); Best Regards Ke _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel