On 23/02/2009 07:27, "Guanqun Lu" <guanqun.lu@intel.com> wrote:> fix S3 suspend error. > > invoke ioapic_pm_state_alloc() eariler, > thus avoiding check_lock() BUG_ON() in spin_lock(). > > Signed-off-by: Guanqun Lu <guanqun.lu@intel.com>This one I think is a good idea anyway. The other two -- I wonder why we do device suspend/resume with IRQs disabled? As far as I can tell Linux does not do that. So would it be cleaner to move the device suspend/resume hooks outside the IRQs-disabled region? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>From: Keir Fraser >Sent: Monday, February 23, 2009 8:39 PM > >On 23/02/2009 07:27, "Guanqun Lu" <guanqun.lu@intel.com> wrote: > >> fix S3 suspend error. >> >> invoke ioapic_pm_state_alloc() eariler, >> thus avoiding check_lock() BUG_ON() in spin_lock(). >> >> Signed-off-by: Guanqun Lu <guanqun.lu@intel.com> > >This one I think is a good idea anyway. The other two -- I >wonder why we do >device suspend/resume with IRQs disabled? As far as I can tell >Linux does >not do that. So would it be cleaner to move the device >suspend/resume hooks >outside the IRQs-disabled region? >I can''t ensure my understanding exactly matching latest Linux S3 implementation which evolves quickly. But generally device drivers typically are allowed with two callbacks, one being invoked with irq enabled and the other with irq disabled, since some actions have such requirement. On the other side, system device class, such as ioapic, lapic, etc. are saved with irq disabled after other external devices. Xen owns a few devices with most as system class, so for simplicity interrupt is disabled from start. It''s not worthy of touching this flow for this simple spinlock issue, imo. :-) Thanks, Kevin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 23/02/2009 04:46, "Tian, Kevin" <kevin.tian@intel.com> wrote:> I can''t ensure my understanding exactly matching latest Linux > S3 implementation which evolves quickly. But generally device > drivers typically are allowed with two callbacks, one being invoked > with irq enabled and the other with irq disabled, since some actions > have such requirement. On the other side, system device class, > such as ioapic, lapic, etc. are saved with irq disabled after other > external devices. > > Xen owns a few devices with most as system class, so for > simplicity interrupt is disabled from start. It''s not worthy of touching > this flow for this simple spinlock issue, imo. :-)Can you point out any core devices in modern Linux 2.6 which implement the suspend_late() or *_noirq() callbacks (those appear to be the ones handled in device_power_down() with IRQs disabled)? They seem very obsolete/legacy, and looks to me like we could unconditionally *not* disable IRQs rather than unconditionally disabling them. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 23/02/2009 05:15, "Keir Fraser" <keir.fraser@eu.citrix.com> wrote:>> Xen owns a few devices with most as system class, so for >> simplicity interrupt is disabled from start. It''s not worthy of touching >> this flow for this simple spinlock issue, imo. :-) > > Can you point out any core devices in modern Linux 2.6 which implement the > suspend_late() or *_noirq() callbacks (those appear to be the ones handled > in device_power_down() with IRQs disabled)? They seem very obsolete/legacy, > and looks to me like we could unconditionally *not* disable IRQs rather than > unconditionally disabling them.Oh, yes, now I see the sysdev_suspend() stuff. Okay, I agree with the patches then. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Guanqun Lu
2009-Feb-23 15:27 UTC
[Xen-devel] [PATCH 0 of 3] fix S3 suspend and resume error
This series fix S3 suspend and resume error, which is mainly caused by inconsistent spinlock usage. Signed-off-by: Guanqun Lu <guanqun.lu@intel.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
# HG changeset patch # User Guanqun Lu <guanqun.lu@intel.com> # Date 1235401592 -28800 # Node ID b713518a241b6f6e07edb10c3fc38143138f82ea # Parent 84af3ded5b02ba0b3f8647e3bfa993725428633b fix S3 suspend error. invoke ioapic_pm_state_alloc() eariler, thus avoiding check_lock() BUG_ON() in spin_lock(). Signed-off-by: Guanqun Lu <guanqun.lu@intel.com> diff -r 84af3ded5b02 -r b713518a241b xen/arch/x86/io_apic.c --- a/xen/arch/x86/io_apic.c Thu Feb 19 11:07:33 2009 +0000 +++ b/xen/arch/x86/io_apic.c Mon Feb 23 23:06:32 2009 +0800 @@ -1779,6 +1779,22 @@ static inline void check_timer(void) */ #define PIC_IRQS (1 << PIC_CASCADE_IR) +struct IO_APIC_route_entry *ioapic_pm_state=NULL; + +void ioapic_pm_state_alloc(void) +{ + int i, nr_entry = 0; + + if (ioapic_pm_state != NULL) + return; + + for (i = 0; i < nr_ioapics; i++) + nr_entry += nr_ioapic_registers[i]; + + ioapic_pm_state = _xmalloc(sizeof(struct IO_APIC_route_entry)*nr_entry, + sizeof(struct IO_APIC_route_entry)); +} + void __init setup_IO_APIC(void) { enable_IO_APIC(); @@ -1801,24 +1817,9 @@ void __init setup_IO_APIC(void) init_IO_APIC_traps(); check_timer(); print_IO_APIC(); + ioapic_pm_state_alloc(); register_keyhandler(''z'', print_IO_APIC_keyhandler, "print ioapic info"); -} - -struct IO_APIC_route_entry *ioapic_pm_state=NULL; - -void ioapic_pm_state_alloc(void) -{ - int i, nr_entry = 0; - - if (ioapic_pm_state != NULL) - return; - - for (i = 0; i < nr_ioapics; i++) - nr_entry += nr_ioapic_registers[i]; - - ioapic_pm_state = _xmalloc(sizeof(struct IO_APIC_route_entry)*nr_entry, - sizeof(struct IO_APIC_route_entry)); } int ioapic_suspend(void) @@ -1826,8 +1827,6 @@ int ioapic_suspend(void) struct IO_APIC_route_entry *entry; unsigned long flags; int apic,i; - - ioapic_pm_state_alloc(); if (ioapic_pm_state == NULL) { printk("Cannot suspend ioapic due to lack of memory\n"); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
# HG changeset patch # User Guanqun Lu <guanqun.lu@intel.com> # Date 1235401621 -28800 # Node ID 0e17f070db2aa92bc74f29245cdb8528aa72bbe3 # Parent b713518a241b6f6e07edb10c3fc38143138f82ea fix S3 suspend error platform_time_calibration() is invoked in S3 when irq is disabled, which causes ASSERT() error in spin_lock_irq(). spin_lock_irqsave() saves us. Signed-off-by: Guanqun Lu <guanqun.lu@intel.com> diff -r b713518a241b -r 0e17f070db2a xen/arch/x86/time.c --- a/xen/arch/x86/time.c Mon Feb 23 23:06:32 2009 +0800 +++ b/xen/arch/x86/time.c Mon Feb 23 23:07:01 2009 +0800 @@ -607,13 +607,14 @@ static void platform_time_calibration(vo { u64 count; s_time_t stamp; - - spin_lock_irq(&platform_timer_lock); + unsigned long flags; + + spin_lock_irqsave(&platform_timer_lock, flags); count = plt_stamp64 + ((plt_src.read_counter() - plt_stamp) & plt_mask); stamp = __read_platform_stime(count); stime_platform_stamp = stamp; platform_timer_stamp = count; - spin_unlock_irq(&platform_timer_lock); + spin_unlock_irqrestore(&platform_timer_lock, flags); } static void resume_platform_timer(void) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
# HG changeset patch # User Guanqun Lu <guanqun.lu@intel.com> # Date 1235401627 -28800 # Node ID 04e7e4e8519fbbd2ec141875d466ea8ca038d553 # Parent 0e17f070db2aa92bc74f29245cdb8528aa72bbe3 fix S3 resume error spin_lock() usage in do_settime() in this situation is invoked when irq is disabled, which causes check_lock() BUG_ON(). We bypass the check by using spin_debug_disable(). It''s safe since no other CPUs are online yet at this moment. Signed-off-by: Guanqun Lu <guanqun.lu@intel.com> diff -r 0e17f070db2a -r 04e7e4e8519f xen/arch/x86/time.c --- a/xen/arch/x86/time.c Mon Feb 23 23:07:01 2009 +0800 +++ b/xen/arch/x86/time.c Mon Feb 23 23:07:07 2009 +0800 @@ -1300,7 +1300,9 @@ int time_resume(void) init_percpu_time(); + spin_debug_disable(); do_settime(get_cmos_time() + cmos_utc_offset, 0, NOW()); + spin_debug_enable(); update_vcpu_system_time(current); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel