Ben Guthro
2013-Jan-16 20:20 UTC
[PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
Check for ioport access, before fully resuming operation, to avoid spinning in __ns16550_poll when reading the LSR register returns 0xFF on failing ioport access. On some systems, there is a SuperIO card that provides this legacy ioport on the LPC bus. In this case, we need to wait for dom0''s ACPI processing to run the proper AML to re-initialize the chip, before we can use the card again. This may cause a small amount of garbage to be written to the serial log while we wait patiently for that AML to be executed. It limits the number of retries, to avoid a situation where we keep trying over and over again, in the case of some other failure on the ioport. Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> --- xen/drivers/char/ns16550.c | 56 +++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c index d77042e..27555c7 100644 --- a/xen/drivers/char/ns16550.c +++ b/xen/drivers/char/ns16550.c @@ -42,6 +42,7 @@ static struct ns16550 { struct irqaction irqaction; /* UART with no IRQ line: periodically-polled I/O. */ struct timer timer; + struct timer resume_timer; unsigned int timeout_ms; bool_t intr_works; /* PCI card parameters. */ @@ -120,6 +121,10 @@ static struct ns16550 { /* Frequency of external clock source. This definition assumes PC platform. */ #define UART_CLOCK_HZ 1843200 +/* Resume retry settings */ +#define RESUME_DELAY MILLISECS(10) +#define RESUME_RETRIES 100 + static char ns_read_reg(struct ns16550 *uart, int reg) { if ( uart->remapped_io_base == NULL ) @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port) uart->ps_bdf[2], PCI_COMMAND); } -static void ns16550_resume(struct serial_port *port) +static void __ns16550_resume(struct serial_port *port) { struct ns16550 *uart = port->uart; @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port) ns16550_setup_postirq(port->uart); } +static int ns16550_ioport_invalid(struct ns16550 *uart) +{ + return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && + (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && + (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && + (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && + (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); +} + +static int delayed_resume_tries; +static void ns16550_delayed_resume(void *data) +{ + struct serial_port *port = data; + struct ns16550 *uart = port->uart; + + if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); + return; + } + + __ns16550_resume(port); +} + +static void ns16550_resume(struct serial_port *port) +{ + struct ns16550 *uart = port->uart; + + /* + * Check for ioport access, before fully resuming operation. + * On some systems, there is a SuperIO card that provides + * this legacy ioport on the LPC bus. + * + * We need to wait for dom0''s ACPI processing to run the proper + * AML to re-initialize the chip, before we can use the card again. + * + * This may cause a small amount of garbage to be written + * to the serial log while we wait patiently for that AML to + * be executed. + */ + if (ns16550_ioport_invalid(uart)) { + delayed_resume_tries = RESUME_RETRIES; + init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0); + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); + return; + } + + __ns16550_resume(port); +} + #ifdef CONFIG_X86 static void __init ns16550_endboot(struct serial_port *port) { -- 1.7.9.5
Pasi Kärkkäinen
2013-Jan-16 21:24 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote:> Check for ioport access, before fully resuming operation, to avoid > spinning in __ns16550_poll when reading the LSR register returns 0xFF > on failing ioport access. > > On some systems, there is a SuperIO card that provides this legacy ioport > on the LPC bus. > > In this case, we need to wait for dom0''s ACPI processing to run the proper > AML to re-initialize the chip, before we can use the card again. > > This may cause a small amount of garbage to be written > to the serial log while we wait patiently for that AML to > be executed. > > It limits the number of retries, to avoid a situation where we keep trying > over and over again, in the case of some other failure on the ioport. >So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops? It might be good to mention that aswell.. -- Pasi> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> > --- > xen/drivers/char/ns16550.c | 56 +++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 55 insertions(+), 1 deletion(-) > > diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c > index d77042e..27555c7 100644 > --- a/xen/drivers/char/ns16550.c > +++ b/xen/drivers/char/ns16550.c > @@ -42,6 +42,7 @@ static struct ns16550 { > struct irqaction irqaction; > /* UART with no IRQ line: periodically-polled I/O. */ > struct timer timer; > + struct timer resume_timer; > unsigned int timeout_ms; > bool_t intr_works; > /* PCI card parameters. */ > @@ -120,6 +121,10 @@ static struct ns16550 { > /* Frequency of external clock source. This definition assumes PC platform. */ > #define UART_CLOCK_HZ 1843200 > > +/* Resume retry settings */ > +#define RESUME_DELAY MILLISECS(10) > +#define RESUME_RETRIES 100 > + > static char ns_read_reg(struct ns16550 *uart, int reg) > { > if ( uart->remapped_io_base == NULL ) > @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port) > uart->ps_bdf[2], PCI_COMMAND); > } > > -static void ns16550_resume(struct serial_port *port) > +static void __ns16550_resume(struct serial_port *port) > { > struct ns16550 *uart = port->uart; > > @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port) > ns16550_setup_postirq(port->uart); > } > > +static int ns16550_ioport_invalid(struct ns16550 *uart) > +{ > + return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && > + (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && > + (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && > + (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && > + (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); > +} > + > +static int delayed_resume_tries; > +static void ns16550_delayed_resume(void *data) > +{ > + struct serial_port *port = data; > + struct ns16550 *uart = port->uart; > + > + if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { > + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); > + return; > + } > + > + __ns16550_resume(port); > +} > + > +static void ns16550_resume(struct serial_port *port) > +{ > + struct ns16550 *uart = port->uart; > + > + /* > + * Check for ioport access, before fully resuming operation. > + * On some systems, there is a SuperIO card that provides > + * this legacy ioport on the LPC bus. > + * > + * We need to wait for dom0''s ACPI processing to run the proper > + * AML to re-initialize the chip, before we can use the card again. > + * > + * This may cause a small amount of garbage to be written > + * to the serial log while we wait patiently for that AML to > + * be executed. > + */ > + if (ns16550_ioport_invalid(uart)) { > + delayed_resume_tries = RESUME_RETRIES; > + init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0); > + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); > + return; > + } > + > + __ns16550_resume(port); > +} > + > #ifdef CONFIG_X86 > static void __init ns16550_endboot(struct serial_port *port) > { > -- > 1.7.9.5 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Ben Guthro
2013-Jan-16 21:31 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On 01/16/2013 04:24 PM, Pasi Kärkkäinen wrote:> On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote: >> Check for ioport access, before fully resuming operation, to avoid >> spinning in __ns16550_poll when reading the LSR register returns 0xFF >> on failing ioport access. >> >> On some systems, there is a SuperIO card that provides this legacy ioport >> on the LPC bus. >> >> In this case, we need to wait for dom0''s ACPI processing to run the proper >> AML to re-initialize the chip, before we can use the card again. >> >> This may cause a small amount of garbage to be written >> to the serial log while we wait patiently for that AML to >> be executed. >> >> It limits the number of retries, to avoid a situation where we keep trying >> over and over again, in the case of some other failure on the ioport. >> > > So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops? > It might be good to mention that aswell..Yes, it seems to resolve the issues seen on T430, T530, and Intel Ivybridge mobile SDP. It likely fixes others, as the Intel mobile SDP is the reference platform for all the OEMs. If I need to re-submit this patch for any other reason, I will be sure to mention this in the comment. That said - there seems to still be an issue on some older Sandy Bridge machines, like a Lenovo T520, and X220 - but I don''t yet know if it is related. Ben> > -- Pasi > > >> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> >> --- >> xen/drivers/char/ns16550.c | 56 +++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 55 insertions(+), 1 deletion(-) >> >> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c >> index d77042e..27555c7 100644 >> --- a/xen/drivers/char/ns16550.c >> +++ b/xen/drivers/char/ns16550.c >> @@ -42,6 +42,7 @@ static struct ns16550 { >> struct irqaction irqaction; >> /* UART with no IRQ line: periodically-polled I/O. */ >> struct timer timer; >> + struct timer resume_timer; >> unsigned int timeout_ms; >> bool_t intr_works; >> /* PCI card parameters. */ >> @@ -120,6 +121,10 @@ static struct ns16550 { >> /* Frequency of external clock source. This definition assumes PC platform. */ >> #define UART_CLOCK_HZ 1843200 >> >> +/* Resume retry settings */ >> +#define RESUME_DELAY MILLISECS(10) >> +#define RESUME_RETRIES 100 >> + >> static char ns_read_reg(struct ns16550 *uart, int reg) >> { >> if ( uart->remapped_io_base == NULL ) >> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port) >> uart->ps_bdf[2], PCI_COMMAND); >> } >> >> -static void ns16550_resume(struct serial_port *port) >> +static void __ns16550_resume(struct serial_port *port) >> { >> struct ns16550 *uart = port->uart; >> >> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port) >> ns16550_setup_postirq(port->uart); >> } >> >> +static int ns16550_ioport_invalid(struct ns16550 *uart) >> +{ >> + return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && >> + (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && >> + (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && >> + (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && >> + (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); >> +} >> + >> +static int delayed_resume_tries; >> +static void ns16550_delayed_resume(void *data) >> +{ >> + struct serial_port *port = data; >> + struct ns16550 *uart = port->uart; >> + >> + if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { >> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); >> + return; >> + } >> + >> + __ns16550_resume(port); >> +} >> + >> +static void ns16550_resume(struct serial_port *port) >> +{ >> + struct ns16550 *uart = port->uart; >> + >> + /* >> + * Check for ioport access, before fully resuming operation. >> + * On some systems, there is a SuperIO card that provides >> + * this legacy ioport on the LPC bus. >> + * >> + * We need to wait for dom0''s ACPI processing to run the proper >> + * AML to re-initialize the chip, before we can use the card again. >> + * >> + * This may cause a small amount of garbage to be written >> + * to the serial log while we wait patiently for that AML to >> + * be executed. >> + */ >> + if (ns16550_ioport_invalid(uart)) { >> + delayed_resume_tries = RESUME_RETRIES; >> + init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0); >> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); >> + return; >> + } >> + >> + __ns16550_resume(port); >> +} >> + >> #ifdef CONFIG_X86 >> static void __init ns16550_endboot(struct serial_port *port) >> { >> -- >> 1.7.9.5 >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel
Malcolm Crossley
2013-Jan-16 21:40 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On 16/01/13 21:31, Ben Guthro wrote:> > On 01/16/2013 04:24 PM, Pasi Kärkkäinen wrote: >> On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote: >>> Check for ioport access, before fully resuming operation, to avoid >>> spinning in __ns16550_poll when reading the LSR register returns 0xFF >>> on failing ioport access. >>> >>> On some systems, there is a SuperIO card that provides this legacy ioport >>> on the LPC bus. >>> >>> In this case, we need to wait for dom0''s ACPI processing to run the proper >>> AML to re-initialize the chip, before we can use the card again. >>> >>> This may cause a small amount of garbage to be written >>> to the serial log while we wait patiently for that AML to >>> be executed. >>> >>> It limits the number of retries, to avoid a situation where we keep trying >>> over and over again, in the case of some other failure on the ioport. >>> >> So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops? >> It might be good to mention that aswell.. > Yes, it seems to resolve the issues seen on T430, T530, and Intel > Ivybridge mobile SDP. It likely fixes others, as the Intel mobile SDP is > the reference platform for all the OEMs. > > If I need to re-submit this patch for any other reason, I will be sure > to mention this in the comment. > > That said - there seems to still be an issue on some older Sandy Bridge > machines, like a Lenovo T520, and X220 - but I don''t yet know if it is > related.Do these laptops (T430/T530) have built in serial? Or is the hang fixed because you were using a serial Express card to debug? BTW, nearly all PC''s have external SUPERIO devices, it just seems we have managed to be lucky that most platforms seem to default to using the BIOS to enable the serial port after resume instead of the OS. Malcolm> Ben > > > >> -- Pasi >> >> >>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> >>> --- >>> xen/drivers/char/ns16550.c | 56 +++++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 55 insertions(+), 1 deletion(-) >>> >>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c >>> index d77042e..27555c7 100644 >>> --- a/xen/drivers/char/ns16550.c >>> +++ b/xen/drivers/char/ns16550.c >>> @@ -42,6 +42,7 @@ static struct ns16550 { >>> struct irqaction irqaction; >>> /* UART with no IRQ line: periodically-polled I/O. */ >>> struct timer timer; >>> + struct timer resume_timer; >>> unsigned int timeout_ms; >>> bool_t intr_works; >>> /* PCI card parameters. */ >>> @@ -120,6 +121,10 @@ static struct ns16550 { >>> /* Frequency of external clock source. This definition assumes PC platform. */ >>> #define UART_CLOCK_HZ 1843200 >>> >>> +/* Resume retry settings */ >>> +#define RESUME_DELAY MILLISECS(10) >>> +#define RESUME_RETRIES 100 >>> + >>> static char ns_read_reg(struct ns16550 *uart, int reg) >>> { >>> if ( uart->remapped_io_base == NULL ) >>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port) >>> uart->ps_bdf[2], PCI_COMMAND); >>> } >>> >>> -static void ns16550_resume(struct serial_port *port) >>> +static void __ns16550_resume(struct serial_port *port) >>> { >>> struct ns16550 *uart = port->uart; >>> >>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port) >>> ns16550_setup_postirq(port->uart); >>> } >>> >>> +static int ns16550_ioport_invalid(struct ns16550 *uart) >>> +{ >>> + return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && >>> + (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && >>> + (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && >>> + (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && >>> + (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); >>> +} >>> + >>> +static int delayed_resume_tries; >>> +static void ns16550_delayed_resume(void *data) >>> +{ >>> + struct serial_port *port = data; >>> + struct ns16550 *uart = port->uart; >>> + >>> + if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { >>> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); >>> + return; >>> + } >>> + >>> + __ns16550_resume(port); >>> +} >>> + >>> +static void ns16550_resume(struct serial_port *port) >>> +{ >>> + struct ns16550 *uart = port->uart; >>> + >>> + /* >>> + * Check for ioport access, before fully resuming operation. >>> + * On some systems, there is a SuperIO card that provides >>> + * this legacy ioport on the LPC bus. >>> + * >>> + * We need to wait for dom0''s ACPI processing to run the proper >>> + * AML to re-initialize the chip, before we can use the card again. >>> + * >>> + * This may cause a small amount of garbage to be written >>> + * to the serial log while we wait patiently for that AML to >>> + * be executed. >>> + */ >>> + if (ns16550_ioport_invalid(uart)) { >>> + delayed_resume_tries = RESUME_RETRIES; >>> + init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0); >>> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); >>> + return; >>> + } >>> + >>> + __ns16550_resume(port); >>> +} >>> + >>> #ifdef CONFIG_X86 >>> static void __init ns16550_endboot(struct serial_port *port) >>> { >>> -- >>> 1.7.9.5 >>> >>> >>> _______________________________________________ >>> Xen-devel mailing list >>> Xen-devel@lists.xen.org >>> http://lists.xen.org/xen-devel > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Ben Guthro
2013-Jan-16 21:48 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley <malcolm.crossley@citrix.com> wrote:> Do these laptops (T430/T530) have built in serial?They seem to have the hardware for it, but no actual serial connector out of the machine. This hardware provides the legacy port that Xen initializes in xen/arch/x86/setup.c __start_xen() When the resume happened, it was getting stuck in __ns16550_poll() because it thought that the LSR register was 0xFF - and had lots of data to read. It got stuck in that while loop, and never exited.> > Or is the hang fixed because you were using a serial Express card to debug? > > BTW, nearly all PC''s have external SUPERIO devices, it just seems we have > managed to be lucky that most platforms seem to default to using the BIOS to > enable the serial port after resume instead of the OS. > > Malcolm > >> Ben >> >> >> >>> -- Pasi >>> >>> >>>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> >>>> --- >>>> xen/drivers/char/ns16550.c | 56 >>>> +++++++++++++++++++++++++++++++++++++++++++- >>>> 1 file changed, 55 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c >>>> index d77042e..27555c7 100644 >>>> --- a/xen/drivers/char/ns16550.c >>>> +++ b/xen/drivers/char/ns16550.c >>>> @@ -42,6 +42,7 @@ static struct ns16550 { >>>> struct irqaction irqaction; >>>> /* UART with no IRQ line: periodically-polled I/O. */ >>>> struct timer timer; >>>> + struct timer resume_timer; >>>> unsigned int timeout_ms; >>>> bool_t intr_works; >>>> /* PCI card parameters. */ >>>> @@ -120,6 +121,10 @@ static struct ns16550 { >>>> /* Frequency of external clock source. This definition assumes PC >>>> platform. */ >>>> #define UART_CLOCK_HZ 1843200 >>>> >>>> +/* Resume retry settings */ >>>> +#define RESUME_DELAY MILLISECS(10) >>>> +#define RESUME_RETRIES 100 >>>> + >>>> static char ns_read_reg(struct ns16550 *uart, int reg) >>>> { >>>> if ( uart->remapped_io_base == NULL ) >>>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port >>>> *port) >>>> uart->ps_bdf[2], PCI_COMMAND); >>>> } >>>> >>>> -static void ns16550_resume(struct serial_port *port) >>>> +static void __ns16550_resume(struct serial_port *port) >>>> { >>>> struct ns16550 *uart = port->uart; >>>> >>>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port >>>> *port) >>>> ns16550_setup_postirq(port->uart); >>>> } >>>> >>>> +static int ns16550_ioport_invalid(struct ns16550 *uart) >>>> +{ >>>> + return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && >>>> + (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && >>>> + (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && >>>> + (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && >>>> + (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); >>>> +} >>>> + >>>> +static int delayed_resume_tries; >>>> +static void ns16550_delayed_resume(void *data) >>>> +{ >>>> + struct serial_port *port = data; >>>> + struct ns16550 *uart = port->uart; >>>> + >>>> + if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { >>>> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); >>>> + return; >>>> + } >>>> + >>>> + __ns16550_resume(port); >>>> +} >>>> + >>>> +static void ns16550_resume(struct serial_port *port) >>>> +{ >>>> + struct ns16550 *uart = port->uart; >>>> + >>>> + /* >>>> + * Check for ioport access, before fully resuming operation. >>>> + * On some systems, there is a SuperIO card that provides >>>> + * this legacy ioport on the LPC bus. >>>> + * >>>> + * We need to wait for dom0''s ACPI processing to run the proper >>>> + * AML to re-initialize the chip, before we can use the card again. >>>> + * >>>> + * This may cause a small amount of garbage to be written >>>> + * to the serial log while we wait patiently for that AML to >>>> + * be executed. >>>> + */ >>>> + if (ns16550_ioport_invalid(uart)) { >>>> + delayed_resume_tries = RESUME_RETRIES; >>>> + init_timer(&uart->resume_timer, ns16550_delayed_resume, port, >>>> 0); >>>> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); >>>> + return; >>>> + } >>>> + >>>> + __ns16550_resume(port); >>>> +} >>>> + >>>> #ifdef CONFIG_X86 >>>> static void __init ns16550_endboot(struct serial_port *port) >>>> { >>>> -- >>>> 1.7.9.5 >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xen.org >>>> http://lists.xen.org/xen-devel >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xen.org >> http://lists.xen.org/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Pasi Kärkkäinen
2013-Jan-16 21:52 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Wed, Jan 16, 2013 at 09:40:40PM +0000, Malcolm Crossley wrote:> On 16/01/13 21:31, Ben Guthro wrote: > > > >On 01/16/2013 04:24 PM, Pasi Kärkkäinen wrote: > >>On Wed, Jan 16, 2013 at 03:20:56PM -0500, Ben Guthro wrote: > >>>Check for ioport access, before fully resuming operation, to avoid > >>>spinning in __ns16550_poll when reading the LSR register returns 0xFF > >>>on failing ioport access. > >>> > >>>On some systems, there is a SuperIO card that provides this legacy ioport > >>>on the LPC bus. > >>> > >>>In this case, we need to wait for dom0''s ACPI processing to run the proper > >>>AML to re-initialize the chip, before we can use the card again. > >>> > >>>This may cause a small amount of garbage to be written > >>>to the serial log while we wait patiently for that AML to > >>>be executed. > >>> > >>>It limits the number of retries, to avoid a situation where we keep trying > >>>over and over again, in the case of some other failure on the ioport. > >>> > >>So I assume this patch fixes the ACPI S3 suspend/resume on the Lenovo T430/T530 laptops? > >>It might be good to mention that aswell.. > >Yes, it seems to resolve the issues seen on T430, T530, and Intel > >Ivybridge mobile SDP. It likely fixes others, as the Intel mobile SDP is > >the reference platform for all the OEMs. > > > >If I need to re-submit this patch for any other reason, I will be sure > >to mention this in the comment. > > > >That said - there seems to still be an issue on some older Sandy Bridge > >machines, like a Lenovo T520, and X220 - but I don''t yet know if it is > >related. > > Do these laptops (T430/T530) have built in serial? > > Or is the hang fixed because you were using a serial Express card to debug? >At least my T430 laptop is "Intel Core i7 vPro" model, so it has the Intel AMT (Active Management Technology), which provides Serial Over LAN (SOL) port. -- Pasi> BTW, nearly all PC''s have external SUPERIO devices, it just seems we > have managed to be lucky that most platforms seem to default to > using the BIOS to enable the serial port after resume instead of the > OS. > > Malcolm > >Ben > > > > > > > >>-- Pasi > >> > >> > >>>Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> > >>>--- > >>> xen/drivers/char/ns16550.c | 56 +++++++++++++++++++++++++++++++++++++++++++- > >>> 1 file changed, 55 insertions(+), 1 deletion(-) > >>> > >>>diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c > >>>index d77042e..27555c7 100644 > >>>--- a/xen/drivers/char/ns16550.c > >>>+++ b/xen/drivers/char/ns16550.c > >>>@@ -42,6 +42,7 @@ static struct ns16550 { > >>> struct irqaction irqaction; > >>> /* UART with no IRQ line: periodically-polled I/O. */ > >>> struct timer timer; > >>>+ struct timer resume_timer; > >>> unsigned int timeout_ms; > >>> bool_t intr_works; > >>> /* PCI card parameters. */ > >>>@@ -120,6 +121,10 @@ static struct ns16550 { > >>> /* Frequency of external clock source. This definition assumes PC platform. */ > >>> #define UART_CLOCK_HZ 1843200 > >>> > >>>+/* Resume retry settings */ > >>>+#define RESUME_DELAY MILLISECS(10) > >>>+#define RESUME_RETRIES 100 > >>>+ > >>> static char ns_read_reg(struct ns16550 *uart, int reg) > >>> { > >>> if ( uart->remapped_io_base == NULL ) > >>>@@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port *port) > >>> uart->ps_bdf[2], PCI_COMMAND); > >>> } > >>> > >>>-static void ns16550_resume(struct serial_port *port) > >>>+static void __ns16550_resume(struct serial_port *port) > >>> { > >>> struct ns16550 *uart = port->uart; > >>> > >>>@@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port *port) > >>> ns16550_setup_postirq(port->uart); > >>> } > >>> > >>>+static int ns16550_ioport_invalid(struct ns16550 *uart) > >>>+{ > >>>+ return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && > >>>+ (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && > >>>+ (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && > >>>+ (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && > >>>+ (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); > >>>+} > >>>+ > >>>+static int delayed_resume_tries; > >>>+static void ns16550_delayed_resume(void *data) > >>>+{ > >>>+ struct serial_port *port = data; > >>>+ struct ns16550 *uart = port->uart; > >>>+ > >>>+ if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { > >>>+ set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); > >>>+ return; > >>>+ } > >>>+ > >>>+ __ns16550_resume(port); > >>>+} > >>>+ > >>>+static void ns16550_resume(struct serial_port *port) > >>>+{ > >>>+ struct ns16550 *uart = port->uart; > >>>+ > >>>+ /* > >>>+ * Check for ioport access, before fully resuming operation. > >>>+ * On some systems, there is a SuperIO card that provides > >>>+ * this legacy ioport on the LPC bus. > >>>+ * > >>>+ * We need to wait for dom0''s ACPI processing to run the proper > >>>+ * AML to re-initialize the chip, before we can use the card again. > >>>+ * > >>>+ * This may cause a small amount of garbage to be written > >>>+ * to the serial log while we wait patiently for that AML to > >>>+ * be executed. > >>>+ */ > >>>+ if (ns16550_ioport_invalid(uart)) { > >>>+ delayed_resume_tries = RESUME_RETRIES; > >>>+ init_timer(&uart->resume_timer, ns16550_delayed_resume, port, 0); > >>>+ set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); > >>>+ return; > >>>+ } > >>>+ > >>>+ __ns16550_resume(port); > >>>+} > >>>+ > >>> #ifdef CONFIG_X86 > >>> static void __init ns16550_endboot(struct serial_port *port) > >>> { > >>>-- > >>>1.7.9.5 > >>> > >>> > >>>_______________________________________________ > >>>Xen-devel mailing list > >>>Xen-devel@lists.xen.org > >>>http://lists.xen.org/xen-devel > >_______________________________________________ > >Xen-devel mailing list > >Xen-devel@lists.xen.org > >http://lists.xen.org/xen-devel > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Pasi Kärkkäinen
2013-Jan-16 21:54 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Wed, Jan 16, 2013 at 04:48:41PM -0500, Ben Guthro wrote:> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley > <malcolm.crossley@citrix.com> wrote: > > > Do these laptops (T430/T530) have built in serial? > > They seem to have the hardware for it, but no actual serial connector > out of the machine. >I believe you can access the serial port over the network, thru the AMT management feature. -- Pasi> This hardware provides the legacy port that Xen initializes in > xen/arch/x86/setup.c __start_xen() > > When the resume happened, it was getting stuck in __ns16550_poll() > because it thought that the > LSR register was 0xFF - and had lots of data to read. It got stuck in > that while loop, and never > exited. > > > > > > Or is the hang fixed because you were using a serial Express card to debug? > > > > BTW, nearly all PC''s have external SUPERIO devices, it just seems we have > > managed to be lucky that most platforms seem to default to using the BIOS to > > enable the serial port after resume instead of the OS. > > > > Malcolm > > > >> Ben > >> > >> > >> > >>> -- Pasi > >>> > >>> > >>>> Signed-Off-By: Ben Guthro <benjamin.guthro@citrix.com> > >>>> --- > >>>> xen/drivers/char/ns16550.c | 56 > >>>> +++++++++++++++++++++++++++++++++++++++++++- > >>>> 1 file changed, 55 insertions(+), 1 deletion(-) > >>>> > >>>> diff --git a/xen/drivers/char/ns16550.c b/xen/drivers/char/ns16550.c > >>>> index d77042e..27555c7 100644 > >>>> --- a/xen/drivers/char/ns16550.c > >>>> +++ b/xen/drivers/char/ns16550.c > >>>> @@ -42,6 +42,7 @@ static struct ns16550 { > >>>> struct irqaction irqaction; > >>>> /* UART with no IRQ line: periodically-polled I/O. */ > >>>> struct timer timer; > >>>> + struct timer resume_timer; > >>>> unsigned int timeout_ms; > >>>> bool_t intr_works; > >>>> /* PCI card parameters. */ > >>>> @@ -120,6 +121,10 @@ static struct ns16550 { > >>>> /* Frequency of external clock source. This definition assumes PC > >>>> platform. */ > >>>> #define UART_CLOCK_HZ 1843200 > >>>> > >>>> +/* Resume retry settings */ > >>>> +#define RESUME_DELAY MILLISECS(10) > >>>> +#define RESUME_RETRIES 100 > >>>> + > >>>> static char ns_read_reg(struct ns16550 *uart, int reg) > >>>> { > >>>> if ( uart->remapped_io_base == NULL ) > >>>> @@ -330,7 +335,7 @@ static void ns16550_suspend(struct serial_port > >>>> *port) > >>>> uart->ps_bdf[2], PCI_COMMAND); > >>>> } > >>>> > >>>> -static void ns16550_resume(struct serial_port *port) > >>>> +static void __ns16550_resume(struct serial_port *port) > >>>> { > >>>> struct ns16550 *uart = port->uart; > >>>> > >>>> @@ -346,6 +351,55 @@ static void ns16550_resume(struct serial_port > >>>> *port) > >>>> ns16550_setup_postirq(port->uart); > >>>> } > >>>> > >>>> +static int ns16550_ioport_invalid(struct ns16550 *uart) > >>>> +{ > >>>> + return ((((unsigned char)ns_read_reg(uart, LSR)) == 0xff) && > >>>> + (((unsigned char)ns_read_reg(uart, MCR)) == 0xff) && > >>>> + (((unsigned char)ns_read_reg(uart, IER)) == 0xff) && > >>>> + (((unsigned char)ns_read_reg(uart, IIR)) == 0xff) && > >>>> + (((unsigned char)ns_read_reg(uart, LCR)) == 0xff)); > >>>> +} > >>>> + > >>>> +static int delayed_resume_tries; > >>>> +static void ns16550_delayed_resume(void *data) > >>>> +{ > >>>> + struct serial_port *port = data; > >>>> + struct ns16550 *uart = port->uart; > >>>> + > >>>> + if (ns16550_ioport_invalid(port->uart) && delayed_resume_tries--) { > >>>> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); > >>>> + return; > >>>> + } > >>>> + > >>>> + __ns16550_resume(port); > >>>> +} > >>>> + > >>>> +static void ns16550_resume(struct serial_port *port) > >>>> +{ > >>>> + struct ns16550 *uart = port->uart; > >>>> + > >>>> + /* > >>>> + * Check for ioport access, before fully resuming operation. > >>>> + * On some systems, there is a SuperIO card that provides > >>>> + * this legacy ioport on the LPC bus. > >>>> + * > >>>> + * We need to wait for dom0''s ACPI processing to run the proper > >>>> + * AML to re-initialize the chip, before we can use the card again. > >>>> + * > >>>> + * This may cause a small amount of garbage to be written > >>>> + * to the serial log while we wait patiently for that AML to > >>>> + * be executed. > >>>> + */ > >>>> + if (ns16550_ioport_invalid(uart)) { > >>>> + delayed_resume_tries = RESUME_RETRIES; > >>>> + init_timer(&uart->resume_timer, ns16550_delayed_resume, port, > >>>> 0); > >>>> + set_timer(&uart->resume_timer, NOW() + RESUME_DELAY); > >>>> + return; > >>>> + } > >>>> + > >>>> + __ns16550_resume(port); > >>>> +} > >>>> + > >>>> #ifdef CONFIG_X86 > >>>> static void __init ns16550_endboot(struct serial_port *port) > >>>> { > >>>> -- > >>>> 1.7.9.5 > >>>> > >>>> > >>>> _______________________________________________ > >>>> Xen-devel mailing list > >>>> Xen-devel@lists.xen.org > >>>> http://lists.xen.org/xen-devel > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xen.org > >> http://lists.xen.org/xen-devel > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Jan Beulich
2013-Jan-17 11:09 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: > On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley > <malcolm.crossley@citrix.com> wrote: > >> Do these laptops (T430/T530) have built in serial? > > They seem to have the hardware for it, but no actual serial connector > out of the machine. > This hardware provides the legacy port that Xen initializes in > xen/arch/x86/setup.c __start_xen() > > When the resume happened, it was getting stuck in __ns16550_poll() > because it thought that the > LSR register was 0xFF - and had lots of data to read. It got stuck in > that while loop, and never > exited.So before acking the patch I''d like to understand how we end up in that loop even when no serial console is in use. Assuming that''s because the post-IRQ initialization (mostly) unconditionally inserts the timer, that shouldn''t be an issue on -unstable (as post-IRQ init of the individual drivers doesn''t get called anymore when no respective command line option was present, and likewise their suspend/resume handlers don''t get called anymore in that case). In which case backporting from -unstable would be preferable over putting custom stuff on the 4.x branches (albeit we likely still want the change here to have a way to resume with serial console, but the impact would be quite different). Jan
Ben Guthro
2013-Jan-17 12:04 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Jan 17, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: >> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley >> <malcolm.crossley@citrix.com> wrote: >> >>> Do these laptops (T430/T530) have built in serial? >> >> They seem to have the hardware for it, but no actual serial connector >> out of the machine. >> This hardware provides the legacy port that Xen initializes in >> xen/arch/x86/setup.c __start_xen() >> >> When the resume happened, it was getting stuck in __ns16550_poll() >> because it thought that the >> LSR register was 0xFF - and had lots of data to read. It got stuck in >> that while loop, and never >> exited. > > So before acking the patch I''d like to understand how we end up > in that loop even when no serial console is in use. Assuming that''s > because the post-IRQ initialization (mostly) unconditionally inserts > the timer, that shouldn''t be an issue on -unstable (as post-IRQ > init of the individual drivers doesn''t get called anymore when no > respective command line option was present, and likewise their > suspend/resume handlers don''t get called anymore in that case). > In which case backporting from -unstable would be preferable > over putting custom stuff on the 4.x branches (albeit we likely > still want the change here to have a way to resume with serial > console, but the impact would be quite different). > > Jan >Admittedly, I have been doing my testing on 4.2.y I can try unstable today to see if it makes a difference in this path.
Ben Guthro
2013-Jan-17 13:37 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Thu, Jan 17, 2013 at 7:04 AM, Ben Guthro <ben.guthro@gmail.com> wrote:> On Jan 17, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com> wrote: > >>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: >>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley >>> <malcolm.crossley@citrix.com> wrote: >>> >>>> Do these laptops (T430/T530) have built in serial? >>> >>> They seem to have the hardware for it, but no actual serial connector >>> out of the machine. >>> This hardware provides the legacy port that Xen initializes in >>> xen/arch/x86/setup.c __start_xen() >>> >>> When the resume happened, it was getting stuck in __ns16550_poll() >>> because it thought that the >>> LSR register was 0xFF - and had lots of data to read. It got stuck in >>> that while loop, and never >>> exited. >> >> So before acking the patch I''d like to understand how we end up >> in that loop even when no serial console is in use. Assuming that''s >> because the post-IRQ initialization (mostly) unconditionally inserts >> the timer, that shouldn''t be an issue on -unstable (as post-IRQ >> init of the individual drivers doesn''t get called anymore when no >> respective command line option was present, and likewise their >> suspend/resume handlers don''t get called anymore in that case). >> In which case backporting from -unstable would be preferable >> over putting custom stuff on the 4.x branches (albeit we likely >> still want the change here to have a way to resume with serial >> console, but the impact would be quite different). >> >> Jan >> > > Admittedly, I have been doing my testing on 4.2.y > > I can try unstable today to see if it makes a difference in this path.Further testing on the Lenovo T430 shows that this fix is insufficient to fully resolve the S3 failures, (even on 4.2.y) The first one seems to work, but the second fails - which is a different behavior than I am seeing on the Intel Mobile SDP. So while I think this is a valuable patch to have for debugging S3, it is not the root cause of the failures as I had previously believed. ...back to the drawing board, I guess.
Konrad Rzeszutek Wilk
2013-Jan-18 15:53 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Thu, Jan 17, 2013 at 08:37:40AM -0500, Ben Guthro wrote:> On Thu, Jan 17, 2013 at 7:04 AM, Ben Guthro <ben.guthro@gmail.com> wrote: > > On Jan 17, 2013, at 6:09 AM, Jan Beulich <JBeulich@suse.com> wrote: > > > >>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: > >>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley > >>> <malcolm.crossley@citrix.com> wrote: > >>> > >>>> Do these laptops (T430/T530) have built in serial? > >>> > >>> They seem to have the hardware for it, but no actual serial connector > >>> out of the machine. > >>> This hardware provides the legacy port that Xen initializes in > >>> xen/arch/x86/setup.c __start_xen() > >>> > >>> When the resume happened, it was getting stuck in __ns16550_poll() > >>> because it thought that the > >>> LSR register was 0xFF - and had lots of data to read. It got stuck in > >>> that while loop, and never > >>> exited. > >> > >> So before acking the patch I''d like to understand how we end up > >> in that loop even when no serial console is in use. Assuming that''s > >> because the post-IRQ initialization (mostly) unconditionally inserts > >> the timer, that shouldn''t be an issue on -unstable (as post-IRQ > >> init of the individual drivers doesn''t get called anymore when no > >> respective command line option was present, and likewise their > >> suspend/resume handlers don''t get called anymore in that case). > >> In which case backporting from -unstable would be preferable > >> over putting custom stuff on the 4.x branches (albeit we likely > >> still want the change here to have a way to resume with serial > >> console, but the impact would be quite different). > >> > >> Jan > >> > > > > Admittedly, I have been doing my testing on 4.2.y > > > > I can try unstable today to see if it makes a difference in this path. > > Further testing on the Lenovo T430 shows that this fix is insufficient > to fully resolve the S3 failures, (even on 4.2.y) > The first one seems to work, but the second fails - which is a > different behavior than I am seeing on the Intel Mobile SDP. > > So while I think this is a valuable patch to have for debugging S3, it > is not the root cause of the failures as I had previously believed. > > ...back to the drawing board, I guess.I just tried Xen 4.3 without trying to use any serial console, and found out that it succesfully came back from resume. But only with one CPU active. Doing ''xl debug-keys r'' before shows: (XEN) sched_smt_power_savings: disabled (XEN) NOW=0x000000099F1C9611 (XEN) Idle cpupool: (XEN) Scheduler: SMP Credit Scheduler (credit) (XEN) info: (XEN) ncpus = 2 (XEN) master = 0 (XEN) credit = 600 (XEN) credit balance = 0 (XEN) weight = 0 (XEN) runq_sort = 380 (XEN) default-weight = 256 (XEN) tslice = 30ms (XEN) ratelimit = 1000us (XEN) credits per msec = 10 (XEN) ticks per tslice = 3 (XEN) migration delay = 0us (XEN) idlers: 02 (XEN) active vcpus: (XEN) Cpupool 0: (XEN) Scheduler: SMP Credit Scheduler (credit) (XEN) info: (XEN) ncpus = 2 (XEN) master = 0 (XEN) credit = 600 (XEN) credit balance = 0 (XEN) weight = 0 (XEN) runq_sort = 380 (XEN) default-weight = 256 (XEN) tslice = 30ms (XEN) ratelimit = 1000us (XEN) credits per msec = 10 (XEN) ticks per tslice = 3 (XEN) migration delay = 0us (XEN) idlers: 02 (XEN) active vcpus: (XEN) CPU[00] sort=380, sibling=01, core=03 (XEN) run: [0.0] pri=0 flags=0 cpu=0 credit=172 [w=256] (XEN) 1: [32767.0] pri=-64 flags=0 cpu=0 (XEN) CPU[01] sort=380, sibling=02, core=03 (XEN) run: [32767.1] pri=-64 flags=0 cpu=1 after ACPI S3 resume: (XEN) Idle cpupool: (XEN) Scheduler: SMP Credit Scheduler (credit) (XEN) info: (XEN) ncpus = 2 (XEN) master = 0 (XEN) credit = 600 (XEN) credit balance = 0 (XEN) weight = 0 (XEN) runq_sort = 471 (XEN) default-weight = 256 (XEN) tslice = 30ms (XEN) ratelimit = 1000us (XEN) credits per msec = 10 (XEN) ticks per tslice = 3 (XEN) migration delay = 0us (XEN) idlers: 02 (XEN) active vcpus: (XEN) Cpupool 0: (XEN) Scheduler: SMP Credit Scheduler (credit) (XEN) info: (XEN) ncpus = 2 (XEN) master = 0 (XEN) credit = 600 (XEN) credit balance = 0 (XEN) weight = 0 (XEN) runq_sort = 471 (XEN) default-weight = 256 (XEN) tslice = 30ms (XEN) ratelimit = 1000us (XEN) credits per msec = 10 (XEN) ticks per tslice = 3 (XEN) migration delay = 0us (XEN) idlers: 02 (XEN) active vcpus: (XEN) CPU[00] sort=471, sibling=01, core=03 (XEN) run: [0.1] pri=0 flags=0 cpu=0 credit=0 [w=256] (XEN) 1: [0.0] pri=0 flags=0 cpu=0 credit=-120 [w=256] (XEN) 2: [32767.0] pri=-64 flags=0 cpu=0 Where did the other CPU go!?
Ben Guthro
2013-Jan-18 20:07 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Fri, Jan 18, 2013 at 10:53 AM, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:> Where did the other CPU go!?strange. We aren''t seeing that particular failure.
Ben Guthro
2013-Apr-22 13:51 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Thu, Jan 17, 2013 at 11:09 AM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: >> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley >> <malcolm.crossley@citrix.com> wrote: >> >>> Do these laptops (T430/T530) have built in serial? >> >> They seem to have the hardware for it, but no actual serial connector >> out of the machine. >> This hardware provides the legacy port that Xen initializes in >> xen/arch/x86/setup.c __start_xen() >> >> When the resume happened, it was getting stuck in __ns16550_poll() >> because it thought that the >> LSR register was 0xFF - and had lots of data to read. It got stuck in >> that while loop, and never >> exited. > > So before acking the patch I''d like to understand how we end up > in that loop even when no serial console is in use. Assuming that''s > because the post-IRQ initialization (mostly) unconditionally inserts > the timer, that shouldn''t be an issue on -unstable (as post-IRQ > init of the individual drivers doesn''t get called anymore when no > respective command line option was present, and likewise their > suspend/resume handlers don''t get called anymore in that case). > In which case backporting from -unstable would be preferable > over putting custom stuff on the 4.x branches (albeit we likely > still want the change here to have a way to resume with serial > console, but the impact would be quite different). > > JanI''m dragging this thread back up from the archives, because I think it is still applicable, and now that some of the other S3 scheduling things are explained, this is a good fix. My assertion in this thread about getting into this loop when no serial is in use turned out to be false. However, if you are using serial with one of these SuperIO controllers on the LPC bus, you will get into an infinite loop at resume time. Ben
Jan Beulich
2013-Apr-23 06:40 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
>>> On 22.04.13 at 15:51, Ben Guthro <ben@guthro.net> wrote: > On Thu, Jan 17, 2013 at 11:09 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: >>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley >>> <malcolm.crossley@citrix.com> wrote: >>> >>>> Do these laptops (T430/T530) have built in serial? >>> >>> They seem to have the hardware for it, but no actual serial connector >>> out of the machine. >>> This hardware provides the legacy port that Xen initializes in >>> xen/arch/x86/setup.c __start_xen() >>> >>> When the resume happened, it was getting stuck in __ns16550_poll() >>> because it thought that the >>> LSR register was 0xFF - and had lots of data to read. It got stuck in >>> that while loop, and never >>> exited. >> >> So before acking the patch I''d like to understand how we end up >> in that loop even when no serial console is in use. Assuming that''s >> because the post-IRQ initialization (mostly) unconditionally inserts >> the timer, that shouldn''t be an issue on -unstable (as post-IRQ >> init of the individual drivers doesn''t get called anymore when no >> respective command line option was present, and likewise their >> suspend/resume handlers don''t get called anymore in that case). >> In which case backporting from -unstable would be preferable >> over putting custom stuff on the 4.x branches (albeit we likely >> still want the change here to have a way to resume with serial >> console, but the impact would be quite different). > > I''m dragging this thread back up from the archives, because I think it is > still applicable, and now that some of the other S3 scheduling things are > explained, this is a good fix. > > My assertion in this thread about getting into this loop when no > serial is in use turned out to be false.Good.> However, if you are using serial with one of these SuperIO controllers on > the LPC bus, you will get into an infinite loop at resume time.Yes, I can see how that can happen. However, there''s another loose end on that thread - see your response http://lists.xen.org/archives/html/xen-devel/2013-01/msg01223.html And I''d expect you to resubmit anyway, in order to - fix the description (as you had promised to Pasi) - fix various coding style issues - Cc Keir (who will have to ack the patch in order for it to go in) - make sure we''re not applying something stale after this long a time Jan
Ben Guthro
2013-Apr-23 18:56 UTC
Re: [PATCH] ns16550: delay resume until dom0 ACPI has a chance to run
On Tue, Apr 23, 2013 at 7:40 AM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 22.04.13 at 15:51, Ben Guthro <ben@guthro.net> wrote: >> On Thu, Jan 17, 2013 at 11:09 AM, Jan Beulich <JBeulich@suse.com> wrote: >>>>>> On 16.01.13 at 22:48, Ben Guthro <ben@guthro.net> wrote: >>>> On Wed, Jan 16, 2013 at 4:40 PM, Malcolm Crossley >>>> <malcolm.crossley@citrix.com> wrote: >>>> >>>>> Do these laptops (T430/T530) have built in serial? >>>> >>>> They seem to have the hardware for it, but no actual serial connector >>>> out of the machine. >>>> This hardware provides the legacy port that Xen initializes in >>>> xen/arch/x86/setup.c __start_xen() >>>> >>>> When the resume happened, it was getting stuck in __ns16550_poll() >>>> because it thought that the >>>> LSR register was 0xFF - and had lots of data to read. It got stuck in >>>> that while loop, and never >>>> exited. >>> >>> So before acking the patch I''d like to understand how we end up >>> in that loop even when no serial console is in use. Assuming that''s >>> because the post-IRQ initialization (mostly) unconditionally inserts >>> the timer, that shouldn''t be an issue on -unstable (as post-IRQ >>> init of the individual drivers doesn''t get called anymore when no >>> respective command line option was present, and likewise their >>> suspend/resume handlers don''t get called anymore in that case). >>> In which case backporting from -unstable would be preferable >>> over putting custom stuff on the 4.x branches (albeit we likely >>> still want the change here to have a way to resume with serial >>> console, but the impact would be quite different). >> >> I''m dragging this thread back up from the archives, because I think it is >> still applicable, and now that some of the other S3 scheduling things are >> explained, this is a good fix. >> >> My assertion in this thread about getting into this loop when no >> serial is in use turned out to be false. > > Good. > >> However, if you are using serial with one of these SuperIO controllers on >> the LPC bus, you will get into an infinite loop at resume time. > > Yes, I can see how that can happen. > > However, there''s another loose end on that thread - see your > response > http://lists.xen.org/archives/html/xen-devel/2013-01/msg01223.htmlThis issue was resolved with the following changeset: commit 1868a0c5c083a53f7473067ceb871bd096c72386 Author: Jan Beulich <jbeulich@suse.com> Date: Tue Sep 11 12:33:31 2012 +0200 x86/MSI: fix 2nd S3 resume with interrupt remapping enabled The first resume from S3 was corrupting internal data structures (in that pci_restore_msi_state() updated the globally stored MSI message from traditional to interrupt remapped format, which would then be translated a second time during the second resume, breaking interrupt delivery). Signed-off-by: Jan Beulich <jbeulich@suse.com> Acked-by: Keir Fraser <keir@xen.org> xen-unstable changeset: 25834:0376c85caaf3 xen-unstable date: Fri Sep 7 15:57:10 UTC 2012> > And I''d expect you to resubmit anyway, in order to > - fix the description (as you had promised to Pasi) > - fix various coding style issues > - Cc Keir (who will have to ack the patch in order for it to go in) > - make sure we''re not applying something stale after this long a > timeOK, I''ll take a look at doing this. Thanks. Ben