I''ve seen $subject on RHEL-5 hvm guests in a RHEL-5 host with vpcus=2 when I hammer the serial console, e.g. with "od -a /boot/vmlinuz". I wonder whether the problem exists in xen-unstable as well. Wild guess: Unlike a real UART, the virtual UART empties as quickly as the kernel can stuff in bytes. So, while the kernel has bytes to stuff, it doesn''t get around to doing much else. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster writes ("[Xen-devel] serial8250: too much work for irq4"):> Wild guess: Unlike a real UART, the virtual UART empties as quickly as > the kernel can stuff in bytes. So, while the kernel has bytes to stuff, > it doesn''t get around to doing much else.This is certainly true and would explain the message you see. Are there any other adverse symptoms ? In principle it would be possible to add a rate limit but it seems poor to artificially rate-limit a virtual device to the wall-clock speed of the physical object. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-18 12:13 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Markus Armbruster writes ("[Xen-devel] serial8250: too much work for irq4"): >> Wild guess: Unlike a real UART, the virtual UART empties as quickly as >> the kernel can stuff in bytes. So, while the kernel has bytes to stuff, >> it doesn''t get around to doing much else. > > This is certainly true and would explain the message you see. > > Are there any other adverse symptoms ? In principle it would be > possible to add a rate limit but it seems poor to artificially > rate-limit a virtual device to the wall-clock speed of the physical > object. > > Ian.I see funny effects where serial output stalls until some input happens, but I don''t know whether that''s related, or whether xen-unstable has the same problem. The 8250 driver makes the (reasonable) assumption that the chip operates at a limited speed. All real UARTs do. The comment next to the printk in drivers/serial/8250.c says "If we hit this, we''re dead." Sounds scary, but I figure it''s overstating the case. The loop executes holding a spin lock, but is limited to 256 iterations. The printk fires if we hit the limit and take the emergency exit. Still, I''m worried we hog the cpu for longer than is healthy, or that taking the emergency exit isn''t as harmless as it looks to me so far. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for irq4"):> I see funny effects where serial output stalls until some input happens, > but I don''t know whether that''s related, or whether xen-unstable has the > same problem.This is probably the bug that Anders Kaseorg reported on the 5th of February which I tracked down to a pair of bugs in the Linux kernel''s serial driver, and therefore unrelated. See http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00372.html and the surrounding thread.> The 8250 driver makes the (reasonable) assumption that the chip operates > at a limited speed. All real UARTs do. The comment next to the printk > in drivers/serial/8250.c says "If we hit this, we''re dead." Sounds > scary, but I figure it''s overstating the case. The loop executes > holding a spin lock, but is limited to 256 iterations. The printk fires > if we hit the limit and take the emergency exit. Still, I''m worried we > hog the cpu for longer than is healthy, or that taking the emergency > exit isn''t as harmless as it looks to me so far.I don''t think that the general 8250 driver can reasonably make the assumption that the chip''s transfer speed is slow compared to the host CPU clock. That register interface is sometimes used for very high speed links. The overall effect in your situation will be, I think, that: * serial output will take priority over other demands on the guest CPU, so long as any output is pending * some CPU may be wasted with other VCPUs spinning on the lock although in modern kernels the fallback to a sleep/wakeup lock will kick in and avoid this being too much of a problem The first of these effects isn''t desirable but it''s difficult to see a good alternative. Note that on many real systems sending a lot of output to the serial port can cause CPU starvation - some years ago my (non-virtualised Linux) colo machine had timekeeping trouble until I stopped the kernel packet filter from complaining to the serial console. I imagine this has been improved by now, but even so userspace (and users) of even modern operating systems should be aware of these kind of problems and not spew huge quantities of unacknowledged stuff to serial consoles. We could rate limit the port to some speed according to the wall clock, but the time intervals would have to be very short. With a 115200bps serial port, a character period is just under 90us. Even limiting consecutive serial writes to a few dozen (32 say) would mean that we need to set a timer every 2.8ms. And the result of doing that would be that when the only thing which is happening is that some data transfer is happening over an emulated serial port, the transfer would be artificially limited to some nominal speed. Do you have any better ideas for how to trick the guest into making better use of its CPU time, which will solve your use case without imposing an artificial wall clock based speed limit ? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-18 17:04 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for irq4"): >> I see funny effects where serial output stalls until some input happens, >> but I don''t know whether that''s related, or whether xen-unstable has the >> same problem. > > This is probably the bug that Anders Kaseorg reported on the 5th of > February which I tracked down to a pair of bugs in the Linux kernel''s > serial driver, and therefore unrelated. See > http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00372.html > and the surrounding thread.I''ll check that out, thanks.>> The 8250 driver makes the (reasonable) assumption that the chip operates >> at a limited speed. All real UARTs do. The comment next to the printk >> in drivers/serial/8250.c says "If we hit this, we''re dead." Sounds >> scary, but I figure it''s overstating the case. The loop executes >> holding a spin lock, but is limited to 256 iterations. The printk fires >> if we hit the limit and take the emergency exit. Still, I''m worried we >> hog the cpu for longer than is healthy, or that taking the emergency >> exit isn''t as harmless as it looks to me so far. > > I don''t think that the general 8250 driver can reasonably make the > assumption that the chip''s transfer speed is slow compared to the host > CPU clock. That register interface is sometimes used for very high > speed links.For better or worse, it does. I guess a case could be made that any sane 8250-compatible chip, real or virtual, should behave like a 8250 when treated like a 8250. I.e. if told to go nice and slow, it better goes nice and slow. If it can also go very fast, it should have some extra knob for that.> The overall effect in your situation will be, I think, that: > * serial output will take priority over other demands on > the guest CPU, so long as any output is pending > * some CPU may be wasted with other VCPUs spinning on the lock > although in modern kernels the fallback to a sleep/wakeup > lock will kick in and avoid this being too much of a problem > > The first of these effects isn''t desirable but it''s difficult to see a > good alternative.Perhaps we should discuss this with driver maintainers, just to make sure we''re not missing anything here.> Note that on many real systems sending a lot of output to the serial > port can cause CPU starvation - some years ago my (non-virtualised > Linux) colo machine had timekeeping trouble until I stopped the kernel > packet filter from complaining to the serial console. I imagine this > has been improved by now, but even so userspace (and users) of even > modern operating systems should be aware of these kind of problems and > not spew huge quantities of unacknowledged stuff to serial consoles.Point.> We could rate limit the port to some speed according to the wall > clock, but the time intervals would have to be very short. With a > 115200bps serial port, a character period is just under 90us. Even > limiting consecutive serial writes to a few dozen (32 say) would mean > that we need to set a timer every 2.8ms. > > And the result of doing that would be that when the only thing which > is happening is that some data transfer is happening over an emulated > serial port, the transfer would be artificially limited to some > nominal speed.Just like with a real machine :)> Do you have any better ideas for how to trick the guest into making > better use of its CPU time, which will solve your use case without > imposing an artificial wall clock based speed limit ? > > Ian.I''m afraid I don''t. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 18/02/2009 17:04, "Markus Armbruster" <armbru@redhat.com> wrote:>> Do you have any better ideas for how to trick the guest into making >> better use of its CPU time, which will solve your use case without >> imposing an artificial wall clock based speed limit ? >> >> Ian. > > I''m afraid I don''t.Perhaps we could add a qemu command-line option to force accurate uart emulation w.r.t. programmed FIFO depth and baud rate? Then those who want this more accurate but more expensive and slower behaviour can configure it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser writes ("Re: [Xen-devel] serial8250: too much work for irq4"):> Perhaps we could add a qemu command-line option to force accurate uart > emulation w.r.t. programmed FIFO depth and baud rate? Then those who want > this more accurate but more expensive and slower behaviour can configure it.That would certainly be possible. Our hw/serial.c is very like upstream''s, and the same issue will apply to (for example) KVM, so we should coordinate any such change with upstream. I would suggest we crosspost the patch to xen-devel and qemu-devel. Markus: are you sufficiently bothered about this that you''d like to supply a patch to change the behaviour in the way Keir suggests :-). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-18 18:18 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Keir Fraser writes ("Re: [Xen-devel] serial8250: too much work for irq4"): >> Perhaps we could add a qemu command-line option to force accurate uart >> emulation w.r.t. programmed FIFO depth and baud rate? Then those who want >> this more accurate but more expensive and slower behaviour can configure it. > > That would certainly be possible. Our hw/serial.c is very like > upstream''s, and the same issue will apply to (for example) KVM, so we > should coordinate any such change with upstream. I would suggest we > crosspost the patch to xen-devel and qemu-devel. > > Markus: are you sufficiently bothered about this that you''d like to > supply a patch to change the behaviour in the way Keir suggests :-). > > Ian.;) I don''t yet know. I need to try the other patch you mentioned, and see how the that flies. Anyway, the proposed change should most probably be routed through upstream QEMU. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-19 16:45 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for irq4"): >> I see funny effects where serial output stalls until some input happens, >> but I don''t know whether that''s related, or whether xen-unstable has the >> same problem. > > This is probably the bug that Anders Kaseorg reported on the 5th of > February which I tracked down to a pair of bugs in the Linux kernel''s > serial driver, and therefore unrelated. See > http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00372.html > and the surrounding thread.Yes, the patch there fixes the annoying stall. How did upstream like it? I can''t see anything on LKML, maybe I missed it. Thanks! [...] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for irq4"):> Yes, the patch there fixes the annoying stall.Great.> How did upstream like it? I can''t see anything on LKML, maybe I missed > it.I think they were waiting for Anders to comment. I haven''t had any reply myself and was about to chase it. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel