I''ve seen $subject on RHEL-5 hvm guests in a RHEL-5 host with vpcus=2 when I hammer the serial console, e.g. with "od -a /boot/vmlinuz". I wonder whether the problem exists in xen-unstable as well. Wild guess: Unlike a real UART, the virtual UART empties as quickly as the kernel can stuff in bytes. So, while the kernel has bytes to stuff, it doesn''t get around to doing much else. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster writes ("[Xen-devel] serial8250: too much work for
irq4"):> Wild guess: Unlike a real UART, the virtual UART empties as quickly as
> the kernel can stuff in bytes. So, while the kernel has bytes to stuff,
> it doesn''t get around to doing much else.
This is certainly true and would explain the message you see.
Are there any other adverse symptoms ? In principle it would be
possible to add a rate limit but it seems poor to artificially
rate-limit a virtual device to the wall-clock speed of the physical
object.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-18 12:13 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Markus Armbruster writes ("[Xen-devel] serial8250: too much work for irq4"): >> Wild guess: Unlike a real UART, the virtual UART empties as quickly as >> the kernel can stuff in bytes. So, while the kernel has bytes to stuff, >> it doesn''t get around to doing much else. > > This is certainly true and would explain the message you see. > > Are there any other adverse symptoms ? In principle it would be > possible to add a rate limit but it seems poor to artificially > rate-limit a virtual device to the wall-clock speed of the physical > object. > > Ian.I see funny effects where serial output stalls until some input happens, but I don''t know whether that''s related, or whether xen-unstable has the same problem. The 8250 driver makes the (reasonable) assumption that the chip operates at a limited speed. All real UARTs do. The comment next to the printk in drivers/serial/8250.c says "If we hit this, we''re dead." Sounds scary, but I figure it''s overstating the case. The loop executes holding a spin lock, but is limited to 256 iterations. The printk fires if we hit the limit and take the emergency exit. Still, I''m worried we hog the cpu for longer than is healthy, or that taking the emergency exit isn''t as harmless as it looks to me so far. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for
irq4"):> I see funny effects where serial output stalls until some input happens,
> but I don''t know whether that''s related, or whether
xen-unstable has the
> same problem.
This is probably the bug that Anders Kaseorg reported on the 5th of
February which I tracked down to a pair of bugs in the Linux kernel''s
serial driver, and therefore unrelated. See
http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00372.html
and the surrounding thread.
> The 8250 driver makes the (reasonable) assumption that the chip operates
> at a limited speed. All real UARTs do. The comment next to the printk
> in drivers/serial/8250.c says "If we hit this, we''re
dead." Sounds
> scary, but I figure it''s overstating the case. The loop executes
> holding a spin lock, but is limited to 256 iterations. The printk fires
> if we hit the limit and take the emergency exit. Still, I''m
worried we
> hog the cpu for longer than is healthy, or that taking the emergency
> exit isn''t as harmless as it looks to me so far.
I don''t think that the general 8250 driver can reasonably make the
assumption that the chip''s transfer speed is slow compared to the host
CPU clock. That register interface is sometimes used for very high
speed links.
The overall effect in your situation will be, I think, that:
* serial output will take priority over other demands on
the guest CPU, so long as any output is pending
* some CPU may be wasted with other VCPUs spinning on the lock
although in modern kernels the fallback to a sleep/wakeup
lock will kick in and avoid this being too much of a problem
The first of these effects isn''t desirable but it''s difficult
to see a
good alternative.
Note that on many real systems sending a lot of output to the serial
port can cause CPU starvation - some years ago my (non-virtualised
Linux) colo machine had timekeeping trouble until I stopped the kernel
packet filter from complaining to the serial console. I imagine this
has been improved by now, but even so userspace (and users) of even
modern operating systems should be aware of these kind of problems and
not spew huge quantities of unacknowledged stuff to serial consoles.
We could rate limit the port to some speed according to the wall
clock, but the time intervals would have to be very short. With a
115200bps serial port, a character period is just under 90us. Even
limiting consecutive serial writes to a few dozen (32 say) would mean
that we need to set a timer every 2.8ms.
And the result of doing that would be that when the only thing which
is happening is that some data transfer is happening over an emulated
serial port, the transfer would be artificially limited to some
nominal speed.
Do you have any better ideas for how to trick the guest into making
better use of its CPU time, which will solve your use case without
imposing an artificial wall clock based speed limit ?
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-18 17:04 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for irq4"): >> I see funny effects where serial output stalls until some input happens, >> but I don''t know whether that''s related, or whether xen-unstable has the >> same problem. > > This is probably the bug that Anders Kaseorg reported on the 5th of > February which I tracked down to a pair of bugs in the Linux kernel''s > serial driver, and therefore unrelated. See > http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00372.html > and the surrounding thread.I''ll check that out, thanks.>> The 8250 driver makes the (reasonable) assumption that the chip operates >> at a limited speed. All real UARTs do. The comment next to the printk >> in drivers/serial/8250.c says "If we hit this, we''re dead." Sounds >> scary, but I figure it''s overstating the case. The loop executes >> holding a spin lock, but is limited to 256 iterations. The printk fires >> if we hit the limit and take the emergency exit. Still, I''m worried we >> hog the cpu for longer than is healthy, or that taking the emergency >> exit isn''t as harmless as it looks to me so far. > > I don''t think that the general 8250 driver can reasonably make the > assumption that the chip''s transfer speed is slow compared to the host > CPU clock. That register interface is sometimes used for very high > speed links.For better or worse, it does. I guess a case could be made that any sane 8250-compatible chip, real or virtual, should behave like a 8250 when treated like a 8250. I.e. if told to go nice and slow, it better goes nice and slow. If it can also go very fast, it should have some extra knob for that.> The overall effect in your situation will be, I think, that: > * serial output will take priority over other demands on > the guest CPU, so long as any output is pending > * some CPU may be wasted with other VCPUs spinning on the lock > although in modern kernels the fallback to a sleep/wakeup > lock will kick in and avoid this being too much of a problem > > The first of these effects isn''t desirable but it''s difficult to see a > good alternative.Perhaps we should discuss this with driver maintainers, just to make sure we''re not missing anything here.> Note that on many real systems sending a lot of output to the serial > port can cause CPU starvation - some years ago my (non-virtualised > Linux) colo machine had timekeeping trouble until I stopped the kernel > packet filter from complaining to the serial console. I imagine this > has been improved by now, but even so userspace (and users) of even > modern operating systems should be aware of these kind of problems and > not spew huge quantities of unacknowledged stuff to serial consoles.Point.> We could rate limit the port to some speed according to the wall > clock, but the time intervals would have to be very short. With a > 115200bps serial port, a character period is just under 90us. Even > limiting consecutive serial writes to a few dozen (32 say) would mean > that we need to set a timer every 2.8ms. > > And the result of doing that would be that when the only thing which > is happening is that some data transfer is happening over an emulated > serial port, the transfer would be artificially limited to some > nominal speed.Just like with a real machine :)> Do you have any better ideas for how to trick the guest into making > better use of its CPU time, which will solve your use case without > imposing an artificial wall clock based speed limit ? > > Ian.I''m afraid I don''t. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 18/02/2009 17:04, "Markus Armbruster" <armbru@redhat.com> wrote:>> Do you have any better ideas for how to trick the guest into making >> better use of its CPU time, which will solve your use case without >> imposing an artificial wall clock based speed limit ? >> >> Ian. > > I''m afraid I don''t.Perhaps we could add a qemu command-line option to force accurate uart emulation w.r.t. programmed FIFO depth and baud rate? Then those who want this more accurate but more expensive and slower behaviour can configure it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser writes ("Re: [Xen-devel] serial8250: too much work for
irq4"):> Perhaps we could add a qemu command-line option to force accurate uart
> emulation w.r.t. programmed FIFO depth and baud rate? Then those who want
> this more accurate but more expensive and slower behaviour can configure
it.
That would certainly be possible. Our hw/serial.c is very like
upstream''s, and the same issue will apply to (for example) KVM, so we
should coordinate any such change with upstream. I would suggest we
crosspost the patch to xen-devel and qemu-devel.
Markus: are you sufficiently bothered about this that you''d like to
supply a patch to change the behaviour in the way Keir suggests :-).
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-18 18:18 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Keir Fraser writes ("Re: [Xen-devel] serial8250: too much work for irq4"): >> Perhaps we could add a qemu command-line option to force accurate uart >> emulation w.r.t. programmed FIFO depth and baud rate? Then those who want >> this more accurate but more expensive and slower behaviour can configure it. > > That would certainly be possible. Our hw/serial.c is very like > upstream''s, and the same issue will apply to (for example) KVM, so we > should coordinate any such change with upstream. I would suggest we > crosspost the patch to xen-devel and qemu-devel. > > Markus: are you sufficiently bothered about this that you''d like to > supply a patch to change the behaviour in the way Keir suggests :-). > > Ian.;) I don''t yet know. I need to try the other patch you mentioned, and see how the that flies. Anyway, the proposed change should most probably be routed through upstream QEMU. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster
2009-Feb-19 16:45 UTC
Re: [Xen-devel] serial8250: too much work for irq4
Ian Jackson <Ian.Jackson@eu.citrix.com> writes:> Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for irq4"): >> I see funny effects where serial output stalls until some input happens, >> but I don''t know whether that''s related, or whether xen-unstable has the >> same problem. > > This is probably the bug that Anders Kaseorg reported on the 5th of > February which I tracked down to a pair of bugs in the Linux kernel''s > serial driver, and therefore unrelated. See > http://lists.xensource.com/archives/html/xen-devel/2009-02/msg00372.html > and the surrounding thread.Yes, the patch there fixes the annoying stall. How did upstream like it? I can''t see anything on LKML, maybe I missed it. Thanks! [...] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Markus Armbruster writes ("Re: [Xen-devel] serial8250: too much work for
irq4"):> Yes, the patch there fixes the annoying stall.
Great.
> How did upstream like it? I can''t see anything on LKML, maybe I
missed
> it.
I think they were waiting for Anders to comment. I haven''t had any
reply myself and was about to chase it.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel