Daniel Stodden
2011-Jun-20 08:26 UTC
[Xen-devel] [PATCH 0 of 1] Deal with broken frontend/backend ring I/O.
Hi.
After running this blkback patch (Don''t let in-flight requests defer
pending ones...)
http://lists.xensource.com/archives/html/xen-devel/2011-05/msg01968.html
for a while I guess it''s mostly been verified.
Unfortunately, it also revealed a great potential to demo old guest
bugs. The 2.6.32 tree used to have a problem with lost notifications
during IRQ handler migration, due to a glitch in the dynirq handler
logic.
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.32.y.git;a=commitdiff;h=c5783925493e315f91330241546da7915dcc46e3
Blkfront got fixed in stable/v2.6.32.y, but looks as at least RHEL6
didn''t patch it (yet), so I suspect CentOS and derivatives to suffer
too.
Xen-blkfront is particularly sensitive to this. Some people seem to
report around one or two incidents per week. Presumably more on
heavily loaded systems (to repro, manually spinning the affinity mask
under scattered I/O will trigger almost immediately). That''s going to
increase.
So let''s learn to live with that. Main issue is that even if you know
what to blame, there''s nothing in place to deal with it.
I''d like to propose toolstack support which provides people with a
workaround. With minimal kernel support, a watchdog can mostly live in
userland, is easy to do and won''t need to clutter backend drivers.
This can hardly be considered a fix for what''s essentiallly guest
problem. But it gives hosts a chance to automate guest recovery until
there''s an update.
Also, it''s nice for debugging. Ring I/O and event races are a constant
source of paranaoia whenever guests appear to wedge, and I believe it
might help to drastically reduce time spent on remote triage in some
cases.
It can also identify excessively blocking I/O (as opposed to a stuck
message dispatch).
Some potential use cases
- Run occasionally (cron). Alerting on production systems where
guest OSes resides in a different admistrative domain with no
prospect for a quick fix. Might go into distros.
- More frequently, once the machine is known to host guests prone to
error. There shouldn''t be much of a performance impact anyway. But
it might want to be tuned to not start spamming the console logs.
- Command line test. For people reporting I/O issues, wherever
suspecting front/backend problems (or to dismiss that). Or to aid
driver hacking. Might also go in xen-bugtool.
I chose to drop it into tools/misc. It''s rather standalone. Takes a
sysfs patch to blkback. I didn''t add netback support, but I guess that
would look very similar if it ever becomes desirable.
Cheers,
Daniel
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Daniel Stodden
2011-Jun-20 08:26 UTC
[Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O
Adds tool support to debug backends which expose I/O ring state in sysfs. Currently supports /sys/devices/xen-backend/vbd-*-*/io_ring nodes for block I/O, where implemented. Primary function is to observe ring state make progress over a period of time, then report stuck message queue halves where pending consumer/event are not moving. Adding --kick will re-issue event notifications to frontends, and/or kick backends out of wait state. Signed-off-by: Daniel Stodden <daniel.stodden@citrix.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Jun-20 16:49 UTC
Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O
Daniel Stodden writes ("[Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with
broken frontend/backend ring I/O"):> Adds tool support to debug backends which expose I/O ring state in
> sysfs. Currently supports /sys/devices/xen-backend/vbd-*-*/io_ring
> nodes for block I/O, where implemented.
Thanks.
> Primary function is to observe ring state make progress over a period
> of time, then report stuck message queue halves where pending
> consumer/event are not moving.
This seems to have only one entry in COMMANDS, "check". Is that
right ? And it doesn''t seem to provide a way to specify a particular
domain to look for ?
I''m happy to take it as-is as it seems like a better-than-nothing tool
but I just wanted to check I''d understood it, first.
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
Daniel Stodden
2011-Jun-20 20:47 UTC
Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O
On Mon, 2011-06-20 at 12:49 -0400, Ian Jackson wrote:> Daniel Stodden writes ("[Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O"): > > Adds tool support to debug backends which expose I/O ring state in > > sysfs. Currently supports /sys/devices/xen-backend/vbd-*-*/io_ring > > nodes for block I/O, where implemented. > > Thanks. > > > Primary function is to observe ring state make progress over a period > > of time, then report stuck message queue halves where pending > > consumer/event are not moving. > > This seems to have only one entry in COMMANDS, "check". Is that > right ?The <command> thing should allow alternative ways to run it without breaking existing deployments. I used to think about a ''daemon'', but then found that cron would likely do the job.> And it doesn''t seem to provide a way to specify a particular > domain to look for ?I briefly considered it initially, but after testing it just didn''t look so important anymore. :} Presently, a # xen-ringwatch check -v RingWatch(vbd-1-51760/io_ring)[IDLE]: RingState(size=32, Req(prod=31, cons=31, event=32), Rsp(prod=31, pvt=31, event=32)): io: complete, req: complete, rsp: complete RingWatch(vbd-1-51712/io_ring)[BUSY]: RingState(size=32, Req(prod=143236466, cons=143236466, event=143236467), Rsp(prod=143236459, pvt=143236459, event=143236460)): io: pending, req: complete, rsp: complete will to dump the entire set of running backends, independent of state. I should point out there''s not really a significant overhead involved, except some required wait period to come to a conclusion. It''s all glob/read/write/wait and all VBDs are watched in parallel. But even with 50 VMs, at some point I anticipated people to rather grep instead. Here''s a sample crontab invocation: xen-ringwatch check -T 4 --kick | logger -p daemon.crit -t RINGWATCH-ALERT Which will remain silent, until it actually discovers some watched subset to .kick() and then outputs those, exclusively. Jun 20 13:26:59 localhost RINGWATCH-ALERT: RingWatch(vbd-1-51712/io_ring)[STCK]: RingState(size=32, Req(prod=146141561, cons=146141561, event=146141562), Rsp(prod=146141561, pvt=146141561, event=146141530)): io: complete, req: complete, rsp: pending> I''m happy to take it as-is as it seems like a better-than-nothing tool > but I just wanted to check I''d understood it, first.Found that the patch I sent was missing cleanup in some spots (mainly a program rename, and the verbose variable in __main__ ended up off by one). Can I sneak in the update attached before you push it? Also, I never tried the make install target. Does it look okay to you? Cheers, Daniel _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2011-Jun-21 14:39 UTC
Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal with broken frontend/backend ring I/O
Daniel Stodden writes ("Re: [Xen-devel] [PATCH 1 of 1] xen-backwatch: Deal
with broken frontend/backend ring I/O"):> On Mon, 2011-06-20 at 12:49 -0400, Ian Jackson wrote:
> > And it doesn''t seem to provide a way to specify a particular
> > domain to look for ?
>
> I briefly considered it initially, but after testing it just
didn''t look
> so important anymore. :}
OK :-). Well, it''s an improvement, so it should go in.
> Found that the patch I sent was missing cleanup in some spots (mainly a
> program rename, and the verbose variable in __main__ ended up off by
> one). Can I sneak in the update attached before you push it?
Sure.
> Also, I never tried the make install target. Does it look okay to you?
My build test invoked it and it DTRT.
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel