Frank Pan
2011-Mar-02 10:20 UTC
[Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
Recent pvops kernel uses wait_event on waiting reply from xenstored. This may cause xenstore clients hung inside kernel when xenstored does not response correctly. The hung even happens when xenstored is not running, and may confuse the developer. The patch attached uses wait_event_timeout instead, and return -EIO to userspace if xenstored does not response in 5 seconds. Simply change wait_event to wait_event_timeout is not correct. Right after the xenstored starts working, the requests abandoned before will be processed by xenstored, and responses sent to the reply_list queue will confuse the requests later. The patch also makes use of the req_id section in struct xsd_sockmsg, as a sequence id. This avoids the confusing of responses by abandoned requests. Any suggestions? Thanks. -- Frank Pan Computer Science and Technology Tsinghua University _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Mar-02 10:34 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
(Chris isn''t involved with Xen maintenance any more, moving him to BCC). On Wed, 2011-03-02 at 10:20 +0000, Frank Pan wrote:> Recent pvops kernel uses wait_event on waiting reply from xenstored. > This may cause xenstore clients hung inside kernel when xenstored does > not response correctly. The hung even happens when xenstored is not > running, and may confuse the developer. > > The patch attached uses wait_event_timeout instead, and return -EIO to > userspace if xenstored does not response in 5 seconds. > > Simply change wait_event to wait_event_timeout is not correct. Right > after the xenstored starts working, the requests abandoned before will > be processed by xenstored, and responses sent to the reply_list queue > will confuse the requests later. The patch also makes use of the > req_id section in struct xsd_sockmsg, as a sequence id. This avoids > the confusing of responses by abandoned requests. > > Any suggestions? Thanks.It all sounds very plausible to me but you''ve forgotten the patch ;-) Why wait_event_timeout and not wait_event_interruptible to allow users to interrupt? In particular I''m concerned about the arbitrarily chosen 5s timeout not being sufficient on a busy system. There was a recent discussion of the general issue you are solving on xen-devel which it might be worth looking over. It was a thread titled "libxl: error handling before xenstored runs" starting on 9 February. Once specific pitfall which I remember was that userspace clients are at liberty to make use of the req_id themselves (and some do). Fixing this might involve shadowing the user provided req_id with a kernel generated ID on the ring and unshadowing in the responses... Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Frank Pan
2011-Mar-02 11:35 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
Oh sorry. Patch attached. On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote:> It all sounds very plausible to me but you''ve forgotten the patch ;-) > > Why wait_event_timeout and not wait_event_interruptible to allow users > to interrupt? In particular I''m concerned about the arbitrarily chosen > 5s timeout not being sufficient on a busy system.FP: I wait_event_interruptible is a choice. But it needs user operation such as kill command. User-level tool(xenstore-ls for example) can also set SIGALRM or something else, but it sounds not so good. The timeout parameter is something discussible. 5s may not be a good one, but I believe xenstored on a healthy system should be response in 5s. What do you think?> Once specific pitfall which I remember was that userspace clients are at > liberty to make use of the req_id themselves (and some do). Fixing this > might involve shadowing the user provided req_id with a kernel generated > ID on the ring and unshadowing in the responses...FP: Yes, that''s what I supposed to do. But I cannot find any dereference on the req_id section of the reply msg. If it exist somewhere, shadowing is surely needed. -- Frank Pan Computer Science and Technology Tsinghua University _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Frank Pan
2011-Mar-03 03:31 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
Also post patch as content. Sorry for gmail''s line wrapping.>From 379c83fa345444b696b5ab96f55ef55a18a9f309 Mon Sep 17 00:00:00 2001From: Frank Pan <frankpzh@gmail.com> Date: Wed, 2 Mar 2011 17:52:52 +0800 Subject: [PATCH] Use timeout on xenstore read_reply to avoid task hunging. --- linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c | 57 ++++++++++++++++++------- 1 files changed, 41 insertions(+), 16 deletions(-) diff --git a/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c b/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c index 5534690..4e66b30 100644 --- a/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c +++ b/linux-2.6-xen/drivers/xen/xenbus/xenbus_xs.c @@ -73,6 +73,9 @@ struct xs_handle { spinlock_t reply_lock; wait_queue_head_t reply_waitq; + /* Sequence number of last request */ + uint32_t reply_seq; + /* * Mutex ordering: transaction_mutex -> watch_mutex -> request_mutex. * response_mutex is never taken simultaneously with the other three. @@ -136,26 +139,45 @@ static int get_error(const char *errorstring) return xsd_errors[i].errnum; } -static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len) +#define XENSTORE_TIMEOUT (5*HZ) + +static void *read_reply(enum xsd_sockmsg_type *type, unsigned int *len, uint32_t seq) { - struct xs_stored_msg *msg; + unsigned long remain_time = XENSTORE_TIMEOUT; + struct xs_stored_msg *msg = 0; char *body; - spin_lock(&xs_state.reply_lock); - - while (list_empty(&xs_state.reply_list)) { - spin_unlock(&xs_state.reply_lock); + while (remain_time && !msg) { /* XXX FIXME: Avoid synchronous wait for response here. */ - wait_event(xs_state.reply_waitq, - !list_empty(&xs_state.reply_list)); + remain_time = wait_event_timeout(xs_state.reply_waitq, + !list_empty(&xs_state.reply_list), + remain_time); + spin_lock(&xs_state.reply_lock); - } - msg = list_entry(xs_state.reply_list.next, - struct xs_stored_msg, list); - list_del(&msg->list); + while (!list_empty(&xs_state.reply_list)) { + msg = list_entry(xs_state.reply_list.next, + struct xs_stored_msg, list); + list_del(&msg->list); + + /* Check sequence number */ + if (msg->hdr.req_id == seq) + break; - spin_unlock(&xs_state.reply_lock); + if (!IS_ERR(msg->u.reply.body)) + kfree(msg->u.reply.body); + kfree(msg); + msg = 0; + } + + spin_unlock(&xs_state.reply_lock); + } + + if (!msg) { + *type = XS_ERROR; + *len = 0; + return ERR_PTR(-EIO); + } *type = msg->hdr.type; if (len) @@ -202,13 +224,14 @@ void *xenbus_dev_request_and_reply(struct xsd_sockmsg *msg) transaction_start(); mutex_lock(&xs_state.request_mutex); + msg->req_id = xs_state.reply_seq++; err = xb_write(msg, sizeof(*msg) + msg->len); if (err) { msg->type = XS_ERROR; ret = ERR_PTR(err); } else - ret = read_reply(&msg->type, &msg->len); + ret = read_reply(&msg->type, &msg->len, msg->req_id); mutex_unlock(&xs_state.request_mutex); @@ -234,13 +257,13 @@ static void *xs_talkv(struct xenbus_transaction t, int err; msg.tx_id = t.id; - msg.req_id = 0; msg.type = type; msg.len = 0; for (i = 0; i < num_vecs; i++) msg.len += iovec[i].iov_len; mutex_lock(&xs_state.request_mutex); + msg.req_id = xs_state.reply_seq++; err = xb_write(&msg, sizeof(msg)); if (err) { @@ -256,7 +279,7 @@ static void *xs_talkv(struct xenbus_transaction t, } } - ret = read_reply(&msg.type, len); + ret = read_reply(&msg.type, len, msg.req_id); mutex_unlock(&xs_state.request_mutex); @@ -872,6 +895,8 @@ int xs_init(void) int err; struct task_struct *task; + xs_state.reply_seq = 0; + INIT_LIST_HEAD(&xs_state.reply_list); spin_lock_init(&xs_state.reply_lock); init_waitqueue_head(&xs_state.reply_waitq); -- 1.7.1 On Wed, Mar 2, 2011 at 7:35 PM, Frank Pan <frankpzh@gmail.com> wrote:> Oh sorry. > Patch attached. > > On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> It all sounds very plausible to me but you''ve forgotten the patch ;-) >> >> Why wait_event_timeout and not wait_event_interruptible to allow users >> to interrupt? In particular I''m concerned about the arbitrarily chosen >> 5s timeout not being sufficient on a busy system. > > FP: I wait_event_interruptible is a choice. But it needs user > operation such as kill command. User-level tool(xenstore-ls for > example) can also set SIGALRM or something else, but it sounds not so > good. > > The timeout parameter is something discussible. 5s may not be a good > one, but I believe xenstored on a healthy system should be response in > 5s. What do you think? > >> Once specific pitfall which I remember was that userspace clients are at >> liberty to make use of the req_id themselves (and some do). Fixing this >> might involve shadowing the user provided req_id with a kernel generated >> ID on the ring and unshadowing in the responses... > > FP: Yes, that''s what I supposed to do. But I cannot find any > dereference on the req_id section of the reply msg. If it exist > somewhere, shadowing is surely needed. > > -- > Frank Pan > > Computer Science and Technology > Tsinghua University >-- 潘震皓, Frank Pan Computer Science and Technology Tsinghua University _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Mar-04 11:31 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
On Wed, 2011-03-02 at 11:35 +0000, Frank Pan wrote:> Oh sorry. > Patch attached. > > On Wed, Mar 2, 2011 at 6:34 PM, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > It all sounds very plausible to me but you''ve forgotten the patch ;-) > > > > Why wait_event_timeout and not wait_event_interruptible to allow users > > to interrupt? In particular I''m concerned about the arbitrarily chosen > > 5s timeout not being sufficient on a busy system. > > FP: I wait_event_interruptible is a choice. But it needs user > operation such as kill command. User-level tool(xenstore-ls for > example) can also set SIGALRM or something else, but it sounds not so > good.I think this is preferable to requiring all clients be reworked to handle a timeout event which they aren''t currently expecting. Rather than have all xenbus clients expect and handle an arbitrary timeout we should provide a new interface to in-kernel users of xenbus which includes a timeout parameter, e.g. foo_timeout. (assuming there are in kernel users who could do anything sane with a timeout, if not then we shouldn''t bother with the interface). For userspace clients I think we would probably be better off making sure that poll/select work properly than trying to find a way to expose timeouts of this sort, that combined with making the sleeps interruptible, would cover the problems we care about, I think.> The timeout parameter is something discussible. 5s may not be a good > one, but I believe xenstored on a healthy system should be response in > 5s. What do you think?I''ve definitely seen cases in heavily loaded domain 0 where it has taken longer than 5s (think of a system which is starting dozens of new domains concurrently for example)> > Once specific pitfall which I remember was that userspace clients are at > > liberty to make use of the req_id themselves (and some do). Fixing this > > might involve shadowing the user provided req_id with a kernel generated > > ID on the ring and unshadowing in the responses... > > FP: Yes, that''s what I supposed to do. But I cannot find any > dereference on the req_id section of the reply msg. If it exist > somewhere, shadowing is surely needed.Userspace clients communicating via the drivers/xen/xenfs/xenbus.c driver are at liberty to use the req_id field for their own purposes. For example, the ocaml xenbus bindings in xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think (although I''m not 100% sure) that the oxenstored (tools/ocaml/xenstored) uses it. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Vincent Hanquez
2011-Mar-04 11:41 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
On 03/04/2011 11:31 AM, Ian Campbell wrote:> For example, the ocaml xenbus bindings in > xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think > (although I''m not 100% sure) that the oxenstored (tools/ocaml/xenstored) > uses it. >oxenstored is echoing it back in the reply, and it doesn''t have any more business for it. The client bindings (ocaml/libs/xs not ocaml/libs/xb) doesn''t use it. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Mar-04 11:53 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
On Fri, 2011-03-04 at 11:41 +0000, Vincent Hanquez wrote:> On 03/04/2011 11:31 AM, Ian Campbell wrote: > > For example, the ocaml xenbus bindings in > > xen-unstable.hg:tools/ocaml/libs/xb expose this field and I think > > (although I''m not 100% sure) that the oxenstored (tools/ocaml/xenstored) > > uses it. > > > oxenstored is echoing it back in the reply, and it doesn''t have any more > business for it.Duh, of course, it''s the server end isn''t it ;-)> The client bindings (ocaml/libs/xs not ocaml/libs/xb) doesn''t use it.Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Frank Pan
2011-Mar-10 10:54 UTC
Re: [Xen-devel] [PATCH] Use timeout on xenstore read_reply to avoid task hunging
> I think this is preferable to requiring all clients be reworked to > handle a timeout event which they aren''t currently expecting.> Rather than have all xenbus clients expect and handle an arbitrary > timeout we should provide a new interface to in-kernel users of xenbus > which includes a timeout parameter, e.g. foo_timeout. (assuming there > are in kernel users who could do anything sane with a timeout, if not > then we shouldn''t bother with the interface).> For userspace clients I think we would probably be better off making > sure that poll/select work properly than trying to find a way to expose > timeouts of this sort, that combined with making the sleeps > interruptible, would cover the problems we care about, I think.How many clients are there? Is it possible to change them all? Poll/select sounds great, but it may change the current xenbus protocol. Also, if the protocol is changing, i think design a method to determine whether xenstored is up is needed. Can we alter xenbus protocol? -- 潘震皓, Frank Pan Computer Science and Technology Tsinghua University _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel