Hi, A couple of months ago, I run into a xenstored IO ring issue ( see http://lists.xen.org/archives/html/xen-devel/2013-07/msg00134.html) and at that time we add a protection on oxenstored side. Somehow I meet this problem again. A suse11 hvm domain start and hang with error: XENBUS request ring is not quiescent (00000010:00000000)! XENBUS response ring is not quiescent (00000000:00000013): fixing up It happens when pv-driver init xenbus modules ... int xb_init_comms(void) { struct xenstore_domain_interface *intf = xen_store_interface; int err; if (intf->req_prod != intf->req_cons) pr_err("XENBUS request ring is not quiescent " "(%08x:%08x)!\n", intf->req_cons, intf->req_prod); if (intf->rsp_prod != intf->rsp_cons) { pr_warning("XENBUS response ring is not quiescent" " (%08x:%08x): fixing up\n", intf->rsp_cons, intf->rsp_prod); /* breaks kdump */ if (!reset_devices) intf->rsp_cons = intf->rsp_prod; } ... With the protection logic, oxenstored will treat this domain as " malicious domains", but apparently it''s not. And this time I review all the related source code, and found that it may be caused by a bug in hvmloader. In hvmloader after all the setup, there will be a xenbus_shutdown call which will clear up the ring, void xenbus_shutdown(void) { ... /* We zero out the whole ring -- the backend can handle this, and it''s * not going to surprise any frontends since it''s equivalent to never * having used the rings. */ memset(rings, 0, sizeof *rings); // <-- clear it here ... } And memset is implemented as this void * memset(void *s, int c, unsigned n) { uint8_t b = (uint8_t) c; uint8_t *p = (uint8_t *)s; int i; for ( i = 0; i < n; i++ ) *p++ = b; return s; } The ring structure is put in a shared page and definition is struct xenstore_domain_interface { char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon. */ char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */ XENSTORE_RING_IDX req_cons, req_prod; XENSTORE_RING_IDX rsp_cons, rsp_prod; }; So here is the problem: memset can not set all the page to zero in an atomic way, and during the clear up process oxenstored may access this ring. So memset set every 8bit in this page to zero one by one and when req_cons set to zero oxenstored read req_cons and req_prod to compare, After that oxenstored will process as a normal incoming request and memset will zero out the left page. Finally oxenstored will increase req_cons but req_prod will remain as 0 and that status will be treated as a "bad connection". To fix that we need to make memset atomicly (clear up req_cons and req_prod the same time at least) or we can make oxenstored do not to access the ring when hvmloader clearing up. Any suggestion?
On Wed, 2013-12-04 at 04:25 +0000, Liuqiming (John) wrote:> memset can not set all the page to zero in an atomic way, and during > the clear up process oxenstored may access this ring.Why is oxenstored poking at the ring? Surely it should only do so when the guest (hvmloader) sends it a request. If hvmloader is clearing the page while there is a request/event outstanding then this is an hvmloader bug. Ian.
According to oxenstored source code, oxenstored will read every domain''s IO ring no matter what events happened. Here is the main loop of oxenstored: let main_loop () incr periodic_ops_counter; if !periodic_ops_counter > 20 then ( periodic_ops_counter := 0; periodic_ops (); ); let mw = Connections.has_more_work cons in let inset, outset = Connections.select cons in let timeout = if List.length mw > 0 then 0. else -1. in let rset, wset, _ try Unix.select (spec_fds @ inset) outset [] timeout <-- poll event from fd set including /dev/xen/evtchn with Unix.Unix_error(Unix.EINTR, _, _) -> [], [], [] in let sfds, cfds List.partition (fun fd -> List.mem fd spec_fds) rset in if List.length sfds > 0 then process_special_fds sfds; if List.length cfds > 0 || List.length wset > 0 then process_connection_fds store cons domains cfds wset; process_domains store cons domains <- no matter what event income, this will handle IO ring request of all domains in so when one domain''s hvmloader is clearing its IO ring, oxenstored may access this IO ring because of another domain''s event happened.> -----Original Message----- > From: Ian Campbell [mailto:Ian.Campbell@citrix.com] > Sent: Wednesday, December 04, 2013 5:50 PM > To: Liuqiming (John) > Cc: xen-devel@lists.xen.org; Yanqiangjun > Subject: Re: [Xen-devel] bug in hvmloader xenbus_shutdown logic > > On Wed, 2013-12-04 at 04:25 +0000, Liuqiming (John) wrote: > > > memset can not set all the page to zero in an atomic way, and during > > the clear up process oxenstored may access this ring. > > Why is oxenstored poking at the ring? Surely it should only do so when > the guest (hvmloader) sends it a request. If hvmloader is clearing the > page while there is a request/event outstanding then this is an > hvmloader bug. > > Ian.
On 05/12/13 02:08, Liuqiming (John) wrote:> According to oxenstored source code, oxenstored will read every domain''s IO ring no matter what events happened. > > Here is the main loop of oxenstored: > > let main_loop () ... > process_domains store cons domains <- no matter what event income, this will handle IO ring request of all domains > in > > so when one domain''s hvmloader is clearing its IO ring, oxenstored may access this IO ring because of another domain''s event happened.Yes, this version of the code checks all the interdomain rings every time it goes around the main loop. FWIW my experimental updated oxenstore uses one (user-level) thread per connection. I had thought this was just an efficiency issue... I didn''t realise it had correctness implications.>> On Wed, 2013-12-04 at 04:25 +0000, Liuqiming (John) wrote: >> >>> memset can not set all the page to zero in an atomic way, and during >>> the clear up process oxenstored may access this ring. >> >> Why is oxenstored poking at the ring? Surely it should only do so when >> the guest (hvmloader) sends it a request. If hvmloader is clearing the >> page while there is a request/event outstanding then this is an >> hvmloader bug.Ok, that makes sense. My only hesitation is that violations of this rule (like with current oxenstored) show up rarely. Presumably the xenstore ring desynchronises and the guest will never be able to boot properly with PV drivers. I don''t think I''ve ever seen this myself. Do frontends expect the xenstore ring to be in a zeroed state when they startup? Assuming hvmloader has received all the outstanding request/events, would it be safe to leave the ring contents alone? Cheers, Dave
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > bounces@lists.xen.org] On Behalf Of David Scott > Sent: 05 December 2013 09:36 > To: Liuqiming (John); Ian Campbell > Cc: Yanqiangjun; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] bug in hvmloader xenbus_shutdown logic > > On 05/12/13 02:08, Liuqiming (John) wrote: > > According to oxenstored source code, oxenstored will read every domain''s > IO ring no matter what events happened. > > > > Here is the main loop of oxenstored: > > > > let main_loop () > ... > > process_domains store cons domains <- no matter what event > income, this will handle IO ring request of all domains > > in > > > > so when one domain''s hvmloader is clearing its IO ring, oxenstored may > access this IO ring because of another domain''s event happened. > > Yes, this version of the code checks all the interdomain rings every > time it goes around the main loop. FWIW my experimental updated > oxenstore uses one (user-level) thread per connection. I had thought > this was just an efficiency issue... I didn''t realise it had correctness > implications. > > >> On Wed, 2013-12-04 at 04:25 +0000, Liuqiming (John) wrote: > >> > >>> memset can not set all the page to zero in an atomic way, and during > >>> the clear up process oxenstored may access this ring. > >> > >> Why is oxenstored poking at the ring? Surely it should only do so when > >> the guest (hvmloader) sends it a request. If hvmloader is clearing the > >> page while there is a request/event outstanding then this is an > >> hvmloader bug. > > Ok, that makes sense. > > My only hesitation is that violations of this rule (like with current > oxenstored) show up rarely. Presumably the xenstore ring desynchronises > and the guest will never be able to boot properly with PV drivers. I > don''t think I''ve ever seen this myself. > > Do frontends expect the xenstore ring to be in a zeroed state when they > startup? Assuming hvmloader has received all the outstanding > request/events, would it be safe to leave the ring contents alone? >I think there are probably some PV driver frontends that don''t expect the ring to have been messed with. You could probably consider that a frontend bug, but restoring the ring to pristine state seems like it''s a good thing to do. However zeroing out a shared ring with no explicit notification to the backend also seems like a very risky thing to do. Paul
On Thu, 2013-12-05 at 09:36 +0000, David Scott wrote:> On 05/12/13 02:08, Liuqiming (John) wrote: > > According to oxenstored source code, oxenstored will read every domain''s IO ring no matter what events happened. > > > > Here is the main loop of oxenstored: > > > > let main_loop () > ... > > process_domains store cons domains <- no matter what event income, this will handle IO ring request of all domains > > in > > > > so when one domain''s hvmloader is clearing its IO ring, oxenstored may access this IO ring because of another domain''s event happened. > > Yes, this version of the code checks all the interdomain rings every > time it goes around the main loop. FWIW my experimental updated > oxenstore uses one (user-level) thread per connection. I had thought > this was just an efficiency issue... I didn''t realise it had correctness > implications.TBH the protocol probably doesn''t say anything about not looking unless you''ve got a event, so I guess it is kind of implicit and I suppose it could be argued that what you do is fine (except it''s broken in practice ;-)) Is there some well defined order in which we could clear the various pointer fields safely? I''d need to get my ring macro head on to figure that out I think.> >> On Wed, 2013-12-04 at 04:25 +0000, Liuqiming (John) wrote: > >> > >>> memset can not set all the page to zero in an atomic way, and during > >>> the clear up process oxenstored may access this ring. > >> > >> Why is oxenstored poking at the ring? Surely it should only do so when > >> the guest (hvmloader) sends it a request. If hvmloader is clearing the > >> page while there is a request/event outstanding then this is an > >> hvmloader bug. > > Ok, that makes sense. > > My only hesitation is that violations of this rule (like with current > oxenstored) show up rarely. Presumably the xenstore ring desynchronises > and the guest will never be able to boot properly with PV drivers. I > don''t think I''ve ever seen this myself. > > Do frontends expect the xenstore ring to be in a zeroed state when they > startup? Assuming hvmloader has received all the outstanding > request/events, would it be safe to leave the ring contents alone?Most frontends also have private versions of various pointers which would need to be synchronised, and I suspect all of the init to zero. Ian.
How about we add a XS_RESET operation to xenstore and the semantic should be: reset the connection to initial state The work flow as: 1, Hvmloader send this request to oxenstored and then poll the event 2,Oxenstored clear up IO ring and do some other work(if any) to make sure this connect reset to initial state 3,Oxenstored send a event to notify the reset work has finished I think oxenstored should be the "owner" of this IO ring, so all the complicated operation should be done by oxenstord and other component should just send the request.> -----Original Message----- > From: Ian Campbell [mailto:Ian.Campbell@citrix.com] > Sent: Thursday, December 05, 2013 7:11 PM > To: David Scott > Cc: Liuqiming (John); Yanqiangjun; xen-devel@lists.xen.org > Subject: Re: [Xen-devel] bug in hvmloader xenbus_shutdown logic > > On Thu, 2013-12-05 at 09:36 +0000, David Scott wrote: > > On 05/12/13 02:08, Liuqiming (John) wrote: > > > According to oxenstored source code, oxenstored will read every > domain's IO ring no matter what events happened. > > > > > > Here is the main loop of oxenstored: > > > > > > let main_loop () > > ... > > > process_domains store cons domains <- no matter what event > income, this will handle IO ring request of all domains > > > in > > > > > > so when one domain's hvmloader is clearing its IO ring, oxenstored may > access this IO ring because of another domain's event happened. > > > > Yes, this version of the code checks all the interdomain rings every > > time it goes around the main loop. FWIW my experimental updated > > oxenstore uses one (user-level) thread per connection. I had thought > > this was just an efficiency issue... I didn't realise it had correctness > > implications. > > TBH the protocol probably doesn't say anything about not looking unless > you've got a event, so I guess it is kind of implicit and I suppose it > could be argued that what you do is fine (except it's broken in > practice ;-)) > > Is there some well defined order in which we could clear the various > pointer fields safely? I'd need to get my ring macro head on to figure > that out I think. > > > >> On Wed, 2013-12-04 at 04:25 +0000, Liuqiming (John) wrote: > > >> > > >>> memset can not set all the page to zero in an atomic way, and during > > >>> the clear up process oxenstored may access this ring. > > >> > > >> Why is oxenstored poking at the ring? Surely it should only do so when > > >> the guest (hvmloader) sends it a request. If hvmloader is clearing the > > >> page while there is a request/event outstanding then this is an > > >> hvmloader bug. > > > > Ok, that makes sense. > > > > My only hesitation is that violations of this rule (like with current > > oxenstored) show up rarely. Presumably the xenstore ring desynchronises > > and the guest will never be able to boot properly with PV drivers. I > > don't think I've ever seen this myself. > > > > Do frontends expect the xenstore ring to be in a zeroed state when they > > startup? Assuming hvmloader has received all the outstanding > > request/events, would it be safe to leave the ring contents alone? > > Most frontends also have private versions of various pointers which > would need to be synchronised, and I suspect all of the init to zero. > > Ian._______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel