Cui, Dexuan
2009-Jun-16 17:04 UTC
[Xen-devel] [PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()
After changeset 19679: ec2bc4b9fa32 (xend: hot-plug PCI devices at boot-time) and the related ioemu commits, I notice there is a race condition that could break the device assignment of hvm guest. The scenario is (assuming we statically assign 1 NIC to hvm guest): 1) ioemu starts up; in xenstore_parse_domain_config(), ioemu adds a watch of "device-model/<dom_id>/command" -- note: xs_watch() would fire the watch immediately and a watch event is added into xsh->watch_list -- see tools/xenstore/xenstored_watch.c:do_watch() -> add_event() and tools/xenstore/xs.c:xs_watch(); 2) in ioemu's main_loop(), ioemu invokes xenstore_record_dm_state("running"); 3) xend invokes signalDeviceModel('pci-ins', 'pci-inserted', bdf_str) thats writes "pci-ins" into "device-model/<dom_id>/command" -- note: it's the second time the watch is fired and a new watch event is added into xsh->watch_list; 4) xend is scheduled out and ioemu is scheduled in (or, both processes are running on different vcpus and ioemu runs a little faster at this time); 5) in main_loop(), ioemu invokes qemu_set_fd_handler(xenstore_fd(), xenstore_process_event, NULL, NULL) and select(); For the command 'pci-ins', in xenstore_process_event(), ioemu 5.1) fetchs the first watch event from xsh->watch_list; in xenstore_process_dm_command_event(), ioemu eventually writes the vslot into /local/domain/0/device-model/<dom_id>/parameter on success; 5.2) fetchs the second watch event and invokes xenstore_process_dm_command_event() again -- this invocation would fail due to /local/domain/0/device-model/<dom_id>/parameter is not a valid bdf string so ioemu would xenstore_record_dm("parameter", "no free hotplug slots"); 6) now xend is scheduled in; in hvm_pci_device_insert_dev(), xend raises VmError since vslot is "no free hotplug slots": raise VmError(("Cannot pass-through PCI function '%s'. " + "Device model reported an error: %s") % (bdf_str, vslot)) The issue can be reproduced on some environments every time. In ioemu's main_loop(), if we add a "sleep(3)" between xenstore_record_dm_state("running") and qemu_set_fd_handler(xenstore_fd(), xenstore_process_event, NULL, NULL), we can reproduce the issue every time. A straightforward thought is: we can ignore the first watch fire. Please see the below patch. However I'm not sure if this is the best fix. And for the other watches(like "logdirty", "hdc") set in xenstore_parse_domain_config(), I think we should also double check them even if no actual bug has been observed yet. Please comment. Signed-off-by: Dexuan Cui <dexuan.cui@intel.com> diff --git a/xenstore.c b/xenstore.c index d2f38d2..508f1c1 100644 --- a/xenstore.c +++ b/xenstore.c @@ -747,6 +747,18 @@ static void xenstore_process_dm_command_event(void) char *path = NULL, *command = NULL, *par = NULL; unsigned int len; + /* xs_watch() would fire the watch immediately -- see + * tools/xenstore/xenstored_watch.c:do_watch() -> add_event(). + * To avoid race condition here, we should ignore the first fire and + * only handle the subsequent fire(s) triggered by the actual changes to + * the xenstore node. + */ + static int first_time = 1; + if (first_time) { + first_time = 0; + goto out; + }; + if (pasprintf(&path, "/local/domain/0/device-model/%u/command", domid) == -1) { fprintf(logfile, "out of memory reading dm command\n"); _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2009-Jun-16 17:23 UTC
[Xen-devel] Re: [PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()
Cui, Dexuan writes ("[PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()"):> After changeset 19679: ec2bc4b9fa32 (xend: hot-plug PCI devices at > boot-time) and the related ioemu commits, I notice there is a race > condition that could break the device assignment of hvm guest....> A straightforward thought is: we can ignore the first watch > fire. Please see the below patch.I don''t think this can possibly be right. xenstore watches are cues to read the path at which the watch has fired, but number of watch firings is not conserved. So the actual watch processing code should be idempotent. When we start processing a command we should either (i) delete it from xenstore immediately, so that future watch triggerings either don''t see the command or actually see genuine new invocations, or (ii) make a note that we''re processing the command and arrange not to reprocess it until something (us or xend, probably) has deleted it. I''m not sure exactly how the protocol works here ... Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-Jun-16 17:27 UTC
[Xen-devel] Re: [PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()
On 16/06/2009 18:04, "Cui, Dexuan" <dexuan.cui@intel.com> wrote:> A straightforward thought is: we can ignore the first watch fire. Please see > the below patch. > However I''m not sure if this is the best fix. And for the other watches(like > "logdirty", "hdc") set in xenstore_parse_domain_config(), I think we should > also double check them even if no actual bug has been observed yet. > > Please comment.Can you not treat the watch firing as an indication of maybe-changed, and decide what to do based on current values you read from xenstore from within the watch handler? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Jun-16 17:33 UTC
[Xen-devel] RE: [PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()
Ian Jackson wrote:> Cui, Dexuan writes ("[PATCH][ioemu] Ignore the first watch fire in > xenstore_process_dm_command_event()"): >> After changeset 19679: ec2bc4b9fa32 (xend: hot-plug PCI devices at >> boot-time) and the related ioemu commits, I notice there is a race >> condition that could break the device assignment of hvm guest. ... >> A straightforward thought is: we can ignore the first watch >> fire. Please see the below patch. > > I don''t think this can possibly be right. xenstore watches are cues > to read the path at which the watch has fired, but number of watch > firings is not conserved. So the actual watch processing code should > be idempotent. > > When we start processing a command we should either (i) delete it from > xenstore immediately, so that future watch triggerings either don''t > see the command or actually see genuine new invocations,Thanks for your suggestion. I prefer this method. I''ll post a new patch.> or (ii) make > a note that we''re processing the command and arrange not to reprocess > it until something (us or xend, probably) has deleted it. > > I''m not sure exactly how the protocol works here ... >Thanks, -- Dexuan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2009-Jun-16 17:42 UTC
[Xen-devel] RE: [PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()
Cui, Dexuan writes ("RE: [PATCH][ioemu] Ignore the first watch fire in xenstore_process_dm_command_event()"):> Ian Jackson wrote: > > When we start processing a command we should either (i) delete it from > > xenstore immediately, so that future watch triggerings either don''t > > see the command or actually see genuine new invocations, > > Thanks for your suggestion. > I prefer this method. I''ll post a new patch.Right. If you do this, make sure that you don''t give xend the impression it''s safe to overwrite the parameters of the old command until you''ve read them all. Otherwise two commands in quick succession (or simultaneously) will go wrong. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Jun-16 18:28 UTC
[Xen-devel] [PATCH][ioemu] in xenstore_process_dm_command_event(), xs_rm the command node after we read it.
As Ian Jackson pointed out: "when we start processing a command we should delete it from xenstore immediately, so that future watch triggerings either don''t see the command or actually see genuine new invocations". The patch is used to overcome a race condition that occurs after changeset 19679: ec2bc4b9fa32 (xend: hot-plug PCI devices at boot-time) and could break the device assignment of hvm guest: ioemu''s xs_watch() fires the ''command'' for the 1st time and xend''s signalDeviceModel(''pci-ins'',...) fires it for the 2nd time -- without the patch, the 2nd time watch handling would try to invoke xenstore_process_dm_command_event() again and since the ''parameter'' node has been changed to hold vslot by ioemu, the second time would fail and set ''parameter'' to "no free hotplug slots" at the end of the 2nd handling; and, if xend runs slower, xend would treat the ''parameter'' of the 2nd time as that of the 1st time and destroy the guest. Signed-off-by: Dexuan Cui <dexuan.cui@intel.com> diff --git a/xenstore.c b/xenstore.c index d2f38d2..0618c80 100644 --- a/xenstore.c +++ b/xenstore.c @@ -756,6 +756,9 @@ static void xenstore_process_dm_command_event(void) if (!command) goto out; + if (!xs_rm(xsh, XBT_NULL, path)) + fprintf(logfile, "xs_rm failed: path=%s\n", path); + if (!strncmp(command, "save", len)) { fprintf(logfile, "dm-command: pause and save state\n"); xen_pause_requested = 1; _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2009-Jun-18 13:51 UTC
[Xen-devel] Re: [PATCH][ioemu] in xenstore_process_dm_command_event(), xs_rm the command node after we read it.
Cui, Dexuan writes ("[PATCH][ioemu] in xenstore_process_dm_command_event(), xs_rm the command node after we read it."):> + if (!xs_rm(xsh, XBT_NULL, path)) > + fprintf(logfile, "xs_rm failed: path=%s\n", path); > +Are you sure this is right ? What normally removes this path ? Should that other code be deleted ? Will bringing the removal forward have any other implications ? As I said: If you do this, make sure that you don''t give xend the impression it''s safe to overwrite the parameters of the old command until you''ve read them all. Otherwise two commands in quick succession (or simultaneously) will go wrong. Ideally the protocol would be documented somewhere, at least in a comment. Then it would be easy to see whether it was a correct protocol, and whether each side implemented it correctly. If you don''t have time to do this kind of analysis just say, and I''ll look at it all in more detail. Sorry to be sceptical. In the meantime I''ll apply your patch :-). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Cui, Dexuan
2009-Jun-19 02:01 UTC
[Xen-devel] RE: [PATCH][ioemu] in xenstore_process_dm_command_event(), xs_rm the command node after we read it.
> Ian Jackson wrote: > > When we start processing a command we should either (i) delete it from > > xenstore immediately, so that future watch triggerings either don''t > > see the command or actually see genuine new invocations,Maybe I misunderstood this. I thought I should rm the ''path'' (namely the ''command'') as the above suggested. rm the ''command'' itself would trigger 1 more watch fire, and the watch handler in qemu-dm trying to xs_read_watch() gets a NULL and ignores this fire. The patch can fix the issue I meet with. Actually I''m not very familiar with the protocols among xenstored,qemu-dm and xend as there is no document/comment at all... Looks there are some simple sync mechanism in xend/qemu-dm, but I don''t think they are enough. I''ll appreciate it if you could help to do an analysis and ducoment the protocols by comments. :-) Thanks, -- Dexuan -----Original Message----- From: Ian Jackson [mailto:Ian.Jackson@eu.citrix.com] Sent: 2009?6?18? 21:51 To: Cui, Dexuan Cc: xen-devel@lists.xensource.com; Keir Fraser; Simon Horman Subject: Re: [PATCH][ioemu] in xenstore_process_dm_command_event(), xs_rm the command node after we read it. Cui, Dexuan writes ("[PATCH][ioemu] in xenstore_process_dm_command_event(), xs_rm the command node after we read it."):> + if (!xs_rm(xsh, XBT_NULL, path)) > + fprintf(logfile, "xs_rm failed: path=%s\n", path); > +Are you sure this is right ? What normally removes this path ? Should that other code be deleted ? Will bringing the removal forward have any other implications ? As I said: If you do this, make sure that you don''t give xend the impression it''s safe to overwrite the parameters of the old command until you''ve read them all. Otherwise two commands in quick succession (or simultaneously) will go wrong. Ideally the protocol would be documented somewhere, at least in a comment. Then it would be easy to see whether it was a correct protocol, and whether each side implemented it correctly. If you don''t have time to do this kind of analysis just say, and I''ll look at it all in more detail. Sorry to be sceptical. In the meantime I''ll apply your patch :-). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel