Masami Watanabe
2006-Nov-07 03:46 UTC
[Xen-devel] [RFC] "xs_read(): uuid get error" of qemu-dm
Hi all, since c/s 11840, qemu-dm process is <defunct>, and the qemu log says "xs_read(): uuid get error" in guest reboot. This is because of being not able to read yet when qemu-dm reads vncpasswd from xenstore. (xend has spawned qemu-dm before writing vncpasswd to xenstore) I think that the following actions are necessary. How about ? - The change of the call order of vm.initDomain() and vm.storeVmDetails() in create()@XendDomainInfo.py. It looks safe. Isn''t there problem ? I tried this. Still, the error sometimes occurs. - writeVm() should guarantee the completion of writing of xenstore. Is it possible ? - Temporary correction for which it waits for a few seconds until being possible to read in qemu-dm. I try to do it. Please comment. Regards, Masami _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-07 08:18 UTC
[Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
On 7/11/06 3:46 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com> wrote:> since c/s 11840, qemu-dm process is <defunct>, and the qemu log says > "xs_read(): uuid get error" in guest reboot. > This is because of being not able to read yet when qemu-dm reads > vncpasswd from xenstore. > (xend has spawned qemu-dm before writing vncpasswd to xenstore)This was supposed to be fixed by c/s 12187. If it hasn''t, we need to fix xend to write the passwd before starting qemu, and/or qemu needs to treat failure of the xs_read() as an indication that there is no authentication. What do you think is the problem? Is the passwd getting written after qemu is started and hence racing the xs_read() in xenstored? We don''t want to work around this with a timeouts. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-07 08:26 UTC
[Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
On 7/11/06 3:46 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com> wrote:> - The change of the call order of vm.initDomain() and vm.storeVmDetails() > in create()@XendDomainInfo.py. > It looks safe. Isn''t there problem ? > I tried this. Still, the error sometimes occurs.I''m not sure. Does this change the ordering of the writeVM versus creation of the qemu-dm process?> - writeVm() should guarantee the completion of writing of xenstore. > Is it possible ?You mean storeVm? Yes, it should complete synchronously, unless it''s part of larger transaction in which case it completes synchronously when the transaction commits (if the transaction commits successfully). Writes aren''t buffered or delayed apart from as part of a transaction. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Masami Watanabe
2006-Nov-08 06:13 UTC
[Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
Hi Keir, My explanation was insufficient. "xs_read(): uuid get error" happens when uuid can''t read from xenstore in xenstore_read_vncpasswd@tools/ioemu/xenstore.c. c/s 12187 evaded this problem when the guest rebooted in a lot of environments. As for my environment, the problem was corrected by this correction. However, Afterwards, following problem keeps happening. I think that it is a problem. [Xen-devel] VMX status report 12254:f8ffeb540ec1 http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00288.html [Xen-devel] VMX status report 12217:20204db0891b http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00183.html> IA32/PAE/IA32E: Windows and Linux VMX domains may fail to be > created, the qemu-dm process is <defunct>, and the qemu log says > "xs_read(): uuid get error."I examined it. As a result, In the environment that allocated two or more CPU in Dom0, this problem was able to be caused. The result of the confirmation is as follows. - uuid cannot be read with xenstore_read_vncpasswd() in qemu-dm. - uuid can often be read by changing the order of vm.initDomain() and vm.storeVmDetails() in create()@XendDomainInfo.py. - And, when the read timing is delayed in qemu-dm, It was possible to always read.>From the above, I thought that this problem was a problem of the timingof writing and reading to xenstore from another process.> Is the passwd getting written after qemu > is started and hence racing the xs_read() in xenstored?Yes, maybe. I understand the order of processing xend as follows. Is it my misunderstanding ? create()@XendDomainInfo.py+135 start() _initDomain() _createDevices() createDeviceModel(self)@image.py os.spawnve() ==============> start qemu-dm process _storeVmDetails() _writeVm() ==============> write to xenstore _setVmPermissions() Masami On Tue, 07 Nov 2006 08:18:44 +0000, Keir Fraser wrote:> On 7/11/06 3:46 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com> > wrote: > > > since c/s 11840, qemu-dm process is <defunct>, and the qemu log says > > "xs_read(): uuid get error" in guest reboot. > > This is because of being not able to read yet when qemu-dm reads > > vncpasswd from xenstore. > > (xend has spawned qemu-dm before writing vncpasswd to xenstore) > > This was supposed to be fixed by c/s 12187. > > If it hasn''t, we need to fix xend to write the passwd before starting qemu, > and/or qemu needs to treat failure of the xs_read() as an indication that > there is no authentication. > > What do you think is the problem? Is the passwd getting written after qemu > is started and hence racing the xs_read() in xenstored? > > We don''t want to work around this with a timeouts. > > -- Keir_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-08 08:22 UTC
[Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
On 8/11/06 6:13 am, "Masami Watanabe" <masami.watanabe@jp.fujitsu.com> wrote:> Yes, maybe. I understand the order of processing xend as follows. > Is it my misunderstanding ?Actually it''s not the read of the ''vncpasswd'' field that is failing but the read of the ''vm'' field. Maybe that happens later still. I can''t repro this right now as I can''t reboot any guests at the moment. I think it''s been broken by the xen-api changes. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ewan Mellor
2006-Nov-08 18:35 UTC
Re: [Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
On Wed, Nov 08, 2006 at 03:13:43PM +0900, Masami Watanabe wrote:> Hi Keir, > > My explanation was insufficient. > > "xs_read(): uuid get error" happens when uuid can''t read from xenstore > in xenstore_read_vncpasswd@tools/ioemu/xenstore.c. > > c/s 12187 evaded this problem when the guest rebooted in a lot of > environments. As for my environment, the problem was corrected by this > correction. > > However, Afterwards, following problem keeps happening. > I think that it is a problem. > > [Xen-devel] VMX status report 12254:f8ffeb540ec1 > http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00288.html > [Xen-devel] VMX status report 12217:20204db0891b > http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00183.html > > IA32/PAE/IA32E: Windows and Linux VMX domains may fail to be > > created, the qemu-dm process is <defunct>, and the qemu log says > > "xs_read(): uuid get error." > > > I examined it. > As a result, In the environment that allocated two or more CPU in > Dom0, this problem was able to be caused. > The result of the confirmation is as follows. > - uuid cannot be read with xenstore_read_vncpasswd() in qemu-dm. > - uuid can often be read by changing the order of vm.initDomain() > and vm.storeVmDetails() in create()@XendDomainInfo.py. > - And, when the read timing is delayed in qemu-dm, It was possible > to always read. > > >From the above, I thought that this problem was a problem of the timing > of writing and reading to xenstore from another process. > > > > Is the passwd getting written after qemu > > is started and hence racing the xs_read() in xenstored? > > Yes, maybe. I understand the order of processing xend as follows. > Is it my misunderstanding ? > > create()@XendDomainInfo.py+135 > start() > _initDomain() > _createDevices() > createDeviceModel(self)@image.py > os.spawnve() ==============> start qemu-dm process > _storeVmDetails() > _writeVm() ==============> write to xenstore > _setVmPermissions()I''ve just put a patch in that ought to help. We can''t reproduce this race here, but perhaps you could give it a try for me. diff -r 9a43cc89ae0a tools/python/xen/xend/XendDomainInfo.py --- a/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:27:31 2006 +0000 +++ b/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:08:28 2006 +0000 @@ -678,6 +678,7 @@ class XendDomainInfo: t.remove() t.mkdir() t.set_permissions({ ''dom'' : self.domid }) + t.write(''vm'', self.vmpath) def _storeDomDetails(self): to_store = { The /vm/<uuid>/vncpasswd node is written before the call to createDeviceModel, in configVNC, but you need the /local/domain/<domid>/vm node to be present too, and it''s this one that isn''t written until after qemu-dm is started. HTH, Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Masami Watanabe
2006-Nov-09 07:44 UTC
Re: [Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
Hi Ewan,> I''ve just put a patch in that ought to help. We can''t reproduce this > race here, but perhaps you could give it a try for me.Special thanks for your patch. Your patch splendidly solved "xs_read(): uuid get error" that occurred on c/s 12307 and before that. When is this patch committed ? Masami On Wed, 8 Nov 2006 18:35:37 +0000, Ewan Mellor wrote:> On Wed, Nov 08, 2006 at 03:13:43PM +0900, Masami Watanabe wrote: > > > Hi Keir, > > > > My explanation was insufficient. > > > > "xs_read(): uuid get error" happens when uuid can''t read from xenstore > > in xenstore_read_vncpasswd@tools/ioemu/xenstore.c. > > > > c/s 12187 evaded this problem when the guest rebooted in a lot of > > environments. As for my environment, the problem was corrected by this > > correction. > > > > However, Afterwards, following problem keeps happening. > > I think that it is a problem. > > > > [Xen-devel] VMX status report 12254:f8ffeb540ec1 > > http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00288.html > > [Xen-devel] VMX status report 12217:20204db0891b > > http://lists.xensource.com/archives/html/xen-devel/2006-11/msg00183.html > > > IA32/PAE/IA32E: Windows and Linux VMX domains may fail to be > > > created, the qemu-dm process is <defunct>, and the qemu log says > > > "xs_read(): uuid get error." > > > > > > I examined it. > > As a result, In the environment that allocated two or more CPU in > > Dom0, this problem was able to be caused. > > The result of the confirmation is as follows. > > - uuid cannot be read with xenstore_read_vncpasswd() in qemu-dm. > > - uuid can often be read by changing the order of vm.initDomain() > > and vm.storeVmDetails() in create()@XendDomainInfo.py. > > - And, when the read timing is delayed in qemu-dm, It was possible > > to always read. > > > > >From the above, I thought that this problem was a problem of the timing > > of writing and reading to xenstore from another process. > > > > > > > Is the passwd getting written after qemu > > > is started and hence racing the xs_read() in xenstored? > > > > Yes, maybe. I understand the order of processing xend as follows. > > Is it my misunderstanding ? > > > > create()@XendDomainInfo.py+135 > > start() > > _initDomain() > > _createDevices() > > createDeviceModel(self)@image.py > > os.spawnve() ==============> start qemu-dm process > > _storeVmDetails() > > _writeVm() ==============> write to xenstore > > _setVmPermissions() > > I''ve just put a patch in that ought to help. We can''t reproduce this race > here, but perhaps you could give it a try for me. > > diff -r 9a43cc89ae0a tools/python/xen/xend/XendDomainInfo.py > --- a/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:27:31 2006 +0000 > +++ b/tools/python/xen/xend/XendDomainInfo.py Wed Nov 08 18:08:28 2006 +0000 > @@ -678,6 +678,7 @@ class XendDomainInfo: > t.remove() > t.mkdir() > t.set_permissions({ ''dom'' : self.domid }) > + t.write(''vm'', self.vmpath) > > def _storeDomDetails(self): > to_store = { > > > The /vm/<uuid>/vncpasswd node is written before the call to createDeviceModel, > in configVNC, but you need the /local/domain/<domid>/vm node to be present > too, and it''s this one that isn''t written until after qemu-dm is started. > > HTH, > > Ewan._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ewan Mellor
2006-Nov-09 09:35 UTC
Re: [Xen-devel] Re: [RFC] "xs_read(): uuid get error" of qemu-dm
On Thu, Nov 09, 2006 at 04:44:43PM +0900, Masami Watanabe wrote:> Hi Ewan, > > > I''ve just put a patch in that ought to help. We can''t reproduce this > > race here, but perhaps you could give it a try for me. > > Special thanks for your patch. > Your patch splendidly solved "xs_read(): uuid get error" that occurred > on c/s 12307 and before that. > > When is this patch committed ?Great, thank you. It''s on its way (I forgot to push it last night ;-) Ewan. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel