Akio Takebe
2007-Aug-31 05:15 UTC
[Xen-devel] [RFC][Patch] Improvemet the responce of xend.
Hi, all This is a idea which improve the responce of xend. When domU panic and xend is dumping the core, if we do "xm list" or "xm create", we cannot get the respornce from xend. If domU panic and xend don''t return such the responce, users probably think the system hangup, and they may reboot the system. If domU is allocated big memory, dumping time is long. (e.g. If domU have 256MB memory, the dumping time may be 1 hour and more.) I make the patch which xend fork at dumping time. But child process of xend cannot write xenstore. Why cann''t it write xenstore? Or I had some mistakes? Do you have the better idea? Any comments are welcome. :-) Signed-off-by: Akio Takebe <takebe_akio@jp.fujitsu.com> diff -r 6644d8486266 tools/python/xen/xend/XendDomainInfo.py --- a/tools/python/xen/xend/XendDomainInfo.py Fri Aug 24 15:09:14 2007 -0600 +++ b/tools/python/xen/xend/XendDomainInfo.py Fri Aug 31 13:13:57 2007 +0900 @@ -1107,6 +1107,18 @@ class XendDomainInfo: def getRestartCount(self): return self._readVm(''xend/restart_count'') + def getIs_dumping(self): + return self._readVm(''xend/is_dumping'') + + def setIs_dumping(self, status): + return self._writeVm(''xend/is_dumping'', status) + + def getDumpPid(self): + return self._readVm(''xend/dump_pid'') + + def setDumpPid(self, status): + return self._writeVm(''xend/dump_pid'', status) + def refreshShutdown(self, xeninfo = None): """ Checks the domain for whether a shutdown is required. @@ -1164,8 +1176,12 @@ class XendDomainInfo: # we can do in this context. pass - restart_reason = ''crash'' - self._stateSet(DOM_STATE_HALTED) + dump_status = self.getIs_dumping() + if dump_status == ''dumping'': + return + else: + restart_reason = ''crash'' + self._stateSet(DOM_STATE_HALTED) elif xeninfo[''shutdown'']: self._stateSet(DOM_STATE_SHUTDOWN) @@ -1368,13 +1384,43 @@ class XendDomainInfo: if os.path.isdir(corefile): raise XendError("Cannot dump core in a directory: %s" % corefile) - - xc.domain_dumpcore(self.domid, corefile) + dump_status = self.getIs_dumping() + if dump_status == ''dumped'': + dump_pid = self.getDumpPid() + pid_exited, status = os.waitpid(dump_pid, 0) + self.setDumpPid(str(0)) + if status != 0: + log.exception("XendDomainInfo.dumpCore might failed: dump pid = %s status = %s", + dump_pid, status) + + elif dump_status == ''dumping'': + return + else: + dump_pid = os.fork() + if dump_pid: + self.setDumpPid(str(dump_pid)) + self.setIs_dumping(''dumping'') + else: + try: + xc.domain_dumpcore(self.domid, corefile) + log.warn(''child xend: finish dumping'') + self.setIs_dumping(''dumped'') + log.warn(''child xend: exit'') + sys.exit(0) + except RuntimeError, ex: + corefile_incomp = corefile+''-incomplete'' + os.rename(corefile, corefile_incomp) + log.exception("XendDomainInfo.dumpCore failed: id = %s name = %s", + self.domid, self.info[''name_label'']) + self.setIs_dumping(''dumped'') + sys.exit(1) except RuntimeError, ex: corefile_incomp = corefile+''-incomplete'' os.rename(corefile, corefile_incomp) log.exception("XendDomainInfo.dumpCore failed: id = %s name = %s", self.domid, self.info[''name_label'']) + if self.getDumpPID != 0: + self.setIs_dumping(''dumped'') raise XendError("Failed to dump core: %s" % str(ex)) # @@ -2061,6 +2107,10 @@ class XendDomainInfo: if not self._readVm(''xend/restart_count''): to_store[''xend/restart_count''] = str(0) + if not self._readVm(''xend/is_dumping''): + to_store[''xend/is_dumping''] = ''no'' + if not self._readVm(''xend/dump_pid''): + to_store[''xend/dump_pid''] = str(0) log.debug("Storing VM details: %s", scrub_password(to_store)) Best Regards, Akio Takebe _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Peter Teoh
2007-Aug-31 06:37 UTC
[Xen-devel] Re: [RFC][Patch] Improvemet the responce of xend.
A few comments: a. The patch basically spawn a new process to handle the additional "xm" inputs asynchronously. Correct? And since currently this is not possible, it also means that the possibilities of deadlocks or race conditions through concurrent access to xend via different xm users is not thoroughly verified. Correct? May be. So this is something to look out for. b. The performance of dump core is slow, mainly because external hard disk are slow. Therefore, there could be four different options/variations: minidump vs full-dump. For each there could be a compressed vs no-compression option - compressed is to get a smaller physical core. c. Furthermore, there could be an additional throttling parameters to specify how much to slow the coredump operation, for example, by forcing a CPU re-scheduling operation (higher overheads in task switching - tradeoff for responsiveness) after every fix number of blocks. This will also have the effect of improving performance for additional domU''s operation. Thank you very much. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Akio Takebe
2007-Sep-13 13:44 UTC
Re: [Xen-devel] Re: [RFC][Patch] Improvemet the responce of xend.
Hi, Thank you for your comments.>A few comments: > >a. The patch basically spawn a new process to handle the additional "xm" >inputs asynchronously. Correct?Yes, but currently only at the time of dump-core.> And since currently this is not possible >, it also means that the possibilities of deadlocks or race conditions >through concurrent access to xend via different xm users is not thoroughly >verified. Correct? May be. So this is something to look out for. >I don''t understand completely what you said. Basicaly xend work one by one, so if xm requests dump-core, xend work only for it. I think it is difficult to change xend to multi thread. But because dump-core work is very simple, Xend should be able to let another process do dump-core work. When I posted my patch, I had a mistake that I don''t open socket for xenstore. Now we''re making another patch with other approach. If xend get a dump-core request, xend does fork() and exec(xc_dump) (like xc_save, xc_restore). If the child process wants to write something into xensotre, it can do by using xs_daemon_open(), xs_write() and so on. (I think) We concern with what we should do if another xm request destory to the dumped domain.>b. The performance of dump core is slow, mainly because external hard disk > are slow. Therefore, there could be four different options/variations: >minidump vs full-dump. For each there could be a compressed vs no- >compression option - compressed is to get a smaller physical core. >Yes, I also think so and want their options. But I don''t think the performance yet.>c. Furthermore, there could be an additional throttling parameters to >specify how much to slow the coredump operation, for example, by forcing a >CPU re-scheduling operation (higher overheads in task switching - tradeoff >for responsiveness) after every fix number of blocks. This will also have >the effect of improving performance for additional domU''s operation.We may be able to do it by nice command if xend let another process do dump-core work. Best Regards, Akio Takebe _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel