Many physical PCs have a hardware watchdog. This is a good way of getting up and running again if you somehow have a bug that causes a deadlock. If you run linux in DomU you will have to use a software watchdog. The kernel provides such a software watchdog. But could there be scenarios where the DomU domain will lockup without the software watchdog goes off? Interrupts? So my question does XEN provide a "hardware" watchdog that the user domains can use? It might be safer that the counter and trigger resides in the XEN domain than in the user domains. Regards, -- Andreas Bach Aaen System Developer, M. Sc. Ericsson Danmark A/S tel: +45 89 38 51 00 Skanderborgvej 232 fax: +45 89 38 51 01 8260 Viby J Denmark andreas.bach.aaen@ericsson.com _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> Many physical PCs have a hardware watchdog. This is a good > way of getting up and running again if you somehow have a bug > that causes a deadlock. > If you run linux in DomU you will have to use a software > watchdog. The kernel provides such a software watchdog. But > could there be scenarios where the DomU domain will lockup > without the software watchdog goes off? > Interrupts? > > So my question does XEN provide a "hardware" watchdog that > the user domains can use? > It might be safer that the counter and trigger resides in the > XEN domain than in the user domains.It doesn''t today, but could easily be added. A better approach might be to do some more sophisticated higher-level liveness monitoring in domain0, then use the tools to reboot the domain. Ian _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> So my question does XEN provide a "hardware" watchdog that the user domains > can use? > It might be safer that the counter and trigger resides in the XEN domain > than in the user domains.A sensible and straightforward way to do this would be to wait for the XenStore code to be fully merged, then set up an attribute in the XenStore which is written to periodically by the domU to say that it''s live. A daemon in dom0 can watch this and restart the domain if the attribute isn''t updated for a while. The hypervisor won''t need to know about this. Cheers, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Tirsdag 26 juli 2005 22:13, Mark Williamson wrote:> > So my question does XEN provide a "hardware" watchdog that the user > > domains can use? > > It might be safer that the counter and trigger resides in the XEN domain > > than in the user domains. > > A sensible and straightforward way to do this would be to wait for the > XenStore code to be fully merged, then set up an attribute in the XenStore > which is written to periodically by the domU to say that it''s live. A > daemon in dom0 can watch this and restart the domain if the attribute isn''t > updated for a while. The hypervisor won''t need to know about this.This seems like a good idea. I expect that you have atomic writes in the XenStore, so the dom0 deamon simply increments a timer that the domU needs to reset once in a while. This could be written into the watchdog userspace deamon that automatically detect that it''s inside a virtual machine and not directly on real hardware. What is the cleanest way to detect id you are running in domU or not? Regards, -- Andreas Bach Aaen System Developer, M. Sc. Ericsson Danmark A/S tel: +45 89 38 51 00 Skanderborgvej 232 fax: +45 89 38 51 01 8260 Viby J Denmark andreas.bach.aaen@ericsson.com _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> This seems like a good idea. I expect that you have atomic writes in the > XenStore, so the dom0 deamon simply increments a timer that the domU needs > to reset once in a while.Yes, something like that should be fine, although I think the intention is to have only one writer to each portion of the store. How about the domU increments / toggles the value and the dom0 daemon could notice when it hasn''t been updated in a while.> This could be written into the watchdog userspace > deamon that automatically detect that it''s inside a virtual machine and not > directly on real hardware. What is the cleanest way to detect id you are > running in domU or not?There''s a flag in the Xen startinfo that tells a domain if it''s dom0 or not (and if it''s a driver domain, etc.). The most straightforward thing to do is probably to write a kernel driver using the Linux watchdog API and then have that talk to the xenstore. This''ll allow you to use the standard daemon without any changes. The kernel API for watchdogs is fairly simple so it should be straightforward once the XenStore / XenBus stuff is all up and running. Cheers, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Mark Williamson wrote:>>So my question does XEN provide a "hardware" watchdog that the user domains >>can use? >>It might be safer that the counter and trigger resides in the XEN domain >>than in the user domains. > > > A sensible and straightforward way to do this would be to wait for the > XenStore code to be fully merged, then set up an attribute in the XenStore > which is written to periodically by the domU to say that it''s live. A daemon > in dom0 can watch this and restart the domain if the attribute isn''t updated > for a while. The hypervisor won''t need to know about this.forgive me if I''m misunderstanding something but it seems to me that one would want to use a hardware watchdog on dom0 so if the system should well and truly fail, an undeniable reset could be applied to the entire system. But if dom0 is healthy, then yes, a software watchdog in dom0 paying attention to and deciding when to reset the various domU''s should be sufficient. ---eric _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
> forgive me if I''m misunderstanding something but it seems to me that one > would want to use a hardware watchdog on dom0 so if the system should > well and truly fail, an undeniable reset could be applied to the entire > system. But if dom0 is healthy, then yes, a software watchdog in dom0 > paying attention to and deciding when to reset the various domU''s should > be sufficient.I think you''d ideally want both: * Hardware watchdog in dom0 in case dom0 or Xen crashes * Software watchdog for domU is provided by dom0 (which we can guarantee is up because of the hardware watchdog) via the store In the absence of a hardware watchdog, you could also implement a software watchdog for dom0 in Xen itself, which is likely to be the most reliable piece of code in the system and shouldn''t lock up even if dom0 does. Cheers, Mark _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Onsdag 27 juli 2005 21:58, Mark Williamson wrote:> I think you''d ideally want both: > * Hardware watchdog in dom0 in case dom0 or Xen crashes > * Software watchdog for domU is provided by dom0 (which we can guarantee is > up because of the hardware watchdog) via the store > > In the absence of a hardware watchdog, you could also implement a software > watchdog for dom0 in Xen itself, which is likely to be the most reliable > piece of code in the system and shouldn''t lock up even if dom0 does.Correct Mark. This is the solution that I would prefer. Unfortunately I wont have time for programming this. I hope that others do before I really need the feature. For now its just on my wish list. Regards, -- Andreas Bach Aaen System Developer, M. Sc. Ericsson Danmark A/S tel: +45 89 38 51 00 Skanderborgvej 232 fax: +45 89 38 51 01 8260 Viby J Denmark andreas.bach.aaen@ericsson.com _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users