After a couple weeks of uptime, my domU router (routing between 3 nics) crashed with the following (hand copied as it was locked up hard. The dom0 was fine, and no problems with the other domUs. Dom0 running 2.6.11-ac1-xen0 and domU running 2.6.11-ac1-xenU Both are basically 2.6.11-xen + alan''s ac patch to get additional network driver support. -Tupshin Unable to handle kernel NULL pointer dereference at virtual address 000000b4 printing eip: c02ae5da *pda = ma 00000000 pa 55555000 [<c02aec1e>] netif_int+0x2e/0x110 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 28 Apr 2005, at 03:27, Tupshin Harper wrote:> After a couple weeks of uptime, my domU router (routing between 3 nics) > crashed with the following (hand copied as it was locked up hard. > > The dom0 was fine, and no problems with the other domUs. > > Dom0 running 2.6.11-ac1-xen0 and domU running 2.6.11-ac1-xenU > Both are basically 2.6.11-xen + alan''s ac patch to get additional > network driver support.Do you have the vmlinux file for your domU (i.e., non-compressed non-stripped image)? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> > On 28 Apr 2005, at 03:27, Tupshin Harper wrote: > >> After a couple weeks of uptime, my domU router (routing between 3 nics) >> crashed with the following (hand copied as it was locked up hard. >> >> The dom0 was fine, and no problems with the other domUs. >> >> Dom0 running 2.6.11-ac1-xen0 and domU running 2.6.11-ac1-xenU >> Both are basically 2.6.11-xen + alan''s ac patch to get additional >> network driver support. > > > Do you have the vmlinux file for your domU (i.e., non-compressed > non-stripped image)? > > -- Keir >Yup...it''s here: http://download.vexi.org/vmlinux or let me know if you want me to do anything with it on my end. Thanks -Tupshin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 28 Apr 2005, at 03:27, Tupshin Harper wrote:> After a couple weeks of uptime, my domU router (routing between 3 nics) > crashed with the following (hand copied as it was locked up hard. > > The dom0 was fine, and no problems with the other domUs. > > Dom0 running 2.6.11-ac1-xen0 and domU running 2.6.11-ac1-xenU > Both are basically 2.6.11-xen + alan''s ac patch to get additional > network driver support.domU got a response from dom0 containing an unexpected request id. domU tried to lookup the skbuff corresponding to the id, but read garbage (because the id was not currently in use). It crashed when it tried to dereference the garbage skbuff pointer. This is rather nasty -- it''s not clear whether dom0 or domU is at fault (dom0 may have forgotten about the id, or corrupted its own state, or the response from dom0 could be at fault). And it must be hard to trigger as we have had no other reports of this. :-( -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> domU got a response from dom0 containing an unexpected request id. > domU tried to lookup the skbuff corresponding to the id, but read > garbage (because the id was not currently in use). It crashed when it > tried to dereference the garbage skbuff pointer. > > This is rather nasty -- it''s not clear whether dom0 or domU is at > fault (dom0 may have forgotten about the id, or corrupted its own > state, or the response from dom0 could be at fault). And it must be > hard to trigger as we have had no other reports of this. :-( > > -- Keir >Well thanks for the analysis :) And yes, it must be hard to trigger, because I''ve had multiple weeks of uptime running 8+ domUs with no problems. It might be interesting that the domU that crashed is the only one doing routing, and it also was under a moderately high traffic load at the time that it crashed. If you have any inclination to add some debug code to try to look for and analyze this problem, I''d be happy to run custom builds for either my dom0 or domU. Thanks. BTW, is there any reccomended way of detecting a crashed domU and restarting it? -Tupshin _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 28 Apr 2005, at 10:33, Tupshin Harper wrote:> BTW, is there any reccomended way of detecting a crashed domU and > restarting it?Ideally a control daemon would automatically detect and reboot for you, or execute a script of your choice. Right now your best bet is to periodically check for a heartbeat (e.g., network ping) and forcibly destroy+create if you see a problem. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel