i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour? James
>i''m having random crashes under load (diff & pings). No messages in the logs >or anything, just reboots or hangs. When I went to look at the screen after >a hang to see if there was an oops or anything like it, the screen was blanked >and wouldn''t wake up - anyone know how to stop that behaviour?Not a lot to go on! It''d be useful to know what xen version you''re using (bk changes | head), and what configuration your running when you get this above behaviour (e.g. how many domains running, h/w spec, etc). Would also be useful to know precisely the steps you can take to cause the problem to see if we can reproduce it. If you can attach a serial line to the machine (and perhaps add "noreboot" to the xen command line) you might be able to get some output... cheers, S.
> i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour? > > James-=- MIME -=- --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour? James=20 --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <HTML dir=3Dltr><HEAD></HEAD> <BODY> <DIV><FONT face=3DArial color=3D#000000 size=3D2>i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour?</FONT></DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>James</FONT> </DIV></BODY></HTML> --_13C88198-ADAF-4895-972A-BD40A832F34F_-- ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel Is this the scenario of pinging another domain on the same machine while diffing a pair of large files, that we were using to find the blkdev mismerge bug? If so, did this problem seem to go away after the for a short while after the mismerge was fixed? -- Keir ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
[Resending with top post as my mailer successfully hid my last response] Is this the same ''ping other domain while diffing large files'' test that we were using to trigger the blkdev mismerge bug? If so, did everything seem to work okay for a short while after that bug was fixed, but has become broken again now? If this is a bug that triggers after DOM0''s memory map is mixed around (no longer contiguous) then it may be easier to find after I add code next week that will reverse DOM0''s address space if you make a debug build of Xen. It''s already found one bug (in pci_alloc_consistent()) that I''ll fix on Monday, and check the debug code in at the same time. -- Keir> i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour?------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Is it an SMP machine? I checked in some changes to the locking scheme. Which could cause such behaviour. If it is not an SMP then locks have nothing to do with it. Cheers Gregor> i''m having random crashes under load (diff & pings). No messages in the > logs or anything, just reboots or hangs. When I went to look at the > screen after a hang to see if there was an oops or anything like it, the > screen was blanked and wouldn''t wake up - anyone know how to stop that > behaviour?------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
yes (it''s the same test) and yes (it appeared to be fixed but is broken now). James From: Keir Fraser Sent: Thu 29/07/2004 7:29 PM To: James Harper Cc: xen-devel@lists.sourceforge.net Subject: Re: [Xen-devel] crashes [Resending with top post as my mailer successfully hid my last response] Is this the same ''ping other domain while diffing large files'' test that we were using to trigger the blkdev mismerge bug? If so, did everything seem to work okay for a short while after that bug was fixed, but has become broken again now? If this is a bug that triggers after DOM0''s memory map is mixed around (no longer contiguous) then it may be easier to find after I add code next week that will reverse DOM0''s address space if you make a debug build of Xen. It''s already found one bug (in pci_alloc_consistent()) that I''ll fix on Monday, and check the debug code in at the same time. -- Keir> i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour?
yes it is smp. i should have thought to test it. i''m booting now with nosmp and will follow up with some results shortly if it crashes, or tomorrow if it doesn''t. james From: G. Milos Sent: Thu 29/07/2004 9:07 PM To: James Harper Cc: xen-devel@lists.sourceforge.net Subject: Re: [Xen-devel] crashes Is it an SMP machine? I checked in some changes to the locking scheme. Which could cause such behaviour. If it is not an SMP then locks have nothing to do with it. Cheers Gregor> i''m having random crashes under load (diff & pings). No messages in the > logs or anything, just reboots or hangs. When I went to look at the > screen after a hang to see if there was an oops or anything like it, the > screen was blanked and wouldn''t wake up - anyone know how to stop that > behaviour?
> i''m having random crashes under load (diff & pings). No > messages in the logs or anything, just reboots or hangs.Are you using SCSI or IDE? What device driver? Please can you try backing out the cset labelled ''a better fix for blkdev request merging'' i.e.: cset -x41051ec1NERNxLF017rAWe7ljBk92w>When I went to look at the screen after a hang to see if there >was an oops or anything like it, the screen was blanked and >wouldn''t wake up - anyone know how to stop that behaviour?That''s Linux behaviour -- not a lot we can do about it. Collecting crash messages is much easier if you have an attached serial line... Ian -=- MIME -=- --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour? James=20 --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <HTML dir=3Dltr><HEAD></HEAD> <BODY> <DIV><FONT face=3DArial color=3D#000000 size=3D2>i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour?</FONT></DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>James</FONT> </DIV></BODY></HTML> --_13C88198-ADAF-4895-972A-BD40A832F34F_-- ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
i''m using scsi, aic7xxx to be precise. there are a couple of sym-something cards too - i could rewire and use those if it would help, but not today. i''m just testing now with the 3rd network card module (natsemi) not loaded, i suspect it will still crash but just wanted to rule it out. i''ll back out that cset as soon as i have an answer on this. James From: Ian Pratt Sent: Fri 30/07/2004 10:39 AM To: James Harper Cc: xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> i''m having random crashes under load (diff & pings). No > messages in the logs or anything, just reboots or hangs.Are you using SCSI or IDE? What device driver? Please can you try backing out the cset labelled ''a better fix for blkdev request merging'' i.e.: cset -x41051ec1NERNxLF017rAWe7ljBk92w>When I went to look at the screen after a hang to see if there >was an oops or anything like it, the screen was blanked and >wouldn''t wake up - anyone know how to stop that behaviour?That''s Linux behaviour -- not a lot we can do about it. Collecting crash messages is much easier if you have an attached serial line... Ian -=- MIME -=- --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour? James=20 --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <HTML dir=3Dltr><HEAD></HEAD> <BODY> <DIV><FONT face=3DArial color=3D#000000 size=3D2>i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour?</FONT></DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>James</FONT> </DIV></BODY></HTML> --_13C88198-ADAF-4895-972A-BD40A832F34F_-- ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
On Fri, Jul 30, 2004 at 01:39:26AM +0100, Ian Pratt wrote:> >When I went to look at the screen after a hang to see if there > >was an oops or anything like it, the screen was blanked and > >wouldn''t wake up - anyone know how to stop that behaviour? > > That''s Linux behaviour -- not a lot we can do about > it. Collecting crash messages is much easier if you have an > attached serial line...man 1 setterm. The command you want is "setterm -blank 0", which should probably go in an initscript somewhere. I think some distributions have an /etc/ file with a switch for you to toggle, but not the ones I''m using right now... -andy ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
it''s just occured to me that i don''t know how to back out anything in bk. the extent of my knowledge of bk begins and ends with ''bk pull''. i tried bk cset -x... but it asked me some questions I don''t have an answer for. Can you give me a hint? Also, if i''ve tinkered with my local repository at any point, how do I completely resync it? thanks James From: Ian Pratt Sent: Fri 30/07/2004 10:39 AM To: James Harper Cc: xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> i''m having random crashes under load (diff & pings). No > messages in the logs or anything, just reboots or hangs.Are you using SCSI or IDE? What device driver? Please can you try backing out the cset labelled ''a better fix for blkdev request merging'' i.e.: cset -x41051ec1NERNxLF017rAWe7ljBk92w>When I went to look at the screen after a hang to see if there >was an oops or anything like it, the screen was blanked and >wouldn''t wake up - anyone know how to stop that behaviour?That''s Linux behaviour -- not a lot we can do about it. Collecting crash messages is much easier if you have an attached serial line... Ian -=- MIME -=- --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour? James=20 --_13C88198-ADAF-4895-972A-BD40A832F34F_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable <HTML dir=3Dltr><HEAD></HEAD> <BODY> <DIV><FONT face=3DArial color=3D#000000 size=3D2>i''m having random crashes under load (diff & pings). No messages in the logs or anything, just reboots or hangs. When I went to look at the screen after a hang to see if there was an oops or anything like it, the screen was blanked and wouldn''t wake up - anyone know how to stop that behaviour?</FONT></DIV> <DIV> </DIV> <DIV><FONT face=3DArial size=3D2>James</FONT> </DIV></BODY></HTML> --_13C88198-ADAF-4895-972A-BD40A832F34F_-- ------------------------------------------------------- This SF.Net email is sponsored by BEA Weblogic Workshop FREE Java Enterprise J2EE developer tools! Get your free copy of BEA WebLogic Workshop 8.1 today. http://ads.osdn.com/?ad_id=4721&alloc_id=10040&op=click _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
cool, thanks. It looks like i can just fiddle with values in /etc/console-tools/config under debian, now that I know what i''m looking for. hopefully next time I crash it I can see what''s what. I''ll probably have to hook up a serial console eventually but that will involve sitting there with my laptop which I don''t have time to do presently. thanks again. James From: Andy Isaacson Sent: Fri 30/07/2004 1:32 PM To: Ian Pratt Cc: James Harper; xen-devel@lists.sourceforge.net Subject: Re: [Xen-devel] crashes On Fri, Jul 30, 2004 at 01:39:26AM +0100, Ian Pratt wrote:> >When I went to look at the screen after a hang to see if there > >was an oops or anything like it, the screen was blanked and > >wouldn''t wake up - anyone know how to stop that behaviour? > > That''s Linux behaviour -- not a lot we can do about > it. Collecting crash messages is much easier if you have an > attached serial line...man 1 setterm. The command you want is "setterm -blank 0", which should probably go in an initscript somewhere. I think some distributions have an /etc/ file with a switch for you to toggle, but not the ones I''m using right now... -andy
> it''s just occured to me that i don''t know how to back out anything in bk. the extent of my knowledge of bk begins and ends with ''bk pull''. > > i tried bk cset -x... but it asked me some questions I don''t have an answer for. Can you give me a hint?Strange -- I jsut tried this and it didn''t ask me any questions: iap10 > bk cset -x41051ec1NERNxLF017rAWe7ljBk92w ChangeSet 1.1161 exc: 1.1128.2.1 -> 1.1162 linux-2.4.26-xen-sparse/include/linux/blkdev.h 1.6 exc: 1.1,1.0 -> 1.7: 0 lines linux-2.4.26-xen-sparse/include/asm-xen/pci.h 1.9 exc: 1.4 -> 1.10: 302 lines ChangeSet revision 1.1162: +2 -0 = 7633 Sending ChangeSet log ...> Also, if i''ve tinkered with my local repository at any point, how do I completely resync it?There''s probably some clever way of doing it, but I''d just re clone it with ''bk clone bk://xen.bkbits.net/xeno-unstable.bk''. Ian ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
> it''s just occured to me that i don''t know how to back out anything in > bk. the extent of my knowledge of bk begins and ends with ''bk pull''. > > i tried bk cset -x... but it asked me some questions I don''t have an > answer for. Can you give me a hint?There is ''bk undo'' which removes most recent changesets. More help available by using ''bk help undo''. I am also attaching the reference card. Cheers Gregor
Well now, without making any changes I can''t get it to break anymore. dammit. I have done a recompile though, so maybe I had something out of whack. it''s been testing for 6 hours now with no errors. I''ll leave it overnight and see what happens. James From: Ian Pratt Sent: Fri 30/07/2004 6:11 PM To: James Harper Cc: Ian Pratt; xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> it''s just occured to me that i don''t know how to back out anything in bk. the extent of my knowledge of bk begins and ends with ''bk pull''. > > i tried bk cset -x... but it asked me some questions I don''t have an answer for. Can you give me a hint?Strange -- I jsut tried this and it didn''t ask me any questions: iap10 > bk cset -x41051ec1NERNxLF017rAWe7ljBk92w ChangeSet 1.1161 exc: 1.1128.2.1 -> 1.1162 linux-2.4.26-xen-sparse/include/linux/blkdev.h 1.6 exc: 1.1,1.0 -> 1.7: 0 lines linux-2.4.26-xen-sparse/include/asm-xen/pci.h 1.9 exc: 1.4 -> 1.10: 302 lines ChangeSet revision 1.1162: +2 -0 = 7633 Sending ChangeSet log ...> Also, if i''ve tinkered with my local repository at any point, how do I completely resync it?There''s probably some clever way of doing it, but I''d just re clone it with ''bk clone bk://xen.bkbits.net/xeno-unstable.bk''. Ian
It looked like everything was okay except for these messages in DOM0: (file=main.c, line=270) Failed MMU update transferring to DOM1 (file=main.c, line=270) Failed MMU update transferring to DOM1 but then I tried to start another domain and got this: Using config file /etc/xen/mail2 Error: Internal Server Error so it looks like something is still wrong somewhere... James From: James Harper Sent: Fri 30/07/2004 9:10 PM To: Ian Pratt Cc: Ian Pratt; xen-devel@lists.sourceforge.net Subject: RE: [Xen-devel] crashes Well now, without making any changes I can''t get it to break anymore. dammit. I have done a recompile though, so maybe I had something out of whack. it''s been testing for 6 hours now with no errors. I''ll leave it overnight and see what happens. James From: Ian Pratt Sent: Fri 30/07/2004 6:11 PM To: James Harper Cc: Ian Pratt; xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> it''s just occured to me that i don''t know how to back out anything in bk. the extent of my knowledge of bk begins and ends with ''bk pull''. > > i tried bk cset -x... but it asked me some questions I don''t have an answer for. Can you give me a hint?Strange -- I jsut tried this and it didn''t ask me any questions: iap10 > bk cset -x41051ec1NERNxLF017rAWe7ljBk92w ChangeSet 1.1161 exc: 1.1128.2.1 -> 1.1162 linux-2.4.26-xen-sparse/include/linux/blkdev.h 1.6 exc: 1.1,1.0 -> 1.7: 0 lines linux-2.4.26-xen-sparse/include/asm-xen/pci.h 1.9 exc: 1.4 -> 1.10: 302 lines ChangeSet revision 1.1162: +2 -0 = 7633 Sending ChangeSet log ...> Also, if i''ve tinkered with my local repository at any point, how do I completely resync it?There''s probably some clever way of doing it, but I''d just re clone it with ''bk clone bk://xen.bkbits.net/xeno-unstable.bk''. Ian
> It looked like everything was okay except for these messages in DOM0: > > (file=main.c, line=270) Failed MMU update transferring to DOM1 > (file=main.c, line=270) Failed MMU update transferring to DOM1 > > but then I tried to start another domain and got this: > > Using config file /etc/xen/mail2 > Error: Internal Server Error > > so it looks like something is still wrong somewhere...Is this with or without the ''better blk dev fix'' changeset backed out? The "Failed MMU update" messages are very interesting, and I''ve never seen them before -- xen is refusing to transfer the page to dom1 for some reason. Please can you try doing the same with a debug=y build of Xen. Xen should tell us a bit more about why it''s refusing the request. What workload are you running when this happens? You seem to have a real talent for provoking hard to reproduce bugs ;-) It might be moderately interesting to see the traceback from xend to see which stage of creating the new domain failed. Further, once it gets in to this state, it would be good to try shuting down or destroying the other domains one by one, doing an ''xm list'' after each stage. If one of the domains hangs around after a ''destroy'' it''s a sign there''s been an inconsistency. Ian ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Hi, looking at the 2004-xen-ols-pdf papers''s xen 2.0 Architecture fig vs. the 1.2 fig, what is the differences between the device drivers? It seems like the drivers for 1.2 is ported to function in the paravirtual environment, but what is the deal with the 2.0 Front-end and Vanilla device drivers? Fig 2.0 also has three red arrows, one from the 2.4 kernel to the HW layer and two from the domain 0. What does this mean in contrast to fig. 1.2 ? cheers, Rune J.Andresen ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Last night I tried a bk pull and noticed some errors which meant nothing was applying, so I was missing a few changesets. I cloned a brand new tree and built a new set of images, and now i''m back to having it spontaneously reboot with no error messages. d''oh. james From: Ian Pratt Sent: Sat 31/07/2004 9:48 PM To: James Harper Cc: Ian Pratt; xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> It looked like everything was okay except for these messages in DOM0: > > (file=main.c, line=270) Failed MMU update transferring to DOM1 > (file=main.c, line=270) Failed MMU update transferring to DOM1 > > but then I tried to start another domain and got this: > > Using config file /etc/xen/mail2 > Error: Internal Server Error > > so it looks like something is still wrong somewhere...Is this with or without the ''better blk dev fix'' changeset backed out? The "Failed MMU update" messages are very interesting, and I''ve never seen them before -- xen is refusing to transfer the page to dom1 for some reason. Please can you try doing the same with a debug=y build of Xen. Xen should tell us a bit more about why it''s refusing the request. What workload are you running when this happens? You seem to have a real talent for provoking hard to reproduce bugs ;-) It might be moderately interesting to see the traceback from xend to see which stage of creating the new domain failed. Further, once it gets in to this state, it would be good to try shuting down or destroying the other domains one by one, doing an ''xm list'' after each stage. If one of the domains hangs around after a ''destroy'' it''s a sign there''s been an inconsistency. Ian
>Hi, looking at the 2004-xen-ols-pdf papers''s xen 2.0 Architecture fig >vs. the 1.2 fig, what is the >differences between the device drivers? It seems like the drivers for >1.2 is ported to function in the >paravirtual environment, but what is the deal with the 2.0 Front-end >and Vanilla device drivers? > >Fig 2.0 also has three red arrows, one from the 2.4 kernel to the HW >layer and two from the domain 0. What does >this mean in contrast to fig. 1.2 ?In 1.2, the "real" device drivers were part of Xen. Device drivers in guest OSes were ''virtual'' device drivers that actually just talked to Xen to get anything done. In 2.0, the "real" device drivers run in one or more privileged guest OSes. Xen only deals with the timer (APIC) hardware, the low-level parts of interrupt dispatch, and some parts of the device probing functionality. Essentially linux device drivers (or BSD ones for that matter) can run unmodified in a guest OS which, among other things, gives us a lot more device support. Many devices (especially network and disk) are shared between guest OSes but there is only ever one "real" device driver. To make this sharing work, a privileged guest also includes a "back-end" driver for every real hardware device. All unprivileged guests wishing to share the device include a "front-end" driver. Both of these "drivers" are actually virtual; they do not directly talk to the hardware. Instead they are connected together using a device channel -- essentially a general means of communication between different virtual machines. So if e.g. an unprivileged dom1 wants to share a real e1000 network card, the setup might be: dom1 front-end driver -> dom0 back-end driver -> dom0 e1000 driver. The last driver is a "vanilla" linux device driver meaning that it''s source code is identical to that in the regular linux kernel. The front-end and back-end drivers are new. The lines in the figure connecting domains together represent the device channels that allow domains to talk to one another. The lines in the figure going from a domain ''through'' xen represent the way in which a suitably privileged domain can be given direct access to [a subset of] the real hardware. You may find the our "Reconstructing I/O" paper gives a clearer description of all of this. cheers, S.
it''s definitely rebooting fairly shortly after i start testing it with disk and network activity. I''ve added noreboot and will inspect the console when it crashes next. james From: James Harper Sent: Mon 2/08/2004 9:37 AM To: Ian Pratt Cc: Ian Pratt; xen-devel@lists.sourceforge.net Subject: RE: [Xen-devel] crashes Last night I tried a bk pull and noticed some errors which meant nothing was applying, so I was missing a few changesets. I cloned a brand new tree and built a new set of images, and now i''m back to having it spontaneously reboot with no error messages. d''oh. james From: Ian Pratt Sent: Sat 31/07/2004 9:48 PM To: James Harper Cc: Ian Pratt; xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> It looked like everything was okay except for these messages in DOM0: > > (file=main.c, line=270) Failed MMU update transferring to DOM1 > (file=main.c, line=270) Failed MMU update transferring to DOM1 > > but then I tried to start another domain and got this: > > Using config file /etc/xen/mail2 > Error: Internal Server Error > > so it looks like something is still wrong somewhere...Is this with or without the ''better blk dev fix'' changeset backed out? The "Failed MMU update" messages are very interesting, and I''ve never seen them before -- xen is refusing to transfer the page to dom1 for some reason. Please can you try doing the same with a debug=y build of Xen. Xen should tell us a bit more about why it''s refusing the request. What workload are you running when this happens? You seem to have a real talent for provoking hard to reproduce bugs ;-) It might be moderately interesting to see the traceback from xend to see which stage of creating the new domain failed. Further, once it gets in to this state, it would be good to try shuting down or destroying the other domains one by one, doing an ''xm list'' after each stage. If one of the domains hangs around after a ''destroy'' it''s a sign there''s been an inconsistency. Ian
On Jul 31, 2004, at 7:48 AM, Ian Pratt wrote:> > The "Failed MMU update" messages are very interesting, and I''ve > never seen them before -- xen is refusing to transfer the page to > dom1 for some reason. Please can you try doing the same with a > debug=y build of Xen. Xen should tell us a bit more about why > it''s refusing the request.I''ve gotten several "Failed MMU update" messages, with debug=y compiled Xen kernels and all debugging options enabled in the kernel. Unfortunately, there wasn''t a whole lot else output in any of the cases where I would see "Failed MMU update" and then a crash. I haven''t had a chance to really touch Xen for a little over a week now, and I''d seen it for at least a week or so before that, so I''m reasonably certain none of the recent block changes were in any of the code that output the MMU errors. I can''t find the specific messages now (I''m at work, and in a meeting too. oh, but I''m paying attention to the meeting ... ;) but one of the last couple of posts I''ve made to the list included a ton of the debug output from Xen during the MMU crash. The only thing I saw unusual was some sort of "protection fault" or something on the Xen console a few minutes before the MMU error/crash popped up. Also, almost always, the "failed MMU update" happens at the very end of a crash, just before the system reboots. i.e. the system becomes unstable and I reboot or it crashes and reboots itself, and the last message on the console before the reboot is usually the "failed MMU update" message. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- "I think that''s what they mean by | "nickels a day can feed a child." | http://www.eff.org/ I thought, "How can food be so | http://www.anti-dmca.org/ cheap over there?" It''s not, they |-------------------------- just eat the nickels." -- Peter Nguyen -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- "We all enter this world in the | Support Electronic Freedom same way: naked; screaming; soaked | http://www.eff.org/ in blood. But if you live your | http://www.anti-dmca.org/ life right, that kind of thing |--------------------------- doesn''t have to stop there." -- Dana Gould ------------------------------------------------------- This SF.Net email is sponsored by OSTG. Have you noticed the changes on Linux.com, ITManagersJournal and NewsForge in the past few weeks? Now, one more big change to announce. We are now OSTG- Open Source Technology Group. Come see the changes on the new OSTG site. www.ostg.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
I get nothing on the console, just a complete hang with ''noreboot'' specified. It actually hangs now when idle. james From: James Harper Sent: Mon 2/08/2004 10:26 AM To: xen-devel@lists.sourceforge.net Cc: Ian Pratt Subject: RE: [Xen-devel] crashes it''s definitely rebooting fairly shortly after i start testing it with disk and network activity. I''ve added noreboot and will inspect the console when it crashes next. james From: James Harper Sent: Mon 2/08/2004 9:37 AM To: Ian Pratt Cc: Ian Pratt; xen-devel@lists.sourceforge.net Subject: RE: [Xen-devel] crashes Last night I tried a bk pull and noticed some errors which meant nothing was applying, so I was missing a few changesets. I cloned a brand new tree and built a new set of images, and now i''m back to having it spontaneously reboot with no error messages. d''oh. james From: Ian Pratt Sent: Sat 31/07/2004 9:48 PM To: James Harper Cc: Ian Pratt; xen-devel@lists.sourceforge.net; Ian.Pratt@cl.cam.ac.uk Subject: Re: [Xen-devel] crashes> It looked like everything was okay except for these messages in DOM0: > > (file=main.c, line=270) Failed MMU update transferring to DOM1 > (file=main.c, line=270) Failed MMU update transferring to DOM1 > > but then I tried to start another domain and got this: > > Using config file /etc/xen/mail2 > Error: Internal Server Error > > so it looks like something is still wrong somewhere...Is this with or without the ''better blk dev fix'' changeset backed out? The "Failed MMU update" messages are very interesting, and I''ve never seen them before -- xen is refusing to transfer the page to dom1 for some reason. Please can you try doing the same with a debug=y build of Xen. Xen should tell us a bit more about why it''s refusing the request. What workload are you running when this happens? You seem to have a real talent for provoking hard to reproduce bugs ;-) It might be moderately interesting to see the traceback from xend to see which stage of creating the new domain failed. Further, once it gets in to this state, it would be good to try shuting down or destroying the other domains one by one, doing an ''xm list'' after each stage. If one of the domains hangs around after a ''destroy'' it''s a sign there''s been an inconsistency. Ian