Hello, I get a similar BUG in both today''s -testing and -unstable trees (below is the -testing dump) (XEN) Scrubbing Free RAM: ........................................done. (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen). (XEN) BUG at domain.c:144 (XEN) CPU: 0 (XEN) EIP: 0808:[<fc50554e>] (XEN) EFLAGS: 00210296 (XEN) eax: 00000000 ebx: fcff4d20 ecx: 00000000 edx: 00000000 (XEN) esi: 00000000 edi: 000003ef ebp: 00000000 esp: fc503f04 (XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810 (XEN) Stack trace from ESP=fc503f04: (XEN) fc52a9c0 fc52aa16 00000090 00000bac 00000000 00000000 00000000 [fc51d609] (XEN) ffbf1000 fc503f70 00000004 fcff4d94 fcff4d20 fcff4d20 00000454 c001dbac (XEN) fcff4d94 00000000 00000004 fcff4d20 00000000 000002eb ffbf1000 00000000 (XEN) fef00074 00000000 f7ef9063 0141d061 feffbfbc 0141d066 0141d061 fcff4d20 (XEN) fbffc001 fc503fb8 00000000 [fc52511e] 00000000 fbffa000 00000007 fcff4d20 (XEN) fcff4d20 f7efa000 00000113 [fc5290fe] fc503fb8 00000001 fbeeb000 fbeeb000 (XEN) f7efa000 00000113 00000000 fbffc000 000e0000 c0113fb2 00000819 00210286 (XEN) c03fbee4 00000821 00000821 00000821 00000000 00000000 fcff4d20 (XEN) Call Trace from ESP=fc503f04: [<fc51d609>] [<fc52511e>] [<fc5290fe>] **************************************** CPU0 FATAL TRAP: vector = 6 (invalid operand) [error_code=0000] Aieee! CPU0 is toast... **************************************** -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
2.0-testing seems to work fine for me. Are you sure you''re loading the right images? Are you building them yourself or using our precompiled ones? Ian> I get a similar BUG in both today''s -testing and -unstable > trees (below is the -testing dump) > > (XEN) Scrubbing Free RAM: > ........................................done. > (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to > switch input to Xen). > (XEN) BUG at domain.c:144 > (XEN) CPU: 0 > (XEN) EIP: 0808:[<fc50554e>] > (XEN) EFLAGS: 00210296 > (XEN) eax: 00000000 ebx: fcff4d20 ecx: 00000000 edx: 00000000 > (XEN) esi: 00000000 edi: 000003ef ebp: 00000000 esp: fc503f04 > (XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810 > (XEN) Stack trace from ESP=fc503f04: > (XEN) fc52a9c0 fc52aa16 00000090 00000bac 00000000 00000000 > 00000000 [fc51d609] > (XEN) ffbf1000 fc503f70 00000004 fcff4d94 fcff4d20 > fcff4d20 00000454 c001dbac > (XEN) fcff4d94 00000000 00000004 fcff4d20 00000000 > 000002eb ffbf1000 00000000 > (XEN) fef00074 00000000 f7ef9063 0141d061 feffbfbc > 0141d066 0141d061 fcff4d20 > (XEN) fbffc001 fc503fb8 00000000 [fc52511e] 00000000 > fbffa000 00000007 > fcff4d20 > (XEN) fcff4d20 f7efa000 00000113 [fc5290fe] fc503fb8 > 00000001 fbeeb000 > fbeeb000 > (XEN) f7efa000 00000113 00000000 fbffc000 000e0000 > c0113fb2 00000819 00210286 > (XEN) c03fbee4 00000821 00000821 00000821 00000000 > 00000000 fcff4d20 > (XEN) Call Trace from ESP=fc503f04: [<fc51d609>] [<fc52511e>] > [<fc5290fe>] > > **************************************** > CPU0 FATAL TRAP: vector = 6 (invalid operand) > [error_code=0000] Aieee! CPU0 is toast... > **************************************** > > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> 2.0-testing seems to work fine for me. > Are you sure you''re loading the right images?Yup. Here''s the grub entry I''ve been using: title xen-unstable kernel /xen.gz dom0_mem=131072 com1=115200,8n1 module /vmlinuz-2.6.11-xen0 root=/dev/ram0 lvm2root=/dev/vg1/root elevator=cfq ramdisk_size=10240 console=ttyS0 module /initrd-lvm2.gz [root@host27 xen-unstable]# diff xen/xen.gz /boot/xen.gz [root@host27 xen-unstable]# diff linux-2.6.11-xen0/vmlinuz /boot/vmlinuz-2.6.11-xen0 [root@host27 xen-unstable]#> Are you building them yourself or using our precompiled ones?Building them myself, gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4) -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > 2.0-testing seems to work fine for me. > > Are you sure you''re loading the right images? > > Yup. Here''s the grub entry I''ve been using: > > title xen-unstable > kernel /xen.gz dom0_mem=131072 com1=115200,8n1 > module /vmlinuz-2.6.11-xen0 root=/dev/ram0 > lvm2root=/dev/vg1/root elevator=cfq ramdisk_size=10240 console=ttyS0 > module /initrd-lvm2.gz > > [root@host27 xen-unstable]# diff xen/xen.gz /boot/xen.gz > [root@host27 xen-unstable]# diff linux-2.6.11-xen0/vmlinuz > /boot/vmlinuz-2.6.11-xen0 > [root@host27 xen-unstable]# > > > Are you building them yourself or using our precompiled ones? > > Building them myself, gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)How recently had you had it working? If in the last few days, please can you check that an earlier src snapshot worked OK? (e.g. X-1.tgz or X-2.tgz). There was a big merge yesterday, but I''d be surprised if this was causing a problem. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> How recently had you had it working?Unfortunately, I''ve only messed with stable + a few non-Xen patches (SKAS for instance).> If in the last few days, please can you check that an earlier src > snapshot worked OK? > (e.g. X-1.tgz or X-2.tgz).I just gave the xen-2.0-testing.2 tarball a try (dated 2005-04-04) with the same result. Default config, no other changes. I couldn''t get kymoops to provide any useful output even after stripping (XEN) and tooling around with its options. Here''s the full boot output: http://www.theshore.net/~caker/xen/BUGdomain-dmesg.txt Is there an easy way to grab snapshots before this date? Is there one you recommend I try next? Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 6 Apr 2005, at 08:59, Christopher S. Aker wrote:> Here''s the full boot output: > > http://www.theshore.net/~caker/xen/BUGdomain-dmesg.txt > > Is there an easy way to grab snapshots before this date? Is there one > you > recommend I try next?Easier may be to pull the BK repsoitory from xen.bkbits.net and build it yourself. It is then very easy to undo changesets after a certain date and so travel backwards until you find a working repository? When did it last work for you -- there haven''t been very many changes to 2.0-testing recently... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
A few updates. -testing and -unstable boot fine on another machine. On the one that exhibits the problem, I tried the binary -testing install with the same result. I also (manually) downloaded diffs and went back to around changeset 1.811 or so, with the same results. Can someone throw me some bk commands to speed up my process a little? I''m looking to either automatically restore my bk tree to a previous changeset (and removing all changesets thereafter), or a way for bk to remove a changeset at a time, and I''ll just script it. Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Take a look at the bitkeeper reference card in the attachment. - Bin On Apr 7, 2005 7:53 PM, Christopher S. Aker <caker@theshore.net> wrote:> A few updates. -testing and -unstable boot fine on another machine. On the one that > exhibits the problem, I tried the binary -testing install with the same result. I > also (manually) downloaded diffs and went back to around changeset 1.811 or so, with > the same results. > > Can someone throw me some bk commands to speed up my process a little? I''m looking > to either automatically restore my bk tree to a previous changeset (and removing all > changesets thereafter), or a way for bk to remove a changeset at a time, and I''ll > just script it. > > Thanks, > -Chris > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 7 Apr 2005, at 19:53, Christopher S. Aker wrote:> A few updates. -testing and -unstable boot fine on another machine. > On the one that > exhibits the problem, I tried the binary -testing install with the > same result. I > also (manually) downloaded diffs and went back to around changeset > 1.811 or so, with > the same results. > > Can someone throw me some bk commands to speed up my process a little? > I''m looking > to either automatically restore my bk tree to a previous changeset > (and removing all > changesets thereafter), or a way for bk to remove a changeset at a > time, and I''ll > just script it.What type of cpu does the non-working machine have? My guess would be that it is exercising some start-of-day code path in Linux that is doing something that is invalid on Xen. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> What type of cpu does the non-working machine have? My guess would be > that it is exercising some start-of-day code path in Linux that is > doing something that is invalid on Xen.2.66Ghz Xeons. It''s a pretty standard, well-supported box (SuperMicro 6013P-i). Xen-.0.5 works fine on this machine. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Thu, 7 Apr 2005, Christopher S. Aker wrote:> Can someone throw me some bk commands to speed up my process a little? I''m looking > to either automatically restore my bk tree to a previous changeset (and removing all > changesets thereafter), or a way for bk to remove a changeset at a time, and I''ll > just script it.How does the recent bk/linux issue affect xen? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Take a look at the bitkeeper reference card in the attachment.This worked great, thanks! bk clone -r1.1775.1.7 bk://xen.bkbits.net/xen-2.0-testing.bk Time to test. -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Christopher S. Aker (caker@theshore.net) wrote:> Can someone throw me some bk commands to speed up my process a little? I''m looking > to either automatically restore my bk tree to a previous changeset (and removing all > changesets thereafter), or a way for bk to remove a changeset at a time, and I''ll > just script it.bk undo -a $cset does that just fine. for more info, bk help undo. thanks, -chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Under the -testing tree: (retrieved with bk clone -r<cset> bk://xen.bkbits.net/xen-2.0-testing.bk) 1.1774 - boots correctly (creates a 2.6.10-xen0) 1.1784 - refuses to boot (creates a 2.6.11-xen0) Anything past 1.1784 that I''ve tried also BUGs. It seems like any 2.6.11-dom0 kernel fails on this machine. Incidentally, I am able to run 2.6.11-domUs using the -stable 2.6.10-dom0. All of the changesets between 1.1774 and 1.1784 are the 2.6.11 merge. I haven''t bother to try changesets in between those, since they look to me like they all need each other. If anyone wants me to try those changesets, let me know. How can I get more information from the stack trace that Xen puts out to further narrow this down? Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
try ''objdump -D xen-syms | less'' and navigate to the addresses printed out by the Xen stack trace. Make sure that the ''xen-syms'' is sync''ed with ''xen.gz''. - Bin On Apr 8, 2005 12:19 AM, Christopher S. Aker <caker@theshore.net> wrote:> Under the -testing tree: > > (retrieved with bk clone -r<cset> bk://xen.bkbits.net/xen-2.0-testing.bk) > > 1.1774 - boots correctly (creates a 2.6.10-xen0) > 1.1784 - refuses to boot (creates a 2.6.11-xen0) > > Anything past 1.1784 that I''ve tried also BUGs. It seems like any 2.6.11-dom0 kernel > fails on this machine. Incidentally, I am able to run 2.6.11-domUs using the -stable > 2.6.10-dom0. > > All of the changesets between 1.1774 and 1.1784 are the 2.6.11 merge. I haven''t > bother to try changesets in between those, since they look to me like they all need > each other. If anyone wants me to try those changesets, let me know. > > How can I get more information from the stack trace that Xen puts out to further > narrow this down? > > Thanks, > -Chris > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> All of the changesets between 1.1774 and 1.1784 are the 2.6.11 merge. I haven''t > bother to try changesets in between those, since they look to me like they all need > each other. If anyone wants me to try those changesets, let me know.I decided to give them a go, here were the results: 1.1784 - BUGs 1.1779 - BUGs 1.1777 - BUGs 1.1774 - boots correctly There are only four changesets between 1.1774 and 1.1777, and of course one of them is huge -- the 2.6.11 merge. Also, the objdump didn''t tell me anything useful. I have the objdump and the stack trace if anyone wants them. I think I''m at the end of my abilities to narrow it down any further. Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> try ''objdump -D xen-syms | less'' and navigate to the addresses printed > out by the Xen stack trace. Make sure that the ''xen-syms'' is sync''ed > with ''xen.gz''.There''s also tools/misc/xensymoops (it has it''s own help message regarding usage), which will help you out here. Cheers, Mark> > - Bin > > On Apr 8, 2005 12:19 AM, Christopher S. Aker <caker@theshore.net> wrote: > > Under the -testing tree: > > > > (retrieved with bk clone -r<cset> > > bk://xen.bkbits.net/xen-2.0-testing.bk) > > > > 1.1774 - boots correctly (creates a 2.6.10-xen0) > > 1.1784 - refuses to boot (creates a 2.6.11-xen0) > > > > Anything past 1.1784 that I''ve tried also BUGs. It seems like any > > 2.6.11-dom0 kernel fails on this machine. Incidentally, I am able to > > run 2.6.11-domUs using the -stable 2.6.10-dom0. > > > > All of the changesets between 1.1774 and 1.1784 are the 2.6.11 merge. > > I haven''t bother to try changesets in between those, since they look to > > me like they all need each other. If anyone wants me to try those > > changesets, let me know. > > > > How can I get more information from the stack trace that Xen puts out > > to further narrow this down? > > > > Thanks, > > -Chris > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> 1.1784 - BUGs > 1.1779 - BUGs > 1.1777 - BUGs > 1.1774 - boots correctly > > There are only four changesets between 1.1774 and 1.1777, and > of course one of them is huge -- the 2.6.11 merge. Also, the > objdump didn''t tell me anything useful. I have the objdump > and the stack trace if anyone wants them. I think I''m at the > end of my abilities to narrow it down any further.Thanks Chris, this is very useful. It would be helpful if you could use a debug=y build of Xen and add a show_guest_stack() just before the BUG(). Further, it might be revealing to hack the following into linux-2.6.11-xen0/kernel/printk.c:vprintk to see how far the dom0 boot it is getting. /* Emit the output into the temporary buffer */ printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, args); + HYPERVISOR_console_io( CONSOLEIO_write, strlen(printk_buf), printk_buf)) Is there anything ''unusual'' about the machine you''re using? Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8 Apr 2005, at 06:12, Christopher S. Aker wrote:> There are only four changesets between 1.1774 and 1.1777, and of > course one of them > is huge -- the 2.6.11 merge. Also, the objdump didn''t tell me > anything useful. I > have the objdump and the stack trace if anyone wants them. I think > I''m at the end of > my abilities to narrow it down any further.See Ian''s email also, but this definitely looks like new start-of-day code in 2.6.11 that needs fixing for running on Xen. If you can get a stack backtrace for Linux, or some printk output (Ian''s email explains how to do both) then we can try narrowing down further. I''m surprised it fails on your Xeon system but not ours. :-( -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
From: "Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk>> Thanks Chris, this is very useful. > > It would be helpful if you could use a debug=y build of Xen and add a > show_guest_stack() just before the BUG(). Further, it might be revealing > to hack the following into linux-2.6.11-xen0/kernel/printk.c:vprintk to > see how far the dom0 boot it is getting. > > /* Emit the output into the temporary buffer */ > printed_len = vscnprintf(printk_buf, sizeof(printk_buf), fmt, > args); > + HYPERVISOR_console_io( CONSOLEIO_write, strlen(printk_buf), > printk_buf))Did all three. Full results are here: http://www.theshore.net/~caker/xen/BUGdomain-dmesg2.txt Here''s the relevant part: (XEN) Scrubbing Free RAM: ........................................done. (XEN) *** Serial input -> DOM0 (type ''CTRL-a'' three times to switch input to Xen). Linux version 2.6.11-xen0 (root@host27.linode.com) (gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)) #1 Fri Apr 8 05:49:40 EDT 2005 <6>BIOS-provided physical RAM map: Xen: 0000000000000000 - 0000000008000000 (usable) <5>128MB LOWMEM available. <7>On node 0 totalpages: 32768 <7> DMA zone: 32768 pages, LIFO batch:8 <7> Normal zone: 0 pages, LIFO batch:1 <7> HighMem zone: 0 pages, LIFO batch:1 <6>DMI present. (XEN) (file=/root/bk/testing/xen/include/asm/mm.h, line=157) Error pfn 000f7ef9: ed=fc5a8d20, sd=00000000, caf=00000000, taf=00000000 (XEN) DOM0: (file=memory.c, line=1756) ptwr: Could not re-validate l1 page (XEN) (XEN) Guest EIP is c0114cd1 (XEN) 00000001 00000001 f7efa000 c0495e12 fbeeb000 000f7ef9 00000063 fbff2190 (XEN) f7ef9000 c03b0478 00000000 c049492a f7ef9000 00000478 fbff2190 f7ef9000 (XEN) c03b0478 c0489f48 c0494b1e f7ef9000 00000478 00000024 c0494bcf fbffb000 (XEN) 00241000 494d445f 0478525f f7ef9000 00000024 00008000 00018000 00000020 (XEN) 00000000 c0494cd8 c0494bcf c0491f90 c04db6a0 c03baa05 00000000 00000000 (XEN) c0488000 00000000 c0119eee 00000000 00000088 ffffff78 c04de9a5 00000000 (XEN) c04de9a6 00000085 c0119da8 0000000a 00000400 c03af9e0 c0489fe8 00000000 (XEN) 0002080b c0973200 c04db180 0002080b c0973200 c04db180 00000000 c048a58a (XEN) c0489ff8 00000000 00000000 00000000 00000000 c04db6a0 c0100066 (XEN) BUG at domain.c:145 (XEN) CPU: 0 (XEN) EIP: 0808:[<fc505853>] (XEN) EFLAGS: 00210292 (XEN) eax: 00000000 ebx: fc5a8d20 ecx: 00000000 edx: 00000000 (XEN) esi: 00000000 edi: 00000000 ebp: 00000000 esp: fc503ee4 (XEN) ds: 0810 es: 0810 fs: 0810 gs: 0810 ss: 0810 (XEN) Stack trace from ESP=fc503ee4: (XEN) fc5329d8 fc532a69 00000091 00000000 00000bac fc5a8d20 00000000 [fc5221d2] (XEN) ffbda000 00000000 000006dc 00000008 00200096 000d2535 00000000 000d2535 (XEN) 00000000 15d5ae1e 00000454 c001dbac fc5a8d94 00000000 00000000 fc5a8d20 (XEN) 00000000 ffbda000 00000000 c001dbad fef00074 00000000 f7ef9063 0181d061 (XEN) feffbfbc 0181d066 0181d061 fc5a8d20 fbffc001 fc503fb8 c001dbad [fc52cb9d] (XEN) 00000000 fcfec000 00000010 c04de920 [fc513324] fcfec000 00000000 fc5a8d20 (XEN) f7efa000 00000113 00000000 [fc5311fe] fc503fb8 00000001 fbeeb000 fbeeb000 (XEN) f7efa000 00000113 00000000 fbffc000 000e0000 c0114cd1 00000819 00210286 (XEN) c0489ee4 00000821 00000821 00000821 00000000 00000000 fc5a8d20 (XEN) Call Trace from ESP=fc503ee4: [<fc5221d2>] [<fc52cb9d>] [<fc513324>] [<fc5311fe>] **************************************** CPU0 FATAL TRAP: vector = 6 (invalid operand) [error_code=0000] Aieee! CPU0 is toast... ****************************************> Is there anything ''unusual'' about the machine you''re using?No, not at all (SuperMicro 6013P-i). I''d would have expected others to have encountered the same problem by now, but I''m not quite sure how to intrepret theese results. From: "Keir Fraser" <Keir.Fraser@cl.cam.ac.uk>> See Ian''s email also, but this definitely looks like new start-of-day > code in 2.6.11 that needs fixing for running on Xen. If you can get a > stack backtrace for Linux, or some printk output (Ian''s email explains > how to do both) then we can try narrowing down further. > > I''m surprised it fails on your Xeon system but not ours. :-(Same here. Hopefully this means something to you guys :-) -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8 Apr 2005, at 09:00, Ian Pratt wrote:> > Thanks Chris, this is very useful. > > It would be helpful if you could use a debug=y build of Xen and add a > show_guest_stack() just before the BUG(). Further, it might be > revealing > to hack the following into linux-2.6.11-xen0/kernel/printk.c:vprintk to > see how far the dom0 boot it is getting.It''s important to ensure you are using a debug build of Xen (debug=y make). Also, the guest backtrace will not be in a particularly pretty format. You may just want to post it here but we will definitely also require a link to your vmlinux file (i.e., non-compressed Linux image that has not been stripped of symbol info). We can then match likely addresses in the backtrace to code points in the objdump''ed kernel image. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> It''s important to ensure you are using a debug build of Xen (debug=y > make).I edited the Rules.mk file and changed verbose and debug to y.> Also, the guest backtrace will not be in a particularly pretty > format. You may just want to post it here but we will definitely also > require a link to your vmlinux file (i.e., non-compressed Linux image > that has not been stripped of symbol info). We can then match likely > addresses in the backtrace to code points in the objdump''ed kernel > image.http://www.theshore.net/~caker/xen/BUGdomain/ -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > It''s important to ensure you are using a debug build of Xen (debug=y > > make). > > I edited the Rules.mk file and changed verbose and debug to y. > > > Also, the guest backtrace will not be in a particularly pretty > > format. You may just want to post it here but we will definitely also > > require a link to your vmlinux file (i.e., non-compressed Linux image > > that has not been stripped of symbol info). We can then match likely > > addresses in the backtrace to code points in the objdump''ed kernel > > image. > > http://www.theshore.net/~caker/xen/BUGdomain/Okay, this is progress. The domain is dying because it is trying to map a page that does not belong to it -- in fact it is a reserved page in the ACPI NVS (Non-Volatile Store) area. Unfortunately we batch page mappings and they get validated some time after the problem code was actually executed. :-( To get a fault at the actual point the mapping is requested, you need to change a line in linux/include/asm-xen/asm-i386/pgtable-2level.h. The line is: #define set_pte(pteptr, pteval) (*(pteptr) = pteval) and should be changed to: #define set_pte(pteptr, pteval) \ xen_l1_entry_update((pteptr), (pteval).pte_low) If you build and retry, we should get a guest backtrace at the code point that is making the invalid mapping. I''m going to be away for the next week, but I will look at your new trace when I get email access. Alternatively Ian or Christian may have time to decipher the backtrace. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> To get a fault at the actual point the mapping is requested, you need > to change a line in linux/include/asm-xen/asm-i386/pgtable-2level.h. > The line is: > #define set_pte(pteptr, pteval) (*(pteptr) = pteval) > and should be changed to: > #define set_pte(pteptr, pteval) \ > xen_l1_entry_update((pteptr), (pteval).pte_low) > > If you build and retry, we should get a guest backtrace at the code > point that is making the invalid mapping.Done, results and binaries: http://www.theshore.net/~caker/xen/BUGdomain/BUGdomain-dmesg3.txt http://www.theshore.net/~caker/xen/BUGdomain/ However, it doesn''t appear to be different.> I''m going to be away for the next week, but I will look at your new > trace when I get email access. Alternatively Ian or Christian may have > time to decipher the backtrace. :-)Ok, thanks! -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > To get a fault at the actual point the mapping is > requested, you need > > to change a line in linux/include/asm-xen/asm-i386/pgtable-2level.h. > > The line is: > > #define set_pte(pteptr, pteval) (*(pteptr) = pteval) and should be > > changed to: > > #define set_pte(pteptr, pteval) \ > > xen_l1_entry_update((pteptr), (pteval).pte_low) > > > > If you build and retry, we should get a guest backtrace at the code > > point that is making the invalid mapping. > > Done, results and binaries: > > http://www.theshore.net/~caker/xen/BUGdomain/BUGdomain-dmesg3.txt > http://www.theshore.net/~caker/xen/BUGdomain/ > > However, it doesn''t appear to be different.I''m pretty sure the vmlinuz xen0 binary you booted didn''t have this change in it -- it still seems to be using wrpt instead of the queued interface. The easiest (but slowest) thing to do is to edit the file in the sparse tree, then do a ''make -j4 world''. [If you edited the file in-place and did make world you''d have lost the change. ] Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 8 Apr 2005, at 20:54, Christopher S. Aker wrote:> Done, results and binaries: > > http://www.theshore.net/~caker/xen/BUGdomain/BUGdomain-dmesg3.txt > http://www.theshore.net/~caker/xen/BUGdomain/ > > However, it doesn''t appear to be different.md5sum of the vmlinux file has not changed, so you were probably running the old Linux binary. You need to rebuild it. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
OK, Started with a fresh -testing tree and made the changes there. Here are the results and binaries: http://www.theshore.net/~caker/xen/BUGdomain2/ Most notably different in the output this time is: <1>Failed to execute MMU updates. (XEN) (file=extable.c, line=71) Pre-exception: fc53106e -> fc531124 (XEN) (file=traps.c, line=463) Page fault: fc531139 -> fc505880 Thanks! -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
From: "Christopher S. Aker" <caker@theshore.net>> OK, Started with a fresh -testing tree and made the changes there. Here are the > results and binaries: > > http://www.theshore.net/~caker/xen/BUGdomain2/ > > Most notably different in the output this time is: > > <1>Failed to execute MMU updates. > (XEN) (file=extable.c, line=71) Pre-exception: fc53106e -> fc531124 > (XEN) (file=traps.c, line=463) Page fault: fc531139 -> fc505880Just curious if this was the information you were looking for. Please let me know if I can do anything else to help get this resolved, I have 40 machines identical to this one, and it would be great to be able to continue with moving to Xen from UML. I appreciate it! -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13 Apr 2005, at 19:16, Christopher S. Aker wrote:> Just curious if this was the information you were looking for. Please > let me know if > I can do anything else to help get this resolved, I have 40 machines > identical to > this one, and it would be great to be able to continue with moving to > Xen from UML. > > I appreciate it!We need to make Xen crash at the point the MMU update fails. Stick this at the end of do_mmu_update() in arch/x86/mm.c: if ( rc != 0 ) { show_guest_stack(); BUG(); } Then post the new crash output -- you need only recompile Xen; no need to recompile XenLinux this time. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> We need to make Xen crash at the point the MMU update fails. Stick this > at the end of do_mmu_update() in arch/x86/mm.c: > > if ( rc != 0 ) { > show_guest_stack(); > BUG(); > } > > Then post the new crash output -- you need only recompile Xen; no need > to recompile XenLinux this time.Ok, edited xen/arch/x86/memory.c; here''s the new results and binaries: http://www.theshore.net/~caker/xen/BUGdomain3/ Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > We need to make Xen crash at the point the MMU update fails. Stick > > this at the end of do_mmu_update() in arch/x86/mm.c: > > > > if ( rc != 0 ) { > > show_guest_stack(); > > BUG(); > > } > > > > Then post the new crash output -- you need only recompile > Xen; no need > > to recompile XenLinux this time. > > Ok, edited xen/arch/x86/memory.c; here''s the new results and binaries: > > http://www.theshore.net/~caker/xen/BUGdomain3/Thanks. Please can you put the vmlinux image or System.map there too: the stack trace were interested in is the guest''s. Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Thanks. Please can you put the vmlinux image or System.map there too: > the stack trace were interested in is the guest''s.Since XenLinux didn''t changed from the last run, they''re here: http://www.theshore.net/~caker/xen/BUGdomain2/ -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > Ok, edited xen/arch/x86/memory.c; here''s the new results and binaries: > > http://www.theshore.net/~caker/xen/BUGdomain3/Sorry, I made a mistake: where I asked you to test for and crash on rc != 0, please instead test for rc < 0. do_mmu_update() sometimes validly returns a positive non-zero value. This mistake meant you were crashing earlier than the point at which the real bug occurs. If you make the above change then domain0 should print ''<6> DMI present'' immediately before Xen prints the backtrace and then crashes. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > http://www.theshore.net/~caker/xen/BUGdomain3/ > > Sorry, I made a mistake: where I asked you to test for and crash on rc > != 0, please instead test for rc < 0. do_mmu_update() sometimes validly > returns a positive non-zero value. > > This mistake meant you were crashing earlier than the point at which > the real bug occurs. If you make the above change then domain0 should > print ''<6> DMI present'' immediately before Xen prints the backtrace and > then crashes.No biggie. Got the correct output this time, I believe. Results and binaries: http://www.theshore.net/~caker/xen/BUGdomain3/?M=D Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14 Apr 2005, at 03:13, Christopher S. Aker wrote:> No biggie. Got the correct output this time, I believe. Results and > binaries: > > http://www.theshore.net/~caker/xen/BUGdomain3/?M=DAh, got it. I need to be a bit more clever with not-ordinary-RAM pages. Your e820 map is a tiny bit unusual in that there is a small piece of usable RAM just after the ACPI areas. Since I currently assume non-RAM only occurs at the very end of the map (no holes) Xen isn''t allowing domain0 to map the BIOS DMI tables (it thinks it is ordinary RAM so performs normal ownership checks, and the pages do not belong to dom0). I''ll sort out a fix tomorrow... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Ah, got it. I need to be a bit more clever with not-ordinary-RAM > pages. Your e820 map is a tiny bit unusual in that there is a small > piece of usable RAM just after the ACPI areas. Since I currently > assume non-RAM only occurs at the very end of the map (no holes) Xen > isn''t allowing domain0 to map the BIOS DMI tables (it thinks it is > ordinary RAM so performs normal ownership checks, and the pages do not > belong to dom0). > > I''ll sort out a fix tomorrow...I''ve pushed a fix into the 2.0-testing repository. I''m about to merge it into the unstable repository as well. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> > I''ll sort out a fix tomorrow... > > I''ve pushed a fix into the 2.0-testing repository. I''m about to merge > it into the unstable repository as well. > > -- KeirJust booted -testing and it works great! Nicely done. Thanks, -Chris _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel