Hi Guys, Well, I''ve all my xen patches ready, but in final testing, the dom0 doesn''t boot in PVH mode. Without debugger, it''s hard to debug. So I added kdb to it, and it boots fine. So back to debugging without kdb. Unfortunately, it''s hanging badly during boot that printk''s won''t work either. I am able to boot in PV mode and boot PV/PVH/HVM guests. so hopefully, it''s just one issue. Anyways, JFYI. Darn software, last 10% takes forever.. :) :) .... Thanks, Mukesh
On Tue, 8 Jan 2013 18:37:56 -0800 Mukesh Rathor <mukesh.rathor@oracle.com> wrote:> Hi Guys, > > Well, I''ve all my xen patches ready, but in final testing, the dom0 > doesn''t boot in PVH mode. Without debugger, it''s hard to debug. So I > added kdb to it, and it boots fine. So back to debugging without kdb. > Unfortunately, it''s hanging badly during boot that printk''s won''t work > either. I am able to boot in PV mode and boot PV/PVH/HVM guests. so > hopefully, it''s just one issue.Ah, debug is on by default. Works with debug=n. I''ve never tested with debug=y. It gives me double fault right at the beginning: (XEN) *** DOUBLE FAULT *** (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c4802a6700>] construct_dom0+0x1d/0x2d9e (XEN) RFLAGS: 0000000000010292 (XEN) rax: 0000000000000001 rbx: ffff83013ffc4000 rcx: ffff83000008cf10 (XEN) rdx: 0000000000dfb000 rsi: ffff83000008cf00 rdi: ffff83013ffc4000 (XEN) rbp: ffff82c4802c7e48 rsp: ffff82c4802c5998 r8: ffff82c48029997d (XEN) r9: ffff82c4802b4660 r10: 0000000000000020 r11: ffff82c4802d3e40 (XEN) r12: ffff83000008cf00 r13: ffff83013fff2370 r14: 0000000000dfb000 (XEN) r15: ffff82c480265780 cs: 000000000000e008 ss: 0000000000000000 (XEN) Valid stack range: ffff82c4802c6000-ffff82c4802c8000, sp=ffff82c4802c5998, tss.esp0=ffff82c4802c7fc0 Starting to look at what debug=y does. Thanks Mukesh
On Tue, 8 Jan 2013 18:50:30 -0800 Mukesh Rathor <mukesh.rathor@oracle.com> wrote: Strange. So the latest is, things are OK with debug=n. With debug=y, I get DOUBLE FAULT at: ffff82c4802d4710: construct_dom0+d subq $0x2488, %rsp where rsp == ffff82c4802efe20 is there, but (rsp - 0x2488), ie, 0xffff82c4802ed998 is not there. But, the subtract instruction should not cause an exception like that IMO. ss is 0, but that should be OK. Hmm... I am at a loss on this one! thanks, mukesh
On 10/01/2013 02:20, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote:> On Tue, 8 Jan 2013 18:50:30 -0800 > Mukesh Rathor <mukesh.rathor@oracle.com> wrote: > > Strange. So the latest is, things are OK with debug=n. With debug=y, > I get DOUBLE FAULT at: > > ffff82c4802d4710: construct_dom0+d subq $0x2488, %rsp > > where rsp == ffff82c4802efe20 is there, but (rsp - 0x2488), ie, > 0xffff82c4802ed998 is not there. But, the subtract instruction > should not cause an exception like that IMO. ss is 0, but that > should be OK. Hmm... I am at a loss on this one!This one''s not rocket science, Mukesh. The hypervisor stack is 8kB, and construct_dom0() is trying to allocate a stack frame bigger than 8kB. Debug builds enforce the 8kB limit with guard pages. You will actually be crashing on the first stack writing instruction after the subq, but double fault is imprecise (in fact reported cs:eip is undefined for a double fault). You''re allocating a ridiculously big local variable on construct_dom0''s stack. So just don''t do that. -- Keir> thanks, > mukesh > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Thu, 10 Jan 2013 07:34:27 +0000 Keir Fraser <keir.xen@gmail.com> wrote:> On 10/01/2013 02:20, "Mukesh Rathor" <mukesh.rathor@oracle.com> wrote: > > This one''s not rocket science, Mukesh. The hypervisor stack is 8kB, > and construct_dom0() is trying to allocate a stack frame bigger than > 8kB. Debug builds enforce the 8kB limit with guard pages. You will > actually be crashing on the first stack writing instruction after the > subq, but double fault is imprecise (in fact reported cs:eip is > undefined for a double fault).Thanks Keir. Of course that was my first thought, and I tried confirming it with debug code in PF handler. When I noticed PF was not happening, I started to think it must be something else combined with the fact that the subq man page doesnt'' say anything about RSP special case, and it''s just decrementing it, and not accessing it. Anyways, the whole x86 is a rocket to me :)...> You''re allocating a ridiculously big local variable on > construct_dom0''s stack. So just don''t do that.Yup, I accidentally was allocating a large char array on stack. Fixed. Thanks, Mukesh