There are (at least) two things which prevent one from debugging the hypervisor with gdb by setting crash_debug=y in xen/Rules.mk. arch/x86/gdbstub.c uses copy_{from,to}_user to access hypervisor memory. This is a bit strange, as the comments nearby suggest: it does this so that it can reuse the fixup facility which allows pagefaults to be turned into error returns. However, evidently since this code was last used by anyone an additional check was added to copy_{from,to}_user: access_ok is called to prevent copy_{from,to}_user from being tricked (by an errant guest) into reading or writing hypervisor memory. This is not appropriate for the gdbstubs. The effect is that all attempts to read and write hypervisor memory from the gdb session fail. So in the patch below I have changed the calls to use __copy_{from,to}_user, bypassing access_ok. gdb attempts to read inaccessible memory at startup, at least in my tests. This ought not to be a problem as the aforementioned page fault fixup magic ought just to turn these into errors, which gdb copes with. However __spurious_page_fault is called fairly early on by the page fault handler - before looking for a fixup. And __spurious_page_fault in turn calls map_domain_page. map_domain_page asserts !in_irq(), which assertion can fail when the page fault happens while debugging, for example if the debugger was entered via the `%'' hypervisor console keystroke. The best way to deal with this seemed to me to make __spurious_page_fault always return false when the debugger is attached (ie, when we are running the debugging stub). The comments surrounding __spurious_page_fault seem to suggest that these only occur when page tables are changed. If that''s true then this change is correct since the debugging stubs don''t change page tables. However, if general speculative memory accesses may generate spurious page faults then this change is wrong. I''m sure someone more familiar with the contents of the CPU architecture manuals can answer this question with less effort than me :-). In that case it may be necessary to put the debugger check in the very-widely-used map_domain_page. (Luckily you only bear the cost if you compile with -DCRASH_DEBUG.) With these two changes, I can get gdb hypervisor debugging to work - at least, in my test I was able to trap into the debugging mode with `%'', attach with gdb, and examine some hypervisor static variables. I haven''t spent any effort on getting it to be able to resume normal operation after such a %-interrupt, and that didn''t work for me even though it''s apparently intended to be supported. The attached patch also changes the error message printed when debugging is requested with `%'' but no gdb=... was provided on the command line from dbg_printk (generally a noop) to a printk, and provides a confirmation that `%'' was pressed. This makes debugger entry a little more positive and the latter may help diagnosis when debugger entry fails for some other reason. For the record, to get to a working gdb prompt I did these things: * Set debug=y in Config.mk * Set crash_debug=y in xen/Rules.mk * Make the changes in the attached patch, and build. * Pass gdb=com1 as a hypervisor command line argument (I already have com1=38400,8n1 console=com1,vga sync_console) * Boot the system with minicom connected from my workstation via a null modem cable in the usual way * In minicom, give the escape character (^A by default) three times to talk to Xen (Xen prints `(XEN) *** Serial input -> Xen...''). * Press % and observe the messages (XEN) ''%'' pressed -> trapping into debugger (XEN) GDB connection activated. (XEN) Waiting for GDB to attach... * Disconnect from minicom without allowing minicom to send any modem control sequences. * Start gdb with gdb /path/to/build/tree/xen/xen-syms and then (gdb) set remotebaud 38400 (gdb) target remote /dev/ttyS0 0x614d12ff in ?? () (gdb) As you can see at least the EIP seen by gdb doesn''t seem accurate, but that''s not important to me right now. Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 11/12/07 17:41, "Ian Jackson" <Ian.Jackson@eu.citrix.com> wrote:> The best way to deal with this seemed to me to make > __spurious_page_fault always return false when the debugger is > attached (ie, when we are running the debugging stub). The comments > surrounding __spurious_page_fault seem to suggest that these only > occur when page tables are changed. If that''s true then this change > is correct since the debugging stubs don''t change page tables.I think it''s more obvious to check for in_irq(). That predicate already exists, and pagetables are never updated in IRQ context. And it''s exactly the case that __spurious_page_fault() cannot handle. Apart from that the patch is good. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser writes ("Re: [Xen-devel] [PATCH] fix gdb debugging of hypervisor"):> I think it''s more obvious to check for in_irq(). That predicate already > exists, and pagetables are never updated in IRQ context. And it''s exactly > the case that __spurious_page_fault() cannot handle. Apart from that the > patch is good.I''ll resubmit it. Also, the problem with the register values was an endianness error which will be easy to fix. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2007-Dec-12 11:15 UTC
[Xen-devel] [PATCH] fix gdb debugging of hypervisor (revised)
This is the revised version of the patch in my message: Subject: [PATCH] fix gdb debugging of hypervisor Date: Tue, 11 Dec 2007 17:41:52 +0000 Message-ID: <18270.52192.288867.395215@mariner.uk.xensource.com> This patch: * enables the gdbstubs to properly access hypervisor memory; * prevents an assertion failure in __spurious_page_fault''s call to map_domain_page if such accesses fail, by testing in_irq(); * prints some additional helpful messages; * fixes the endianness of register transfers from the gdbstubs so that gdb is much less confused. * fixes the documentation in docs/misc/crashdb.txt Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com> Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Jackson
2007-Dec-12 11:34 UTC
Re: [Xen-devel] [PATCH] fix gdb debugging of hypervisor (revised)
iwj writes ("[Xen-devel] [PATCH] fix gdb debugging of hypervisor (revised)"):> diff -r 38a45b7c6cb5 xen/arch/x86/traps.c > --- a/xen/arch/x86/traps.c Mon Dec 10 11:37:13 2007 +0000 > +++ b/xen/arch/x86/traps.c Wed Dec 12 11:12:36 2007 +0000 > @@ -784,8 +784,8 @@ asmlinkage int do_invalid_op(struct cpu_ > predicate = is_kernel(bug_str.str) ? (char *)bug_str.str : "<unknown>"; > printk("Assertion ''%s'' failed at %.50s:%d\n", > predicate, filename, lineno); > + show_execution_state(regs); > DEBUGGER_trap_fatal(TRAP_invalid_op, regs); > - show_execution_state(regs); > panic("Assertion ''%s'' failed at %.50s:%d\n", > predicate, filename, lineno);Please don''t apply that hunk, it was included by mistake. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel