Ryan Harper
2006-Apr-19 22:09 UTC
[Xen-devel] gdbserver-xen x86_64 paravirt guest debugging
I''ve attempted to debug live x86_64 domU domains with little success. gdbserver-xen segfaults and I''ve starting running gdb on gdbserver-xen to see where things are going south. I kick off the server under gdb, and then run gdb client and remote attach. This appears to succeed, and gives me: [New Thread 0] [Switching to Thread 0] 0xffffffff8014e258 in softlockup_tick (regs=0xffff880026c6fcd8) at kernel/softlockup.c:50 50 unsigned long timestamp = per_cpu(timestamp, this_cpu); But when I ask for a backtrace: (gdb) bt #0 0xffffffff8014e258 in softlockup_tick (regs=0xffff880026c6fcd8) at kernel/softlockup.c:50 Ignoring packet error, continuing... Reply contains invalid hex digit 116 On the other side (gdbserver-xen) I see: (gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/rharper/work/openhype/xen/unstable/hg/d/tools/debugger/gdb/gdb-6.2.1-linux-x86_64-xen/gdb/gdbserver/gdbserver-xen 127.0.0.1:9999 --attach 1 domain currently paused Attached; pid = 1 Listening on port 9999 Remote debugging from host 127.0.0.1 Program received signal SIGSEGV, Segmentation fault. 0x00002b2611bf410a in map_domain_va_64 (xc_handle=7, cpu=0, guest_va=0xffffffff80364ed0, perm=1) at xc_ptrace.c:295 295 l3p = page_array[l3p]; Some inspection: (gdb) p l3p $4 = 796261 (gdb) p nr_pages $5 = 196608 page_array only has 196k entries AFAICT, so this seems problematic. Anyone have x86_64 paravirt guest debugging working on latest unstable? -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Apr-19 22:16 UTC
RE: [Xen-devel] gdbserver-xen x86_64 paravirt guest debugging
Hi Ryan, I am writing a paper for debugging Linux kernel using the Xen and the gdbserver. So with that you will get more details. For the quick solution for your problem, you need to run "set architecture i386:x86-64:intel" command in the gdb before attaching. Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp>-----Original Message----- >From: xen-devel-bounces@lists.xensource.com [mailto:xen-devel- >bounces@lists.xensource.com] On Behalf Of Ryan Harper >Sent: Wednesday, April 19, 2006 3:10 PM >To: xen-devel@lists.xensource.com >Subject: [Xen-devel] gdbserver-xen x86_64 paravirt guest debugging > >I''ve attempted to debug live x86_64 domU domains with little success. >gdbserver-xen segfaults and I''ve starting running gdb on gdbserver-xen >to see where things are going south. > >I kick off the server under gdb, and then run gdb client and remote >attach. This appears to succeed, and gives me: > >[New Thread 0] >[Switching to Thread 0] >0xffffffff8014e258 in softlockup_tick (regs=0xffff880026c6fcd8) at >kernel/softlockup.c:50 >50 unsigned long timestamp = per_cpu(timestamp, this_cpu); > >But when I ask for a backtrace: > >(gdb) bt >#0 0xffffffff8014e258 in softlockup_tick (regs=0xffff880026c6fcd8) at >kernel/softlockup.c:50 >Ignoring packet error, continuing... >Reply contains invalid hex digit 116 > > >On the other side (gdbserver-xen) I see: > >(gdb) run >The program being debugged has been started already. >Start it from the beginning? (y or n) y > >Starting program: >/home/rharper/work/openhype/xen/unstable/hg/d/tools/debugger/gdb/gdb-6.2.1->linux-x86_64-xen/gdb/gdbserver/gdbserver-xen 127.0.0.1:9999 --attach 1 >domain currently paused >Attached; pid = 1 >Listening on port 9999 >Remote debugging from host 127.0.0.1 > >Program received signal SIGSEGV, Segmentation fault. >0x00002b2611bf410a in map_domain_va_64 (xc_handle=7, cpu=0, >guest_va=0xffffffff80364ed0, > perm=1) at xc_ptrace.c:295 >295 l3p = page_array[l3p]; > >Some inspection: > >(gdb) p l3p >$4 = 796261 >(gdb) p nr_pages >$5 = 196608 > >page_array only has 196k entries AFAICT, so this seems problematic. > >Anyone have x86_64 paravirt guest debugging working on latest unstable? > >-- >Ryan Harper >Software Engineer; Linux Technology Center >IBM Corp., Austin, Tx >(512) 838-9253 T/L: 678-9253 >ryanh@us.ibm.com > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ryan Harper
2006-Apr-20 14:31 UTC
Re: [Xen-devel] gdbserver-xen x86_64 paravirt guest debugging
* Kamble, Nitin A <nitin.a.kamble@intel.com> [2006-04-19 17:17]:> Hi Ryan, > I am writing a paper for debugging Linux kernel using the Xen and the > gdbserver. So with that you will get more details. > > For the quick solution for your problem, you need to run "set > architecture i386:x86-64:intel" command in the gdb before attaching.Thanks for the response, I''ll give that a try. What exactly does that command do? -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ryan Harper
2006-Apr-20 22:23 UTC
Re: [Xen-devel] gdbserver-xen x86_64 paravirt guest debugging
* Kamble, Nitin A <nitin.a.kamble@intel.com> [2006-04-19 17:17]:> Hi Ryan, > I am writing a paper for debugging Linux kernel using the Xen and the > gdbserver. So with that you will get more details. > > For the quick solution for your problem, you need to run "set > architecture i386:x86-64:intel" command in the gdb before attaching.This didn''t help at all, but thank you for the suggestion. I''m not sure if anyone has tried to get debugging of paravirtual x86_64 guests to work. I encountered several issues along the way to getting a successful ''bt'' from gdb on a live x86_64 domU. Note, this is not a hvm domain I am debugging. The first issue was the paging_enabled check on tools/libxc/xc_ptrace.c:384 if (!paging_enabled(&ctxt[cpu])) { static void * v; unsigned long page; if ( v != NULL ) munmap(v, PAGE_SIZE); page = page_array[va >> PAGE_SHIFT] << PAGE_SHIFT; Specifically, the va >> PAGE_SHIFT was too large for the page_array of pfn-to-mfns. Examining cr0 (ctxt->ctrlreg[0]) in gdb showed that cr0 was 0x00000008. Any reason why cr0 wouldn''t at least have paging and protected mode bits set for paravirtual guests? I hacked up xc_linux_build.c and linux-2.6-xen-sparse/drivers/xen/core/smpboot.c to set paging and protected bits in cr0 and I made it further. Next stop was at xc_ptrace.c:284 #ifdef __x86_64__ static void * map_domain_va_64( int xc_handle, int cpu, void *guest_va, int perm) { unsigned long l3p, l2p, l1p, l1e, p, va = (unsigned long)guest_va; uint64_t *l4, *l3, *l2, *l1; static void *v; if ((ctxt[cpu].ctrlreg[4] & 0x20) == 0 ) /* legacy ia32 mode */ return map_domain_va_32(xc_handle, cpu, guest_va, perm); cr4''s value was 0, which forced me down the map_domain_va_32, which was not the right path since I''m on a 64-bit guest. Commenting that check out, I then blew up processing the page table pointers, xc_ptrace.c:292 l4 = xc_map_foreign_range( xc_handle, current_domid, PAGE_SIZE, PROT_READ, ctxt[cpu].ctrlreg[3] >> PAGE_SHIFT); if ( l4 == NULL ) return NULL; l3p = l4[l4_table_offset(va)] >> PAGE_SHIFT; l3p = page_array[l3p]; // lineno 292 In a paravirt domain, the entries in the l4 page table are going to be mfns, not gpfns, in which case, we don''t need to try to convert the entries in the l4 table into mfns via the page_array pfn list. The same holds true for the rest of the tables. I removed all page_array index lines and I was finally able to get a backtrace of the guest. (gdb) bt #0 0xffffffff801348d5 in do_timer (regs=0xffff88002ea4bf58) at kernel/timer.c:775 #1 0xffffffff8010f31b in timer_interrupt (irq=782548824, dev_id=0x989680, regs=0xffff88002ea4bf58) at arch/x86_64/kernel/../../i386/kernel/time-xen.c:672 #2 0xffffffff8014e5b9 in handle_IRQ_event (irq=256, regs=0xffff88002ea4bf58, action=0xffff880001dfe600) at kernel/irq/handle.c:88 #3 0xffffffff8014e6b2 in __do_IRQ (irq=256, regs=0xffff88002ea4bf58) at kernel/irq/handle.c:173 #4 0xffffffff8010d6ae in do_IRQ (regs=0xffff88002ea4bf58) at arch/x86_64/kernel/irq-xen.c:105 #5 0xffffffff80249c50 in evtchn_do_upcall (regs=0xffff88002ea4bf58) at drivers/xen/core/evtchn.c:215 #6 0xffffffff8010b87e in do_hypervisor_callback () #7 0xffff88002ea4bf58 in ?? () #8 0x0000000000000000 in ?? () My guess is that those conversions are needed if the guest is in shadow paging mode. If so, then we need to add a shadow and non-shadow mode to the map_domain_va_64() call so we can debug non-hvm 64-bit guests via gdb. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Kamble, Nitin A
2006-Apr-20 23:53 UTC
RE: [Xen-devel] gdbserver-xen x86_64 paravirt guest debugging
Hi Ryan, You are right. It is needed for fully virtualized (HVM) Guests. And I incorrectly thought that you are running HVM domain. I did not try 64-bit para-virtualized domain debug with the gdbserver. And your debug findings of make sense to me. I think the map_domain_va functions and CR registers need to more code to support the para-virtualized domains. Thanks & Regards, Nitin ------------------------------------------------------------------------ ----------- Open Source Technology Center, Intel Corp>-----Original Message----- >From: Ryan Harper [mailto:ryanh@us.ibm.com] >Sent: Thursday, April 20, 2006 3:24 PM >To: Kamble, Nitin A >Cc: xen-devel@lists.xensource.com >Subject: Re: [Xen-devel] gdbserver-xen x86_64 paravirt guest debugging > >* Kamble, Nitin A <nitin.a.kamble@intel.com> [2006-04-19 17:17]: >> Hi Ryan, >> I am writing a paper for debugging Linux kernel using the Xen andthe>> gdbserver. So with that you will get more details. >> >> For the quick solution for your problem, you need to run "set >> architecture i386:x86-64:intel" command in the gdb before attaching. > >This didn''t help at all, but thank you for the suggestion. > >I''m not sure if anyone has tried to get debugging of paravirtual x86_64 >guests to work. I encountered several issues along the way to gettinga>successful ''bt'' from gdb on a live x86_64 domU. Note, this is not ahvm>domain I am debugging. > >The first issue was the paging_enabled check on >tools/libxc/xc_ptrace.c:384 > >if (!paging_enabled(&ctxt[cpu])) { > static void * v; > unsigned long page; > > if ( v != NULL ) > munmap(v, PAGE_SIZE); > > page = page_array[va >> PAGE_SHIFT] << PAGE_SHIFT; > >Specifically, the va >> PAGE_SHIFT was too large for the page_array of >pfn-to-mfns. Examining cr0 (ctxt->ctrlreg[0]) in gdb showed that cr0 >was 0x00000008. Any reason why cr0 wouldn''t at least have paging and >protected mode bits set for paravirtual guests? > >I hacked up xc_linux_build.c and >linux-2.6-xen-sparse/drivers/xen/core/smpboot.c to set paging and >protected bits in cr0 and I made it further. Next stop was at >xc_ptrace.c:284 > >#ifdef __x86_64__ >static void * >map_domain_va_64( > int xc_handle, > int cpu, > void *guest_va, > int perm) >{ > unsigned long l3p, l2p, l1p, l1e, p, va = (unsigned long)guest_va; > uint64_t *l4, *l3, *l2, *l1; > static void *v; > > if ((ctxt[cpu].ctrlreg[4] & 0x20) == 0 ) /* legacy ia32 mode */ > return map_domain_va_32(xc_handle, cpu, guest_va, perm); > >cr4''s value was 0, which forced me down the map_domain_va_32, which was >not the right path since I''m on a 64-bit guest. Commenting that >check out, I then blew up processing the page table pointers, >xc_ptrace.c:292 > > l4 = xc_map_foreign_range( xc_handle, current_domid, PAGE_SIZE, > PROT_READ, ctxt[cpu].ctrlreg[3] >> PAGE_SHIFT); > if ( l4 == NULL ) > return NULL; > > l3p = l4[l4_table_offset(va)] >> PAGE_SHIFT; > l3p = page_array[l3p]; // lineno 292 > > >In a paravirt domain, the entries in the l4 page table are going to be >mfns, not gpfns, in which case, we don''t need to try to convert the >entries in the l4 table into mfns via the page_array pfn list. Thesame>holds true for the rest of the tables. > >I removed all page_array index lines and I was finally able to get a >backtrace of the guest. > >(gdb) bt >#0 0xffffffff801348d5 in do_timer (regs=0xffff88002ea4bf58) at >kernel/timer.c:775 >#1 0xffffffff8010f31b in timer_interrupt (irq=782548824, > dev_id=0x989680, regs=0xffff88002ea4bf58) > at arch/x86_64/kernel/../../i386/kernel/time-xen.c:672 >#2 0xffffffff8014e5b9 in handle_IRQ_event (irq=256, > regs=0xffff88002ea4bf58, action=0xffff880001dfe600) > at kernel/irq/handle.c:88 >#3 0xffffffff8014e6b2 in __do_IRQ (irq=256, regs=0xffff88002ea4bf58)at> kernel/irq/handle.c:173 >#4 0xffffffff8010d6ae in do_IRQ (regs=0xffff88002ea4bf58) at > arch/x86_64/kernel/irq-xen.c:105 >#5 0xffffffff80249c50 in evtchn_do_upcall (regs=0xffff88002ea4bf58) at > drivers/xen/core/evtchn.c:215 >#6 0xffffffff8010b87e in do_hypervisor_callback () >#7 0xffff88002ea4bf58 in ?? () >#8 0x0000000000000000 in ?? () > > >My guess is that those conversions are needed if the guest is in shadow >paging mode. If so, then we need to add a shadow and non-shadow modeto>the map_domain_va_64() call so we can debug non-hvm 64-bit guests via >gdb. > >-- >Ryan Harper >Software Engineer; Linux Technology Center >IBM Corp., Austin, Tx >(512) 838-9253 T/L: 678-9253 >ryanh@us.ibm.com_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel