I''ve been trying to use xenalyze to investigate some problems on our Xen 4.2.2 based product and have run into a couple of issues: First, I ran into problem running xenalyze on 64-bit Linux because some of the trace data structures were not properly packed - I worked around this by adding the packed attribute where needed but I suspect there is a more elegant solution (see attached patch). With that out of the way, I am now running it on a trace file and it craps out in the following code because e->pf_case has a large -ve value (presumably uninitialized): void shadow_mmio_postprocess(struct hvm_data *h) { struct pf_xen_extra *e = &h->inflight.pf_xen; if ( opt.summary_info ) { if(e->pf_case) => update_summary(&h->summary.pf_xen[e->pf_case], h->arc_cycles); else fprintf(warn, "Strange, pf_case 0!\n"); hvm_update_short_summary(h, HVM_SHORT_SUMMARY_MMIO); } if(opt.with_mmio_enumeration) enumerate_mmio(h); } The output from running the program is: Starting program: /home/sgraham/sandbox/Community/xenalyze/xenalyze ~/tmp/e.evt No output defined, using summary. Using VMX hardware-assisted virtualization. scan_for_new_pcpu: Activating pcpu 0 at offset 0 Creating vcpu 0 for dom 32768 scan_for_new_pcpu: Activating pcpu 1 at offset 1560 Creating vcpu 1 for dom 32768 scan_for_new_pcpu: Activating pcpu 2 at offset 1772 Creating vcpu 2 for dom 32768 scan_for_new_pcpu: Activating pcpu 3 at offset 68428 Creating vcpu 3 for dom 32768 scan_for_new_pcpu: Activating pcpu 5 at offset 68520 Creating vcpu 5 for dom 32768 scan_for_new_pcpu: Activating pcpu 10 at offset 73632 Creating vcpu 10 for dom 32768 scan_for_new_pcpu: Activating pcpu 11 at offset 90588 Creating vcpu 11 for dom 32768 init_pcpus: through first trace write, done for now. hvm_generic_postprocess_init: Strange, h->postprocess set! Program received signal SIGSEGV, Segmentation fault. 0x0000000000402888 in update_summary (s=0x7fe8aa769ca8, c=23368) at xenalyze.c:2244 This is using Xen 4.2.2 with several VMs running (32-bit Win7, 64-bit HVM Linux, 64-bit Win8). The trace file was generated by a 10s run of ''xentrace -e all'' and is 200MB so I haven''t attached it! I suspect the problem is related to the specific set of trace records being output but I''m having a hard time grok''ing the code - Any suggestions welcome! Simon Graham _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Thu, Jun 6, 2013 at 12:35 PM, Simon Graham <simon.graham@citrix.com> wrote:> I''ve been trying to use xenalyze to investigate some problems on our Xen 4.2.2 based product and have run into a couple of issues: > > First, I ran into problem running xenalyze on 64-bit Linux because some of the trace data structures were not properly packed - I worked around this by adding the packed attribute where needed but I suspect there is a more elegant solution (see attached patch). > > With that out of the way, I am now running it on a trace file and it craps out in the following code because e->pf_case has a large -ve value (presumably uninitialized): > > void shadow_mmio_postprocess(struct hvm_data *h) > { > struct pf_xen_extra *e = &h->inflight.pf_xen; > if ( opt.summary_info ) > { > if(e->pf_case) > => update_summary(&h->summary.pf_xen[e->pf_case], > h->arc_cycles); > else > fprintf(warn, "Strange, pf_case 0!\n"); > > hvm_update_short_summary(h, HVM_SHORT_SUMMARY_MMIO); > } > > if(opt.with_mmio_enumeration) > enumerate_mmio(h); > }> > The output from running the program is: > > Starting program: /home/sgraham/sandbox/Community/xenalyze/xenalyze ~/tmp/e.evt > No output defined, using summary. > Using VMX hardware-assisted virtualization. > scan_for_new_pcpu: Activating pcpu 0 at offset 0 > Creating vcpu 0 for dom 32768 > scan_for_new_pcpu: Activating pcpu 1 at offset 1560 > Creating vcpu 1 for dom 32768 > scan_for_new_pcpu: Activating pcpu 2 at offset 1772 > Creating vcpu 2 for dom 32768 > scan_for_new_pcpu: Activating pcpu 3 at offset 68428 > Creating vcpu 3 for dom 32768 > scan_for_new_pcpu: Activating pcpu 5 at offset 68520 > Creating vcpu 5 for dom 32768 > scan_for_new_pcpu: Activating pcpu 10 at offset 73632 > Creating vcpu 10 for dom 32768 > scan_for_new_pcpu: Activating pcpu 11 at offset 90588 > Creating vcpu 11 for dom 32768 > init_pcpus: through first trace write, done for now. > hvm_generic_postprocess_init: Strange, h->postprocess set!Hmm, it looks like this is the problem here. Can you try adding this patch, and then running the following? xenalyze -a -s --tolerance=1 [filename] > [filename.dump]? The resulting file may be pretty big; I''ll just need the last 50 or so lines. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
> > init_pcpus: through first trace write, done for now. > > hvm_generic_postprocess_init: Strange, h->postprocess set! > > Hmm, it looks like this is the problem here. > > Can you try adding this patch, and then running the following? > > xenalyze -a -s --tolerance=1 [filename] > [filename.dump]? > > The resulting file may be pretty big; I''ll just need the last 50 or so lines. >Actually it was pretty small - must be the 1st record that is bad! Tolerating errors at or below 1 scan_for_new_pcpu: Activating pcpu 0 at offset 0 Creating vcpu 0 for dom 32768 scan_for_new_pcpu: Activating pcpu 1 at offset 1560 Creating vcpu 1 for dom 32768 scan_for_new_pcpu: Activating pcpu 2 at offset 1772 Creating vcpu 2 for dom 32768 scan_for_new_pcpu: Activating pcpu 3 at offset 68428 Creating vcpu 3 for dom 32768 scan_for_new_pcpu: Activating pcpu 5 at offset 68520 Creating vcpu 5 for dom 32768 scan_for_new_pcpu: Activating pcpu 10 at offset 73632 Creating vcpu 10 for dom 32768 scan_for_new_pcpu: Activating pcpu 11 at offset 90588 Creating vcpu 11 for dom 32768 init_pcpus: through first trace write, done for now. ] .... . x. d32768v10 vmexit exit_reason EXCEPTION_NMI eip fffff88005a25ad0 ] .... . x. d32768v10 fast mmio va fffff88008808df0 ] .... . x. d32768v10 cpuid [ 1 206c2 400800 83b82203 178bfbff ] hvm_generic_postprocess_init: Strange, h->postprocess set! Simon
On Fri, Jun 7, 2013 at 10:14 AM, Simon Graham <simon.graham@citrix.com> wrote:>> > init_pcpus: through first trace write, done for now. >> > hvm_generic_postprocess_init: Strange, h->postprocess set! >> >> Hmm, it looks like this is the problem here. >> >> Can you try adding this patch, and then running the following? >> >> xenalyze -a -s --tolerance=1 [filename] > [filename.dump]? >> >> The resulting file may be pretty big; I''ll just need the last 50 or so lines. >> > > Actually it was pretty small - must be the 1st record that is bad! > > Tolerating errors at or below 1 > scan_for_new_pcpu: Activating pcpu 0 at offset 0 > Creating vcpu 0 for dom 32768 > scan_for_new_pcpu: Activating pcpu 1 at offset 1560 > Creating vcpu 1 for dom 32768 > scan_for_new_pcpu: Activating pcpu 2 at offset 1772 > Creating vcpu 2 for dom 32768 > scan_for_new_pcpu: Activating pcpu 3 at offset 68428 > Creating vcpu 3 for dom 32768 > scan_for_new_pcpu: Activating pcpu 5 at offset 68520 > Creating vcpu 5 for dom 32768 > scan_for_new_pcpu: Activating pcpu 10 at offset 73632 > Creating vcpu 10 for dom 32768 > scan_for_new_pcpu: Activating pcpu 11 at offset 90588 > Creating vcpu 11 for dom 32768 > init_pcpus: through first trace write, done for now. > ] .... . x. d32768v10 vmexit exit_reason EXCEPTION_NMI eip fffff88005a25ad0 > ] .... . x. d32768v10 fast mmio va fffff88008808df0 > ] .... . x. d32768v10 cpuid [ 1 206c2 400800 83b82203 178bfbff ] > hvm_generic_postprocess_init: Strange, h->postprocess set!...I''m trying to figure out how on earth you got a trace like that. It looks like: 1. The VM took a page fault (EXCEPTION_NMI) 2. Xen determined that it was an MMIO, which would mean that the guest PT was valid, but pointed to PFN space that Xen didn''t recognize 3. It somehow emulated a CPUID instruction??? CPUID doesn''t do any memory accesses, so it shouldn''t be able to trigger an MMIO fault like this. Do you have any way of telling what instruction was at the address at the EIP (fffff88005a25ad0)? Also, could you run xenalyze as follows, and attach the first couple hundred lines: xenalyze -a [filename] > [filename.dump] (Not having the -s will remove the path that is crashing, to allow us to see what happened after this trace.) -George
> > init_pcpus: through first trace write, done for now. > > ] .... . x. d32768v10 vmexit exit_reason EXCEPTION_NMI eip > fffff88005a25ad0 > > ] .... . x. d32768v10 fast mmio va fffff88008808df0 > > ] .... . x. d32768v10 cpuid [ 1 206c2 400800 83b82203 178bfbff ] > > hvm_generic_postprocess_init: Strange, h->postprocess set! > > ...I''m trying to figure out how on earth you got a trace like that. > It looks like: > 1. The VM took a page fault (EXCEPTION_NMI) > 2. Xen determined that it was an MMIO, which would mean that the guest > PT was valid, but pointed to PFN space that Xen didn''t recognize > 3. It somehow emulated a CPUID instruction??? > > CPUID doesn''t do any memory accesses, so it shouldn''t be able to > trigger an MMIO fault like this.According to xentrace, the first VMEXIT was a TRAP_nodevice followed indeed by a page fault that triggered emulation of a CPUID which is odd... CPU2 243316272356568 (+ 26064) VMENTRY CPU2 243316272362080 (+ 5512) VMEXIT [ exitcode = 0x00000000, rIP = 0xfffff80188e57e2e ] CPU2 243316272362080 (+ 0) TRAP [ vector = 0x00000007 ] CPU2 243316272366728 (+ 4648) VMENTRY CPU2 243316272377592 (+ 10864) VMEXIT [ exitcode = 0x00000000, rIP = 0xfffff88005a25ad0 ] CPU2 243316272377592 (+ 0) shadow_fast_mmio [ va = 0xfffff88008808df0 ] CPU2 243316272377592 (+ 0) CPUID [ func = 0x00000001, eax = 0x000206c2, ebx = 0x00400800, ecx=0x83b82203, edx = 0x178bfbff ] CPU2 243316272377592 (+ 0) PF_INJECT [ errorcode = 0x0b, virt = 0xfffff88008808df0 ] followed by many repeats of the same CPUID... this might explain my performance problem!!! BTW: I should say that this system is setup to use shadow for VRAM tracking and does NOT have HAP enabled (because that slows down VRAM tracking too much) I''ve seen problems in the past where the emulation setup in the shadow code is woefully limited leading to just this sort of repeated loop...> > Do you have any way of telling what instruction was at the address at > the EIP (fffff88005a25ad0)? >Probably with some effort - I think this was a 64-bit Win8 VM and I can use windbag to figure this out: will get back to you on that!> Also, could you run xenalyze as follows, and attach the first couple > hundred lines: > > xenalyze -a [filename] > [filename.dump] >Attached. Simon _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel