thr3ads.net - Xen devel - Xenalyze questions [Jun 2013]

If this information is useful, please help other people find it:
Share via:

Simon Graham

2013-Jun-06 11:35 UTC

Xenalyze questions

I''ve been trying to use xenalyze to investigate some problems on our
Xen 4.2.2 based product and have run into a couple of issues:

First, I ran into problem running xenalyze on 64-bit Linux because some of the
trace data structures were not properly packed  - I worked around this by adding
the packed attribute where needed but I suspect there is a more elegant solution
(see attached patch).

With that out of the way, I am now running it on a trace file and it craps out
in the following code because e->pf_case has a large -ve value (presumably
uninitialized):
 
void shadow_mmio_postprocess(struct hvm_data *h)
{
    struct pf_xen_extra *e = &h->inflight.pf_xen;
    if ( opt.summary_info )
    {
        if(e->pf_case)
=>          update_summary(&h->summary.pf_xen[e->pf_case],
                           h->arc_cycles);
        else
            fprintf(warn, "Strange, pf_case 0!\n");
 
        hvm_update_short_summary(h, HVM_SHORT_SUMMARY_MMIO);
    }
 
    if(opt.with_mmio_enumeration)
        enumerate_mmio(h);
}
 
The output from running the program is:
 
Starting program: /home/sgraham/sandbox/Community/xenalyze/xenalyze ~/tmp/e.evt
No output defined, using summary.
Using VMX hardware-assisted virtualization.
scan_for_new_pcpu: Activating pcpu 0 at offset 0
Creating vcpu 0 for dom 32768
scan_for_new_pcpu: Activating pcpu 1 at offset 1560
Creating vcpu 1 for dom 32768
scan_for_new_pcpu: Activating pcpu 2 at offset 1772
Creating vcpu 2 for dom 32768
scan_for_new_pcpu: Activating pcpu 3 at offset 68428
Creating vcpu 3 for dom 32768
scan_for_new_pcpu: Activating pcpu 5 at offset 68520
Creating vcpu 5 for dom 32768
scan_for_new_pcpu: Activating pcpu 10 at offset 73632
Creating vcpu 10 for dom 32768
scan_for_new_pcpu: Activating pcpu 11 at offset 90588
Creating vcpu 11 for dom 32768
init_pcpus: through first trace write, done for now.
hvm_generic_postprocess_init: Strange, h->postprocess set!
 
Program received signal SIGSEGV, Segmentation fault.
0x0000000000402888 in update_summary (s=0x7fe8aa769ca8, c=23368) at
xenalyze.c:2244
 
This is using Xen 4.2.2 with several VMs running (32-bit Win7, 64-bit HVM Linux,
64-bit Win8). The trace file was generated by a 10s run of ''xentrace -e
all'' and is 200MB so I haven''t attached it!
 
I suspect the problem is related to the specific set of trace records being
output but I''m having a hard time grok''ing the code - Any
suggestions welcome!

Simon Graham





_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

George Dunlap

2013-Jun-06 13:46 UTC

head link

Re: Xenalyze questions

On Thu, Jun 6, 2013 at 12:35 PM, Simon Graham <simon.graham@citrix.com>
wrote:> I''ve been trying to use xenalyze to investigate some problems on
our Xen 4.2.2 based product and have run into a couple of issues:
>
> First, I ran into problem running xenalyze on 64-bit Linux because some of
the trace data structures were not properly packed  - I worked around this by
adding the packed attribute where needed but I suspect there is a more elegant
solution (see attached patch).
>
> With that out of the way, I am now running it on a trace file and it craps
out in the following code because e->pf_case has a large -ve value
(presumably uninitialized):
>
> void shadow_mmio_postprocess(struct hvm_data *h)
> {
>     struct pf_xen_extra *e = &h->inflight.pf_xen;
>     if ( opt.summary_info )
>     {
>         if(e->pf_case)
> =>          update_summary(&h->summary.pf_xen[e->pf_case],
>                            h->arc_cycles);
>         else
>             fprintf(warn, "Strange, pf_case 0!\n");
>
>         hvm_update_short_summary(h, HVM_SHORT_SUMMARY_MMIO);
>     }
>
>     if(opt.with_mmio_enumeration)
>         enumerate_mmio(h);
> }

>
> The output from running the program is:
>
> Starting program: /home/sgraham/sandbox/Community/xenalyze/xenalyze
~/tmp/e.evt
> No output defined, using summary.
> Using VMX hardware-assisted virtualization.
> scan_for_new_pcpu: Activating pcpu 0 at offset 0
> Creating vcpu 0 for dom 32768
> scan_for_new_pcpu: Activating pcpu 1 at offset 1560
> Creating vcpu 1 for dom 32768
> scan_for_new_pcpu: Activating pcpu 2 at offset 1772
> Creating vcpu 2 for dom 32768
> scan_for_new_pcpu: Activating pcpu 3 at offset 68428
> Creating vcpu 3 for dom 32768
> scan_for_new_pcpu: Activating pcpu 5 at offset 68520
> Creating vcpu 5 for dom 32768
> scan_for_new_pcpu: Activating pcpu 10 at offset 73632
> Creating vcpu 10 for dom 32768
> scan_for_new_pcpu: Activating pcpu 11 at offset 90588
> Creating vcpu 11 for dom 32768
> init_pcpus: through first trace write, done for now.
> hvm_generic_postprocess_init: Strange, h->postprocess set!
Hmm, it looks like this is the problem here.

Can you try adding this patch, and then running the following?

xenalyze -a -s --tolerance=1 [filename] > [filename.dump]?

The resulting file may be pretty big; I''ll just need the last 50 or so
lines.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Simon Graham

2013-Jun-07 09:14 UTC

head link

Re: Xenalyze questions

> > init_pcpus: through first trace write, done for now.
> > hvm_generic_postprocess_init: Strange, h->postprocess set!
> 
> Hmm, it looks like this is the problem here.
> 
> Can you try adding this patch, and then running the following?
> 
> xenalyze -a -s --tolerance=1 [filename] > [filename.dump]?
> 
> The resulting file may be pretty big; I''ll just need the last 50
or so lines.
> 
Actually it was pretty small - must be the 1st record that is bad!

Tolerating errors at or below 1
scan_for_new_pcpu: Activating pcpu 0 at offset 0
Creating vcpu 0 for dom 32768
scan_for_new_pcpu: Activating pcpu 1 at offset 1560
Creating vcpu 1 for dom 32768
scan_for_new_pcpu: Activating pcpu 2 at offset 1772
Creating vcpu 2 for dom 32768
scan_for_new_pcpu: Activating pcpu 3 at offset 68428
Creating vcpu 3 for dom 32768
scan_for_new_pcpu: Activating pcpu 5 at offset 68520
Creating vcpu 5 for dom 32768
scan_for_new_pcpu: Activating pcpu 10 at offset 73632
Creating vcpu 10 for dom 32768
scan_for_new_pcpu: Activating pcpu 11 at offset 90588
Creating vcpu 11 for dom 32768
init_pcpus: through first trace write, done for now.
]               .... .    x. d32768v10 vmexit exit_reason EXCEPTION_NMI eip
fffff88005a25ad0
]               .... .    x. d32768v10 fast mmio va fffff88008808df0
]               .... .    x. d32768v10 cpuid [ 1 206c2 400800 83b82203 178bfbff
]
hvm_generic_postprocess_init: Strange, h->postprocess set!

Simon

George Dunlap

2013-Jun-07 11:07 UTC

head link

Re: Xenalyze questions

On Fri, Jun 7, 2013 at 10:14 AM, Simon Graham <simon.graham@citrix.com>
wrote:>> > init_pcpus: through first trace write, done for now.
>> > hvm_generic_postprocess_init: Strange, h->postprocess set!
>>
>> Hmm, it looks like this is the problem here.
>>
>> Can you try adding this patch, and then running the following?
>>
>> xenalyze -a -s --tolerance=1 [filename] > [filename.dump]?
>>
>> The resulting file may be pretty big; I''ll just need the last
50 or so lines.
>>
>
> Actually it was pretty small - must be the 1st record that is bad!
>
> Tolerating errors at or below 1
> scan_for_new_pcpu: Activating pcpu 0 at offset 0
> Creating vcpu 0 for dom 32768
> scan_for_new_pcpu: Activating pcpu 1 at offset 1560
> Creating vcpu 1 for dom 32768
> scan_for_new_pcpu: Activating pcpu 2 at offset 1772
> Creating vcpu 2 for dom 32768
> scan_for_new_pcpu: Activating pcpu 3 at offset 68428
> Creating vcpu 3 for dom 32768
> scan_for_new_pcpu: Activating pcpu 5 at offset 68520
> Creating vcpu 5 for dom 32768
> scan_for_new_pcpu: Activating pcpu 10 at offset 73632
> Creating vcpu 10 for dom 32768
> scan_for_new_pcpu: Activating pcpu 11 at offset 90588
> Creating vcpu 11 for dom 32768
> init_pcpus: through first trace write, done for now.
> ]               .... .    x. d32768v10 vmexit exit_reason EXCEPTION_NMI eip
fffff88005a25ad0
> ]               .... .    x. d32768v10 fast mmio va fffff88008808df0
> ]               .... .    x. d32768v10 cpuid [ 1 206c2 400800 83b82203
178bfbff ]
> hvm_generic_postprocess_init: Strange, h->postprocess set!
...I''m trying to figure out how on earth you got a trace like that.
It looks like:
1. The VM took a page fault (EXCEPTION_NMI)
2. Xen determined that it was an MMIO, which would mean that the guest
PT was valid, but pointed to PFN space that Xen didn''t recognize
3. It somehow emulated a CPUID instruction???

CPUID doesn''t do any memory accesses, so it shouldn''t be able
to
trigger an MMIO fault like this.

Do you have any way of telling what instruction was at the address at
the EIP (fffff88005a25ad0)?

Also, could you run xenalyze as follows, and attach the first couple
hundred lines:

xenalyze -a [filename] > [filename.dump]

(Not having the -s will remove the path that is crashing, to allow us
to see what happened after this trace.)

 -George

Simon Graham

2013-Jun-07 11:46 UTC

head link

Re: Xenalyze questions

> > init_pcpus: through first trace write, done for now.
> > ]               .... .    x. d32768v10 vmexit exit_reason
EXCEPTION_NMI eip
> fffff88005a25ad0
> > ]               .... .    x. d32768v10 fast mmio va fffff88008808df0
> > ]               .... .    x. d32768v10 cpuid [ 1 206c2 400800 83b82203
178bfbff ]
> > hvm_generic_postprocess_init: Strange, h->postprocess set!
> 
> ...I''m trying to figure out how on earth you got a trace like
that.
> It looks like:
> 1. The VM took a page fault (EXCEPTION_NMI)
> 2. Xen determined that it was an MMIO, which would mean that the guest
> PT was valid, but pointed to PFN space that Xen didn''t recognize
> 3. It somehow emulated a CPUID instruction???
> 
> CPUID doesn''t do any memory accesses, so it shouldn''t be
able to
> trigger an MMIO fault like this.
According to xentrace, the first VMEXIT was a TRAP_nodevice followed indeed by a
page fault that triggered emulation of a CPUID which is odd...

CPU2  243316272356568 (+   26064)  VMENTRY
CPU2  243316272362080 (+    5512)  VMEXIT      [ exitcode = 0x00000000, rIP  =
0xfffff80188e57e2e ]
CPU2  243316272362080 (+       0)  TRAP        [ vector = 0x00000007 ]
CPU2  243316272366728 (+    4648)  VMENTRY
CPU2  243316272377592 (+   10864)  VMEXIT      [ exitcode = 0x00000000, rIP  =
0xfffff88005a25ad0 ]
CPU2  243316272377592 (+       0)  shadow_fast_mmio                  [ va =
0xfffff88008808df0 ]
CPU2  243316272377592 (+       0)  CPUID       [ func = 0x00000001, eax =
0x000206c2, ebx = 0x00400800, ecx=0x83b82203, edx = 0x178bfbff ]
CPU2  243316272377592 (+       0)  PF_INJECT   [ errorcode = 0x0b, virt =
0xfffff88008808df0 ]

followed by many repeats of the same CPUID... this might explain my performance
problem!!!

BTW: I should say that this system is setup to use shadow for VRAM tracking and
does NOT have HAP enabled (because that slows down VRAM tracking too much)

I''ve seen problems in the past where the emulation setup in the shadow
code is woefully limited leading to just this sort of repeated loop...
> 
> Do you have any way of telling what instruction was at the address at
> the EIP (fffff88005a25ad0)?
> 
Probably with some effort - I think this was a 64-bit Win8 VM and I can use
windbag to figure this out: will get back to you on that!
> Also, could you run xenalyze as follows, and attach the first couple
> hundred lines:
> 
> xenalyze -a [filename] > [filename.dump]
> 
Attached.

Simon


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Xen devel - Jun 2013 - Xenalyze questions

Xenalyze questions

Re: Xenalyze questions

Re: Xenalyze questions

Re: Xenalyze questions

Re: Xenalyze questions