thr3ads.net - Xen devel - [Xen-devel] A different probklem with save/restore on C/S 14823. [Apr 2007]

If this information is useful, please help other people find it:
Share via:

Petersson, Mats

2007-Apr-13 16:24 UTC

[Xen-devel] A different probklem with save/restore on C/S 14823.

I''m not seeing the problem that Fan Zhao is reporting, instead I get
this one. Not sure if ti''s the same one or a different problem... This
happens with my simple-guest [i.e. not using hvmloader, as I described
before]. This worked fine yesterday. 

(XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, error -22, line
178
(XEN) bad shared page: 0
(XEN) domain_crash_sync called from platform.c:844
(XEN) Domain 20 (vcpu#0) crashed on cpu#1:
(XEN) ----[ Xen-3.0-unstable  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    1
(XEN) RIP:    0010:[<0000000000102638>]
(XEN) RFLAGS: 0000000000000002   CONTEXT: hvm
(XEN) rax: 0000000000002520   rbx: 000000000000234e   rcx:
0000000000107180
(XEN) rdx: 0000000000000020   rsi: 0000000000105b40   rdi:
000000000000234d
(XEN) rbp: 0000000000105844   rsp: 000000000010582c   r8:
0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11:
0000000000000000
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14:
0000000000000000
(XEN) r15: 0000000000000000   cr0: 0000000000050033   cr4:
0000000000000640
(XEN) cr3: 0000000000000000   cr2: 0000000000000000
(XEN) ds: 0008   es: 0008   fs: 0008   gs: 0008   ss: 0008   cs: 0010

--
Mats



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2007-Apr-13 16:35 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

Hi, 

At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats
wrote:> I''m not seeing the problem that Fan Zhao is reporting, instead I
get
> this one. Not sure if ti''s the same one or a different problem...
This
> happens with my simple-guest [i.e. not using hvmloader, as I described
> before]. This worked fine yesterday. 
This looks like the same problem (but caught in Xen instead of
crashing).  The restore path isn''t setting the ioreq page''s
PFN
properly.   Have you reinstalled your tools (in particular libxenguest) 
since cset 14830:e3b3800c769a ? 

Cheers,

Tim.
> (XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, error -22, line
> 178
> (XEN) bad shared page: 0
> (XEN) domain_crash_sync called from platform.c:844
> (XEN) Domain 20 (vcpu#0) crashed on cpu#1:
-- 
Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
Registered office c/o EC2Y 5EB, UK; company number 05334508

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Tim Deegan

2007-Apr-13 16:40 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

At 17:35 +0100 on 13 Apr (1176485709), Tim Deegan wrote:> Have you reinstalled your tools (in particular libxenguest) 
> since cset 14830:e3b3800c769a ? 
...which hasn''t made it out of the staging tree yet; my apologies. 
This issue should be fixed when it does. 

Tim.

-- 
Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
Registered office c/o EC2Y 5EB, UK; company number 05334508

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-13 16:43 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 13/4/07 17:35, "Tim Deegan" <Tim.Deegan@xensource.com> wrote:
> At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote:
>> I''m not seeing the problem that Fan Zhao is reporting, instead
I get
>> this one. Not sure if ti''s the same one or a different
problem... This
>> happens with my simple-guest [i.e. not using hvmloader, as I described
>> before]. This worked fine yesterday.
> 
> This looks like the same problem (but caught in Xen instead of
> crashing).  The restore path isn''t setting the ioreq
page''s PFN
> properly.   Have you reinstalled your tools (in particular libxenguest)
> since cset 14830:e3b3800c769a ?
It is also somewhat odd that Xen got a chance to catch the problem (probably
the printed guest EIP is an I/O port operation? In which case Xen caught the
problem in send_pio_req), rather than crashing in hvm_do_resume() with a
NULL pointer dereference, which is what Fan Zhao saw. Either the guest
started executing without passing through hvm_do_resume(), or there was a
valid page mapping at address 0 in Xen''s address space when you
executed
hvm_do_resume(). Neither of these possibilities is good. It might be worth
doing a bit of digging to find out why you didn''t repro the exact same
crash
as Fan Zhao.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-13 16:47 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 13 April 2007 17:43
> To: Tim Deegan; Petersson, Mats
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> On 13/4/07 17:35, "Tim Deegan" <Tim.Deegan@xensource.com>
wrote:
> 
> > At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote:
> >> I''m not seeing the problem that Fan Zhao is reporting, 
> instead I get
> >> this one. Not sure if ti''s the same one or a different 
> problem... This
> >> happens with my simple-guest [i.e. not using hvmloader, as 
> I described
> >> before]. This worked fine yesterday.
> > 
> > This looks like the same problem (but caught in Xen instead of
> > crashing).  The restore path isn''t setting the ioreq
page''s PFN
> > properly.   Have you reinstalled your tools (in particular 
> libxenguest)
> > since cset 14830:e3b3800c769a ?
> 
> It is also somewhat odd that Xen got a chance to catch the 
> problem (probably
> the printed guest EIP is an I/O port operation? In which case 
> Xen caught the
> problem in send_pio_req), rather than crashing in 
> hvm_do_resume() with a
> NULL pointer dereference, which is what Fan Zhao saw. Either the guest
> started executing without passing through hvm_do_resume(), or 
> there was a
> valid page mapping at address 0 in Xen''s address space when 
> you executed
> hvm_do_resume(). Neither of these possibilities is good. It 
> might be worth
> doing a bit of digging to find out why you didn''t repro the 
> exact same crash
> as Fan Zhao.
See my other reply, although you may have a point about mapping - my
guest is running with the HVMloaders map, which probably maps all memory
available to guest linearly, including address zero (as that''s where
real-mode puts the interrupt vector table, which can be useful to have
mapped - just a little bit ;-) ). 

So maybe we need an earlier/different test to kill guest? Or do you
think this is such a critical error that hypervisor should die?

--
Mats> 
>  -- Keir
> 
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-13 16:47 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Tim Deegan [mailto:Tim.Deegan@xensource.com] 
> Sent: 13 April 2007 17:35
> To: Petersson, Mats
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> Hi, 
> 
> At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote:
> > I''m not seeing the problem that Fan Zhao is reporting,
instead I get
> > this one. Not sure if ti''s the same one or a different 
> problem... This
> > happens with my simple-guest [i.e. not using hvmloader, as 
> I described
> > before]. This worked fine yesterday. 
> 
> This looks like the same problem (but caught in Xen instead of
> crashing).  The restore path isn''t setting the ioreq
page''s PFN
> properly.   Have you reinstalled your tools (in particular 
> libxenguest) 
> since cset 14830:e3b3800c769a ? 
After some more testing, I would agree that it''s probably the same
problem - I found that I could reproduce the same problem if I use a
sles93 guest rather than the simple guest - not sure why it''s different
(perhaps because the sles guest uses MMIO and my simple guest never uses
MMIO?)

It would perhaps be nice to understand why one type of guest only
crashes the guest (kind of not so bad) and the other crashes the
hypervisor (quite bad) and make BOTH crash the guest only? [Not that it
REALLY fixes a problem, but leaving the hypervisor running if the guest
is broken is a good thing, I think. [perhaps not a priority for 3.0.5,
but I think it should be fixed - I''ll have a look to see if I can
understand what''s going on and I can cobble together something that
catches both cases]. 

I''m always a bit wary of using (wholesale) changesets from the staging
tree, under the assumption that if they don''t pass the basic testing,
it''s quite possible that it won''t work for me either.
Obviously, if I
understand a particular change, and I think it won''t need a whole bunch
of others, I may pick a single (or small subset) of staging for test
purposes. But 14823 is the latest available on unstable. [You must be
mind-reading, as you just apologized for not realizing it''s not out of
staging yet]. 

--
Mats> 
> Cheers,
> 
> Tim.
> 
> > (XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, 
> error -22, line
> > 178
> > (XEN) bad shared page: 0
> > (XEN) domain_crash_sync called from platform.c:844
> > (XEN) Domain 20 (vcpu#0) crashed on cpu#1:
> 
> -- 
> Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited
> Registered office c/o EC2Y 5EB, UK; company number 05334508
> 
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-13 16:55 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 13/4/07 17:47, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
> See my other reply, although you may have a point about mapping - my
> guest is running with the HVMloaders map, which probably maps all memory
> available to guest linearly, including address zero (as that''s
where
> real-mode puts the interrupt vector table, which can be useful to have
> mapped - just a little bit ;-) ).
> 
> So maybe we need an earlier/different test to kill guest? Or do you
> think this is such a critical error that hypervisor should die?
The NULL dereference is inside the hypervisor in hvm_do_resume(). At that
point you are running in Xen''s address space, not the guest''s.
And Xen
should have no mapping at address zero.

The issue here is that shared_page_va is not initialised, so it contains 0.
hvm_do_resume() should be getting a pointer derived from this value via
get_vio(). When it dereferences it, Xen should crash. That didn''t
happen for
you and that is scarily inexplicable.

I suggest adding some tracing to hvm_do_resume() to find out whether it is
being called at all and, if it is, what value it sets its local variable
''p''
to. Also what value is in v->domain->arch.hvm-domain.shared_page_va.

The bugs that cause this condition should all be fixed in xen-unstable
staging tip, by the way. I just think this situation should be investigated
before you upgrade in case you''ve uncovered another latent bug. Because
you
really should be crashing in hvm_do_resume() in this scenario.

 -- Keir

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-13 17:24 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 13 April 2007 17:56
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> On 13/4/07 17:47, "Petersson, Mats"
<Mats.Petersson@amd.com> wrote:
> 
> > See my other reply, although you may have a point about mapping - my
> > guest is running with the HVMloaders map, which probably 
> maps all memory
> > available to guest linearly, including address zero (as
that''s where
> > real-mode puts the interrupt vector table, which can be 
> useful to have
> > mapped - just a little bit ;-) ).
> > 
> > So maybe we need an earlier/different test to kill guest? Or do you
> > think this is such a critical error that hypervisor should die?
> 
> The NULL dereference is inside the hypervisor in 
> hvm_do_resume(). At that
> point you are running in Xen''s address space, not the
guest''s. And Xen
> should have no mapping at address zero.
Yes, of course - me not thinking right - sorry [it is late-ish on a
Friday, that''s my excuse and I''m sticking to it]. 
> 
> The issue here is that shared_page_va is not initialised, so 
> it contains 0.
> hvm_do_resume() should be getting a pointer derived from this 
> value via
> get_vio(). When it dereferences it, Xen should crash. That 
> didn''t happen for
> you and that is scarily inexplicable.
Yes, I follow that. 

However, my guest does A LOT of IOIO exits (it''s an IDE test-app), with
some HLT and IRQ exits thrown in for good measure. So if the guest is
doing IOIO exit it would end up in platform.c:844 before it gets to
hvm_do_resume?  Or are you saying that we should crash as soon as the
guest restarts, because that''s done through hvm_do_resume? 
> 
> I suggest adding some tracing to hvm_do_resume() to find out 
> whether it is
> being called at all and, if it is, what value it sets its 
> local variable ''p''
> to. Also what value is in v->domain->arch.hvm-domain.shared_page_va.
Would a check for zero in get_vio() with domain_crash_synchronous() be a
"good thing" here, or is that too time-consuming in a relatively
time-critical path of HVM?

I will look at it on Monday (before I update to the new version, just to
make sure I can reproduce it still ;-) ).

> 
> The bugs that cause this condition should all be fixed in xen-unstable
> staging tip, by the way. I just think this situation should 
> be investigated
> before you upgrade in case you''ve uncovered another latent 
> bug. Because you
> really should be crashing in hvm_do_resume() in this scenario.
> 
>  -- Keir
> 
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-13 17:34 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 13/4/07 18:24, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
> However, my guest does A LOT of IOIO exits (it''s an IDE test-app),
with
> some HLT and IRQ exits thrown in for good measure. So if the guest is
> doing IOIO exit it would end up in platform.c:844 before it gets to
> hvm_do_resume?  Or are you saying that we should crash as soon as the
> guest restarts, because that''s done through hvm_do_resume?
Exactly. hvm_do_resume() should always be executed when an HVM VCPU gets
scheduled onto a physical CPU (it''s part of the schedule_tail). So it
should
execute before your first vmentry, and hence before your first vmexit. And
shared_page_va is sticky: once it''s set it stays set. It
shouldn''t ever get
zapped to zero while a guest is running.
> Would a check for zero in get_vio() with domain_crash_synchronous() be a
> "good thing" here, or is that too time-consuming in a relatively
> time-critical path of HVM?
Could do. But hvm_do_resume() is the one to concentrate on.

 -- Keir
> I will look at it on Monday (before I update to the new version, just to
> make sure I can reproduce it still ;-) ).


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-16 17:20 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> Could do. But hvm_do_resume() is the one to concentrate on.
> 
>  -- Keir
> 
> > I will look at it on Monday (before I update to the new 
> version, just to
> > make sure I can reproduce it still ;-) ).
> Ok, so some further checks, and it looks like address zero (0-0xfff) is
mapped in, read/write. I haven''t looked at the page-table, just tried
to
write to the pointer that is zero and it didn''t "crash". 

Any thoughts on where I should head off on this for tomorrow. 

[I''m sorry I didn''t get more done on this, but I forgot on
Friday that I
had a meetings most of the day today [one with a customer from 10AM to
3PM, and then two internal AMD meetings of around 1hr each]. 

--
Mats



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-16 17:42 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 16/4/07 18:20, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
>>> I will look at it on Monday (before I update to the new
>> version, just to
>>> make sure I can reproduce it still ;-) ).
>> 
> Ok, so some further checks, and it looks like address zero (0-0xfff) is
> mapped in, read/write. I haven''t looked at the page-table, just
tried to
> write to the pointer that is zero and it didn''t "crash".
> 
> Any thoughts on where I should head off on this for tomorrow.
What sub-arch is Xen built for (32, pae, 64)? Are you using NPT or shadow
mode? I''d suggest write a bit of code to dump %cr3, and check e.g., are
you
running on the v->arch.monitor_table. Dump all entries in the top-level page
directory -- are they all populated, or is entry 0 the only lowmem one to be
populated? Dump the pagetable walk all the way down to the mapping of
address 0: what machine address is mapped there?

I.e., basically just dump some interesting stuff and let''s narrow it
down
from there.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-17 08:54 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 16 April 2007 18:43
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> On 16/4/07 18:20, "Petersson, Mats"
<Mats.Petersson@amd.com> wrote:
> 
> >>> I will look at it on Monday (before I update to the new
> >> version, just to
> >>> make sure I can reproduce it still ;-) ).
> >> 
> > Ok, so some further checks, and it looks like address zero 
> (0-0xfff) is
> > mapped in, read/write. I haven''t looked at the page-table, 
> just tried to
> > write to the pointer that is zero and it didn''t
"crash".
> > 
> > Any thoughts on where I should head off on this for tomorrow.
> 
> What sub-arch is Xen built for (32, pae, 64)? Are you using 
> NPT or shadow
> mode? 
64-bit, shadow mode. [Haven''t even got a machine with NPT :-(]
> I''d suggest write a bit of code to dump %cr3, and check 
> e.g., are you
> running on the v->arch.monitor_table. Dump all entries in the 
> top-level page
> directory -- are they all populated, or is entry 0 the only 
> lowmem one to be
> populated? Dump the pagetable walk all the way down to the mapping of
> address 0: what machine address is mapped there?
Whilst I agree this is a good path to go down, I''m not quite sure why
cr3 would point anywhere but to monitor_table, is there any (legal) case
where cr3 isn''t this value when in the hypervisor?

--
Mats
> 
> I.e., basically just dump some interesting stuff and let''s 
> narrow it down
> from there.
> 
>  -- Keir
> 
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-17 11:17 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 17/4/07 09:54, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
>> populated? Dump the pagetable walk all the way down to the mapping of
>> address 0: what machine address is mapped there?
> 
> Whilst I agree this is a good path to go down, I''m not quite sure
why
> cr3 would point anywhere but to monitor_table, is there any (legal) case
> where cr3 isn''t this value when in the hypervisor?
I don''t think so. But equally, Xen should never create itself a mapping
at
address zero. So it''s worth dumping some obvious things and sanity
checking
them.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-17 14:26 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 17 April 2007 12:18
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> 
> 
> 
> On 17/4/07 09:54, "Petersson, Mats"
<Mats.Petersson@amd.com> wrote:
> 
> >> populated? Dump the pagetable walk all the way down to the 
> mapping of
> >> address 0: what machine address is mapped there?
> > 
> > Whilst I agree this is a good path to go down, I''m not 
> quite sure why
> > cr3 would point anywhere but to monitor_table, is there any 
> (legal) case
> > where cr3 isn''t this value when in the hypervisor?
> 
> I don''t think so. But equally, Xen should never create itself 
> a mapping at
> address zero. So it''s worth dumping some obvious things and 
> sanity checking
> them.
Here''s some debug output, hopefully sufficiently self-explanatory:
(XEN) hvm.c:debug_stuff:125: cr3=00000000559cf000,
arch.monitor_table=0000000000
1c1920
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4[0x000] = 00000000559ce063 000000000001069c
(XEN)  L3[0x000] = 00000000559cd063 000000000001069b
(XEN)  L2[0x000] = 0000000000000000 ffffffffffffffff
(XEN) bad shared page: 0
(XEN) Trying to write to NULL!
(XEN) domain_crash_sync called from hvm.c:154
(XEN) Domain 2 (vcpu#0) crashed on cpu#1:
(XEN) ----[ Xen-3.0-unstable  x86_64  debug=n  Tainted:    C ]----
(XEN) CPU:    1
(XEN) RIP:    0010:[<00000000001022b1>]
(XEN) RFLAGS: 0000000000000297   CONTEXT: hvm
(XEN) rax: 0000000000001a18   rbx: 0000000000000000   rcx:
00000000000001f7
(XEN) rdx: 0000000000000050   rsi: 0000000000001e13   rdi:
00000000001030c2
(XEN) rbp: 00000000001058a8   rsp: 0000000000105880   r8:
0000000000000000
(XEN) r9:  0000000000000000   r10: 0000000000000000   r11:
0000000000000000
(XEN) r12: 0000000000000000   r13: 0000000000000000   r14:
0000000000000000
(XEN) r15: 0000000000000000   cr0: 0000000000050033   cr4:
0000000000000640
(XEN) cr3: 0000000000000000   cr2: 0000000000000000
(XEN) ds: 0008   es: 0008   fs: 0008   gs: 0008   ss: 0008   cs: 0010
(XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, error -22, line
178

[The RIP of the guest indicates that it''s at a HLT - which is very much
the expected place to be]. 

arch.monitor_table looks VERY different from CR3, but maybe I''ve done
something wrong when transforming from virtual to physical address, or
some such?

--
Mats> 
>  -- Keir
> 
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-17 14:41 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 17/4/07 15:26, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
> Here''s some debug output, hopefully sufficiently self-explanatory:
> (XEN) hvm.c:debug_stuff:125: cr3=00000000559cf000,
> arch.monitor_table=0000000000
> 1c1920
> (XEN) Pagetable walk from 0000000000000000:
> (XEN)  L4[0x000] = 00000000559ce063 000000000001069c
> (XEN)  L3[0x000] = 00000000559cd063 000000000001069b
> (XEN)  L2[0x000] = 0000000000000000 ffffffffffffffff
Nothing mapped at address 0 according to that walk. If you try to write
address 0 immediately before the walk, and that doesn''t crash, yet you
get
the above walk, something weird is going on! cr3 being != monitor_table is
also rather strange, but it''s worth probing into why the write of
address
zero isn''t crashing first.

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-17 15:49 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 17 April 2007 15:41
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> 
> 
> 
> On 17/4/07 15:26, "Petersson, Mats"
<Mats.Petersson@amd.com> wrote:
> 
> > Here''s some debug output, hopefully sufficiently
self-explanatory:
> > (XEN) hvm.c:debug_stuff:125: cr3=00000000559cf000,
> > arch.monitor_table=0000000000
> > 1c1920
> > (XEN) Pagetable walk from 0000000000000000:
> > (XEN)  L4[0x000] = 00000000559ce063 000000000001069c
> > (XEN)  L3[0x000] = 00000000559cd063 000000000001069b
> > (XEN)  L2[0x000] = 0000000000000000 ffffffffffffffff
Got another one that looks like this:
(XEN) About to write to NULL
(XEN) Done
(XEN) Pagetable walk from 0000000000000000:
(XEN)  L4[0x000] = 00000000472ea063 000000000000f6ea
(XEN)  L3[0x000] = 00000000472e9063 000000000000f6e9
(XEN)  L2[0x000] = 00000000472e8067 000000000000f6e8
(XEN)  L1[0x000] = 00000000485ae067 0000000000000000
> 
> Nothing mapped at address 0 according to that walk. If you 
> try to write
> address 0 immediately before the walk, and that doesn''t 
> crash, yet you get
> the above walk, something weird is going on! cr3 being != 
> monitor_table is
> also rather strange, but it''s worth probing into why the 
> write of address
> zero isn''t crashing first.
Just to make sure, monitor_table contains a virtual address, right, and
it should be made into a physical address, what is the right way to do
that? I''m using virt_to_maddr, but I''m not entirely sure
that''s the
right thing to do?

--
Mats> 
>  -- Keir
> 
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-17 16:16 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 17/4/07 16:49, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
> Got another one that looks like this:
> (XEN) About to write to NULL
> (XEN) Done
> (XEN) Pagetable walk from 0000000000000000:
> (XEN)  L4[0x000] = 00000000472ea063 000000000000f6ea
> (XEN)  L3[0x000] = 00000000472e9063 000000000000f6e9
> (XEN)  L2[0x000] = 00000000472e8067 000000000000f6e8
> (XEN)  L1[0x000] = 00000000485ae067 0000000000000000
Okay, I think this is expected behaviour from what I can understand of the
monitor_table logic. I''ll sort it out with Tim.

 Thanks,
 Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Petersson, Mats

2007-Apr-17 16:22 UTC

head link

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

> -----Original Message-----
> From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] 
> Sent: 17 April 2007 17:17
> To: Petersson, Mats; Tim Deegan
> Cc: xen-devel@lists.xensource.com
> Subject: Re: [Xen-devel] A different probklem with 
> save/restore on C/S 14823.
> 
> On 17/4/07 16:49, "Petersson, Mats"
<Mats.Petersson@amd.com> wrote:
> 
> > Got another one that looks like this:
> > (XEN) About to write to NULL
> > (XEN) Done
> > (XEN) Pagetable walk from 0000000000000000:
> > (XEN)  L4[0x000] = 00000000472ea063 000000000000f6ea
> > (XEN)  L3[0x000] = 00000000472e9063 000000000000f6e9
> > (XEN)  L2[0x000] = 00000000472e8067 000000000000f6e8
> > (XEN)  L1[0x000] = 00000000485ae067 0000000000000000
> 
> Okay, I think this is expected behaviour from what I can 
> understand of the
> monitor_table logic. I''ll sort it out with Tim.
And when it comes to CR3 and monitor_table, I didn''t have the right
thing in the output - I printed the ADDRESS of monitor_table, not the
actual PFN of the monitor table - changing that, and I can see that CR3
and monitor_table is the same thing (aside from one being a FN and the
other a real address). Sorry for that confusion. 

So just to confirm, you think that this should be fixed (i.e. the
null-access should not be possible), but I should test the latest to see
if save/restore works better there, as there is no need to search
further for the actual cause of the "write to zero is possible"
problem?

--
Mats> 
>  Thanks,
>  Keir
> 
> 
> 
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Keir Fraser

2007-Apr-17 16:35 UTC

head link

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

On 17/4/07 17:22, "Petersson, Mats" <Mats.Petersson@amd.com>
wrote:
> So just to confirm, you think that this should be fixed (i.e. the
> null-access should not be possible), but I should test the latest to see
> if save/restore works better there, as there is no need to search
> further for the actual cause of the "write to zero is possible"
problem?
Yeah, I''m pretty sure this behaviour is intentional. Slot 0 of the
monitor_table top-level table points at a shadow linear map. So feel free to
upgrade your tree now -- no more tracking down to be done. :-)

 -- Keir



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Maybe Matching Threads

Search for more seemingly similar threads

Xen devel - Apr 2007 - A different probklem with save/restore on C/S 14823.

[Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

RE: [Xen-devel] A different probklem with save/restore on C/S 14823.

Re: [Xen-devel] A different probklem with save/restore on C/S 14823.

Maybe Matching Threads