Petersson, Mats
2007-Apr-13 16:24 UTC
[Xen-devel] A different probklem with save/restore on C/S 14823.
I''m not seeing the problem that Fan Zhao is reporting, instead I get this one. Not sure if ti''s the same one or a different problem... This happens with my simple-guest [i.e. not using hvmloader, as I described before]. This worked fine yesterday. (XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, error -22, line 178 (XEN) bad shared page: 0 (XEN) domain_crash_sync called from platform.c:844 (XEN) Domain 20 (vcpu#0) crashed on cpu#1: (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Tainted: C ]---- (XEN) CPU: 1 (XEN) RIP: 0010:[<0000000000102638>] (XEN) RFLAGS: 0000000000000002 CONTEXT: hvm (XEN) rax: 0000000000002520 rbx: 000000000000234e rcx: 0000000000107180 (XEN) rdx: 0000000000000020 rsi: 0000000000105b40 rdi: 000000000000234d (XEN) rbp: 0000000000105844 rsp: 000000000010582c r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 0000000000050033 cr4: 0000000000000640 (XEN) cr3: 0000000000000000 cr2: 0000000000000000 (XEN) ds: 0008 es: 0008 fs: 0008 gs: 0008 ss: 0008 cs: 0010 -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2007-Apr-13 16:35 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
Hi, At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote:> I''m not seeing the problem that Fan Zhao is reporting, instead I get > this one. Not sure if ti''s the same one or a different problem... This > happens with my simple-guest [i.e. not using hvmloader, as I described > before]. This worked fine yesterday.This looks like the same problem (but caught in Xen instead of crashing). The restore path isn''t setting the ioreq page''s PFN properly. Have you reinstalled your tools (in particular libxenguest) since cset 14830:e3b3800c769a ? Cheers, Tim.> (XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, error -22, line > 178 > (XEN) bad shared page: 0 > (XEN) domain_crash_sync called from platform.c:844 > (XEN) Domain 20 (vcpu#0) crashed on cpu#1:-- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tim Deegan
2007-Apr-13 16:40 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
At 17:35 +0100 on 13 Apr (1176485709), Tim Deegan wrote:> Have you reinstalled your tools (in particular libxenguest) > since cset 14830:e3b3800c769a ?...which hasn''t made it out of the staging tree yet; my apologies. This issue should be fixed when it does. Tim. -- Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited Registered office c/o EC2Y 5EB, UK; company number 05334508 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-13 16:43 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 13/4/07 17:35, "Tim Deegan" <Tim.Deegan@xensource.com> wrote:> At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote: >> I''m not seeing the problem that Fan Zhao is reporting, instead I get >> this one. Not sure if ti''s the same one or a different problem... This >> happens with my simple-guest [i.e. not using hvmloader, as I described >> before]. This worked fine yesterday. > > This looks like the same problem (but caught in Xen instead of > crashing). The restore path isn''t setting the ioreq page''s PFN > properly. Have you reinstalled your tools (in particular libxenguest) > since cset 14830:e3b3800c769a ?It is also somewhat odd that Xen got a chance to catch the problem (probably the printed guest EIP is an I/O port operation? In which case Xen caught the problem in send_pio_req), rather than crashing in hvm_do_resume() with a NULL pointer dereference, which is what Fan Zhao saw. Either the guest started executing without passing through hvm_do_resume(), or there was a valid page mapping at address 0 in Xen''s address space when you executed hvm_do_resume(). Neither of these possibilities is good. It might be worth doing a bit of digging to find out why you didn''t repro the exact same crash as Fan Zhao. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-13 16:47 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 13 April 2007 17:43 > To: Tim Deegan; Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > On 13/4/07 17:35, "Tim Deegan" <Tim.Deegan@xensource.com> wrote: > > > At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote: > >> I''m not seeing the problem that Fan Zhao is reporting, > instead I get > >> this one. Not sure if ti''s the same one or a different > problem... This > >> happens with my simple-guest [i.e. not using hvmloader, as > I described > >> before]. This worked fine yesterday. > > > > This looks like the same problem (but caught in Xen instead of > > crashing). The restore path isn''t setting the ioreq page''s PFN > > properly. Have you reinstalled your tools (in particular > libxenguest) > > since cset 14830:e3b3800c769a ? > > It is also somewhat odd that Xen got a chance to catch the > problem (probably > the printed guest EIP is an I/O port operation? In which case > Xen caught the > problem in send_pio_req), rather than crashing in > hvm_do_resume() with a > NULL pointer dereference, which is what Fan Zhao saw. Either the guest > started executing without passing through hvm_do_resume(), or > there was a > valid page mapping at address 0 in Xen''s address space when > you executed > hvm_do_resume(). Neither of these possibilities is good. It > might be worth > doing a bit of digging to find out why you didn''t repro the > exact same crash > as Fan Zhao.See my other reply, although you may have a point about mapping - my guest is running with the HVMloaders map, which probably maps all memory available to guest linearly, including address zero (as that''s where real-mode puts the interrupt vector table, which can be useful to have mapped - just a little bit ;-) ). So maybe we need an earlier/different test to kill guest? Or do you think this is such a critical error that hypervisor should die? -- Mats> > -- Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-13 16:47 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Tim Deegan [mailto:Tim.Deegan@xensource.com] > Sent: 13 April 2007 17:35 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > Hi, > > At 18:24 +0200 on 13 Apr (1176488676), Petersson, Mats wrote: > > I''m not seeing the problem that Fan Zhao is reporting, instead I get > > this one. Not sure if ti''s the same one or a different > problem... This > > happens with my simple-guest [i.e. not using hvmloader, as > I described > > before]. This worked fine yesterday. > > This looks like the same problem (but caught in Xen instead of > crashing). The restore path isn''t setting the ioreq page''s PFN > properly. Have you reinstalled your tools (in particular > libxenguest) > since cset 14830:e3b3800c769a ?After some more testing, I would agree that it''s probably the same problem - I found that I could reproduce the same problem if I use a sles93 guest rather than the simple guest - not sure why it''s different (perhaps because the sles guest uses MMIO and my simple guest never uses MMIO?) It would perhaps be nice to understand why one type of guest only crashes the guest (kind of not so bad) and the other crashes the hypervisor (quite bad) and make BOTH crash the guest only? [Not that it REALLY fixes a problem, but leaving the hypervisor running if the guest is broken is a good thing, I think. [perhaps not a priority for 3.0.5, but I think it should be fixed - I''ll have a look to see if I can understand what''s going on and I can cobble together something that catches both cases]. I''m always a bit wary of using (wholesale) changesets from the staging tree, under the assumption that if they don''t pass the basic testing, it''s quite possible that it won''t work for me either. Obviously, if I understand a particular change, and I think it won''t need a whole bunch of others, I may pick a single (or small subset) of staging for test purposes. But 14823 is the latest available on unstable. [You must be mind-reading, as you just apologized for not realizing it''s not out of staging yet]. -- Mats> > Cheers, > > Tim. > > > (XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, > error -22, line > > 178 > > (XEN) bad shared page: 0 > > (XEN) domain_crash_sync called from platform.c:844 > > (XEN) Domain 20 (vcpu#0) crashed on cpu#1: > > -- > Tim Deegan <Tim.Deegan@xensource.com>, XenSource UK Limited > Registered office c/o EC2Y 5EB, UK; company number 05334508 > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-13 16:55 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 13/4/07 17:47, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:> See my other reply, although you may have a point about mapping - my > guest is running with the HVMloaders map, which probably maps all memory > available to guest linearly, including address zero (as that''s where > real-mode puts the interrupt vector table, which can be useful to have > mapped - just a little bit ;-) ). > > So maybe we need an earlier/different test to kill guest? Or do you > think this is such a critical error that hypervisor should die?The NULL dereference is inside the hypervisor in hvm_do_resume(). At that point you are running in Xen''s address space, not the guest''s. And Xen should have no mapping at address zero. The issue here is that shared_page_va is not initialised, so it contains 0. hvm_do_resume() should be getting a pointer derived from this value via get_vio(). When it dereferences it, Xen should crash. That didn''t happen for you and that is scarily inexplicable. I suggest adding some tracing to hvm_do_resume() to find out whether it is being called at all and, if it is, what value it sets its local variable ''p'' to. Also what value is in v->domain->arch.hvm-domain.shared_page_va. The bugs that cause this condition should all be fixed in xen-unstable staging tip, by the way. I just think this situation should be investigated before you upgrade in case you''ve uncovered another latent bug. Because you really should be crashing in hvm_do_resume() in this scenario. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-13 17:24 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 13 April 2007 17:56 > To: Petersson, Mats; Tim Deegan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > On 13/4/07 17:47, "Petersson, Mats" <Mats.Petersson@amd.com> wrote: > > > See my other reply, although you may have a point about mapping - my > > guest is running with the HVMloaders map, which probably > maps all memory > > available to guest linearly, including address zero (as that''s where > > real-mode puts the interrupt vector table, which can be > useful to have > > mapped - just a little bit ;-) ). > > > > So maybe we need an earlier/different test to kill guest? Or do you > > think this is such a critical error that hypervisor should die? > > The NULL dereference is inside the hypervisor in > hvm_do_resume(). At that > point you are running in Xen''s address space, not the guest''s. And Xen > should have no mapping at address zero.Yes, of course - me not thinking right - sorry [it is late-ish on a Friday, that''s my excuse and I''m sticking to it].> > The issue here is that shared_page_va is not initialised, so > it contains 0. > hvm_do_resume() should be getting a pointer derived from this > value via > get_vio(). When it dereferences it, Xen should crash. That > didn''t happen for > you and that is scarily inexplicable.Yes, I follow that. However, my guest does A LOT of IOIO exits (it''s an IDE test-app), with some HLT and IRQ exits thrown in for good measure. So if the guest is doing IOIO exit it would end up in platform.c:844 before it gets to hvm_do_resume? Or are you saying that we should crash as soon as the guest restarts, because that''s done through hvm_do_resume?> > I suggest adding some tracing to hvm_do_resume() to find out > whether it is > being called at all and, if it is, what value it sets its > local variable ''p'' > to. Also what value is in v->domain->arch.hvm-domain.shared_page_va.Would a check for zero in get_vio() with domain_crash_synchronous() be a "good thing" here, or is that too time-consuming in a relatively time-critical path of HVM? I will look at it on Monday (before I update to the new version, just to make sure I can reproduce it still ;-) ).> > The bugs that cause this condition should all be fixed in xen-unstable > staging tip, by the way. I just think this situation should > be investigated > before you upgrade in case you''ve uncovered another latent > bug. Because you > really should be crashing in hvm_do_resume() in this scenario. > > -- Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-13 17:34 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 13/4/07 18:24, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:> However, my guest does A LOT of IOIO exits (it''s an IDE test-app), with > some HLT and IRQ exits thrown in for good measure. So if the guest is > doing IOIO exit it would end up in platform.c:844 before it gets to > hvm_do_resume? Or are you saying that we should crash as soon as the > guest restarts, because that''s done through hvm_do_resume?Exactly. hvm_do_resume() should always be executed when an HVM VCPU gets scheduled onto a physical CPU (it''s part of the schedule_tail). So it should execute before your first vmentry, and hence before your first vmexit. And shared_page_va is sticky: once it''s set it stays set. It shouldn''t ever get zapped to zero while a guest is running.> Would a check for zero in get_vio() with domain_crash_synchronous() be a > "good thing" here, or is that too time-consuming in a relatively > time-critical path of HVM?Could do. But hvm_do_resume() is the one to concentrate on. -- Keir> I will look at it on Monday (before I update to the new version, just to > make sure I can reproduce it still ;-) )._______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-16 17:20 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> Could do. But hvm_do_resume() is the one to concentrate on. > > -- Keir > > > I will look at it on Monday (before I update to the new > version, just to > > make sure I can reproduce it still ;-) ). >Ok, so some further checks, and it looks like address zero (0-0xfff) is mapped in, read/write. I haven''t looked at the page-table, just tried to write to the pointer that is zero and it didn''t "crash". Any thoughts on where I should head off on this for tomorrow. [I''m sorry I didn''t get more done on this, but I forgot on Friday that I had a meetings most of the day today [one with a customer from 10AM to 3PM, and then two internal AMD meetings of around 1hr each]. -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-16 17:42 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 16/4/07 18:20, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:>>> I will look at it on Monday (before I update to the new >> version, just to >>> make sure I can reproduce it still ;-) ). >> > Ok, so some further checks, and it looks like address zero (0-0xfff) is > mapped in, read/write. I haven''t looked at the page-table, just tried to > write to the pointer that is zero and it didn''t "crash". > > Any thoughts on where I should head off on this for tomorrow.What sub-arch is Xen built for (32, pae, 64)? Are you using NPT or shadow mode? I''d suggest write a bit of code to dump %cr3, and check e.g., are you running on the v->arch.monitor_table. Dump all entries in the top-level page directory -- are they all populated, or is entry 0 the only lowmem one to be populated? Dump the pagetable walk all the way down to the mapping of address 0: what machine address is mapped there? I.e., basically just dump some interesting stuff and let''s narrow it down from there. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-17 08:54 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 16 April 2007 18:43 > To: Petersson, Mats; Tim Deegan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > On 16/4/07 18:20, "Petersson, Mats" <Mats.Petersson@amd.com> wrote: > > >>> I will look at it on Monday (before I update to the new > >> version, just to > >>> make sure I can reproduce it still ;-) ). > >> > > Ok, so some further checks, and it looks like address zero > (0-0xfff) is > > mapped in, read/write. I haven''t looked at the page-table, > just tried to > > write to the pointer that is zero and it didn''t "crash". > > > > Any thoughts on where I should head off on this for tomorrow. > > What sub-arch is Xen built for (32, pae, 64)? Are you using > NPT or shadow > mode?64-bit, shadow mode. [Haven''t even got a machine with NPT :-(]> I''d suggest write a bit of code to dump %cr3, and check > e.g., are you > running on the v->arch.monitor_table. Dump all entries in the > top-level page > directory -- are they all populated, or is entry 0 the only > lowmem one to be > populated? Dump the pagetable walk all the way down to the mapping of > address 0: what machine address is mapped there?Whilst I agree this is a good path to go down, I''m not quite sure why cr3 would point anywhere but to monitor_table, is there any (legal) case where cr3 isn''t this value when in the hypervisor? -- Mats> > I.e., basically just dump some interesting stuff and let''s > narrow it down > from there. > > -- Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-17 11:17 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 17/4/07 09:54, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:>> populated? Dump the pagetable walk all the way down to the mapping of >> address 0: what machine address is mapped there? > > Whilst I agree this is a good path to go down, I''m not quite sure why > cr3 would point anywhere but to monitor_table, is there any (legal) case > where cr3 isn''t this value when in the hypervisor?I don''t think so. But equally, Xen should never create itself a mapping at address zero. So it''s worth dumping some obvious things and sanity checking them. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-17 14:26 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 17 April 2007 12:18 > To: Petersson, Mats; Tim Deegan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > > > > On 17/4/07 09:54, "Petersson, Mats" <Mats.Petersson@amd.com> wrote: > > >> populated? Dump the pagetable walk all the way down to the > mapping of > >> address 0: what machine address is mapped there? > > > > Whilst I agree this is a good path to go down, I''m not > quite sure why > > cr3 would point anywhere but to monitor_table, is there any > (legal) case > > where cr3 isn''t this value when in the hypervisor? > > I don''t think so. But equally, Xen should never create itself > a mapping at > address zero. So it''s worth dumping some obvious things and > sanity checking > them.Here''s some debug output, hopefully sufficiently self-explanatory: (XEN) hvm.c:debug_stuff:125: cr3=00000000559cf000, arch.monitor_table=0000000000 1c1920 (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 00000000559ce063 000000000001069c (XEN) L3[0x000] = 00000000559cd063 000000000001069b (XEN) L2[0x000] = 0000000000000000 ffffffffffffffff (XEN) bad shared page: 0 (XEN) Trying to write to NULL! (XEN) domain_crash_sync called from hvm.c:154 (XEN) Domain 2 (vcpu#0) crashed on cpu#1: (XEN) ----[ Xen-3.0-unstable x86_64 debug=n Tainted: C ]---- (XEN) CPU: 1 (XEN) RIP: 0010:[<00000000001022b1>] (XEN) RFLAGS: 0000000000000297 CONTEXT: hvm (XEN) rax: 0000000000001a18 rbx: 0000000000000000 rcx: 00000000000001f7 (XEN) rdx: 0000000000000050 rsi: 0000000000001e13 rdi: 00000000001030c2 (XEN) rbp: 00000000001058a8 rsp: 0000000000105880 r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000000 (XEN) r12: 0000000000000000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 0000000000050033 cr4: 0000000000000640 (XEN) cr3: 0000000000000000 cr2: 0000000000000000 (XEN) ds: 0008 es: 0008 fs: 0008 gs: 0008 ss: 0008 cs: 0010 (XEN) event_channel.c:178:d0 EVTCHNOP failure: domain 0, error -22, line 178 [The RIP of the guest indicates that it''s at a HLT - which is very much the expected place to be]. arch.monitor_table looks VERY different from CR3, but maybe I''ve done something wrong when transforming from virtual to physical address, or some such? -- Mats> > -- Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-17 14:41 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 17/4/07 15:26, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:> Here''s some debug output, hopefully sufficiently self-explanatory: > (XEN) hvm.c:debug_stuff:125: cr3=00000000559cf000, > arch.monitor_table=0000000000 > 1c1920 > (XEN) Pagetable walk from 0000000000000000: > (XEN) L4[0x000] = 00000000559ce063 000000000001069c > (XEN) L3[0x000] = 00000000559cd063 000000000001069b > (XEN) L2[0x000] = 0000000000000000 ffffffffffffffffNothing mapped at address 0 according to that walk. If you try to write address 0 immediately before the walk, and that doesn''t crash, yet you get the above walk, something weird is going on! cr3 being != monitor_table is also rather strange, but it''s worth probing into why the write of address zero isn''t crashing first. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-17 15:49 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 17 April 2007 15:41 > To: Petersson, Mats; Tim Deegan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > > > > On 17/4/07 15:26, "Petersson, Mats" <Mats.Petersson@amd.com> wrote: > > > Here''s some debug output, hopefully sufficiently self-explanatory: > > (XEN) hvm.c:debug_stuff:125: cr3=00000000559cf000, > > arch.monitor_table=0000000000 > > 1c1920 > > (XEN) Pagetable walk from 0000000000000000: > > (XEN) L4[0x000] = 00000000559ce063 000000000001069c > > (XEN) L3[0x000] = 00000000559cd063 000000000001069b > > (XEN) L2[0x000] = 0000000000000000 ffffffffffffffffGot another one that looks like this: (XEN) About to write to NULL (XEN) Done (XEN) Pagetable walk from 0000000000000000: (XEN) L4[0x000] = 00000000472ea063 000000000000f6ea (XEN) L3[0x000] = 00000000472e9063 000000000000f6e9 (XEN) L2[0x000] = 00000000472e8067 000000000000f6e8 (XEN) L1[0x000] = 00000000485ae067 0000000000000000> > Nothing mapped at address 0 according to that walk. If you > try to write > address 0 immediately before the walk, and that doesn''t > crash, yet you get > the above walk, something weird is going on! cr3 being != > monitor_table is > also rather strange, but it''s worth probing into why the > write of address > zero isn''t crashing first.Just to make sure, monitor_table contains a virtual address, right, and it should be made into a physical address, what is the right way to do that? I''m using virt_to_maddr, but I''m not entirely sure that''s the right thing to do? -- Mats> > -- Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-17 16:16 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 17/4/07 16:49, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:> Got another one that looks like this: > (XEN) About to write to NULL > (XEN) Done > (XEN) Pagetable walk from 0000000000000000: > (XEN) L4[0x000] = 00000000472ea063 000000000000f6ea > (XEN) L3[0x000] = 00000000472e9063 000000000000f6e9 > (XEN) L2[0x000] = 00000000472e8067 000000000000f6e8 > (XEN) L1[0x000] = 00000000485ae067 0000000000000000Okay, I think this is expected behaviour from what I can understand of the monitor_table logic. I''ll sort it out with Tim. Thanks, Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2007-Apr-17 16:22 UTC
RE: [Xen-devel] A different probklem with save/restore on C/S 14823.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 17 April 2007 17:17 > To: Petersson, Mats; Tim Deegan > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] A different probklem with > save/restore on C/S 14823. > > On 17/4/07 16:49, "Petersson, Mats" <Mats.Petersson@amd.com> wrote: > > > Got another one that looks like this: > > (XEN) About to write to NULL > > (XEN) Done > > (XEN) Pagetable walk from 0000000000000000: > > (XEN) L4[0x000] = 00000000472ea063 000000000000f6ea > > (XEN) L3[0x000] = 00000000472e9063 000000000000f6e9 > > (XEN) L2[0x000] = 00000000472e8067 000000000000f6e8 > > (XEN) L1[0x000] = 00000000485ae067 0000000000000000 > > Okay, I think this is expected behaviour from what I can > understand of the > monitor_table logic. I''ll sort it out with Tim.And when it comes to CR3 and monitor_table, I didn''t have the right thing in the output - I printed the ADDRESS of monitor_table, not the actual PFN of the monitor table - changing that, and I can see that CR3 and monitor_table is the same thing (aside from one being a FN and the other a real address). Sorry for that confusion. So just to confirm, you think that this should be fixed (i.e. the null-access should not be possible), but I should test the latest to see if save/restore works better there, as there is no need to search further for the actual cause of the "write to zero is possible" problem? -- Mats> > Thanks, > Keir > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Apr-17 16:35 UTC
Re: [Xen-devel] A different probklem with save/restore on C/S 14823.
On 17/4/07 17:22, "Petersson, Mats" <Mats.Petersson@amd.com> wrote:> So just to confirm, you think that this should be fixed (i.e. the > null-access should not be possible), but I should test the latest to see > if save/restore works better there, as there is no need to search > further for the actual cause of the "write to zero is possible" problem?Yeah, I''m pretty sure this behaviour is intentional. Slot 0 of the monitor_table top-level table points at a shadow linear map. So feel free to upgrade your tree now -- no more tracking down to be done. :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel