Petersson, Mats
2006-Feb-09 15:48 UTC
[Xen-devel] X86_64 "assert" when booting 64-bit image.
I can boot a 64-bit HVM image on my SVM-system, and it goes quite a long way before it crashes (due to not finding the hard-disk image and thus no root FS - which I think may well be the configuration file). But if I set debug=y in the Rules.mk, I don''t get very far at all - it crashes well before it gets to any user-mode (Ring3) - in fact, I think it''s crashing pretty much as soon as it''s beginning to set up page-tables on the guest. See dump below. Also copy of source code enclosed. The same binary works just fine when running 32-bit HVM guest. My real question is, is the "assert" stuff in shadow.h:1222 valid? If it is valid, what is actually being checked? [yes, I have read the code, but I confess I don''t understand enough about the internal workings of the shadow page-table to know what''s going on in this bit of code]. The code is based on XenSource changeset 8677:b721455c5e44. -- Mats (XEN) d->id=1 gpfn=0 gmfn=567a7 stype=a0000000 c=80000004 t=00000000f0000002 mfn_out_of_sync(gmfn)=0 mfn_is_page_table(gmfn)=0 (XEN) BUG at /root/xen-unstable-x64.hg/xen/include/asm/shadow.h:1222 (XEN) CPU: 0 (XEN) RIP: e010:[<ffff830000185ab8>] (XEN) RFLAGS: 0000000000010286 CONTEXT: hypervisor (XEN) rax: ffff8300001c7984 rbx: 0000000000000000 rcx: 00000000000017d7 (XEN) rdx: 0000000000000000 rsi: 0000000000000001 rdi: ffff8300001c7984 (XEN) rbp: ffff8300001e3b38 rsp: ffff8300001e3ac0 r8: 00000000ffffffff (XEN) r9: 00000000ffffffff r10: ffff8300001eeb3f r11: ffff8300001ee77a (XEN) r12: 0000000000000001 r13: 0000010100000000 r14: 0000010000000000 (XEN) r15: ffffffff80395fe0 cr0: 000000008005003b cr3: 000000000fe5c000 (XEN) Xen stack trace from rsp=ffff8300001e3ad8: (XEN) 00000000f0000002 0000000000000000 0000000000000000 00000000000567a7 (XEN) 00000000000567a7 000000080018f51e 00000000000567a7 00000000a0000000 (XEN) 0000000000000000 ffff830000fd0080 0000000000000000 ffffffff8035521f (XEN) ffff8300001e3b88 ffff83000019000f ffff83000026c0e0 0000000000000000 (XEN) 00000000000000e3 0000000040000000 0000000200000104 ffff8300001e3ba8 (XEN) ffff8300001e3b98 ffff830000fd0080 ffff8300001e3be8 ffff83000018bea1 (XEN) 00000000000000e3 0000000000057b6f 0000000000057b6f 000000000000fa54 (XEN) 0000000000000000 0000000040000000 0000000200000104 ffff83000fa540a0 (XEN) ffff830057b6f0a0 ffff830000fd0080 ffff8300001e3cc8 ffff830000187d43 (XEN) 0000000400db49e8 0000000040000000 00000000e0000008 00000004f0000002 (XEN) ffff83000fa4d000 00000000e0000008 ffff830057b6f000 ffff83000fa54000 (XEN) e0000007e0000008 ffffffff00db49d0 ffff8300001e3c68 00000000000001ff (XEN) 00000000000001ff 0000000000000000 0000000000000000 0000000000000020 (XEN) ffff83000fa4d000 ffff83000fa54000 ffff830057b6f000 000000000000fa54 (XEN) 00000014001e3cc8 ffff830000264128 40000000001e3cc8 ffff830000fd0080 (XEN) ffffffffffffffff 0000000000000000 ffff8300001e3d38 ffff8300001881ef (XEN) ffff800000001d10 0000000000000000 0000000000000000 ffff800000000810 (XEN) 0000000057b72101 00000008001e3df8 0000000057b72103 ffff8300001e3d58 (XEN) ffff83000fa56808 00000001001811d2 0000000000000000 ffff830000fd0080 (XEN) ffff8300001e3d58 ffff830000190aec ffff800000000808 ffff830000fd0080 (XEN) Xen call trace: (XEN) [<ffff830000185ab8>] __shadow_status+0x2f0/0x31e (XEN) [<ffff83000019000f>] entry_propagate_from_guest+0x97/0xe6 (XEN) [<ffff83000018bea1>] validate_entry_change+0x41/0x125 (XEN) [<ffff830000187d43>] resync_all+0x5e3/0x86c (XEN) [<ffff8300001881ef>] sync_all+0x223/0x29d (XEN) [<ffff830000190aec>] __shadow_sync_all+0x21/0x23 (XEN) [<ffff830000181395>] shadow_sync_all+0x86/0x176 (XEN) [<ffff83000017ebe1>] mov_to_cr+0x217/0x4d6 (XEN) [<ffff83000017f122>] svm_cr_access+0x282/0x4d7 (XEN) [<ffff8300001809b6>] svm_vmexit_handler+0x75f/0x9ff (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) CPU0 FATAL TRAP: vector = 6 (invalid operand) (XEN) [error_code=0000] (XEN) **************************************** Selected section from "hg annotate xen/include/asm-x86/shadow.h |grep -n $ | grep 12[0-9][0-9]" 1213:4239: #ifndef NDEBUG 1214:4799: if ( ___shadow_status(d, gpfn, stype) != 0 ) 1215:4799: { 1216:6039: printk("d->id=%d gpfn=%lx gmfn=%lx stype=%lx c=%x t=%" P\ Rtype_info " " 1217:4799: "mfn_out_of_sync(gmfn)=%d mfn_is_page_table(gmfn)\ =%d\n", 1218:4877: d->domain_id, gpfn, gmfn, stype, 1219:8400: pfn_to_page(gmfn)->count_info, 1220:8400: pfn_to_page(gmfn)->u.inuse.type_info, 1221:4799: mfn_out_of_sync(gmfn), mfn_is_page_table(gmfn)); 1222:4799: BUG(); 1223:4799: } 1224:4799: 1225:4799: // Undo the affects of the above call to ___shadow_status()''\ s perf 1226:4799: // counters, since that call is really just part of an asser\ tion. 1227:4239: // 1228:4239: perfc_decrc(shadow_status_calls); 1229:4239: perfc_decrc(shadow_status_miss); 1230:4239: #endif _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-09 16:03 UTC
Re: [Xen-devel] X86_64 "assert" when booting 64-bit image.
On 9 Feb 2006, at 15:48, Petersson, Mats wrote:> My real question is, is the "assert" stuff in shadow.h:1222 valid? If > it > is valid, what is actually being checked? > > [yes, I have read the code, but I confess I don''t understand enough > about the internal workings of the shadow page-table to know what''s > going on in this bit of code]. > > The code is based on XenSource changeset 8677:b721455c5e44.That debug code is totally ancient. Ian may know whether it has any relevance any more. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Feb-09 16:16 UTC
RE: [Xen-devel] X86_64 "assert" when booting 64-bit image.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 09 February 2006 16:03 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] X86_64 "assert" when booting 64-bit image. > > > On 9 Feb 2006, at 15:48, Petersson, Mats wrote: > > > My real question is, is the "assert" stuff in shadow.h:1222 > valid? If > > it is valid, what is actually being checked? > > > > [yes, I have read the code, but I confess I don''t understand enough > > about the internal workings of the shadow page-table to know what''s > > going on in this bit of code]. > > > > The code is based on XenSource changeset 8677:b721455c5e44. > > That debug code is totally ancient. Ian may know whether it > has any relevance any more.Thanks Keir. I #if 0''d out the test and it flies through the rest of the stuff until the point where it got without debug. But I don''t think the checking code was added purely because it seemed fun to add in the first place, so I''m still a bit concerned that it may actually be pointing at something that causes a problem... Is it really safe to remove it? -- Mats> > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-09 16:30 UTC
Re: [Xen-devel] X86_64 "assert" when booting 64-bit image.
On 9 Feb 2006, at 16:16, Petersson, Mats wrote:>> That debug code is totally ancient. Ian may know whether it >> has any relevance any more. > > Thanks Keir. > > I #if 0''d out the test and it flies through the rest of the stuff until > the point where it got without debug. But I don''t think the checking > code was added purely because it seemed fun to add in the first place, > so I''m still a bit concerned that it may actually be pointing at > something that causes a problem... Is it really safe to remove it?Mats, I think that the correct thing to do is to remove that whole middle portion of __shadow_status(). That is, the entire outermost ''if'' statement. (That is, the ''if ( VALID_MFN()....'' all the way to ''return 0; }''). Can you please try that out and see how it works for you? I actually think there is another problem here. PGT_fl1_shadow shadow pages are looked up by the first guest pfn in that superpage extent, but that first guest pfn may itself be a pagetable page, and no pfn can currently have more than one ''shadow status''. That needs more investigation though... -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Feb-09 16:52 UTC
RE: [Xen-devel] X86_64 "assert" when booting 64-bit image.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 09 February 2006 16:31 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] X86_64 "assert" when booting 64-bit image. > > > On 9 Feb 2006, at 16:16, Petersson, Mats wrote: > > >> That debug code is totally ancient. Ian may know whether > it has any > >> relevance any more. > > > > Thanks Keir. > > > > I #if 0''d out the test and it flies through the rest of the stuff > > until the point where it got without debug. But I don''t think the > > checking code was added purely because it seemed fun to add in the > > first place, so I''m still a bit concerned that it may actually be > > pointing at something that causes a problem... Is it really > safe to remove it? > > Mats, > > I think that the correct thing to do is to remove that whole > middle portion of __shadow_status(). That is, the entire > outermost ''if'' > statement. (That is, the ''if ( VALID_MFN()....'' all the way > to ''return 0; }''). > > Can you please try that out and see how it works for you?I''ve done that [I did it first using a #if 0, but I''ve now hit the "delete" key for it...] - Patch attached. I don''t know if I need to add this for removing existing lines of code: Signed off by: Mats Petersson mats.petersson@amd.com -- Mats> > I actually think there is another problem here. > PGT_fl1_shadow shadow pages are looked up by the first guest > pfn in that superpage extent, but that first guest pfn may > itself be a pagetable page, and no pfn can currently have > more than one ''shadow status''. That needs more investigation though... > > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-09 17:06 UTC
Re: [Xen-devel] X86_64 "assert" when booting 64-bit image.
On 9 Feb 2006, at 16:52, Petersson, Mats wrote:>> I think that the correct thing to do is to remove that whole >> middle portion of __shadow_status(). That is, the entire >> outermost ''if'' >> statement. (That is, the ''if ( VALID_MFN()....'' all the way >> to ''return 0; }''). >> >> Can you please try that out and see how it works for you? > > I''ve done that [I did it first using a #if 0, but I''ve now hit the > "delete" key for it...] - Patch attached. > > I don''t know if I need to add this for removing existing lines of code: > Signed off by: Mats Petersson mats.petersson@amd.comNo, please try removing the if statement *as well*. That whole conditional return of 0 should go away -- the only return statement in that function should be the one on the final final of the function. Let me know how that works out. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Petersson, Mats
2006-Feb-09 17:16 UTC
RE: [Xen-devel] X86_64 "assert" when booting 64-bit image.
> -----Original Message----- > From: Keir Fraser [mailto:Keir.Fraser@cl.cam.ac.uk] > Sent: 09 February 2006 17:06 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: Re: [Xen-devel] X86_64 "assert" when booting 64-bit image. > > > On 9 Feb 2006, at 16:52, Petersson, Mats wrote: > > >> I think that the correct thing to do is to remove that > whole middle > >> portion of __shadow_status(). That is, the entire outermost ''if'' > >> statement. (That is, the ''if ( VALID_MFN()....'' all the way to > >> ''return 0; }''). > >> > >> Can you please try that out and see how it works for you? > > > > I''ve done that [I did it first using a #if 0, but I''ve now hit the > > "delete" key for it...] - Patch attached. > > > > I don''t know if I need to add this for removing existing > lines of code: > > Signed off by: Mats Petersson mats.petersson@amd.com > > No, please try removing the if statement *as well*. That > whole conditional return of 0 should go away -- the only > return statement in that function should be the one on the > final final of the function. Let me know how that works out.So this is an updated patch... I can''t tell any difference - but then I''m not sure what type of cases it''s trying to catch... ;-) -- Mats> > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-09 17:27 UTC
Re: [Xen-devel] X86_64 "assert" when booting 64-bit image.
On 9 Feb 2006, at 17:16, Petersson, Mats wrote:>> No, please try removing the if statement *as well*. That >> whole conditional return of 0 should go away -- the only >> return statement in that function should be the one on the >> final final of the function. Let me know how that works out. > > So this is an updated patch... > > I can''t tell any difference - but then I''m not sure what type of cases > it''s trying to catch... ;-)Yes, that''s the patch I wanted. You tested it and it works? -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Nakajima, Jun
2006-Feb-09 17:44 UTC
RE: [Xen-devel] X86_64 "assert" when booting 64-bit image.
Keir Fraser wrote:> On 9 Feb 2006, at 17:16, Petersson, Mats wrote: > >>> No, please try removing the if statement *as well*. That >>> whole conditional return of 0 should go away -- the only >>> return statement in that function should be the one on the >>> final final of the function. Let me know how that works out. >> >> So this is an updated patch... >> >> I can''t tell any difference - but then I''m not sure what type of >> cases it''s trying to catch... ;-) > > Yes, that''s the patch I wanted. You tested it and it works? > > -- Keir > >I think that particular check is valid; it''s basically saying "we shadow page tables only". I feel mfn_is_page_table can be wrong with the type PGT_fl1_shadow, which is bigger than PGT_l4_shadow. Let me check and come back. I slightly remember I needed to kill the check when I was debugging the code, but not sure if it was triggered by a bug. if ( VALID_MFN(mfn) && mfn_valid(mfn) && (stype != PGT_writable_pred) && ((stype == PGT_snapshot) ? !mfn_out_of_sync(mfn) : !mfn_is_page_table(mfn)) ) { perfc_incrc(shadow_status_shortcut); #ifndef NDEBUG if ( ___shadow_status(d, gpfn, stype) != 0 ) { printk("d->id=%d gpfn=%lx mfn=%lx stype=%lx c=%x t=%" PRtype_info " " "mfn_out_of_sync(mfn)=%d mfn_is_page_table(mfn)=%d\n", d->domain_id, gpfn, mfn, stype, mfn_to_page(mfn)->count_info, mfn_to_page(mfn)->u.inuse.type_info, mfn_out_of_sync(mfn), mfn_is_page_table(mfn)); BUG(); } // Undo the affects of the above call to ___shadow_status()''s perf // counters, since that call is really just part of an assertion. // perfc_decrc(shadow_status_calls); perfc_decrc(shadow_status_miss); #endif return 0; } Jun --- Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Feb-09 17:56 UTC
Re: [Xen-devel] X86_64 "assert" when booting 64-bit image.
On 9 Feb 2006, at 17:44, Nakajima, Jun wrote:> I think that particular check is valid; it''s basically saying "we > shadow > page tables only". I feel mfn_is_page_table can be wrong with the type > PGT_fl1_shadow, which is bigger than PGT_l4_shadow. Let me check and > come back. I slightly remember I needed to kill the check when I was > debugging the code, but not sure if it was triggered by a bug.Yeah, you killed the check by calling ___shadow_status() directly in one place, but there''s also a path via sync_all which still was bad. I just killed the whole function and renamed ___shadow_status() to take its place. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Nakajima, Jun
2006-Feb-09 18:07 UTC
RE: [Xen-devel] X86_64 "assert" when booting 64-bit image.
Keir Fraser wrote:> On 9 Feb 2006, at 17:44, Nakajima, Jun wrote: > >> I think that particular check is valid; it''s basically saying "we >> shadow page tables only". I feel mfn_is_page_table can be wrong with >> the type PGT_fl1_shadow, which is bigger than PGT_l4_shadow. Let me >> check and come back. I slightly remember I needed to kill the check >> when I was debugging the code, but not sure if it was triggered by a >> bug. > > Yeah, you killed the check by calling ___shadow_status() directly in > one place, but there''s also a path via sync_all which still was bad. > > I just killed the whole function and renamed ___shadow_status() to > take its place. > > -- KeirSounds sensible. That check was overkill for the non-debug Xen. I checked if we can remove the whole section, and it worked fine (i.e. 64-bit hvm guest on the 64-bit Xen). Jun --- Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Nakajima, Jun
2006-Feb-09 18:48 UTC
RE: [Xen-devel] X86_64 "assert" when booting 64-bit image.
Nakajima, Jun wrote:> Keir Fraser wrote: >> On 9 Feb 2006, at 17:44, Nakajima, Jun wrote: >> >>> I think that particular check is valid; it''s basically saying "we >>> shadow page tables only". I feel mfn_is_page_table can be wrong with >>> the type PGT_fl1_shadow, which is bigger than PGT_l4_shadow. Let me >>> check and come back. I slightly remember I needed to kill the check >>> when I was debugging the code, but not sure if it was triggered by a >>> bug. >> >> Yeah, you killed the check by calling ___shadow_status() directly in >> one place, but there''s also a path via sync_all which still was bad. >> >> I just killed the whole function and renamed ___shadow_status() to >> take its place. >> >> -- Keir > > Sounds sensible. That check was overkill for the non-debug Xen. I > checked if we can remove the whole section, and it worked fine (i.e. > 64-bit hvm guest on the 64-bit Xen).BTW, the xen built with debug=y also worked fine with this.> > Jun > --- > Intel Open Source Technology CenterJun --- Intel Open Source Technology Center _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel