Hi, $subject says all. Running latest bits from the 3.0.2 testing tree, trying to boot a 3.0.3 linux guest kernel with the compat option enabled, doesn''t boot on x86_64 (32bit works ok). Kernel dies very early at boot: (XEN) ----[ Xen-3.0.2-3 Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e033:[<ffffffff8010734a>] (XEN) RFLAGS: 0000000000000246 CONTEXT: guest (XEN) rax: 0000000000000000 rbx: ffffffff7fffffff rcx: ffffffff8010734a (XEN) rdx: 0000000000000000 rsi: 0000000000000001 rdi: ffffffff8041ff70 (XEN) rbp: ffffffff8041ff90 rsp: ffffffff8041ff10 r8: 0000000000000000 (XEN) r9: 0000000000000000 r10: 0000000000007ff0 r11: 0000000000000246 (XEN) r12: ffffffff8040c000 r13: 0000000000000000 r14: 0000000000000000 (XEN) r15: 0000000000000000 cr0: 000000008005003b cr3: 0000000021bdc000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e02b cs: e033 (XEN) Guest stack trace from rsp=ffffffff8041ff10: (XEN) ffffffff8010734a 0000000000000246 0000000000000011 ffffffff8010734a (XEN) 000000010000e030 0000000000010046 ffffffff8041ff58 000000000000e02b (XEN) 0000000021bdbff8 0000000000000000 0000000000000101 ffffffff80116f9b (XEN) 0000000000000005 0000000000021bdc ffffffff8040c000 0000000000000000 (XEN) ffffffff8041ffb0 ffffffff80110424 0000000000000000 0000000000000000 (XEN) ffffffff8041fff0 ffffffff804210e1 ffff800000000000 ffff804000000000 (XEN) 00000007ffffffff 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 ### (XEN) RIP: e033:[<ffffffff8010734a>] ### (XEN) Guest stack trace from rsp=ffffffff8041ff10: ffffffff80116f9b - func ffffffff80116f4a + 51 xen_pt_switch ffffffff80110424 - func ffffffff80110357 + cd pda_init - call xen_pt_switch ffffffff804210e1 - func ffffffff80421000 + e1 x86_64_start_kernel - call pda_init -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann wrote:> Hi, > > $subject says all. Running latest bits from the 3.0.2 testing tree, > trying to boot a 3.0.3 linux guest kernel with the compat option > enabled, doesn''t boot on x86_64 (32bit works ok). Kernel dies very > early at boot:changeset 11226 is the culprit (stop setting _PAGE_USER for kernel pages). cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/11/06 16:09, "Gerd Hoffmann" <kraxel@suse.de> wrote:>> $subject says all. Running latest bits from the 3.0.2 testing tree, >> trying to boot a 3.0.3 linux guest kernel with the compat option >> enabled, doesn''t boot on x86_64 (32bit works ok). Kernel dies very >> early at boot: > > changeset 11226 is the culprit (stop setting _PAGE_USER for kernel pages).To fix this we''d need to make all the KERNPG_XXX macros into variables and poke in PAGE_USER if running on an older version of Xen. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:>>> $subject says all. Running latest bits from the 3.0.2 testing tree, >>> trying to boot a 3.0.3 linux guest kernel with the compat option >>> enabled, doesn''t boot on x86_64 (32bit works ok). Kernel dies very >>> early at boot: >> changeset 11226 is the culprit (stop setting _PAGE_USER for kernel pages). > > To fix this we''d need to make all the KERNPG_XXX macros into variables and > poke in PAGE_USER if running on an older version of Xen.As xen must be able to deal with PAGE_USER being set anyway (to deal with old guests) I''d simply make that a compile time option depending on CONFIG_XEN_COMPAT_030002, so we can avoid the extra cost of checking some variable at runtime ... What was the reason for that change btw? Just make the differences between native and paravirtualized smaller? cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/11/06 16:47, "Gerd Hoffmann" <kraxel@suse.de> wrote:>> To fix this we''d need to make all the KERNPG_XXX macros into variables and >> poke in PAGE_USER if running on an older version of Xen. > > As xen must be able to deal with PAGE_USER being set anyway (to deal > with old guests) I''d simply make that a compile time option depending on > CONFIG_XEN_COMPAT_030002, so we can avoid the extra cost of checking > some variable at runtime ... > > What was the reason for that change btw? Just make the differences > between native and paravirtualized smaller?Yes, and to allow fewer TLB entries to be flushed when switching between guest kernel and guest user. That optimisation is foiled if PAGE_USER is set everywhere. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> > Yes, and to allow fewer TLB entries to be flushed when switching between > guest kernel and guest user. That optimisation is foiled if PAGE_USER is set > everywhere.Ok, so the extra cost to decide that at runtime (if CONFIG_XEN_COMPAT_030002 is set) probably is outweighed by the tlb flush optimization ... cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 13/11/06 17:08, "Gerd Hoffmann" <kraxel@suse.de> wrote:>> Yes, and to allow fewer TLB entries to be flushed when switching between >> guest kernel and guest user. That optimisation is foiled if PAGE_USER is set >> everywhere. > > Ok, so the extra cost to decide that at runtime (if > CONFIG_XEN_COMPAT_030002 is set) probably is outweighed by the tlb flush > optimization ...Definitely! The only potential problem is I don''t know whether any code depends on those definitions being compile-time constant. If not, it should be a straightforward patch. By the way, the test of whether to poke in PAGE_USER can be done by looking at one of the initial mappings provided by the domain builder. If one of those ptes contains PAGE_USER, you know you need to use PAGE_USER for all kernel mappings. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>By the way, the test of whether to poke in PAGE_USER can be done by looking >at one of the initial mappings provided by the domain builder. If one of >those ptes contains PAGE_USER, you know you need to use PAGE_USER for all >kernel mappings.How would this work? adjust_guest_l1e() (and BASE_PROT for dom0) forces PAGE_USER on even on 3.0.3, so I still shouldn''t find any L1 page table entries not having this bit set when looking at them from kernel code. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/11/06 11:14, "Jan Beulich" <jbeulich@novell.com> wrote:>> By the way, the test of whether to poke in PAGE_USER can be done by looking >> at one of the initial mappings provided by the domain builder. If one of >> those ptes contains PAGE_USER, you know you need to use PAGE_USER for all >> kernel mappings. > > How would this work? adjust_guest_l1e() (and BASE_PROT for dom0) forces > PAGE_USER on even on 3.0.3, so I still shouldn''t find any L1 page table > entries > not having this bit set when looking at them from kernel code.Oh, good point! Okay then the more straightforward XENVER_version check will have to be used: 3.0.3 and newer, versus 3.0.2 and older. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>>> Keir Fraser <keir@xensource.com> 14.11.06 12:50 >>> >On 14/11/06 11:14, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> By the way, the test of whether to poke in PAGE_USER can be done by looking >>> at one of the initial mappings provided by the domain builder. If one of >>> those ptes contains PAGE_USER, you know you need to use PAGE_USER for all >>> kernel mappings. >> >> How would this work? adjust_guest_l1e() (and BASE_PROT for dom0) forces >> PAGE_USER on even on 3.0.3, so I still shouldn''t find any L1 page table >> entries >> not having this bit set when looking at them from kernel code. > >Oh, good point! Okay then the more straightforward XENVER_version check will >have to be used: 3.0.3 and newer, versus 3.0.2 and older.More strait forward? The .3 is part of extraversion, so in order to do a comparison one would have to parse that string (and hence make certain assumptions). That''s not nice for use in (early) feature detection. Maybe it''d be better to try and write a page table entry without PAGE_USER, and check whether that bit got turned on implicitly... Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/11/06 12:32, "Jan Beulich" <jbeulich@novell.com> wrote:>> Oh, good point! Okay then the more straightforward XENVER_version check will >> have to be used: 3.0.3 and newer, versus 3.0.2 and older. > > More strait forward? The .3 is part of extraversion, so in order to do a > comparison > one would have to parse that string (and hence make certain assumptions). > That''s > not nice for use in (early) feature detection. Maybe it''d be better to try and > write > a page table entry without PAGE_USER, and check whether that bit got turned > on implicitly...Yes, that sounds reasonable. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Oh, as I''m changing this - is there a reason for pmd_bad() to use PAGE_MASK rather than PTE_MASK? (I already agreed with Andi that pmd_bad() in native should be cleaned up to match pgd_bad/pud_bad). Jan>>> Keir Fraser <keir@xensource.com> 14.11.06 13:42 >>>On 14/11/06 12:32, "Jan Beulich" <jbeulich@novell.com> wrote:>> Oh, good point! Okay then the more straightforward XENVER_version check will >> have to be used: 3.0.3 and newer, versus 3.0.2 and older. > > More strait forward? The .3 is part of extraversion, so in order to do a > comparison > one would have to parse that string (and hence make certain assumptions). > That''s > not nice for use in (early) feature detection. Maybe it''d be better to try and > write > a page table entry without PAGE_USER, and check whether that bit got turned > on implicitly...Yes, that sounds reasonable. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Not as far as I know. On 14/11/06 13:02, "Jan Beulich" <jbeulich@novell.com> wrote:> Oh, as I''m changing this - is there a reason for pmd_bad() to use PAGE_MASK > rather than PTE_MASK? (I already agreed with Andi that pmd_bad() in native > should be cleaned up to match pgd_bad/pud_bad). Jan > >>>> Keir Fraser <keir@xensource.com> 14.11.06 13:42 >>> > On 14/11/06 12:32, "Jan Beulich" <jbeulich@novell.com> wrote: > >>> Oh, good point! Okay then the more straightforward XENVER_version check will >>> have to be used: 3.0.3 and newer, versus 3.0.2 and older. >> >> More strait forward? The .3 is part of extraversion, so in order to do a >> comparison >> one would have to parse that string (and hence make certain assumptions). >> That''s >> not nice for use in (early) feature detection. Maybe it''d be better to try >> and >> write >> a page table entry without PAGE_USER, and check whether that bit got turned >> on implicitly... > > Yes, that sounds reasonable. > > -- Keir > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich wrote:> not nice for use in (early) feature detection. Maybe it''d be better to try and write > a page table entry without PAGE_USER, and check whether that bit got turned > on implicitly...Patch attached, seems to work ok, survived quick test with ttylinux on both 3.0.3 and 3.0.2 without crashing and detected both versions correctly. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/11/06 15:11, "Gerd Hoffmann" <kraxel@suse.de> wrote:>> not nice for use in (early) feature detection. Maybe it''d be better to try >> and write >> a page table entry without PAGE_USER, and check whether that bit got turned >> on implicitly... > > Patch attached, seems to work ok, survived quick test with ttylinux on > both 3.0.3 and 3.0.2 without crashing and detected both versions correctly. > > cheers, > GerdKernel_pages_need_user_flag seems unnecessary. The one consumer of that flag could simply do ''flags |= kernel_page_user'' unconditionally. Also, is it necessary to default to 3.0.2 behaviour? Could we have kernel_page_user==0 initially and then change the value only if 3.0.2 is detected? This would provide a sanity check that check_page_user_flag() is being executed early enough. We could even set kernel_page_user to a garbage value initially, like ~0. So far we have maintained the COMPAT code to be easily entirely strippable. So the change to pmd_bad() should either use kernel_page_user rather than _PAGE_USER, or its definition should be conditional on the COMPAT flag. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> Kernel_pages_need_user_flag seems unnecessary. The one consumer of that flag > could simply do ''flags |= kernel_page_user'' unconditionally.Yep, noticed that too meanwhile ;)> Also, is it necessary to default to 3.0.2 behaviour?I''ve started coding it that way before I found the final place for the test, just to be sure it doesn''t crash on pgtable ops before the test. Defaulting to 3.0.3 behavior should work though as test happens early enough.> So far we have maintained the COMPAT code to be easily entirely strippable. > So the change to pmd_bad() should either use kernel_page_user rather than > _PAGE_USER, or its definition should be conditional on the COMPAT flag.Both ways should have the same effect as kernel_page_user is defined to 0 for the non-compat case. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>Patch attached, seems to work ok, survived quick test with ttylinux on >both 3.0.3 and 3.0.2 without crashing and detected both versions correctly.Hmm, just saw that you already posted a patch - though in addition to Keir''s remark regarding kernel_pages_need_user_flag I''d also like to point out that changing __PAGE_KERNEL_LARGE{,_EXEC} is unnecessary as those already derive from __PAGE_KERNEL{,_EXEC}. Additionally I think your change to arch/i386/mm/ioremap-xen.c makes the code more complicated than it needs to be (I know, it had been that way before). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>Also, is it necessary to default to 3.0.2 behaviour? Could we have >kernel_page_user==0 initially and then change the value only if 3.0.2 is >detected? This would provide a sanity check that check_page_user_flag() is >being executed early enough. We could even set kernel_page_user to a garbage >value initially, like ~0.I think it''s better the way Gerd had it (my patch also does it that way) - adding extra _PAGE_USER when not needed is not wrong afaics, only hurts performance, whereas missing to add it when needed would crash the kernel. While that might help verify that the check is done early enough, it doesn''t guarantee anything since certain code paths may not be taken, and so we may enjoy false safety. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Gerd Hoffmann wrote:>> So far we have maintained the COMPAT code to be easily entirely strippable. >> So the change to pmd_bad() should either use kernel_page_user rather than >> _PAGE_USER, or its definition should be conditional on the COMPAT flag. > > Both ways should have the same effect as kernel_page_user is defined to > 0 for the non-compat case.Well, thinko in there, wasn''t that simple. Went with the #ifdef, new version attached. cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 14/11/06 16:06, "Gerd Hoffmann" <kraxel@suse.de> wrote:>> Both ways should have the same effect as kernel_page_user is defined to >> 0 for the non-compat case. > > Well, thinko in there, wasn''t that simple. Went with the #ifdef, new > version attached.I''ve applied a version that is part yours, part Jan''s, and part mine. Changeset 12404:bb76a76985fe. It would be great if you could test that it actually still works on Xen 3.0.2! :-) -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser wrote:> On 14/11/06 16:06, "Gerd Hoffmann" <kraxel@suse.de> wrote: > > I''ve applied a version that is part yours, part Jan''s, and part mine. > Changeset 12404:bb76a76985fe. It would be great if you could test that it > actually still works on Xen 3.0.2! :-)Works for me, thanks. Maybe that should be added to the regression tests? BTW: what is the state of the testing mercurial tree? It hasn''t seen updates for quite some time. I''d expect bugfixes like this one being queued for 3.0.3 too ... cheers, Gerd -- Gerd Hoffmann <kraxel@suse.de> http://www.suse.de/~kraxel/julika-dora.jpeg _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On 15/11/06 10:02, "Gerd Hoffmann" <kraxel@suse.de> wrote:>> I''ve applied a version that is part yours, part Jan''s, and part mine. >> Changeset 12404:bb76a76985fe. It would be great if you could test that it >> actually still works on Xen 3.0.2! :-) > > Works for me, thanks. Maybe that should be added to the regression tests? > > BTW: what is the state of the testing mercurial tree? It hasn''t seen > updates for quite some time. I''d expect bugfixes like this one being > queued for 3.0.3 too ...We currently maintain a patch queue against 3.0.3-testing and we''ll apply patches to 3.0.3-testing from that. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel